CompiledArtifacts Contract
The CompiledArtifacts schema defines the output of Floe’s compilation pipeline: the resolved, validated configuration that the runtime uses for execution. In the current alpha, use make compile-demo for the Customer 360 artifact path or floe platform compile for platform manifest validation; the root data-team floe compile command is planned and not yet implemented.
Overview
Section titled “Overview”CompiledArtifacts is the single source of truth for pipeline execution. It contains:
- Resolved plugin configuration (after inheritance)
- Compiled transforms
- Governance policies
- Observability settings
- Secret-free deployment bindings for composed platform services
- Optional Data Mesh contracts
┌─────────────────────────────────────────────────────────────────────────────┐│ INPUT ││ ││ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ ││ │ Manifest │───►│ Manifest │───►│ DataProduct │ ││ │ (enterprise) │ │ (domain) │ │ (floe.yaml) │ ││ │ [optional] │ │ [optional] │ │ [required] │ ││ └─────────────────┘ └─────────────────┘ └─────────────────┘ ││ │└──────────────────────────────────┬──────────────────────────────────────────┘ │ ▼ ┌────────────────┐ │ compile pipeline│ └────────┬───────┘ │ ▼┌─────────────────────────────────────────────────────────────────────────────┐│ OUTPUT ││ ││ ┌─────────────────────────────────────────────────────────────────────┐ ││ │ CompiledArtifacts │ ││ │ │ ││ │ • metadata: compilation info, source hash │ ││ │ • mode: "simple" | "centralized" | "mesh" │ ││ │ • inheritance_chain: [manifest refs...] │ ││ │ • plugins: resolved compute, orchestrator, catalog, etc. │ ││ │ • transforms: compiled dbt models │ ││ │ • governance: classification, quality gates │ ││ │ • observability: traces, metrics, lineage config │ ││ │ • output_ports: [optional, mesh only] │ ││ │ • data_contracts: [optional, mesh only] │ ││ └─────────────────────────────────────────────────────────────────────┘ ││ │└─────────────────────────────────────────────────────────────────────────────┘Schema Definition
Section titled “Schema Definition”from datetime import datetimefrom typing import Any, Literal
from floe_core.schemas.versions import COMPILED_ARTIFACTS_VERSIONfrom pydantic import BaseModel
class CompiledArtifacts(BaseModel): """Output of the compilation pipeline - unified for all deployment modes."""
# Schema version version: str = COMPILED_ARTIFACTS_VERSION # currently "0.16.0"
# Compilation metadata metadata: CompilationMetadata
# Product identity (ADR-0030) identity: ProductIdentity
# Deployment mode mode: Literal["simple", "centralized", "mesh"]
# Inheritance chain (for auditing and debugging) inheritance_chain: list[ManifestRef]
# Resolved plugin configuration plugins: PluginConfig
# Core pipeline configuration transforms: list[CompiledTransform] schedule: ScheduleConfig | None dbt: DbtConfig
# Governance (merged from all inheritance levels) governance: GovernanceConfig
# Observability observability: ObservabilityConfig
# Data Mesh (optional - only present in mesh mode) output_ports: list[OutputPort] | None = None input_ports: list[InputPort] | None = None data_contracts: list[DataContract] | None = None
# Deployment bindings (secret-free, renderer consumed) deployment: DeploymentConfig | None = None
# Platform-visible raw Iceberg outputs created by ingestion sources ingestion_outputs: list[IngestionOutputTable] = []
class CompilationMetadata(BaseModel): """Information about the compilation process."""
compiled_at: datetime floe_version: str source_hash: str # SHA256 of input files product_name: str product_version: str
class ProductIdentity(BaseModel): """Product identity information from catalog registration.
See ADR-0030 for the namespace-based identity model. """
product_id: str # "sales.customer_360" domain: str # "sales" repository: str # "github.com/acme/sales-customer-360" namespace_registered: bool # True if registered in catalog registration_timestamp: datetime | None # When first registered
class ManifestRef(BaseModel): """Reference to a manifest in the inheritance chain."""
name: str version: str scope: Literal["enterprise", "domain"] ref: str # OCI referenceDeployment Bindings
Section titled “Deployment Bindings”Deployment bindings are the resolved, secret-free contract between plugin composition and deployment renderers.
class DeploymentConfig(BaseModel): storage: StorageDeploymentBinding | None = None catalog: CatalogDeploymentBinding | None = None ingestion: IngestionDeploymentBinding | None = NoneStorage plugins emit neutral storage desired state. floe-core validates
storage/catalog compatibility and records typed deployment bindings. Catalog
plugins translate storage state into catalog-owned deployment config after
compatibility passes.
Catalog bindings have two layers:
- Provider-specific details such as
CatalogDeploymentBinding.polarisare for renderers that deploy that provider. - Neutral projections such as
CatalogDeploymentBinding.iceberg_restandCatalogDeploymentBinding.dbtare for runtime consumers. dlt and future ingestion plugins consumeiceberg_rest; compute plugins consumedbtor the generic projection and translate it into their adapter-specific profile shape.
class IcebergRestOAuth2Binding(BaseModel): secret_name: str = "iceberg" client_id_env: str client_secret_env: str oauth2_server_uri_env: str oauth2_scope_env: str | None = None oauth2_scope_default: str | None = None
class IcebergRestCatalogBinding(BaseModel): catalog_name: str = "iceberg" uri: str warehouse: str properties: dict[str, str] = {} oauth2: IcebergRestOAuth2Binding | None = None
class DbtCatalogBinding(BaseModel): profile_fragment: dict[str, Any] = {} env_refs: dict[str, str] = {} iceberg_rest: IcebergRestCatalogBinding | None = None
class CatalogDeploymentBinding(BaseModel): provider: str polaris: PolarisCatalogDeploymentBinding | None = None iceberg_rest: IcebergRestCatalogBinding | None = None dbt: DbtCatalogBinding | None = None
class IngestionDeploymentBinding(BaseModel): provider: str dlt: DltIngestionBindingRules:
- Raw secrets are forbidden. Use Kubernetes Secret refs, environment refs,
workload identity refs, or
none. - Helm values are renderer output, not the semantic contract.
- Chart values must reference Kubernetes Secrets for storage credentials; raw storage credentials are not valid compiled artifact or generated chart content.
- Compile declares bucket requirements and compatibility issues; it does not create buckets or call live infrastructure.
- A new catalog plugin should add storage requirements and a catalog deployment translator without changing existing storage plugins.
- Consumers must not infer catalog runtime endpoints from provider-specific
storage fields. Use
iceberg_rest.uriso DuckDB, dlt, and future plugins point at the catalog service rather than the object-store endpoint.
Composition Error Codes
Section titled “Composition Error Codes”Plugin composition diagnostics use COMPOSITION_* codes. These codes are
operator-facing and map to the action needed to fix the platform selection or
compiled artifact. Legacy numeric E* codes remain valid for broader
compilation stages outside plugin composition.
The canonical constant definitions live in
floe_core.composition.error_codes. floe_core.compilation.errors re-exports
them for existing compilation error consumers, but new composition code should
import from floe_core.composition.error_codes.
| Code | Meaning | Operator action |
|---|---|---|
COMPOSITION_PLUGIN_MISSING | A selected plugin cannot be found or loaded. | Install the plugin package or fix the manifest plugin type. |
COMPOSITION_PLUGIN_INTERFACE_INVALID | A registry entry does not implement the required plugin interface. | Register the plugin under the correct entry point group or fix the plugin class. |
COMPOSITION_PLUGIN_CONFIG_INVALID | A plugin exists but its config or provider-owned binding is invalid. | Fix the plugin config in manifest.yaml. |
COMPOSITION_STORAGE_MISSING | A storage-dependent plugin was selected without a storage plugin. | Select a storage plugin or remove the storage-dependent consumer. |
COMPOSITION_PROTOCOL_UNSUPPORTED | Selected plugins do not share a required storage protocol. | Choose compatible storage/catalog providers or adjust provider config. |
COMPOSITION_CREDENTIAL_MODE_UNSUPPORTED | Selected plugins do not share a required credential mode. | Choose compatible credential modes or update provider config. |
COMPOSITION_SECRET_PROJECTION_UNSUPPORTED | Selected plugins do not share a required secret projection mode. | Choose compatible storage/catalog/secrets providers or adjust provider config. |
COMPOSITION_SECRET_PROVIDER_MISSING | A catalog requires a secret provider but none was selected. | Select a secrets plugin that supports the required provider and projection mode. |
COMPOSITION_SECRET_PROVIDER_UNSUPPORTED | The selected secrets plugin does not satisfy the catalog provider requirement. | Choose a compatible secrets provider or adjust catalog requirements. |
COMPOSITION_IDENTITY_PROVIDER_MISSING | A catalog requires identity integration but none was selected. | Select an identity plugin that supports the required identity mode. |
COMPOSITION_IDENTITY_PROVIDER_UNSUPPORTED | The selected identity plugin does not satisfy the catalog provider requirement. | Choose a compatible identity provider or adjust catalog requirements. |
COMPOSITION_IDENTITY_MODE_UNSUPPORTED | Selected storage, catalog, and identity plugins do not share a required identity mode. | Choose compatible workload identity settings or update provider config. |
COMPOSITION_DEPLOYMENT_BINDING_MISSING | A selected plugin does not emit the required typed deployment binding. | Upgrade or fix the plugin implementation. |
COMPOSITION_RENDERER_PRECONDITION_FAILED | A renderer cannot render the compiled artifact shape. | Recompile with required deployment bindings or fix the artifact before rendering. |
Plugin Configuration
Section titled “Plugin Configuration”class PluginConfig(BaseModel): """Resolved plugin configuration after inheritance."""
compute_registry: ComputeRegistry # All approved computes (multi-compute support) orchestrator: OrchestratorConfig catalog: CatalogConfig semantic_layer: SemanticLayerConfig ingestion: IngestionConfig secrets: SecretsConfig
class ComputeRegistry(BaseModel): """Registry of all approved compute configurations.
Platform teams define N approved compute targets. Data engineers select per-transform from this approved list. See ADR-0010 (Multi-Compute Pipeline Architecture). """
configs: dict[str, ComputeConfig] # name → config (e.g., {"duckdb": ..., "spark": ...}) default: str # Fallback compute when transform doesn't specify
class ComputeConfig(BaseModel): """Configuration for a single compute target."""
name: str # "duckdb" | "spark" | "snowflake" | etc. connection_secret_ref: str | None = None properties: dict = {}
class OrchestratorConfig(BaseModel): """Orchestrator plugin configuration."""
type: str # "dagster" | "airflow" | etc.
class CatalogConfig(BaseModel): """Catalog plugin configuration."""
type: str # "polaris" | "glue" | "hive" | etc. uri: str | None = None
class SemanticLayerConfig(BaseModel): """Semantic layer plugin configuration."""
type: str # "cube" | "dbt_semantic_layer" | "none" port: int | None = None
class IngestionConfig(BaseModel): """Ingestion plugin configuration."""
type: str # "dlt" | "airbyte"Transform Configuration
Section titled “Transform Configuration”class CompiledTransform(BaseModel): """Compiled dbt transform."""
type: Literal["dbt"] path: str models: list[str] manifest_path: str compute: str | None = None # Selected compute (None → uses default from registry)
class DbtConfig(BaseModel): """dbt-specific configuration.
Note: In multi-compute pipelines, multiple dbt profiles may be generated (one per approved compute). The `target` field maps to the compute name. """
manifest_path: str project_path: str profiles_dir: str # Note: target is resolved per-transform from CompiledTransform.computeGovernance Configuration
Section titled “Governance Configuration”class GovernanceConfig(BaseModel): """Merged governance configuration."""
classification: ClassificationConfig quality_gates: QualityGatesConfig data_architecture: DataArchitectureConfig
class ClassificationConfig(BaseModel): """Data classification rules."""
source: Literal["dbt_meta"] levels: list[str] # e.g., ["public", "internal", "confidential", "pii"]
class QualityGatesConfig(BaseModel): """Quality gate requirements."""
minimum_test_coverage: int # percentage required_tests: list[str] # e.g., ["not_null", "unique", "freshness"] block_on_failure: bool
class DataArchitectureConfig(BaseModel): """Data architecture pattern configuration."""
pattern: Literal["medallion", "kimball", "data_vault", "hybrid"] layers: dict[str, LayerConfig] naming_enforcement: Literal["off", "warn", "strict"]
class LayerConfig(BaseModel): """Layer-specific configuration."""
prefix: str namespace: str | None = None quality_gates: list[str] = []Observability Configuration
Section titled “Observability Configuration”class ObservabilityConfig(BaseModel): """Observability settings."""
traces: bool = True metrics: bool = True lineage: bool = True namespace: str # Lineage namespace (e.g., "my-project")Data Mesh Configuration (Optional)
Section titled “Data Mesh Configuration (Optional)”class OutputPort(BaseModel): """Data product output port."""
name: str description: str | None = None table: str sla: SLAConfig | None = None access: AccessConfig | None = None
class InputPort(BaseModel): """Data product input port."""
name: str description: str | None = None source: SourceConfig freshness_requirement: str | None = None
class SLAConfig(BaseModel): """Service level agreement."""
freshness: str # e.g., "6h" availability: str # e.g., "99.9%" quality: dict = {}Data Contracts (ODCS v3)
Section titled “Data Contracts (ODCS v3)”Data contracts follow the Open Data Contract Standard (ODCS) v3.x. See ADR-0026 and ADR-0027 for background.
class CompiledDataContract(BaseModel): """Compiled data contract (ODCS v3 format).
Stored in CompiledArtifacts.data_contracts for runtime monitoring. """
# ODCS required fields api_version: str = "v3.0.2" kind: Literal["DataContract"] = "DataContract" name: str version: str # Semantic version, independent from data product
# Ownership owner: str domain: str | None = None description: str | None = None
# Lifecycle status status: Literal["active", "deprecated", "sunset", "retired"] = "active" deprecation: DeprecationInfo | None = None
# Schema models models: list[ContractModel]
# SLA properties sla: ContractSLA | None = None
# Terms and governance terms: dict | None = None tags: list[str] = []
# Floe-specific metadata source: ContractSource # Where contract was defined generated_at: datetime generated_from: Literal["ports", "explicit", "merged"]
class ContractModel(BaseModel): """Schema definition for a data model in the contract."""
name: str description: str | None = None primary_key: list[str] | None = None elements: list[ContractElement]
class ContractElement(BaseModel): """Single element (column) in a schema."""
name: str type: str # string, int, float, decimal, boolean, timestamp, date, etc. description: str | None = None required: bool = True primary_key: bool = False unique: bool = False classification: str | None = None # pii, phi, sensitive, etc. format: str | None = None # email, uri, uuid, date-time, etc.
class ContractSLA(BaseModel): """Service level agreement for contract."""
freshness_hours: float | None = None # Max hours since last update availability_percent: float | None = None # Target availability (e.g., 99.9) quality_score_min: float | None = None # Minimum quality score (0-100) completeness_percent: float | None = None # Minimum row completeness
class DeprecationInfo(BaseModel): """Deprecation metadata for sunsetting contracts."""
announced: str # ISO date when deprecation was announced sunset_date: str # ISO date when contract will be retired replacement: str | None # Name of replacement contract migration_guide: str | None # URL to migration documentation reason: str | None # Why the contract is being deprecated
class ContractSource(BaseModel): """Source of contract definition."""
type: Literal["port", "file", "merged"] file_path: str | None = None # Path to datacontract.yaml if explicit port_name: str | None = None # Port name if auto-generated
class ContractMonitoringConfig(BaseModel): """Configuration for runtime contract monitoring."""
enabled: bool = True mode: Literal["scheduled", "continuous", "on_demand"] = "scheduled" freshness_check_interval: str = "15m" # ISO duration schema_drift_check_interval: str = "1h" # ISO duration quality_check_interval: str = "6h" # ISO durationExample: Simple Mode
Section titled “Example: Simple Mode”{ "version": "0.16.0", "metadata": { "compiled_at": "2026-01-03T10:00:00Z", "floe_version": "0.3.0", "source_hash": "sha256:abc123...", "product_name": "my-pipeline", "product_version": "1.0.0" }, "identity": { "product_id": "default.my_pipeline", "domain": "default", "repository": "github.com/acme/my-pipeline", "namespace_registered": true, "registration_timestamp": "2026-01-01T00:00:00Z" }, "mode": "simple", "inheritance_chain": [], "plugins": { "compute_registry": { "configs": { "duckdb": { "name": "duckdb", "properties": { "threads": 8 } } }, "default": "duckdb" }, "orchestrator": { "type": "dagster" }, "catalog": { "type": "polaris", "uri": "http://polaris:8181/api/catalog" }, "semantic_layer": { "type": "cube", "port": 4000 }, "ingestion": { "type": "dlt" }, "secrets": { "type": "k8s" } }, "transforms": [ { "type": "dbt", "path": "./models", "models": ["bronze_customers", "silver_customers", "gold_revenue"], "manifest_path": "/app/target/manifest.json", "compute": null } ], "schedule": { "cron": "0 6 * * *", "timezone": "UTC" }, "dbt": { "manifest_path": "/app/target/manifest.json", "project_path": "/app", "target": "prod", "profiles_dir": "/app/.dbt" }, "governance": { "classification": { "source": "dbt_meta", "levels": ["public", "internal", "confidential", "pii"] }, "quality_gates": { "minimum_test_coverage": 80, "required_tests": ["not_null", "unique"], "block_on_failure": true }, "data_architecture": { "pattern": "medallion", "layers": { "bronze": { "prefix": "bronze_" }, "silver": { "prefix": "silver_" }, "gold": { "prefix": "gold_" } }, "naming_enforcement": "strict" } }, "observability": { "traces": true, "metrics": true, "lineage": true, "namespace": "my-pipeline" }}Example: Centralized Mode
Section titled “Example: Centralized Mode”{ "version": "0.16.0", "metadata": { "compiled_at": "2026-01-03T10:00:00Z", "floe_version": "0.3.0", "source_hash": "sha256:def456...", "product_name": "customer-analytics", "product_version": "1.0.0" }, "identity": { "product_id": "analytics.customer_analytics", "domain": "analytics", "repository": "github.com/acme/customer-analytics", "namespace_registered": true, "registration_timestamp": "2026-01-01T00:00:00Z" }, "mode": "centralized", "inheritance_chain": [ { "name": "acme-platform", "version": "1.2.3", "scope": "enterprise", "ref": "oci://registry.acme.com/floe-platform:v1.2.3" } ], "plugins": { "compute_registry": { "configs": { "duckdb": { "name": "duckdb", "properties": { "threads": 8 } }, "spark": { "name": "spark", "properties": { "cluster": "spark-thrift.svc" } } }, "default": "duckdb" }, "orchestrator": { "type": "dagster" }, "catalog": { "type": "polaris" }, "semantic_layer": { "type": "cube" }, "ingestion": { "type": "dlt" }, "secrets": { "type": "k8s" } }, "transforms": [ { "type": "dbt", "path": "./models/staging", "models": [...], "compute": "spark" }, { "type": "dbt", "path": "./models/marts", "models": [...], "compute": "duckdb" } ], "governance": { "classification": {...}, "quality_gates": { "minimum_test_coverage": 80, "required_tests": ["not_null", "unique", "freshness"], "block_on_failure": true }, "data_architecture": {...} }, "observability": {...}}Example: Data Mesh Mode
Section titled “Example: Data Mesh Mode”{ "version": "0.16.0", "metadata": { "compiled_at": "2026-01-03T10:00:00Z", "floe_version": "0.3.0", "source_hash": "sha256:ghi789...", "product_name": "customer-360", "product_version": "3.2.1" }, "identity": { "product_id": "sales.customer_360", "domain": "sales", "repository": "github.com/acme/sales-customer-360", "namespace_registered": true, "registration_timestamp": "2026-01-01T00:00:00Z" }, "mode": "mesh", "inheritance_chain": [ { "name": "acme-enterprise", "version": "1.0.0", "scope": "enterprise", "ref": "oci://registry.acme.com/enterprise:v1.0.0" }, { "name": "sales-domain", "version": "0.1.0", "scope": "domain", "ref": "oci://registry.acme.com/domains/sales:v2.0.0" } ], "plugins": { "compute_registry": { "configs": { "spark": { "name": "spark", "properties": { "cluster": "spark-thrift.svc" } }, "duckdb": { "name": "duckdb", "properties": { "threads": 8 } } }, "default": "duckdb" }, "orchestrator": { "type": "dagster" }, "catalog": { "type": "polaris" }, "semantic_layer": { "type": "cube" }, "ingestion": { "type": "dlt" }, "secrets": { "type": "external-secrets" } }, "transforms": [ { "type": "dbt", "path": "./models/ingest", "models": [...], "compute": "spark" }, { "type": "dbt", "path": "./models/marts", "models": [...], "compute": "duckdb" } ], "governance": {...}, "observability": { "traces": true, "metrics": true, "lineage": true, "namespace": "sales.customer-360" }, "output_ports": [ { "name": "customers", "description": "Unified customer dimension", "table": "sales.gold.customers", "sla": { "freshness": "6h", "availability": "99.9%" }, "access": { "default": "deny", "grants": [ { "domain": "marketing", "access": "read" }, { "domain": "finance", "access": "read" } ] } } ], "input_ports": [ { "name": "crm_data", "source": { "type": "ingestion", "config": { "type": "dlt", "source": "salesforce" } } }, { "name": "marketing_interactions", "source": { "type": "data_product", "ref": "marketing.campaign-attribution.customer_interactions" }, "freshness_requirement": "4h" } ], "data_contracts": [ { "api_version": "v3.0.2", "kind": "DataContract", "name": "sales-customer-360-customers", "version": "2.1.0", "owner": "sales-analytics@acme.com", "domain": "sales", "description": "Unified customer dimension from customer-360 data product", "status": "active", "models": [ { "name": "customers", "description": "Customer master data", "primary_key": ["customer_id"], "elements": [ { "name": "customer_id", "type": "string", "required": true, "primary_key": true, "unique": true, "description": "Unique customer identifier" }, { "name": "email", "type": "string", "required": true, "unique": true, "classification": "pii", "format": "email", "description": "Primary email address" }, { "name": "name", "type": "string", "required": true, "classification": "pii", "description": "Full name" }, { "name": "lifetime_value", "type": "decimal", "required": false, "description": "Calculated customer lifetime value" } ] } ], "sla": { "freshness_hours": 6.0, "availability_percent": 99.9, "quality_score_min": 95.0, "completeness_percent": 99.0 }, "terms": { "usage": "Internal analytics only", "retention": "7 years per compliance" }, "tags": ["customer-data", "gold-layer", "sales-domain"], "source": { "type": "merged", "file_path": "data-products/customer-360/datacontract.yaml", "port_name": "customers" }, "generated_at": "2026-01-03T10:00:00Z", "generated_from": "merged" } ]}Validation
Section titled “Validation”Python
Section titled “Python”from floe_core.schemas import CompiledArtifacts
# Load and validatewith open(".floe/artifacts.json") as f: artifacts = CompiledArtifacts.model_validate_json(f.read())
# Access configurationprint(artifacts.mode) # "simple" | "centralized" | "mesh"print(artifacts.plugins.compute.type) # "duckdb"print(artifacts.plugins.orchestrator.type) # "dagster"
# Check inheritancefor manifest in artifacts.inheritance_chain: print(f"{manifest.name} ({manifest.scope}): {manifest.ref}")JSON Schema Inspection
Section titled “JSON Schema Inspection”The public CLI does not currently expose a schema export command. During alpha, contributors can inspect the current Pydantic schema from the repository:
import json
from floe_core.schemas import CompiledArtifacts
print(json.dumps(CompiledArtifacts.export_json_schema(), indent=2))Versioning
Section titled “Versioning”| Change | Version Impact |
|---|---|
| Add optional field | Minor (1.x.0) |
| Add required field | Major (x.0.0) |
| Remove field | Major (x.0.0) |
| Change field type | Major (x.0.0) |
Related Documents
Section titled “Related Documents”- Contracts Index - Overview of all contracts
- Observability Attributes - Telemetry conventions
- datacontract.yaml Reference - ODCS format reference
- Glossary - Terminology
- Four-Layer Overview - Architecture context
- ADR-0026: Data Contract Architecture
- ADR-0027: ODCS Standard Adoption
- ADR-0028: Runtime Contract Monitoring
- ADR-0030: Namespace-Based Identity