Four-Layer Architecture Overview
This document provides a comprehensive overview of floe’s four-layer architecture.
Architecture Diagram
Section titled “Architecture Diagram”┌─────────────────────────────────────────────────────────────────────────────┐│ LAYER 4: DATA LAYER (Ephemeral Jobs) ││ Owner: Data Engineers ││ K8s Resources: Jobs (run-to-completion) ││ Config: floe.yaml ││ ││ ┌───────────────────────────────────────────────────────────────────────┐ ││ │ • dbt run pods → Execute transformations │ ││ │ • dlt ingestion jobs → Load external data │ ││ │ • Quality check jobs → Validate data quality │ ││ │ • Orchestrator workers → Scaled by orchestrator as needed │ ││ └───────────────────────────────────────────────────────────────────────┘ ││ ││ Lifecycle: Run-to-completion, stateless, per-execution pods │└─────────────────────────────────────┬───────────────────────────────────────┘ │ │ Connects to (K8s Service Discovery) ▼┌─────────────────────────────────────────────────────────────────────────────┐│ LAYER 3: SERVICES LAYER (Long-lived) ││ Owner: Platform Engineers ││ K8s Resources: Deployments, StatefulSets ││ Deployment: `floe platform deploy` ││ ││ ┌─────────────────────────────────────────────────────────────────────┐ ││ │ ORCHESTRATOR CATALOG SEMANTIC OBSERVABILITY│ ││ │ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ ┌───────────┐ │ ││ │ │ Dagster │ │ Polaris │ │ Cube │ │ OTLP │ │ ││ │ │ Webserver │ │ Server │ │ Server │ │ Collector │ │ ││ │ │ Daemon │ │ │ │ │ │ Prometheus│ │ ││ │ │ PostgreSQL │ │ PostgreSQL │ │ Redis │ │ Grafana │ │ ││ │ └─────────────┘ └─────────────┘ └──────────┘ └───────────┘ │ ││ └─────────────────────────────────────────────────────────────────────┘ ││ ││ Lifecycle: Always running, rolling updates, stateful (databases, caches) │└─────────────────────────────────────┬───────────────────────────────────────┘ │ │ Configured by ▼┌─────────────────────────────────────────────────────────────────────────────┐│ LAYER 2: CONFIGURATION LAYER (Enforcement) ││ Owner: Platform Engineers ││ Storage: OCI Registry (immutable, versioned) ││ Config: manifest.yaml ││ ││ ┌─────────────────────────────────────────────────────────────────────┐ ││ │ • Plugin selection (compute, orchestrator, catalog, semantic) │ ││ │ • Governance policies (classification, access control, retention) │ ││ │ • Data architecture rules (medallion/kimball, naming conventions) │ ││ │ • Quality gates (test coverage, required tests, block/warn/notify) │ ││ └─────────────────────────────────────────────────────────────────────┘ ││ ││ Lifecycle: Versioned artifacts, rarely changes, published to OCI registry │└─────────────────────────────────────┬───────────────────────────────────────┘ │ │ Built on ▼┌─────────────────────────────────────────────────────────────────────────────┐│ LAYER 1: FOUNDATION LAYER (Framework Code) ││ Owner: floe Maintainers ││ Distribution: PyPI, Helm registry ││ ││ ┌─────────────────────────────────────────────────────────────────────┐ ││ │ floe-core │ Schemas, interfaces, enforcement engine │ ││ │ floe-cli │ CLI for Platform Team and Data Team │ ││ │ floe-dbt │ dbt framework (ENFORCED); runtime PLUGGABLE │ ││ │ floe-iceberg │ Iceberg utilities (ENFORCED) │ ││ │ plugins/* │ Pluggable implementations │ ││ │ charts/* │ Helm charts for deployment │ ││ └─────────────────────────────────────────────────────────────────────┘ ││ ││ ENFORCED STANDARDS: Iceberg, OTel, OpenLineage, dbt framework, K8s-native │└─────────────────────────────────────────────────────────────────────────────┘Layer Details
Section titled “Layer Details”Layer 1: Foundation
Section titled “Layer 1: Foundation”The Foundation layer contains the core framework code that defines floe’s capabilities.
| Package | Purpose | Distribution |
|---|---|---|
floe-core | Schemas, interfaces, enforcement engine | PyPI |
floe-cli | CLI for both personas | PyPI |
floe-dbt | dbt framework integration (enforced); runtime via DBTPlugin (pluggable) | PyPI |
floe-iceberg | Iceberg utilities (enforced) | PyPI |
plugins/* | Pluggable implementations | PyPI |
charts/* | Helm charts | Helm registry |
Enforced Standards:
- Apache Iceberg (table format)
- OpenTelemetry (observability)
- OpenLineage (data lineage)
- dbt (transformation framework - SQL compilation ENFORCED; execution environment PLUGGABLE via ADR-0043)
- Kubernetes-native (deployment)
Layer 2: Configuration
Section titled “Layer 2: Configuration”The Configuration layer contains platform artifacts that enforce guardrails.
Artifact Storage: OCI Registry (immutable, versioned, signed)
Contents:
- Plugin selection configuration
- Governance policies (classification, access control)
- Data architecture rules (naming conventions, layer constraints)
- Quality gates (test coverage, required tests)
Workflow:
floe platform compile # Build artifactsfloe platform test # Planned/stub: run policy testsfloe platform publish # Push to OCI registryLayer 3: Services
Section titled “Layer 3: Services”The Services layer contains long-lived services that support data operations.
| Service Type | Examples | K8s Resource |
|---|---|---|
| Orchestrator | Dagster webserver, daemon | Deployment |
| Catalog | Polaris server | Deployment |
| Semantic Layer | Cube server | Deployment |
| Observability | OTLP Collector, Prometheus | Deployment |
| Databases | PostgreSQL | StatefulSet |
| Caches | Redis | StatefulSet |
| Storage | MinIO | StatefulSet |
Deployment:
floe platform deploy # Deploy all servicesfloe platform status # Check healthfloe platform logs # View logsLayer 4: Data
Section titled “Layer 4: Data”The Data layer contains ephemeral jobs that execute data operations.
| Job Type | Trigger | K8s Resource |
|---|---|---|
| dbt run | Schedule/manual | Job |
| dbt test | Post-run | Job |
| dlt ingestion | Schedule | Job |
| Quality checks | Post-run | Job |
Execution:
floe init --platform=v1.2.3 # Planned: pull platform artifactsfloe compile # Planned: validate against platformfloe run # Planned: execute pipelineLayer Boundaries
Section titled “Layer Boundaries”| Aspect | Layer 3 (Services) | Layer 4 (Data) |
|---|---|---|
| K8s Resource | Deployment, StatefulSet | Job |
| Lifecycle | Long-lived, upgraded | Run-to-completion |
| State | Stateful | Stateless |
| Scaling | Fixed replicas or HPA | Per-execution |
| Owner | Platform Team | Data Team (execution) |
| Deployment | floe platform deploy | Triggered by orchestrator |
Ownership Model
Section titled “Ownership Model”| Layer | Owner | Responsibilities |
|---|---|---|
| Foundation | floe maintainers | Framework code, releases |
| Configuration | Platform Team | Plugin selection, policies, architecture |
| Services | Platform Team | Deploy, upgrade, operate |
| Data | Data Team | Pipeline code, transforms, schedules |
Data Mesh Extension
Section titled “Data Mesh Extension”The four-layer architecture extends to support Data Mesh deployments with federated domain ownership:
┌─────────────────────────────────────────────────────────────────────────────┐│ Layer 4: DATA PRODUCTS (Ephemeral) ││ • Data product pipelines run as K8s Jobs ││ • Each product has defined input/output ports ││ • Cross-domain dependencies tracked via data contracts │└────────────────────────────────────┬────────────────────────────────────────┘ │┌────────────────────────────────────┴────────────────────────────────────────┐│ Layer 3: DOMAIN SERVICES (Long-lived, per-domain) ││ • Orchestrator per domain (domain autonomy) ││ • Domain-specific semantic layer ││ • Connected to shared catalog and observability │└────────────────────────────────────┬────────────────────────────────────────┘ │┌────────────────────────────────────┴────────────────────────────────────────┐│ Layer 2: FEDERATED CONFIGURATION ││ • Enterprise manifest: Global governance, approved plugins ││ • Domain manifest: Domain-specific choices, namespace ││ • Inheritance: Enterprise → Domain → Product │└────────────────────────────────────┬────────────────────────────────────────┘ │┌────────────────────────────────────┴────────────────────────────────────────┐│ Layer 1: FOUNDATION (Same as centralized) ││ • Additional schemas: EnterpriseManifest, DomainManifest, DataProduct ││ • Additional CLI: floe enterprise/domain/product commands │└─────────────────────────────────────────────────────────────────────────────┘See ADR-0021: Data Architecture Patterns for full Data Mesh documentation.