Skip to content

ADR-0009: dbt Owns SQL Transformation

Accepted

Floe needs to handle SQL transformations across multiple compute targets (DuckDB, Snowflake, BigQuery, etc.). Key challenges:

  • SQL dialect differences between targets
  • Dependency resolution between models
  • Incremental processing logic
  • Data testing

Options considered:

  • Build custom SQL handling - Parse, transpile, manage dependencies ourselves
  • Use dbt - Leverage existing, proven tooling
  • Hybrid - Light wrapper around dbt
  • Target-specific code - Different implementations per target

dbt owns SQL transformation. Floe does not parse, transpile, or manage SQL dependencies.

  • Proven tooling - dbt handles dialect translation via adapters
  • Dependency resolution - ref() and source() macros work out of the box
  • Incremental processing - is_incremental() macro handles complexity
  • Data testing - dbt tests validate data quality
  • Large ecosystem - dbt packages, community, documentation
  • Simpler Floe - Less code to maintain
  • dbt dependency - Users must structure transforms as dbt projects
  • Learning curve - Users need dbt knowledge
  • Less flexibility - Can’t support non-dbt SQL patterns
  • Version coupling - Must support dbt version changes
  • CompiledArtifacts just point to dbt project (don’t rewrite SQL)
  • Floe adds value in orchestration, data isolation, observability
  • Future non-dbt transforms (Python, Flink) handled separately
ConcernOwner
SQL dialect translationdbt (via adapters)
Dependency resolutiondbt (ref(), source())
Incremental processingdbt (is_incremental())
Data testsdbt
OrchestrationDagster
Data isolationFloe
Pipeline lifecycleFloe
ObservabilityFloe

While dbt framework is enforced for SQL transformation DSL, the execution environment WHERE dbt compiles and runs is pluggable via DBTPlugin (ADR-0043):

ImplementationDescriptionEntry Point
LocalDBTPlugindbt-core via CLI subprocessfloe.dbt
FusionDBTPlugindbt Fusion (Rust-based) via CLI subprocessfloe.dbt
CloudDBTPlugindbt Cloud API (deferred to Epic 8+)floe.dbt

Key Distinction:

  • dbt Framework (ENFORCED): SQL transformation DSL, models, tests, macros, Jinja templating
  • dbt Execution Environment (PLUGGABLE): WHERE dbt compiles (local dbt-core, dbt Fusion, dbt Cloud)

Platform teams select the execution environment in manifest.yaml:

plugins:
dbt_compiler:
provider: fusion # or local, or cloud

Data engineers use dbt framework features (models, tests, macros) regardless of execution environment.

// Floe points to dbt project, doesn't rewrite SQL
type DBTConfig struct {
ProjectDir string // Path to dbt project
Target string // dbt target (profiles.yml)
EnvVars map[string]string // Environment variables
Commands []string // ["dbt run", "dbt test"]
Select string // Model selection
Exclude string // Model exclusion
}
  • ❌ Parse SQL to understand structure
  • ❌ Transpile SQL between dialects
  • ❌ Manage model dependencies
  • ❌ Handle incremental logic
  • ❌ Run data tests directly
  • ✅ Orchestrate dbt runs via Dagster
  • ✅ Provide environment variables (connections, etc.)
  • ✅ Collect observability from dbt runs
  • ✅ Manage data isolation (namespace-based)
  • ✅ Provision compute targets