Skip to content

Data Quality Plugin Configuration Reference

The floe Data Quality Plugin system allows platform teams to enforce data quality standards across the enterprise while giving data teams the flexibility to define specific checks for their data products.

Configuration is managed at two levels:

  1. manifest.yaml: Managed by the Platform Team. Defines the quality provider, global dimension weights, and quality gate policies (Bronze, Silver, Gold).
  2. floe.yaml: Managed by Data Teams. Defines specific quality checks for models and selects the desired quality tier.

The quality section in manifest.yaml controls the global behavior of the quality system.

plugins:
quality:
provider: great_expectations # required: great_expectations or dbt_expectations
enabled: true # default: true
check_timeout_seconds: 300 # default: 300
# Layer 1: Global Dimension Weights (must sum to 1.0)
dimension_weights:
completeness: 0.25
accuracy: 0.25
validity: 0.20
consistency: 0.15
timeliness: 0.15
# Layer 3: Score Calculation Parameters
calculation:
baseline_score: 70 # starting score before checks
max_positive_influence: 30 # max increase (70 + 30 = 100)
max_negative_influence: 50 # max decrease (70 - 50 = 20)
severity_weights:
critical: 3.0
warning: 1.0
info: 0.5
# Global Thresholds
thresholds:
min_score: 70 # score below this blocks deployment/run
warn_score: 85 # score below this emits warnings
# Quality Gate Policies
quality_gates:
bronze:
min_test_coverage: 0 # % of columns that must have tests
required_tests: [] # tests that must be present
min_score: 0 # minimum score for this tier
silver:
min_test_coverage: 80
required_tests: ["not_null", "unique"]
min_score: 75
gold:
min_test_coverage: 100
required_tests: ["not_null", "unique", "accepted_values", "relationships"]
min_score: 90
overridable: false # prevent lower levels from weakening gold tier
OptionTypeDescription
min_test_coveragefloatMinimum percentage of columns that must have at least one quality check.
required_testslist[str]List of test types (e.g., not_null, unique) that MUST be present for every model in this tier.
min_scoreintThe minimum QualityScore (0-100) required for models in this tier.
overridableboolIf false, lower levels (Domain/Product) cannot modify these settings.

Data teams define specific checks for their transforms in the quality section.

transforms:
- name: dim_customers
type: dbt
quality_tier: gold # selects policy from manifest.yaml
quality:
checks:
- name: id_not_null # unique identifier for the check
type: not_null # check type (provider-specific)
column: customer_id # target column
dimension: completeness # completeness, accuracy, validity, consistency, timeliness
severity: critical # critical, warning, info
parameters: # optional parameters for the check
custom_arg: value
enabled: true # default: true
FieldTypeRequiredDescription
namestrYesUnique identifier for the check within the model.
typestrYesThe type of check (e.g., not_null, unique, expect_column_values_to_be_between).
columnstrNoTarget column name. Omit for table-level checks.
dimensionDimensionYesThe quality dimension: completeness, accuracy, validity, consistency, or timeliness.
severitySeverityNoImpact of failure: critical (default), warning, or info.
custom_weightfloatNoOverride the default severity weight (0.1 - 10.0).
parametersdictNoDictionary of parameters passed to the quality provider.
enabledboolNoWhether the check should be executed (default: true).

floe uses a hierarchical configuration model that allows for central governance with local flexibility.

  1. Enterprise Level (manifest.yaml): Sets global standards and “locked” policies.
  2. Domain Level (domain.yaml): (Optional) Refines standards for a specific business domain.
  3. Product Level (floe.yaml): Implements specific checks within the guardrails set by the Enterprise and Domain.

Settings can be locked using overridable: false. Any attempt to weaken a locked setting at a lower level will result in a FLOE-DQ107 error.

The quality plugin automatically emits OpenLineage FAIL events when quality checks fail. This integration allows you to track data quality issues in tools like Marquez or Atlan.

To enable OpenLineage emission, ensure your LineageBackend is configured in manifest.yaml:

lineage:
backend: marquez
url: http://marquez:5000

The QualityPlugin will then use the OpenLineageEmitter to report:

  • The failed job_name and dataset_name.
  • Specific check results that caused the failure.
  • Impacted quality dimensions.
CodeMessageResolution
FLOE-DQ001Quality Provider Not FoundEnsure the provider in manifest.yaml matches an installed plugin (great_expectations or dbt_expectations).
FLOE-DQ102Quality Check FailedAchievement score was below min_score. Review the runtime results for specific check failures.
FLOE-DQ103Coverage ViolationYour model has fewer tests than the tier minimum. Add checks to more columns.
FLOE-DQ104Missing Required TestsOne or more tests required by the tier (e.g., unique) are missing from your configuration.
FLOE-DQ105Invalid Column ReferenceThe check refers to a column that does not exist in the source data or dbt model.
FLOE-DQ106Quality Check TimeoutThe checks took longer than check_timeout_seconds. Optimize your queries or increase the timeout.
FLOE-DQ107Locked Setting OverrideYou tried to override a setting that the Platform Team has marked as overridable: false.