A recent blog post outlining the “Only 4 Types of DQ Programs” offers a useful framing of how organizations currently approach data quality. It identifies four categories: programs driven by policy or regulatory mandates, ML-driven approaches, reports and roll-ups, and anomaly detection-based systems. It’s a practical lens, reflecting both business motivation and common technical approaches in use today.
But as we move deeper into the AI era, we believe the conversation needs to go further. These four types are important, but they’re not sufficient. The modern enterprise doesn’t just need better data quality—it needs a more agentic, dynamic, and operationally embedded approach to managing data.
- Data environments are no longer static: Traditional data quality programs assume a relatively stable data ecosystem. But modern enterprises deal with dynamic pipelines, rapidly changing schemas, distributed ownership, and real-time demands. This requires a management model that is embedded into operations, not layered on top.
- Complexity demands system-wide intelligence: Issues with data quality are often symptoms of deeper problems—like pipeline failures, misaligned business logic, or cost-performance tradeoffs. A dynamic, operationally rooted agentic approach lets systems reason across those dimensions and act holistically.
- AI introduces new operational possibilities: With telemetry, LLMs, and autonomous agents, we’re no longer limited to dashboards and rules. We can now design systems that observe, learn, and act with context. That’s the essence of agency—not just running automations but making adaptive, goal-driven decisions.
Let’s explore how data quality—and data management more broadly—must evolve to support these new demands.
1. From Discrete Approaches to Layered Orchestration
Collibra’s framing of four DQ program types highlights the diversity of approaches in use today—some driven by external mandates, others by analytics or machine learning workflows. These approaches often coexist within the same organization. But in today’s fast-moving AI environments, coexistence isn't enough.
Modern data systems demand simultaneous orchestration. For example:
- A policy-driven DQ rule may prevent bad data from entering a pipeline.
- A report might flag a downstream data discrepancy.
- An ML model may detect drift in behavior patterns.
- An anomaly detection system may flag unusual spikes in usage or cost.
Each approach brings value—but only when coordinated do they create resilient, intelligent systems. In dynamic environments where schemas evolve overnight, volumes shift unpredictably, and AI models retrain continuously, DQ systems must operate as a layered fabric, not isolated tactics.
Success isn’t about picking one approach—it’s about orchestrating all of them in real time.
2. Autonomy is a Goal—Agency is the System
Autonomy—defined as systems acting independently without human intervention—is only part of the picture. It’s a necessary goal but needs a system.
Agency is the system that adds more dimension. It includes the ability to reason, to weigh trade-offs, to align actions with business goals, and to coordinate with other systems. Think of it as autonomy made meaningful—shaped by awareness, intentionality, and adaptability to align with evolving business needs.
In an agentic data management approach, intelligent agents:
- Learn from telemetry across the data lifecycle
- Correlate signals across quality, performance, cost, and lineage
- Decide and act (or recommend actions) based on business context
- Continuously evolve their behavior based on feedback
- Operate in dynamic, distributed, and real-time environments
- Adapt to shifting schemas, system topologies, and business needs
Agency brings the contextual intelligence that makes autonomy effective—empowering systems to navigate ambiguity, make informed decisions, and continuously adapt to meet evolving business needs.
3. Static Metadata Can’t Keep Up with Dynamic Data Systems
Traditional governance frameworks are rooted in metadata—definitions, policies, ownership. These are essential, but in today’s complex ELT/ETL environments, data systems generate dynamic metadata every second:
- Warehouse performance metrics
- API error rates
- Drift in schema or data distributions
- Cost spikes on cloud workloads
Capturing this in motion and making sense of it requires observability—not just governance. It requires a reasoning engine, not just a catalog.
4. Data Quality is Only One Thread in a Much Larger System of Interdependencies
Modern data teams don’t just struggle with quality—they grapple with:
- Pipeline reliability
- Cloud cost overruns
- Data product SLAs
- Regulatory data traceability
These issues are interwoven. Fixing a data quality issue might require optimizing compute, changing ingestion patterns, or adjusting business logic. An agentic approach understands this interconnectedness and operates across it.
5. Acceldata’s View: Observability + Intelligence + Action
At Acceldata, we believe in a broader vision for data excellence. Our platform combines:
- Deep observability across data, pipelines, systems, and cost
- A xLake Reasoning Engine to correlate signals and identify root causes
- Intelligent Agents that drive actions—preventive, corrective, and optimizational
This isn’t just about quality. It’s about operational excellence across the entire data stack.
In Closing: Evolving the Narrative
We applaud Collibra for framing a valuable discussion around data quality implementation models. But in today’s AI-powered, telemetry-rich, real-time environments, we must evolve that framing.
The future of data management is agentic.
Let’s not just detect issues. Let’s reason. Let’s act. Let’s build data systems that don’t just monitor themselves—but improve themselves.
Acceldata is leading the way.