#AgenticDataStack

How Do Data Systems Evolve in an Agentic World?

A community exploring how modern data stacks evolve into agentic data stacks — enabling AI-powered data transformation, autonomous data engineering with data agents, and intelligent data systems powered by data engineering agents.

Vendor-neutral · Practice-driven · Architecture-first

Join the Community Newsletter Explore Resources

Latest from the community

Blog

Agentic Data Stack Evaluation FAQ: Practical Answers for Platform Teams

Common questions about evaluating agent-ready data stacks — from measuring readiness quantitatively to making existing tools agent-friendly.

6 min readMar 22

Newsletter #1

Agentic Data Stack Weekly #1

Inaugural issue: MCP protocol gains traction in data tools, Apache Iceberg adoption accelerates, and why semantic layers are the missing piece for data agents.

Mar 14, 2026

Resources

28components tracked

Across 8 categories — from catalog services to data agents. The living wiki of the agentic data stack ecosystem.

Explore resources →

The Agentic Data Stack

A reference architecture for intelligent data systems designed for agents — enabling autonomous data pipelines, context-aware operations, AI-powered reasoning, and governed execution.

What we focus on

Modern data stack optimized for humans. Agentic data stack optimized for data agents, autonomous data pipelines, and AI-powered analytics driven by data engineering agents.

Autonomous Data Engineering

Data engineering agents autonomously build models, pipelines, dashboards, and analytics from raw data — then continuously optimize, monitor, and maintain them with intelligent automation.

AI Agent Data Architecture

Build intelligent data systems designed for AI agents. Agentic data catalogs, semantic layers, lakehouse engines, and autonomous data pipelines — architected for agent-driven operations and data stack automation with AI.

Operating Agentic Data Systems

Learn from data platform teams running agentic workloads in production. Real architectures for data quality for AI agents, real trade-offs in AI-powered data transformation, real lessons from operating intelligent data systems.

Vendor-Neutral Architecture Patterns

Building agentic data stack requires repeatable patterns and system-level thinking — not vendor pitches. We focus on architecture blueprints for catalog services for AI agents, agentic ETL pipelines, and data engineering automation.

People don't disappear — their roles evolve

Data Engineers

will focus on architecture, layering, and building better context for agents.

Data Analysts

will define directions, hypotheses, and the right questions, while agents explore and reason autonomously.

How to participate

Join a growing community of data engineers, analysts, and architects shaping the agentic future of data.

Follow

Follow the Agentic Data Stack LinkedIn page and Luma calendar to stay up to date on events, discussions, and new content.

Speak

Submit a talk via our Call for Proposals. Share your experience building, operating, or designing agentic data systems.

Learn

Explore past sessions and shared materials. Learn from real production practices across data platform teams worldwide.

Call for Proposals

10 tracks covering the full spectrum of agentic data systems. Share your experience — submit a talk.

Foundations of Agentic Data Stack

What makes modern data stack agentic?
Agentic vs traditional data architectures
From passive data platforms to active, reasoning systems
Lessons learned when introducing agents into data stacks

Catalog, Metadata & Context

Catalog services as agent memory
Metadata, lineage, and ownership for agent reasoning
Making data catalogs actionable for agents
Context modeling for large-scale data systems

Lake Formats, Versioning & Time Travel

Table formats and branching for agent workflows
Time travel as checkpoints for agents
Reproducibility and auditability in agent-driven analytics
Managing cost and performance with large numbers of versions

Lakehouse/Warehouse Engines & Execution

Query engines as components for agents
Plan-aware and cost-aware execution for agent workloads
Feedback loops between agents and query engines
Optimizing execution for iterative and exploratory agents

Semantic Layer & Metrics

Metrics and semantic models as first-class agent knowledge
Metric reasoning, attribution, and explanation
Bridging BI semantics and agent planning
Semantic consistency across agents, dashboards, and pipelines

Data Agents & Agent Frameworks

Designing data agents over data stacks
Sub-agents, skills, and task-oriented architectures
Human-in-the-loop design for data agents
Evaluating reliability and correctness of data agents

ETL / ELT & Agentic Pipelines

Pipelines operated or assisted by agents
Agent-driven data quality checks and recovery
Backfills, schema evolution, and migration with agents
From DAGs to adaptive workflows

Agentic Data Stack in Production

Real-world agentic data stack architectures
Build vs buy decisions and trade-offs
Migration stories from traditional data stacks
What worked — and what didn't — in production

Platform Engineering, DevOps & SRE

Operating agents and data systems together
Cost control, isolation, and multi-tenancy
Observability for agent workflows (logs, traces, feedback)
Security, permissions, and access control for agents

Governance, Reliability & Evaluation

Guardrails and approval flows for agent actions
Governance and compliance in agentic data systems
Offline and online evaluation of agent workflows
Continuous improvement with feedback loops

Frequently Asked Questions

An agentic data stack is a data architecture designed for AI agents to autonomously build, query, and optimize data pipelines. It includes MCP-enabled catalogs, versioned lake formats, API-first SQL engines, and intelligent orchestration tools that agents can interact with programmatically.

Traditional data stacks require human data engineers to manually configure pipelines. Agentic data stacks provide API-first interfaces, declarative configurations, and self-service capabilities that allow AI agents to autonomously discover data, build transformations, and optimize queries without human intervention.

An agentic data stack typically consists of: (1) MCP-enabled catalog services for metadata discovery, (2) versioned lake formats like Apache Iceberg or Delta Lake for time-travel and rollback, (3) API-first SQL engines like Trino or DuckDB, (4) semantic layers for business logic abstraction, (5) declarative ETL/ELT tools, and (6) data agents for autonomous operations.

Yes, most agentic data stack components are open-source and free to use. This includes Apache Iceberg, Delta Lake, DuckDB, Trino, dbt, Airbyte, Apache Airflow, Unity Catalog, Apache Superset, and many more. We maintain a directory of 27+ free tools across 8 categories.

Agentic data stacks are ideal for data engineering teams building AI-powered analytics, organizations deploying autonomous data pipelines, and companies looking to reduce manual data engineering work through intelligent automation. Both startups and enterprises can benefit from agentic architecture patterns.