#AgenticDataStack

How Do Data Systems Evolve in an Agentic World?

A community exploring how modern data stacks evolve into agentic data stacks — enabling AI-powered data transformation, autonomous data engineering with data agents, and intelligent data systems powered by data engineering agents.

Vendor-neutral · Practice-driven · Architecture-first

The Agentic Data Stack

A reference architecture for intelligent data systems designed for agents — enabling autonomous data pipelines, context-aware operations, AI-powered reasoning, and governed execution.

Agentic Experience LayerData Agents · Copilots · Analysts · Engineers · APIs · Chat Interfaces1. Context & Semantic IntelligenceCatalog · Metadata · Lineage · Ownership · Semantic Layer · Metrics · Business Glossary · Policy ContextMakes data understandable for agents and provides shared language, constraints, and memory.2. Planning, Orchestration & SkillsPlanner · Workflow Engine · MCP / Tooling · Skills · Sub-agents · Human Approval · Memory RoutingTurns requests into structured plans, invokes tools, coordinates multi-step workflows, and manages control flow.3. Execution & Data OperationsSQL Engines · Lakehouse / Warehouse · ETL / ELT · Transformations · Data Quality · Versioning · Time TravelExecutes queries, pipelines, model builds, backfills, validation, and operational actions across the data platform.4. Storage, Systems & Governance FoundationTable Formats · Object Storage · Databases · Stream Sources · IAM · Security · Observability · Cost ControlProvides durable data, scalable infrastructure, access control, telemetry, and governance guardrails.ContextPlanningExecutionFoundationShared meaningTask routingOperational actionControl & trustContinuous Feedback LoopEvaluation · Monitoring · Cost Optimization · Reliability · Governance Improvement

What we focus on

Modern data stack optimized for humans. Agentic data stack optimized for data agents, autonomous data pipelines, and AI-powered analytics driven by data engineering agents.

Autonomous Data Engineering

Data engineering agents autonomously build models, pipelines, dashboards, and analytics from raw data — then continuously optimize, monitor, and maintain them with intelligent automation.

AI Agent Data Architecture

Build intelligent data systems designed for AI agents. Agentic data catalogs, semantic layers, lakehouse engines, and autonomous data pipelines — architected for agent-driven operations and data stack automation with AI.

Operating Agentic Data Systems

Learn from data platform teams running agentic workloads in production. Real architectures for data quality for AI agents, real trade-offs in AI-powered data transformation, real lessons from operating intelligent data systems.

Vendor-Neutral Architecture Patterns

Building agentic data stack requires repeatable patterns and system-level thinking — not vendor pitches. We focus on architecture blueprints for catalog services for AI agents, agentic ETL pipelines, and data engineering automation.

People don't disappear — their roles evolve

Data Engineers

will focus on architecture, layering, and building better context for agents.

Data Analysts

will define directions, hypotheses, and the right questions, while agents explore and reason autonomously.

How to participate

Join a growing community of data engineers, analysts, and architects shaping the agentic future of data.

01

Follow

Follow the Agentic Data Stack LinkedIn page and Luma calendar to stay up to date on events, discussions, and new content.

02

Speak

Submit a talk via our Call for Proposals. Share your experience building, operating, or designing agentic data systems.

03

Learn

Explore past sessions and shared materials. Learn from real production practices across data platform teams worldwide.

Call for Proposals

10 tracks covering the full spectrum of agentic data systems. Share your experience — submit a talk.

1

Foundations of Agentic Data Stack

  • What makes modern data stack agentic?
  • Agentic vs traditional data architectures
  • From passive data platforms to active, reasoning systems
  • Lessons learned when introducing agents into data stacks
2

Catalog, Metadata & Context

  • Catalog services as agent memory
  • Metadata, lineage, and ownership for agent reasoning
  • Making data catalogs actionable for agents
  • Context modeling for large-scale data systems
3

Lake Formats, Versioning & Time Travel

  • Table formats and branching for agent workflows
  • Time travel as checkpoints for agents
  • Reproducibility and auditability in agent-driven analytics
  • Managing cost and performance with large numbers of versions
4

Lakehouse/Warehouse Engines & Execution

  • Query engines as components for agents
  • Plan-aware and cost-aware execution for agent workloads
  • Feedback loops between agents and query engines
  • Optimizing execution for iterative and exploratory agents
5

Semantic Layer & Metrics

  • Metrics and semantic models as first-class agent knowledge
  • Metric reasoning, attribution, and explanation
  • Bridging BI semantics and agent planning
  • Semantic consistency across agents, dashboards, and pipelines
6

Data Agents & Agent Frameworks

  • Designing data agents over data stacks
  • Sub-agents, skills, and task-oriented architectures
  • Human-in-the-loop design for data agents
  • Evaluating reliability and correctness of data agents
7

ETL / ELT & Agentic Pipelines

  • Pipelines operated or assisted by agents
  • Agent-driven data quality checks and recovery
  • Backfills, schema evolution, and migration with agents
  • From DAGs to adaptive workflows
8

Agentic Data Stack in Production

  • Real-world agentic data stack architectures
  • Build vs buy decisions and trade-offs
  • Migration stories from traditional data stacks
  • What worked — and what didn't — in production
9

Platform Engineering, DevOps & SRE

  • Operating agents and data systems together
  • Cost control, isolation, and multi-tenancy
  • Observability for agent workflows (logs, traces, feedback)
  • Security, permissions, and access control for agents
10

Governance, Reliability & Evaluation

  • Guardrails and approval flows for agent actions
  • Governance and compliance in agentic data systems
  • Offline and online evaluation of agent workflows
  • Continuous improvement with feedback loops

Frequently Asked Questions

An agentic data stack is a data architecture designed for AI agents to autonomously build, query, and optimize data pipelines. It includes MCP-enabled catalogs, versioned lake formats, API-first SQL engines, and intelligent orchestration tools that agents can interact with programmatically.
Traditional data stacks require human data engineers to manually configure pipelines. Agentic data stacks provide API-first interfaces, declarative configurations, and self-service capabilities that allow AI agents to autonomously discover data, build transformations, and optimize queries without human intervention.
An agentic data stack typically consists of: (1) MCP-enabled catalog services for metadata discovery, (2) versioned lake formats like Apache Iceberg or Delta Lake for time-travel and rollback, (3) API-first SQL engines like Trino or DuckDB, (4) semantic layers for business logic abstraction, (5) declarative ETL/ELT tools, and (6) data agents for autonomous operations.
Yes, most agentic data stack components are open-source and free to use. This includes Apache Iceberg, Delta Lake, DuckDB, Trino, dbt, Airbyte, Apache Airflow, Unity Catalog, Apache Superset, and many more. We maintain a directory of 27+ free tools across 8 categories.
Agentic data stacks are ideal for data engineering teams building AI-powered analytics, organizations deploying autonomous data pipelines, and companies looking to reduce manual data engineering work through intelligent automation. Both startups and enterprises can benefit from agentic architecture patterns.