Apache Iceberg
Open table format for huge analytic datasets with agent-friendly branching and time travel.
About Apache Iceberg
Open table format for huge analytic datasets with agent-friendly branching and time travel. Explore how Apache Iceberg integrates with the agentic data stack ecosystem and supports autonomous data operations.
Key Features
- ACID transactions on cloud object storage ensuring data integrity
- Schema evolution without table rewrites (add, rename, reorder columns)
- Hidden partitioning that decouples physical layout from query filters
- Time travel and version rollback for reproducible queries and recovery
- Multi-engine compatibility (Spark, Trino, Flink, Presto, Hive, Impala)
- Partition evolution allowing partition schemes to change without rewriting data
- Scalable metadata handling for tables with tens of petabytes of data
- File-level statistics enabling query engines to skip irrelevant data files
Agent Integration
MCP Server
cloudera/iceberg-mcp-serverExternal Links
MCP server providing read-only access to Iceberg tables via Impala with LangChain/OpenAI SDK integration
Official REST API specification for Iceberg catalogs — the standardized API agents use
Native Python library for programmatic access to Iceberg table metadata and data, no Spark/JVM required
Curated list of Apache Iceberg resources, tools, and ecosystem projects
Documentation for git-like branching, tagging, and fast-forward merge on Iceberg tables