Lake Format
Open table formats that bring ACID transactions, schema evolution, and time travel to data lakes.
Open table formats are the storage foundation of the agentic data stack. By bringing ACID transactions, schema evolution, and time travel capabilities to data lakes, they provide the reliability guarantees that autonomous agents require to operate confidently. An agent modifying a dataset can do so transactionally, knowing that concurrent readers will see consistent snapshots and that any operation can be safely rolled back if something goes wrong.
Time travel is particularly valuable in agentic workflows. When agents detect anomalies or unexpected results, they can inspect previous versions of the data to understand what changed and when. This capability transforms debugging from a manual investigation into an automated diagnostic process. Schema evolution support means agents can adapt to changing data structures without breaking downstream pipelines.
The convergence of open table formats around shared specifications further strengthens the agentic data stack by preventing vendor lock-in and enabling agents to work seamlessly across different engines and tools. Whether an agent is orchestrating a batch transformation or serving a real-time query, open table formats ensure the underlying data layer is versioned, reliable, and interoperable.
Components & Frameworks(4)
Open table format for huge analytic datasets with agent-friendly branching and time travel.
Open storage framework that brings reliability and performance to data lakes.
Streaming data lakehouse platform with incremental processing and record-level updates.
Streaming data lake platform supporting high-speed ingestion and changelog tracking.
Articles and case studies for Lake Format are coming soon.