FORMATS '26 — Sunday, May 31, 2026.

1st International Workshop on Data FORMATS for Modern Architectures and Workloads

Where Research Meets Practice.

Sponsored by and held in conjunction with ACM SIGMOD/PODS 2026 in Bengaluru, India.

Workshop Program

Room: Scarlet 2

Tentative program. Some talks may be omitted if speakers are unable to travel due to visa-related uncertainty, and timings may be updated closer to the workshop date.

9:10 – 9:15

Welcome and opening remarks

Session 1: Storage Layout, Compaction, and Pruning

9:45 – 10:00

Commutative Compaction

Chris Douglas (UC Berkeley), Joseph Hellerstein (UC Berkeley)

10:00 – 10:15

Amethyst: Adaptive Compaction for LSM Trees via Segment-Level Policy Selection

Suchitra Shankar (PES University), Nilin Rose (Jain University)

10:15 – 10:30

What No One Tells You About Page Pruning in Parquet: The Real Cost of Page Index Parsing

Faeze Faghih (Technical University of Darmstadt), Si Jun Kwon (Technical University of Darmstadt), Zsolt István (Technical University of Darmstadt)

10:30 – 11:00

Coffee Break

Session 2: Lakehouse Architecture and Cross-Format Interoperability

12:00 – 12:15

The Lance Lakehouse Format

Ayush Chaurasia (LanceDB), Jack Ye (LanceDB), Lu Qiu (LanceDB), Lei Xu (LanceDB), Weston Pace (LanceDB)

12:15 – 12:30

Polyglot: An LLM-Driven Semantic Control Plane for Cross-Format Type Interoperability Across Lakehouse Formats

Aastha Agrrawal (LinkedIn), Sumedh Sakdeo (LinkedIn), Afzal Afzal (LinkedIn), Ruolin Fan (LinkedIn), Kunal Narula (LinkedIn), Lenisha Gandhi (LinkedIn)

12:30 – 1:30

Lunch

Session 3: Metadata, Consistency, and Maintenance in Production Lakehouses

2:00 – 2:15

Zero-Scan Data Quality: Leveraging Table Format Metadata for Continuous Observability at Scale

Mohit Verma (LinkedIn), Shantanu Rawat (LinkedIn), Christian Bush (LinkedIn), Sumedh Sakdeo (LinkedIn), Lokesh Amarnath Ravindranathan (LinkedIn), Dwarak Bakshi (LinkedIn)

2:15 – 2:30

How Consistent and Fresh are Lake Tables Really

Zinuo Li (Renmin University of China), Dongyang Geng (Renmin University of China), Haoyue Li (Renmin University of China), Hailong Yu (eDaijia Automobile Technology), Qi Lei (Renmin University of China), Haoqiong Bian (Renmin University of China)

2:30 – 2:45

Metadata-as-Data in Apache Hudi: A Multi-Modal Index Substrate for Lakehouse Tables

Sagar Sumit (Anyscale), Prashant Wason (Uber), Sivabalan Narayanan (Onehouse)

2:45 – 3:00

Table Format Optimizations in Managed Spark for Dataproc

Isha Tarte (Google), Jayadeep Jayaraman (Google), Abhishek Modi (Google), Vishal Karve (Google), Jingwei Lu (Google), Sourabh Badhya (Google), Rajarshi Sarkar (Google), Aditya Shah (Google), Haymant Mangla (Google), Zihan Cao (Google), Huadong Liu (Google), Warren Zhu (Google), Wei Yan (OpenAI)

3:00 – 3:30

Coffee Break

Session 4: Emerging Directions and the Road Ahead

4:55 – 5:00

Closing remarks