Innovative lakehouse design atop Apache Iceberg, Apache Arrow and Apache Parquet

吴刚,付旭炜

Chinese Session 2023-08-19 15:00 GMT+8  #datalake

Yunqi Tech Inc. is a China-based emerging startup which offers an enterprise multi-cloud data platform and ensures ecosystem compatibility and data democracy with a seamless user experience. At its core is a cloud lakehouse, a converged data architecture with single-engine, supports both general and analytical purposes, batch and streaming data analysis.

This talk will unveil the innovative architectural design of our lakehouse at Yunqi. In addition, we will exemplify how we achieve ecosystem compatibility and blazing-fast performance by leveraging and improving Apache Iceberg (a high-performance format for huge analytic tables), Apache Arrow (a language-independent columnar memory format) and Apache Parquet (a column-oriented data file format).

Speakers:


Gang Wu, Software Engineer at Yunqi Tech Inc., is working on the Yunqi lakehouse. He is a PMC member of Apache ORC, committer of Apache Arrow and Apache Parquet. Prior to Yunqi, he was a Staff Software Engineer at Alibaba working on the storage system of MaxCompute, and a Software Engineer at Uber working on Apacke Spark.


Xuwei Fu, Software Engineer at Yunqi Tech Inc., is working on the storage system of the Yunqi lakehouse.