Big Data


Track Chairs : Lidong Dai, Gang Li

Big Data is leading and changing various industries and is inseparable from our lives. Big Data is also a very important part of ASF. ASF has so many big data projects, such as Hadoop, Hive, Spark, HBase, Kylin, Ozone, CarbonData, Doris, Cassandra, etc. In this topic, you will learn the cutting-edge trends of these technologies and the practical experience, principles, architecture analysis and other exciting content from first-line users

2022-07-29

ROOM : A

13:30 GMT+8 New features for Apache Doris 1.x version and future plans for the cloud native era Chinese Session 杨政国

14:10 GMT+8 An extension of Apache Atlas’ data model and an alternative open source user interface English Session Wombacher, Andreas

14:50 GMT+8 Apache Druid cloud native architecture evolution Chinese Session 金嘉怡

15:30 GMT+8 Scaling Open Source Big Data Cloud Applications is Easy/Hard English Session Paul Brebner

16:10 GMT+8 Fine grained authorization to Cloud stores using Apache Ranger English Session Mukund

16:50 GMT+8 Big data infrastructure system evolution Chinese Session 张伟伟

2022-07-30

ROOM : A

13:30 GMT+8 Capturing per thread statistics for a job - Thread-level IOStatistics - HADOOP-17461 English Session Mehakmeet Singh

14:10 GMT+8 Sharing of recent progress and practices in Apache Ozone Chinese Session Yan Liu,Yi Chen

14:50 GMT+8 Bytedance data Lake table optimization management service based on Apache Hudi Chinese Session 喻兆靖

15:30 GMT+8 How can I use Apache Seatunnel to simplify data synchronization Chinese Session 陶克路

16:10 GMT+8 Hadoop Vectored IO: your data just got faster! English Session Mukund Thakur

16:50 GMT+8 Apache Hive 4.0 Unreleased Features Chinese Session Yan Liu

2022-07-31

ROOM : A

13:30 GMT+8 Disaster Recovery in Apache Ozone English Session Sadanand Shenoy,Rakesh Radhakrishnan

14:10 GMT+8 Evolution of Apache Kylin technology -- second half of MOLAP technology Chinese Session 俞霄翔

14:50 GMT+8 Flink Table Store: Streaming data warehouse architecture and scenario Chinese Session 李劲松

15:30 GMT+8 Dolphinscheduler + Notebook open-source Big data Studio Chinese Session 高楚枫

16:10 GMT+8 Huya application metadata platform practice based on graph data Chinese Session 邹磊

16:50 GMT+8 Tales at Scale: Analytics at 1000 QPS and Beyond English Session Gian Merlino

17:30 GMT+8 Optimization implementation and future planning of real-time writing to Apache Doris via Flink Chinese Session 杨勇强

2022-07-29

ROOM : B

13:30 GMT+8 Building a real-time analytics dashboard with Apache Kafka, Apache Pinot, and Streamlit English Session Dunith Dhanushka, Karin Wolok

14:10 GMT+8 Apache Ozone: Multi-Protocol aware system handles both Files and Objects efficiently English Session Rakesh Radhakrishnan, Mukul Kumar Singh

14:50 GMT+8 EBay built the Unified & ServerLess Spark gateway practice based on Apache Kyuubi(Incubating) Chinese Session 王斐

15:30 GMT+8 Optimization and practice of Apache InLong in Tencent Cloud Chinese Session Yunqing Mo

16:10 GMT+8 Flink/Spark cloud native practices based on Zeppelin Chinese Session 陶克路,王正

16:50 GMT+8 An off-line data discovery method based on consanguinity Chinese Session 韩帅,孙科

2022-07-30

ROOM : B

13:30 GMT+8 Apache Ozone behind Simulation and AI industries English Session Kota Uenishi

14:10 GMT+8 What's new in Apache Impala 4.x Chinese Session Quanlong Huang(黄权隆)

14:50 GMT+8 HBase improvements and practices in Meituan Chinese Session 哈晓琳

15:30 GMT+8 Support Customized Kubernetes Schedulers: Provides Customized scheduling capabilities for Spark on Kubernetes Chinese Session 姜逸坤,王雷博

16:10 GMT+8 How to use the Cloud Shuffle Service in the Spark scenario of Bytedance Chinese Session 魏中佳

16:50 GMT+8 Interactive data engineering workload execution using Livy session on Kubernetes cluster English Session Anmol Chaturvedi, Haripriya Bendapudi, Praneet Sharma

2022-07-31

ROOM : B

13:30 GMT+8 BIGTOP 3.0 with the upgraded Mpack: New era of BigData Distribution English Session Yuqi Gu(顾煜祺)

14:10 GMT+8 Large scale migration to Parquet in Uber English Session Huicheng Song

14:50 GMT+8 Spark's application practice in Xiaomi Chinese Session 王准

15:30 GMT+8 How does Linkis provide computing governance capabilities for diversified big data computing and storage engines Chinese Session 邸帅

16:10 GMT+8 Multi-engine virtual column technology based on Apache Calcite Chinese Session 谢佳君

16:50 GMT+8 The practice and thinking of big data Python ecology in transmitting wisdom education Chinese Session 张敬存,赵晨杰