New Apache Bigtop 1.5 and Wikimedia: Empower BigData in the real world

Yuqi Gu, Luca Toscano

English Session 2021-08-08 16:50 GMT+8  (ROOM : B) #bigdata

Bigtop is the Apache project for Infrastructure Engineers and Data Scientists who are looking for packaging, testing, and configuration of the open-source Big Data components. Its latest version 1.5.0 has been released as the Big Data Sw stack. From a real-world use case of Bigtop, the Wikimedia’s Analytics/Data-Engineering team collaborated with the project to move its Cloudera CDH Hadoop distribution to Apache Bigtop. In this session, the speakers will provide an overview of Bigtop’s new features and their implementation details, plus a brief introduction of the work done between Wikimedia and Bigtop to support the migration of their data infrastructure.


Yuqi Gu: Yuqi Gu currently works for Arm and linaro. He is the committer and PMC member of Apache Bigtop. He is mainly focusing on performance optimization on Arm64.

Luca Toscano: Luca Toscano is a Site Reliability Engineer @ Wikimedia Foundation. He is interested in scalable systems design and reliability, process automation and parallel/distributed/low-level programming. He is passionate about software stacks able to manage and automate complex infrastructures.