Application of Apache SeaTunnel, the next generation of ultra-high performance big data integration tool, in the data lake scenario

代立冬

Chinese Session 2023-08-19 14:30 GMT+8 #datalake

Today, there are hundreds of data sources, not only relational and non-relational databases, but also SAAS, log and interface data, offline batch synchronization can no longer meet business needs, more and more business requirements for real-time synchronization, how to make these data sources can be quickly and efficiently offline and real-time synchronization. And to achieve data consistency and perfect monitoring at the same time occupy the least resources, which is a great challenge to data integration

Solution idea: The Apache SeaTunnel data synchronization pipeline combined with SeaTunnel’s own dedicated synchronization engine Zeta is used to solve the difficult integration problem, and data synchronization can be completed with the lowest possible resources, providing better performance for large-scale data integration synchronization.

Audience revenue

Function and architecture design of Apache SeaTunnel
Why develop SeaTunnel’s own sync engine Zeta instead of Spark/Flink?
User use cases and subsequent Roadmap

Speakers:

Dai Lidong: Beluga Open Source, Co-founder of Beluga Open Source, Apache SeaTunnel PMC member, Co-founder of Beluga Open Source, Apache SeaTunnel PMC & Apache DolphinScheduler PMC, Apache Incubator Mentor