Resilient Data: Exploring Replication and Recovery in Apache Ozone

Sadanand Shenoy

English Session 2023-08-18 16:15 GMT+8  #datastorage

Data resilience is crucial in modern distributed systems to ensure data availability and durability. Apache Ozone, a scalable and distributed object store that has the capability to handle billions of objects, addresses the need for resilient data storage through its replication and recovery mechanisms. This talk delves into the concepts and techniques employed by Apache Ozone to achieve high data resilience.

The first part of the talk explores data replication in Apache Ozone. It discusses how Ozone maintains strong consistency by keeping consistent copies of blocks across all nodes and also briefly touches upon how one can reduce data redundancy using the Erasure coding feature.

The second part, which is the crux of the talk, deals with data backup and recovery. It will discuss how one can use effective backup strategies like cross-cluster replication, Ozone snapshots, etc. This talk serves as a comprehensive guide for exploring the resilience aspects of Apache Ozone, enabling practitioners to leverage its capabilities effectively and make informed decisions when designing data-intensive applications.

Speakers:


Sadanand Shenoy: Cloudera, Software Engineer II, Sadanand Shenoy is a committer in the Apache Ozone project and has keen interest in distributed systems . Sadanand is currently working at Cloudera and has been actively contributing to the Apache Ozone project for the past 3 years . He has pursued B.E in Information Science and Engineering from MSRIT Bangalore.