Apache Bookkeeper (as a Key Value Stire) and its use cases

Shivji Kumar Jha

English Session 2021-08-06 13:30 GMT+8 #messaging

In order to leverage the best performance characters of your stream backend, it is important to understand the nitty gritty details of how your stream server stores your data. Understanding this empowers you to design your use case solutioning so as to make the best use of resources at hand as well as get the optimum amount of consistency, availability, latency and throughput for a given amount of resources at hand.

With this underlying philosophy, in this talk, we will get to the bottom of storage tier of pulsar (apache bookkeeper), the barebones of the bookkeeper storage semantics, how it is used in different use cases ( even other than pulsar), understand the object models of storage in pulsar, different kinds of data structures and algorithms pulsar uses therein and how that maps to the semantics of the storage class shipped with pulsar by default. Oh yes, you can change the storage backend too with some additional code!

This session will empower you with the right background to map your data right with pulsar. The focus will be more on storage backend so as to not keep this tailored to pulsar specifically but to be able to apply it different data stores or streams.

Speakers:

Shivji Kumar Jha: Shiv is a senior software developer at Nutanix and works for the beam team helping Nutanix customers minimise cloud costs and security risks for hybrid cloud usage. Shiv works on, among other things, all things pulsar at nutanix managing 4 clusters ((30 nodes)) of pulsar and the use cases around it. Shiv loves spending time on data stores (databases, streams, analytics etc) and has contributed to MySQL and pulsar codebases. Shiv is an avid reader (tech, fiction, economics etc) and is always looking at ways to simplify software architectures.