An extension of Apache Atlas’ data model and an alternative open source user interface

Wombacher, Andreas

English Session 2022-07-29 14:10 GMT+8  (ROOM : A) #bigdata

Apache Atlas provides data governance functionality and is part of the Hadoop eco-system, however, it is not limited to it. The underlying data model is very generic and can be extended. This makes Apache Atlas very flexible, however, has consequences for the usability of the user interface. In this talk an extension of the data model and an alternative open source user interface is presented, which can be used more intuitively by non-technical users, especially business users. In more detail, in this presentation the underlying extension of the data model is motivated and explained. Further, it is motivated which derived information is required by the business user to increase the usability. Next the open source backend functionality is explained and the underlying technologies are motivated. Finally, a short tutorial is provided on how to setup your own system with a related open source helm chart and explains how to get started. The motivation for this talk is to promote this open source project and find support and interest in the community.
Some related numbers:

  • we are adding 6 additional base types
  • we have a further extension of these 6 types for elastic, kafka and kubernetes
  • the frontend is in Angular
  • the backend uses Apache Atlas, HBase, Apache Kafka, Apache Flink, Keycloak, Apache Httpd, elasticsearch and elastic enterprise search

Speakers:


Wombacher, Andreas: Aurelius Enterprise B.V., CTO, Andreas has extensive expertise in workflow and data management. Ranging from data integration, sensor data fusion, data mining, and data analysis towards understanding the dependencies of data created in a workflow environment. This includes compliance testing, data provenance, process mining and process variance mining. Andreas has worked with data on different scales and abstraction levels from time series sensor data to information system or human event data. As a consequence, Andreas has experience in techniques ranging from in memory data analysis to Hadoop distributed data processing. Due to the hands-on experience in various environments Andreas has been in the role of a data architect in various customer engagements.