Data Lake accelerator on Hadoop-COS in Tencent Cloud

Li Cheng

English Session 2021-08-08 14:10 GMT+8  (ROOM : A) #bigdata

New game-changer from Tencent Cloud, Data Lake accelerator GooseFS on Hadoop-COS Ever since COS, as Tencent Cloud Object Storage Solution, submitted Hadoop Capatible FS plugic Hadoop-COS to Hadoop community in 2019, Tencent Cloud has gone full speed to support Data Lake on Tencent COS. In 2021, Tencent COS weaponizes Hadoop-COS with multi-layer accelerator GooseFS, which not only greatly enhances COS performance in Hadoop eco-system, but also make Tencent COS more cohesive with big data and AI platforms. From the talk, audience would learn about:

  1. How GooseFS helps the cohesion with EMR and K8s on Tencent Cloud platforms.
  2. Translucent acceleration of IO performance brought by new Hadoop-COS.
  3. How GooseFS loads namespace level cache and table level cache.
  4. Data Lake solutions from Tencent Cloud Storage brought by new Hadoop-COS.

GooseFS in the new Hadoop-COS is a distributed cache solution, which can be natively deployed with EMR and K8s. GooseFS supports cache options as memory, SSD and HDD and multiple cache policies. GooseFS catalog also supports cache load table level Cache like Hive tables by table id and table partitions. Furthermore, GooseFS on Hadoop-COS would support Iceberg tables and load by versions. There are many more exciting features Tencent Cloud are building up to expand the Data Lake realm.


Li Cheng: Current Senior Engineer who owns Big Data Storage in Tencent Cloud COS. Formerly works for AWS S3 and Huawei Storage team. Also active in open source community. Currently Apache Ozone PMC and Hadoop Committer.