The apache software foundation (asf) releases apache hudi from the incubator calls the tool for managing data lakes to the top-level project.
Developers who manage analytical records quickly on the hadoop distributed file system (hdfs) – or in cloud stores, with apache hudi, get a tool to help with the managing data lakes in the gross range of petabytes. Hudi supports companies while writing and reading rough quantities of raw data for stream processing. Dafur sets the storage management on distributed file systems (dfs) such as hdfs (hadoop distributed file system) or compatible cloud stores.
A backlack on the beginning
The short hudi, spoken hoodie, stands for hadoop upserts deletes and incrementals. Uber has developed the project in 2016 and laid the code opened in 2017. The open source project started in january 2019 in the preservation phase in the apache incubator of the asf. Meanwhile, there are also companies such as the alibaba group, emis health, linknovate, tathastu.Ai, and tencent on apache hudi. A list of other companies can be found on the website for apache hudi. As part of amazon elastic mapreduce (emr) it is also supported by the amazon web services.
Further information about apache hudi can be found on the official website of the project. The apache software foundation also has a contribution to the commands of the project on top level level – interested can be informed about the apache blog.