Partition Management in Hadoop
-
Guest blog post written by Adir Mashiach
In this post I’ll talk about the problem of Hive tables with a lot of small partitions and files and describe my solution in details.
A little background
In my organization, we keep a lot of our data in HDFS. Most of it is the raw data but a significant amount is the final product of many data enrichment processes.
The post Partition Management in Hadoop appeared first on Cloudera Engineering Blog.
https://blog.cloudera.com/blog/2019/05/partition-management-in-hadoop/
© Lightnetics 2024