Small Files, Big Foils: Addressing the Associated Metadata and Application Challenges
-
Small files are a common challenge in the Apache Hadoop world and when not handled with care, they can lead to a number of complications. The Apache Hadoop Distributed File System (HDFS) was developed to store and process large data sets over the range of terabytes and petabytes. However, HDFS stores small files inefficiently, leading to inefficient Namenode memory utilization and RPC calls, block scanning throughput degradation, and reduced application layer performance. In this blog post,
The post Small Files, Big Foils: Addressing the Associated Metadata and Application Challenges appeared first on Cloudera Engineering Blog.
https://blog.cloudera.com/blog/2019/05/small-files-big-foils-addressing-the-associated-metadata-and-application-challenges/
© Lightnetics 2024