Skip to main content

Table 3 Definition of the number of buckets (for bucketing only)

From: Evaluating partitioning and bucketing strategies for Hive-based Big Data Warehousing systems

Data model SF Table size (MB) HDFS block (128 MB) At least 1 GB
SS 30 5088 \( \frac{{5088 {\text{MB}}}}{{128 {\text{MB}}}} \cong 40\, \text{buckets} \) \( \frac{{5088 {\text{MB}}}}{{1024 {\text{MB}}}} \cong 5\varvec{ }\, \text{buckets} \)
100 16,533 \( \frac{{16533 {\text{MB}}}}{{128 {\text{MB}}}} \cong 129\, \text{buckets} \) \( \frac{{16533 {\text{MB}}}}{{1024 {\text{MB}}}} \cong 16\varvec{ }\,\text{buckets} \)
300 49,700 \( \frac{{49700 {\text{MB}}}}{{128 {\text{MB}}}} \cong 388\, \text{buckets} \) \( \frac{{49700 {\text{MB}}}}{{1024 {\text{MB}}}} \cong 49\varvec{ }\,\text{buckets} \)
DT 30 14,650 \( \frac{{14650 {\text{MB}}}}{{128 {\text{MB}}}} \cong 114\, \text{buckets} \) \( \frac{{14650 {\text{MB}}}}{{1024 {\text{MB}}}} \cong 14\varvec{ }\,\text{buckets} \)
100 46,800 \( \frac{{46800 {\text{MB}}}}{{128 {\text{MB}}}} \cong 366\, \text{buckets} \) \( \frac{{46800 {\text{MB}}}}{{1024 {\text{MB}}}} \cong 46\varvec{ }\,\text{buckets} \)