Skip to main content

Table 3 Definition of the number of buckets (for bucketing only)

From: Evaluating partitioning and bucketing strategies for Hive-based Big Data Warehousing systems

Data model

SF

Table size (MB)

HDFS block (128 MB)

At least 1 GB

SS

30

5088

\( \frac{{5088 {\text{MB}}}}{{128 {\text{MB}}}} \cong 40\, \text{buckets} \)

\( \frac{{5088 {\text{MB}}}}{{1024 {\text{MB}}}} \cong 5\varvec{ }\, \text{buckets} \)

100

16,533

\( \frac{{16533 {\text{MB}}}}{{128 {\text{MB}}}} \cong 129\, \text{buckets} \)

\( \frac{{16533 {\text{MB}}}}{{1024 {\text{MB}}}} \cong 16\varvec{ }\,\text{buckets} \)

300

49,700

\( \frac{{49700 {\text{MB}}}}{{128 {\text{MB}}}} \cong 388\, \text{buckets} \)

\( \frac{{49700 {\text{MB}}}}{{1024 {\text{MB}}}} \cong 49\varvec{ }\,\text{buckets} \)

DT

30

14,650

\( \frac{{14650 {\text{MB}}}}{{128 {\text{MB}}}} \cong 114\, \text{buckets} \)

\( \frac{{14650 {\text{MB}}}}{{1024 {\text{MB}}}} \cong 14\varvec{ }\,\text{buckets} \)

100

46,800

\( \frac{{46800 {\text{MB}}}}{{128 {\text{MB}}}} \cong 366\, \text{buckets} \)

\( \frac{{46800 {\text{MB}}}}{{1024 {\text{MB}}}} \cong 46\varvec{ }\,\text{buckets} \)