Skip to main content

Table 17 Best results by partitioning and bucketing configuration and by SF

From: Evaluating partitioning and bucketing strategies for Hive-based Big Data Warehousing systems

SF Data model Without data organization strategies Partitioning (P) and bucketing (B)
P = Od_Year
B = Orderkey
P = S_Region
B = Suppkey
P = Od_Year
B = P_Brand
P = Od_Year, S_Region
B = Suppkey
30 SS 92 s 100 s 77 s 81 s
8% − 16% − 12%
DT 63 s 46 s 47 s 33 s
− 26% − 24% − 47%
100 SS 262 s 256 s 285 s 220 s
− 2% 9% − 16%
DT 155 s 129 s 119 s 90 s
− 17% − 23% − 42%
300 SS 733 s 835 s 452 s 650 s
14% − 38% − 11%
DT 472 s
  1. Italic values indicate the fastest processing time by SF, data model and configuration