From: Evaluating partitioning and bucketing strategies for Hive-based Big Data Warehousing systems
Data model | Attributes | SF | Tool | |||
---|---|---|---|---|---|---|
Time (s) | Increase along SF | |||||
Hive | Presto | Hive | Presto | |||
SS | None | 30 | 420 | 92 | ||
100 | 982 | 262 | 2.34 | 2.85 | ||
300 | 4874 | 733 | 4.96 | 2.80 | ||
SS-P | Od_Year + S_Region | 30 | 375 | 63 | ||
100 | 760 | 149 | 2.03 | 2.37 | ||
300 | 2849 | 399 | 3.75 | 2.68 | ||
SS-B | Orderdate + Custkey + Suppkey + Partkey | 30 | 420 | 121 | ||
100 | 1047 | 305 | 2.49 | 2.52 | ||
300 | 5712 | 876 | 5.46 | 2.87 | ||
Suppkey | 30 | 404 | 120 | |||
100 | 676 | 321 | 1.67 | 2.68 | ||
300 | 1803 | 768 | 2.67 | 2.39 | ||
SS-PB | Od_Year + Orderkey | 30 | 378 | 100 | ||
100 | 865 | 256 | 2.29 | 2.56 | ||
300 | 5166 | 835 | 5.97 | 3.26 | ||
Od_Year + S_Region+ Suppkey | 30 | 362 | 81 | |||
100 | 765 | 220 | 2.11 | 2.72 | ||
300 | 933 | 650 | 1.22 | 2.95 | ||
S_Region + Suppkey | 30 | 349 | 77 | |||
100 | 908 | 285 | 2.60 | 3.70 | ||
300 | 982 | 452 | 1.08 | 1.59 | ||
DT | None | 30 | 349 | 63 | ||
100 | 516 | 155 | 1.48 | 2.46 | ||
300 | 1090 | 472 | 2.11 | 3.05 | ||
DT-P | Od_Year + S_Region | 30 | 292 | 43 | ||
100 | 346 | 71 | 1.18 | 1.65 | ||
300 | 602 | 299 | 1.74 | 4.21 |