Skip to main content

Table 10 SSB execution times (in seconds): partitioning by “S_Region” and bucketing by “Suppkey”.

From: Evaluating partitioning and bucketing strategies for Hive-based Big Data Warehousing systems

  SF = 30 SF = 100 SF = 300 SF = 30 SF = 100 SF = 300
HIVE PRESTO
SS SS-PB SS SS-PB SS SS-PB SS SS-PB SS SS-PB SS SS-PB
Q1.1 25 22 31 32 44 48 5 6 13 24 36 36
Q1.2 24 23 29 29 42 47 5 6 13 23 34 37
Q1.3 24 23 29 30 43 46 4 6 13 22 35 37
Q2.1 32 25 47 39 531 44 8 4 19 10 59 19
Q2.2 31 21 46 39 531 143 7 4 18 9 51 14
Q2.3 30 21 44 38 531 42 7 4 17 9 49 13
Q3.1 35 23 59 37 651 67 8 4 29 15 81 28
Q3.2 30 29 45 46 677 96 6 7 17 33 51 52
Q3.3 33 33 219 219 665 77 5 7 15 29 43 46
Q3.4 34 32 222 220 675 75 6 7 15 29 43 45
Q4.1 38 30 86 61 226 118 13 7 43 22 119 36
Q4.2 49 34 70 58 141 67 9 5 26 17 69 26
Q4.3 34 34 54 60 116 110 8 11 23 44 63 63
Total 420 349 982 908 4874 982 92 77 262 285 733 452
Diff   − 17%   − 8%   − 80%   − 16%   9%   − 38%
  1. Italic values indicate the fastest processing time by query, workload, tool and data model