Fig. 2From: Data-aware optimization of bioinformatics workflows in hybrid cloudsThe execution times of all 13 participating sites in boxplot form are presented for the phylogenetic profiling workflow when executed with an internal scheduler subset size value n of 0.0010, 0.0025, 0.0050, 0.0100, 0.0250, 0.0500 and \(0.2500\,\%\) as a percentage of the total number of protein sequences in three distinct datasets comprising 189,378, 264,088 and 368,949 protein sequences. The optimal value of n leading to the fastest execution times is \(0.01\,\%\) of the input datasetBack to article page