Skip to main content

Table 1 Query datasets description

From: StreamAligner: a streaming based sequence aligner on Apache Spark

Query genome Number of reads Bp per read Size (MB)
100k.fa 100,000 36 4.18
BRL.fastq 3,958,076 100 1100
AML.fastq 304,745 150 112
ANL.fastq 2,986,312 400 2600
NA12750/ERR000589 \(12*10^{6}\) 51 3400
HG00096/SRR015390 \(15.9*10^{6}\) 51 5100
HG00096/SRR062634 \(24.1*10^{6}\) 100 11,800
150140/SRR642648 \(98.8*10^{6}\) 100 48,300