See also fastq_mergepairs command FASTQ files Quality scores The process of merging paired reads is sometimes called overlapping or assembly of read pairs. The goal of merging is to convert a pair into a single read containing one sequence and one set of quality scores. A pair must overlap over a significant fraction of its length. Merging generates a single FASTQ file from FASTQ
files for paired forward reads and reverse reads. A pair is merged by aligning
the forward read sequence to the reverse-complement of the reverse read
sequence. In the overlap region where both reads cover the same bases, a single
letter and Q score is derived from the aligned pair of letters and Q scores for
each base. If the forward and reverse read agree on the base call, this
increases the confidence in the predicted base, increasing the Q score.
Conversely, if the reads disagree, this reduces confidence in the base call and
decreases the Q score. The adjusted Q scores for matches and mismatches are
calculated using
Bayesian statistics. The merged Q often exceeds the maximum
allowed by the file format, in which case the maximum is used. The maximum Q
score for the result of a merge is set by the -fastq_qmaxout option (see
FASTQ options for details).
|