fastq_strip_barcode_relabel.py
Usage
python fastq_strip_barcode_relabel.py reads primer barcodes
label_prefix > outputfile
python fastq_strip_barcode_relabel2.py reads primer barcodes label_prefix > outputfile
Description
Strips the primer and barcode and creates a new label for the read
containing the barcode sequence (fastq_strip_barcode_relabel.py) or barcode
label (fastq_strip_barcode_relabel2.py).
Generally used for 454 reads.
Assumes the read layout is <barcode><primer><gene>.
If you reads start with a control sequence (typically TCAG) then this can be added to the barcodes.
The reads argument is a FASTQ file containing the reads.
The primer argument is the primer sequence. Wildcards such as N are allowed in the primer sequence. Up to 2 primer mismatches are allowed.
The barcodes argument is the name of a FASTA file containing barcodes. No mismatches are allowed with the barcode.
The label_prefix agument is a string of characters. The read label is replaced by:
label_prefixN;barcodelabel=xxx;
where N is the read number, counting from 1 as the first read in the file, and xxx is the barcode sequence (fastq_strip_barcode_relabel.py) or FASTA label (fastq_strip_barcode_relabel2.py).
Output is written in FASTQ format.