Illumina unpaired reads
See also
UPARSE home page
UPARSE pipeline
home page
Illumina
unpaired reads of variable-length amplicons
This page gives
an example UPARSE pipeline for Illumina unpaired reads. These commands make the
following assumptions, which are usually but not always true in the datasets
I've seen.:
1. There are no non-biological bases in the read such as adapters or barcodes.
2. Sequences complementary to PCR primers are not included in the reads.
3. The reads have been demultiplexed, i.e. split into separate FASTQ files for each sample.
4. The FASTQ filenames start with the sample name.
5. The reads are all on the same strand.
Commands
usearch -fastq_filter
*_R1_*.fastq -relabel @ -fastaout reads.fa
usearch -fastq_filter
*_R1_*.fastq -fastq_maxee 1.0 -relabel Filt -fastaout filtered.fa
usearch -fastx_uniques filtered.fa -relabel Uniq -sizeout -fastaout uniques.fa
usearch -cluster_otus uniques.fa -minsize 2 -otus otus.fa -relabel Otu
usearch -usearch_global reads.fa -db otus.fa -strand plus -id 0.97 \
-otutabout otutab.txt -biomout otutab.json