Illumina unpaired reads
See also
UPARSE home page
UPARSE pipeline
home page
This page gives
an example UPARSE pipeline for Illumina unpaired reads. These commands make the
following assumptions, which are usually but not always true in the datasets
I've seen.:
1. There are no non-biological bases in the read such as adapters or barcodes.
2. Sequences complementary to PCR primers are not included in the reads.
3. The reads have been demultiplexed, i.e. split into separate FASTQ files for each sample.
4. The FASTQ filenames start with the sample name.
5. The reads are all on the same strand.
Commands
usearch -fastq_filter
*_R1_*.fastq -relabel @ -fastaout reads.fa
usearch -fastq_filter
*_R1_*.fastq -fastq_maxee 1.0 -relabel Filt -fastaout filtered.fa
usearch -derep_fulllength filtered.fa -relabel Uniq -sizeout -fastaout uniques.fa
usearch -cluster_otus uniques.fa -minsize 2 -otus otus.fa -relabel Otu
usearch -usearch_global reads.fa -db otus.fa -strand plus -id 0.97 \
-otutabout otutab.txt -biomout otutab.json