See also
UPARSE home page
UPARSE algorithm
UPARSE pipeline
Python scripts
This page gives some example command lines for constructing a UPARSE pipeline. Of course, you should edit as needed for your reads and file locations. This example assumes unpaired reads in FASTQ format. You'll need the forward primer sequence (it is ATTACCGCGGCTGCTGG in the example below), and you'll also need to prepare a FASTA file called barcodes.fa containing the barcodes that identify your samples. The FASTA label for each barcode should be a short name identifying the sample.
# Set a variable to give a short
name for the USEARCH binary
u=~/bin/usearch6.1.544_i86linux32
# Variable for directory
containing input data (reads and ref. db)
d=~/data
# Strip barcodes, assumes python scripts in directory
py in your home dir.
# "Ex" is a prefix for the read labels, can be anything you like.
python ~/py/fastq_strip_barcode_relabel2.py $d/reads.fq ATTACCGCGGCTGCTGG barcodes.fa
Ex > reads2.fq
# Quality filter, length truncate, covert to
FASTA
$u -fastq_filter reads2.fq -fastq_maxee 0.5 -fastq_trunclen 250 -fastaout
reads.fa
# Dereplication
$u -derep_fulllength reads.fa -output derep.fa -sizeout
# Abundance sort and discard singletons
$u -sortbysize derep.fa -output sorted.fa -minsize 2
# OTU clustering
$u -cluster_otus sorted.fa -otus otus1.fa
# Chimera filtering using reference database
$u -uchime_ref otus1.fa -db $d/gold.fa -strand plus -nonchimeras otus2.fa
# Label OTU sequences OTU_1, OTU_2...
python ~/py/fasta_number.py otus2.fa OTU_ > otus.fa
# Map
reads (including singletons) back to OTUs
$u -usearch_global reads.fa -db otus.fa -strand plus -id 0.97 -uc map.uc
# Create OTU table
python ~/py/uc2otutab.py map.uc > otu_table.txt