Part 1. UPARSE pipeline from reads to OTU table This tutorial uses
reads of 11 samples deposited in archive
SRR058098 by the Human Microbiome Project.
I took a random subset of 5% of the reads, giving 19,735 FASTQ records. See
Part 2 for more information about the reads
and how the UPARSE pipeline was implemented.
Download
this archive: Tutorial files:
hmptut.tar.gz
I'll assume your downloaded file
is in
~/Downloads, if you downloaded to a different path then replace as needed
below.
Make a top-level directory for the tutorials, change to that
directory and extract the data files using tar for the tutorial files. See
tutorial directories for description of
subdirectories.
Unzip for the data files: mkdir -p ~/tutorials cd ~/tutorials tar -zxvf ~/Downloads/hmptut.tar.gz
Create the utax
database by running the setup_utax.bash script, like this.
cd ~/tutorials/hmptut/scripts ./setup_utax.bash Notice the
dot and slash (./) before
setup_utax.bash. This tells the shell to look for the command file (script
or binary) in your current directory (dot means current directory). This is
needed if the current directory is not in your PATH. Tutorial scripts always
assume that they are being run like this, i.e. from inside the scripts/
subdirectory.
The setup_utax.bash script uses curl to fetch the data.
Some systems don't have curl in which case you can use wget. There is a wget
command in the script which is commented out so it's a simple edit of the
script to comment out curl instead.
The UPARSE pipeline is
implemented in the run_uparse.bash script. Run it like this:
cd ~/tutorials/hmptut/scripts ./run_uparse.bash
This should reproduce the
pre-computed files in the hmptut/out/ directory.