Making an OTU table
An
OTU table is made by running the
usearch_global command with an appropriate
output file option, e.g. otutabout. See
Mapping reads to OTUs for details.
Read sequences must have sample identifiers
When you run usearch_global to make the OTU table, the FASTA file
or FASTQ file containing the reads must have sample identifiers in the
labels.
Sample identifier syntax
The sample name
can be specified by putting sample=xxx; into the label. If sample= is not found,
the sample identifier is assumed to start at the beginning of the label and
continue to the first character in the label which is not alphanumeric or an
underscore. Put another way, any character which is not a letter, number of
underscore marks the end of the sample label. (For backwards compatibility, you also can use barcodelabel=xxx).
The following labels have sample identifier S01. FASTA labels start with > at
the beginning of the line, FASTQ labels start with @.
>S01.123
>S01.123;size=14;
@M00967:43:000000000-A3JHG:1:1101:18327:1699;sample=S01;
In the first and second example, the period (.) is the first non-alphanumeric character so the .123 is not part of the sample identifier.
How to get sample names into your labels
The simplest method is to use the -relabel option of
fastq_mergepairs,
fastq_filter,
derep_fulllength or
derep_prefix. If you process one file at a time,
you can do something like this:
usearch -derep_fulllength reads.fastq -relabel SampleName. -fastaout derep.fa
Note the period following SampleName -- you must have a character which is not a letter, number or underscore to separate the sample name from the read number.
If -relabel @ is specified, the sample name is constructed from the FASTQ filename by truncating at the first underscore or period. With typical Illumina FASTQ filenames, this is the sample name.
Alternatively, you could write you own script to do this
task.