See also
OTU /
denoising pipeline
Read preparation
fastx_demux command
Cross-talk
How to demultiplex
If you have Illumina reads with one
FASTQ file per sample, then demultiplexing has already been done for you.
If you have 454 reads with barcodes, or Illumina paired or unpaired reads with i1 index reads, then you can use the fastx_demux command to perform demultiplexing. If you have raw Illumina dual index reads (i5 + i7 + r1 + r2), this is not currently supported in usearch -- let me know and I will add the feature for you.
Background
Several samples can be combined into a single
sequencer run by using "multiplexing" where a barcode sequencing identifying
the sample is inserted into the sequencing construct. Barcodes are also
called index sequences.
With Illumina sequencing, the barcode is usually positioned before the sequencing primer so does not appear in the forward reads that contain the biological sequence. Barcodes are obtained by making one (single-indexing) or two (dual-indexing) additional reads which are sometimes called i1 for single indexing and i5+i7 for dual indexing.
With other next-generation sequencers, the barcode sequence usually appears at the beginning of the read, possibly after a machine-specific sequence such as TCAG for 454.
With current Illumina software and standard library preparation protocols,
the demultiplexing is usually done for you and the basespace download
includes one FASTQ file for each sample; the index reads are not included.
However, it is sometimes useful to do the demultiplexing yourself, in which
case you can get "raw" i1, r1 and r2 reads.
With both 454 and
Illumina, reads are assigned to the wrong sample due to incorrect barcode
sequences at a surprisingly high rate. I call this problem
cross-talk. A suggested strategy for reducing
cross-talk is to use a sparse dual index scheme where most pairs of indexes
are not assigned to samples.