Home Software Services About Contact usearch manual
uncross command   

See also
 
Cross-talk
  UNCROSS algorithm
  UNCROSS paper
  OTU table
  Example OTU tables and reports

The uncross command detects and filters cross-talk (sample mis-assignment) in a OTU table using the UNCROSS algorithm. In a typical run, about 2% of reads are assigned to the wrong sample. If some samples contain large numbers of reads for a given OTU, these often "bleed" into other samples which may not in fact contain that OTU. This can cause may spurious counts wihch should be zero, giving inflated estimates of richness, alpha diversity and beta diversity.

You can clearly see cross-talk in this GAIIx example and this MiSeq example. You can use this example data to try the uncross command.

Please note that I do not consider the UNCROSS algorithm to be a robust solution for cross-talk. The mechanism(s) causing cross-talk are not well understood. Many different indexing schemes are used. Cross-talk rates in your data may be quite different from the datasets on which UNCROSS was designed and tested, in which case the accuracy of UNCROSS on your data may be lower. Also, cross-talk may be hard or impossible to detect when the number of multiplexed samples is large, say around 100 or more. It is much better to use multiplexing strategies that are designed to reduce cross-talk. UNCROSS is best understood as a simplisitc hack that is the best we can do with exisitng data.

Input is an OTU table in QIIME classic format generated from the all of the reads in a single run. Runs should NOT be combined for this analysis. It is important to include ALL samples that were sequenced in the same run, even if they contain samples for different experiments.

If the run has mock community samples, mock sample names should start with "mock" (case-insensitive), e.g. Mock1, mock or mock_13. OTU identifiers for sequences that are in the designed mock community should contain one of the following strings (case-sensitive): ";mock=yes;", ";annot=perfect;" or ";annot=noisy;" The annot command can be used to generate these annotations in the sequence labels before generating the OTU table.

The -tabbedout option specifies an output file in tabbed text format.

The -report option specifies a text file name for a summary report.

The -otutabout option specifies a filename to store the filtered OTU table. By default, entries predicted to be spurious due to cross-talk are set to zero and undetermined entries are kept. Specify -uncross_undet_zero to set undetermined entries to zero.

The following options specify user-settable parameters:

-uncross_maxxt (default 0.5). Maximum cross-talk frequency as a percentage.

-uncross_minvalid (default 2.0). Minimum valid frequency as a percentage.

-uncross_minvalidtotal (default 75.0). Minimum fraction of valid reads in an OTU as a percentage.

Example

usearch -uncross otutab.txt -tabbedout out.txt -report rep.txt -otutabout result.txt