Home Software Services About Contact     

search_16s command (64-bit only)

ImageSee also
SEARCH_16S algorithm
  SEARCH_16S paper

The search_16s command searches a long sequence such as a chromosome or contig for 16S genes.
It has exceptionally high accuracy, finding at least 99.9% of known 16S genes with few or no false positives.

A bit vector database is required, specified by the -bitvec option. See creating a bit vector file for the search_16s command.

Input can be in FASTQ or FASTA format.

-hitsout option
FASTA file containing "hits", i.e. regions with elevated density of signature words. These are candidate 16S genes with flanking sequence (see paper for details).

-fastaout option
FASTA file containing predicted 16S genes.

-fragout option
FASTA file containg probable fragments of 16S genes which lack one or both identifying motifs.

-tabbedout option
Tabbed text file containg records for query sequences, hits, full-length genes and fragments.

-start_motif option
Start motif. Default GNTTGATCNTGNC.

-end_motif option

-min_gene_length option
Minimum gene length. Default 1200.

-max_gene_length option
Maximum gene length. Default 2000.

-maxstartdiffs option
Maximum number of mismatches with the start motif. Default 4.

-maxenddiffs option
Maximum number of mismatches with the end motif. Default 4.


usearch -search_16s contigs.fa -bitvec gg97.bitvec -fastaout 16s.fa -tabbedout results.txt