See also
  OTU / denoising 
pipeline
  Read preparation
 
Defining and interpreting OTUs
 
Expected errors
 
To get good 
OTU sequences, low-quality reads should be discarded because they often cause 
spurious OTUs. I strongly recommend using expected error 
filtering using the fastq_filter command, 
which is much more effective than most other quality filters.
Discarding singletons is also a 
strategy for quality filtering.
Quality filtering should be performed after paired read merging, stripping primers and length trimming.
	Paired read merging should be done before quality filtering because the 
	posterior Q scores in the overlapping region are more accurate. You should 
	use the usearch fastq_mergepairs command 
	to get this benefit, because most other paired read assemblers generate 
	incorrect concensus Q scores, most notably PANDAseq which systematically 
	reduces Q scores at positions where both reads agree.
Trimming should 
	be done before quality filtering because trimming always reduces expected 
	errors, so e.e. will be over-estimated if it is calculated before trimming.
	
I recommend using setting the maximum expected error threshold to 1.0,
	regardless of the read length.
Example
usearch -fastq_filter trimmed.fq -fastq_maxee 1.0 -fastaout filtered.fa
	Validating quality filtering
The best way to validate 
	the effectiveness of quality filtering, and the other steps in your 
	pipeline, is to use control samples with 
	known composition. If you don't have control samples, the
	fastx_learn command can be used to 
	estimate the error rate de novo. This can be used as a check that 
	the error rate after quality filtering is low, i.e. that the Q scores give 
	good predictions of base call errors. For some machines, e.g. 454 and Ion 
	Torrent, the Q scores are much less effective than Illumina because they are 
	estimates of the homopolymer length error, not of the base call error. Using 
	fastx_learn can reveal this type of problem. If that happens, a more 
	effective quality filtering strategy is to increase the minimum abundance 
	threshold. The minimum abundance can be tuned using a
	mock community control sample.