Sensitivity / error plots comparing RDP and UTAX

**See also
** Taxonomy benchmark home

Results:

UTAX vs. RDP S/E plot for 16S V3-V5 region (~530nt)

UTAX vs. RDP S/E plot for 16S V5 region (120nt)

UTAX vs. RDP S/E plot for fungal ITS1

UTAX vs. RDP S/E plot for fungal ITS2

Sensitivity / error (S/E) plots are used to compare classifiers that give a confidence score. Currently, the only classifiers I know of which do this are the RDP Naive Bayesian Classifier (RDP) and UTAX. A plot is made by sorting all predictions on a benchmark test in order decreasing confidence and calculating the total accumulated sensitivity and error rate at each value of the score. The total number of predictions included thus increases from left to right in the plot. The sensitivity can only increase as more predictions are added, but the error rate can fluctuate because lowering the threshold may have the effect of bringing in more or fewer errors as a fraction of the total, though if the confidence score is a good predictor of error rates we would expect the error rate to tend to increase as the threshold is lowered.

If the line for classifier X is always below the line for classifier Y this shows that the confidence score of X is more effective at sorting correct from incorrect predictions.

In the case of RDP, a substantial number of predictions have maximum confidence which causes the RDP curve to start away from the origin of the plot at the open red circle corresponding to 100% bootstrap. In the case of UTAX, only a small number of predictions have confidence 1.0 (to two significant figures), so the curve starts close to the origin and may fluctuate more strongly as a single incorrect prediction may represent a larger fraction of the predictions counted up to that point. Thus, apparently high error rates which are sometimes seen close to the left of the UTAX line correspond to high cutoffs, say around 0.99 or 0.98, which would not be used in practice. This is seen in the example below, where the UTAX error rate briefly jumps over 5% before settling down.

Suggested cutoffs are indicated by solid circles. The suggest cutoff for UTAX is 0.9. For RDP, I followed the recommendations on the RDP web site: use 80% for sequences > 250nt or 50% for sequences < 250nt. Note that the default used by QIIME with -m rdp is 50% regardless of length, so the default cutoff for the QIIME implementation of RDP will be higher on all datasets except for V5.