Home Software Services About Contact usearch manual
Taxonomy overclassification and underclassification errors

See also
 
Taxonomy benchmark home
 
Validating taxonomy classifiers
  Training taxonomy classifiers
  UTAX algorithm
  Splitting a taxonomy reference set
  Defining "accuracy" of a taxonomy classifier
  Taxonomy classification errors

Lowest Common Rank (LCR)
The lowest common rank is defined to be the lowest level (genus, family etc.) that is present in the reference set (training set). For example, if the genus is not present but the family is present, then the lowest common level is family. Predicting the LCR is the most difficult challenge for a taxonomy classification algorithm. For example, if the identity of the top hit is 88%, should be LCR be class, family, genus or what?

See also overclassification benchmark results.

Overclassification error
If a classifier predicts a level lower than the LCR, then this is an overclassification error. For example, the lowest common level is family but a genus is predicted.

Underclassification error
If a classifier predicts higher levels but does not predict the LCR, then this is an underclassification error. For example, the lowest common level is family but the lowest predicted level is order. All false negatives are underclassification errors so there is really no need for a new term.