Home Software Services About Contact     
 
Muscle5

Column confidence

Column confidence (CC) a measure of how well a column is reproduced across an ensemble. It is calculated as the fraction of MSAs in the ensemble where the column appears. The value ranges from zero (the column is not found) to one (the column appears in all MSAs).

CC is similar to the Felsenstein bootstrap confidence of a tree edge, which is measured as the fraction of tree replicates where an edge is found.

Sequence positions are considered as well as letters, so a given pattern of letters does not match between two MSAs unless their positions are also the same.

See also letter confidence.

On benchmark tests, CC is strongly predictive of correctness, but over-estimates if you try to interpret it as a probability.


Toy example showing how CC is calculated.
This is an ensemble with two replicates. The first green column is perfectly conserved F, it is found in 2/2 replicates so CC=1.0. The first red column is different, it is found in 1/2 replicates so CC=0.5.