The Bejerano Lab developed the Mendelian Clinically Applicable Pathogenicity (M-CAP) Score. M-CAP is the first pathogenicity classifier for rare missense variants in the human genome that is tuned to the high sensitivity required in the clinic (see Table). By combining previous pathogenicity scores (including SIFT, Polyphen-2 and CADD) with novel features and a powerful model, we attain the best classifier at all thresholds, reducing a typical exome/genome rare (<1%) missense variant (VUS) list from 300 to 120, while never mistaking 95% of known pathogenic variants as benign. Further details can be found here.
|Table of Contents|
Q1. Which mutations does M-CAP score?
M-CAP scores rare, missense variants from the Ensemble 75 Gene Set.
Q2. How do you distinguish common variants from rare variants?
The frequency of a base pair is determine by examining the ExAC and 1000 genomes project control populations. A base is defined as common if it has greater than 1% allele frequency in ExAC, 1000 genomes or in any subpopulation in the two datasets.
Q3. What does the color coding in the browser track correspond to?
Mutations that we score are colored red or green. Green variants are those that are classifier considers likely benign and red variants are those that M-CAP classifies as possibly pathogenic. These classifications are based on the score the M-CAP classifier assigns scored variants. Scored variants with an M-CAP score of above 0.025 are marked as possibly pathogenic while those below 0.025 are marked likely benign. Non-coding variants are not listed in the browser, while stopgain and stoploss mutations are colored dark gray and synonymous and common, nonsynonymous variants are colored light gray.
Q4. How are M-CAP scores assigned?
M-CAP scores are between 0 and 1. A threshold of 0.025 is used to separate likely benign from possibly pathogenic variants. The threshold of 0.025 corresponds with a 95% confidence threshold that variants marked likely benign are in fact benign. We label stopgain and stoploss mutations (which our classifier does not score) with a score of +10 while common, nonsynonymous and synonymous variants are assigned scores of -10.
Q5. How is the mutation type selected if the mutation occurs on a gene with multiple isoforms, and the type of mutation corresponding to different isoforms is different?
In this case, we follow a hierarchy whose order is (1) nonsynonymous, (2) stopgain, (3) stoploss, (4) synonymous, (5) noncoding. That is, if multiple isoforms with different corresponding mutation types exist, the mutation type that comes first in the hierarchy will be displayed. That is, if the mutation corresponds to nonsynonymous in one isoform and stopgain in another, the mutation type displayed will be nonsynonymous. Nonsynonymous is the top of the hierarchy because the classifier scores only nonsynonymous variants, and stopgain and stoploss are prioritized over synonymous variants as they are thought to be more likely pathogenic than synonymous mutations.