A. Understanding the color code of scoring matrix.
1. Row header of the matrix denotes the position and column header denotes amino acids.
A green cell in the row header indicates anchor positions.
2. The body of the matrix contains floating numbers which are the values of a SMM prediction matrix.
A green cell indicates that the residue (name denoted by column header) is a prefered residues at that position (indicated by row header)
A red cell indicates that the residue is a deleterious residue at that position
An white cell indicates that the residue is a tolerated residue at that position
B. Algorithm to derive color code.
1. Determine the canonical length of MHC motif
The MHC class I molecule can accommodate peptides of variable length but the variation is rather limited comparing to MHC class II. here we use an empirical approach to determine the canonical motif length by using available binding affinity data. For each allele, the length with highest number of binders is identified as canonical length.
2. Determine the anchor positions
The anchor position of a MHC allele was solely determined from the SMM prediction matrix.
For each position along the MHC binding motif, a spread factor (SF) was calculated.
For each position, the corresponding column from SMM matrix was extracted and SF was derived by subtract the smallest value at
that column from the largest value at that column.
After the SFs were calculated for all positions, the positions are order by SF with highest SF at top.
the following steps were taken to determine anchor positions.
Step 1: Examine all positions, if a position has a SF greater than 2, this position is designated as anchor position.
Step 2a: If we have 2 or more anchor positions after step 1, stop and goto step 3.
Step 2b: If we have less than 2 anchor positions and the next position has SF greater than 1, designate the next
position as anchor. Repeat step 2b until we have 2 anchors.
Step 3: Compare the SF of each non-anchor position to anchor positions, if its SF is within 0.1 of an anchor position, designate this position as anchor.
3. Determine residue preference at each position
The residue preference at each position of a MHC allele was solely determined from the SMM prediction matrix.
For anchor position:
Step 1: determine the best value of this position from corresponding column of SMM matrix
Step 2: if a residue's value in this column of SMM matrix is within 3 fold of the best value, designate this residue as prefered residues.
Step 3: if a residue's value in this column of SMM matrix is within 10 fold of the best value, designate this residue as tolerated residues.
Step 4: The rest residues were designated as deleterious residues.
For non-anchor position:
Step 1: determine the median value of this position from corresponding column of SMM matrix
Step 2: if a residue's value in this column of SMM matrix is within 3 fold of the median value, designate this residue as tolerated residues.
Step 3: if a residue's value in this column of SMM matrix is above 3 fold of the median value, designate this residue as prefered residues.
Step 4: if a residue's value in this column of SMM matrix is below 3 fold of the median value, designate this residue as deleterious residues.
Comments
0 comments
Article is closed for comments.