Here we report the results of the prediction of HLA class II restricted T cell epitopes for the Ebola virus (EBOV) envelope glycoprotein (GP). These results are the first installment of a series of analysis, whose ultimate goal is to provide a comprehensive analysis of the molecular targets of the immune responses to Ebola virus, to assist the scientific community in the evaluation of laboratory results and design of further investigations. Using the IEDB Analysis Resource and the procedure described below, we identified a subset of 15-mer peptides from GP meeting specific selection criteria to be predicted as the most dominant (predicted to account the majority of responses) CD4+ epitopes. From this subset of 142 peptides, 38 were conserved at 80% of sequence identity across all analyzed species of EBOV. As a further validation step, we then screened all EBOV-specific T cell response data captured to date in the IEDB (positive and negative responses) with these 38 peptides.
Methods
MHC Class II prediction
All sequences of the EBOV GP protein were extracted from the UniProt database [search script = "Ebolavirus [186536]" AND reviewed: yes]. Of the 11 unique sequences, 4 were represented Zaire EBOV, 3 Sudan EBOV, 3 Reston EBOV, and 1 Ivory Coast EBOV. For each of these sequences, we generated 15-mers overlapping by 10 amino acids (aa) using the multiple sequence alignment produced with ClustalW (for the alignment, see the attached file). The epitope prediction analysis was then performed using the IEDB MHC class II prediction tool (IEDB recommended method) on each peptide for each of seven selected HLA class II alleles: DRB1*03:01, DRB1*07:01, DRB1*15:01, DRB3*01:01, DRB3*02:02, DRB4*01:01, DRB5*01:01. Peptides with a median percentile of ≤ 20.0 were selected as the top predicted binders. Previous work [Paul et. Al., 2014, unpublished] has identified these selective criteria (peptide length, aa overlap, 7 alleles and the prediction score threshold) as optimal for capturing 50% of class II human T cell responses.
Conservancy analysis
The conservancy of each predicted GP peptide was analyzed using the Epitope Conservancy Analysis Tool across all EBOV GP protein sequences using a threshold of 80% sequence identity (i.e., a peptide had 12 or more identical amino acids in the analyzed GP protein).
Results
In all, 142 of the epitopes analyzed were predicted to have a median percentile rank of ≤ 20 (4.33-19.88). Seventy-eight of these were found to be conserved in more than a half of the GP proteins specifically from the Zaire EBOV species; and 60 of these peptide were conserved in more than a half of all EBOVs. Supplemental Table 1 (the attached Excel file) provides details for these peptides, including correspondence to the validated T cell epitopes, which are highlighted in red.
Applying the conservation analysis, we found a total of 38 peptides that were conserved using 80% homology across all EBOV GP sequences. Of this optimized subset of class II predicted epitopes, two were reported in the IEDB as generating positive CD4+ T cell responses in mice: H-2-IAb-restricted GP RWGFRSGVPPKVVSY (85-99 Sudan) and RWGFRSGVPPKVVNY (85-99 Zaire) (Table 1, in bold; IEDB IDs 187512 and 187511, respectively). Only one of the predicted peptides, GP RQLANETTQALQLFL (559-573) (IEDB ID 187504), was shown to be negative in the limited experimental conditions tested thus far (Table 1). Of note, this sequence is identical in both Zaire and Sudan EBOVs.
We also analyzed the conservancy of the predicted peptides in the currently circulating EBOV strains in West Africa. On Aug 28th, 2014, there were 101 complete sequences of the GP protein available in GenBank: 98 from Sierra Leone (100% identical on the protein sequence level) and 3 from Guinea (sequences were updated in GenBank on Aug 26th), with 2 of them identical. The current GP protein sequence diversity within the outbreak is only 1-2 amino acid mutations and can be represented by 3 strains: H.sapiens-wt/GIN/2014/Gueckedou-C05, H.sapiens-wt/GIN/2014/Kissidougou-C15 and H.sapiens-wt/SLE/2014/ManoRiver-G3825.1. All of the peptides provided in Table 1 were conserved in these sequences using 80% sequence identity threshold. Among the 12 peptides in table 1 from the Zaire EBOV lineage, all except peptide #27 are 100% conserved from 1976 to the 2014 outbreak.
Analysis in progress for future release
Here we report a first installment of the analysis, focused on prediction of HLA class II restricted T cell epitopes for the Ebola virus (EBOV) envelope glycoprotein (GP). In future releases, we plan to focus on the other EBOV proteins and provide further analysis of HLA class II epitopes, with predicted binding affinities for the most common HLA alleles and a breakdown of population coverage provided by these predicted epitopes in different ethnic groups, as provided by the IEDB population coverage tool.
The analysis of HLA class I/CD8 epitopes for the most common HLA class I molecules and analysis of predicted B cell epitopes will also be pursued, in parallel with a meta-analysis of all experimental data related to the Ebola virus curated in the IEDB.
Table 1. 38 peptides predicted as the most dominant HLA class II restricted epitopes and found to be conserved across all EBOV species. The reported in the IEDB two positive epitopes, #1 and #3, and the only negative epitope, #35, are in bold font.
# |
Peptide |
UniProt Accession |
UniProt Organism |
Peptide Start Position |
Median Consensus Percentile |
1 |
RWGFRSGVPPKVVSY |
Q91DD8 |
Reston ebolavirus (strain Philippines-96) |
86 |
4.33 |
2 |
FLYDRLASTIIYRGT |
Q66810 |
Ivory Coast ebolavirus (strain Cote d'Ivoire-94) |
161 |
4.35 |
3 |
RWGFRSGVPPKVVNY |
P87671 |
Zaire ebolavirus (strain Eckron-76) |
86 |
4.53 |
4 |
DYAFHKDGAFFLYDR |
Q66814 |
Sudan ebolavirus (strain Boniface-76) |
151 |
5.05 |
5 |
LRTYTILNRKAIDFL |
Q66814 |
Sudan ebolavirus (strain Boniface-76) |
586 |
5.28 |
6 |
AGIGIIGVIIAIIAL |
Q91DD8 |
Reston ebolavirus (strain Philippines-96) |
661 |
5.68 |
7 |
FLYDRLASTVIYRGV |
Q66814 |
Sudan ebolavirus (strain Boniface-76) |
161 |
5.79 |
8 |
TGVIIAIIALLCICK |
Q66810 |
Ivory Coast ebolavirus (strain Cote d'Ivoire-94) |
666 |
6.21 |
9 |
IGVIIAIIALLCICK |
Q91DD8 |
Reston ebolavirus (strain Philippines-96) |
666 |
6.21 |
10 |
LRTYSLLNRKAIDFL |
Q91DD8 |
Reston ebolavirus (strain Philippines-96) |
586 |
6.26 |
11 |
FLYDRLASTVIYRGT |
P87671 |
Zaire ebolavirus (strain Eckron-76) |
161 |
6.29 |
12 |
LRTFSILNRKAIDFL |
P87671 |
Zaire ebolavirus (strain Eckron-76) |
586 |
6.45 |
13 |
RWGFRAGVPPKVVNY |
Q66810 |
Ivory Coast ebolavirus (strain Cote d'Ivoire-94) |
86 |
6.56 |
14 |
LQLFLRATTELRTFS |
P87671 |
Zaire ebolavirus (strain Eckron-76) |
576 |
7.18 |
15 |
LQLFLRATTELRTYT |
Q66814 |
Sudan ebolavirus (strain Boniface-76) |
576 |
7.97 |
16 |
LQLFLRATTELRTYS |
Q91DD8 |
Reston ebolavirus (strain Philippines-96) |
576 |
8.19 |
17 |
AGIGITGVIIAIIAL |
Q66810 |
Ivory Coast ebolavirus (strain Cote d'Ivoire-94) |
661 |
8.85 |
18 |
DLAFHKNGAFFLYDR |
Q91DD8 |
Reston ebolavirus (strain Philippines-96) |
151 |
10.09 |
19 |
KEGAFFLYDRLASTI |
Q66810 |
Ivory Coast ebolavirus (strain Cote d'Ivoire-94) |
156 |
11.14 |
20 |
KNGAFFLYDRLASTV |
Q91DD8 |
Reston ebolavirus (strain Philippines-96) |
156 |
11.47 |
21 |
KEGAFFLYDRLASTV |
P87671 |
Zaire ebolavirus (strain Eckron-76) |
156 |
11.52 |
22 |
KDGAFFLYDRLASTV |
Q66814 |
Sudan ebolavirus (strain Boniface-76) |
156 |
11.53 |
23 |
GLAFHKEGAFFLYDR |
Q66810 |
Ivory Coast ebolavirus (strain Cote d'Ivoire-94) |
151 |
11.77 |
24 |
DFAFHKEGAFFLYDR |
P87671 |
Zaire ebolavirus (strain Eckron-76) |
151 |
12.12 |
25 |
RWGFRSGVPPQVVSY |
Q66814 |
Sudan ebolavirus (strain Boniface-76) |
86 |
12.23 |
26 |
RATTELRTFSILNRK |
P87671 |
Zaire ebolavirus (strain Eckron-76) |
581 |
13.41 |
27 |
TEGLMHNQNGLICGL |
P87671 |
Zaire ebolavirus (strain Eckron-76) |
551 |
14.34 |
28 |
AGIGITGIIIAIIAL |
Q66814 |
Sudan ebolavirus (strain Boniface-76) |
661 |
14.44 |
29 |
RATTELRTYSLLNRK |
Q91DD8 |
Reston ebolavirus (strain Philippines-96) |
581 |
14.51 |
30 |
LICGLRQLANETTQA |
P87671 |
Zaire ebolavirus (strain Eckron-76) |
561 |
17.18 |
31 |
LVCGLRQLANETTQA |
Q66814 |
Sudan ebolavirus (strain Boniface-76) |
561 |
17.18 |
32 |
ILNRKAIDFLLRRWG |
Q66814 |
Sudan ebolavirus (strain Boniface-76) |
591 |
17.71 |
33 |
RATTELRTYTILNRK |
Q66814 |
Sudan ebolavirus (strain Boniface-76) |
581 |
18.16 |
34 |
IYRGTTFAEGVVAFL |
P87671 |
Zaire ebolavirus (strain Eckron-76) |
171 |
18.27 |
35 |
RQLANETTQALQLFL |
P87671 |
Zaire ebolavirus (strain Eckron-76) |
566 |
19.41 |
36 |
ETTQALQLFLRATTE |
P87671 |
Zaire ebolavirus (strain Eckron-76) |
571 |
19.73 |
37 |
DKLSSTSQLKSVGLN |
Q66810 |
Ivory Coast ebolavirus (strain Cote d'Ivoire-94) |
56 |
19.73 |
38 |
LLNRKAIDFLLQRWG |
Q91DD8 |
Reston ebolavirus (strain Philippines-96) |
591 |
19.88 |
__________________________________________________________________________________
The report was prepared by Julia Ponomarenko, Kerrie Vaughan, Sinu Paul, Alessandro Sette, and Bjoern Peters. Analysis of 2014 outbreak sequences was conducted by Sebastian Maurer-Stroh of Bioinformatics Institute, A*STAR, Singapore.
Comments
0 comments
Article is closed for comments.