Knowledgebase and Forums/Tutorials and Reference Materials

IEDB Antigens 3.0

Ward Fleri
posted this on August 7, 2012, 4:40 PM

The Antigen is the natural source from which an epitope is derived. Antigens may be searched upon by the organism name or by the antigen name.These can be searched on the home page:



Both fields provide auto-complete functionality when one begins typing in the search boxes. The IEDB contains epitope data related to infectious diseases, allergy, autoimmunity, and transplant antigens. Therefore, the organisms from which epitopes are derived may be infectious agents, such as viruses, bacteria, and fungi, or allergens such as trees and cats, or self-organisms such as humans or mice. The antigens derived from these organisms include proteins such as haemagglutinin from influenza or Fel d1 from cat, and additionally, non-peptidic structures such as LPS from bacterial cell walls.

If the free text search does not display the organism or antigen of interest, the Molecule Finder can be used to navigate the entirety of antigens having data in the IEDB, by clicking on the "Tree View" icon as found on the Search Results page of the IEDB:



Finders are available to help facilitate selections and control vocabulary usage, thus improving result outputs.  At times the potential list of selections can be quite extensive, and the finders help users make selections from large lists.  Multiple selections can be made when utilizing finders during a query.

The Molecule Finder is used to facilitate the selection of source antigens, immunogens, and epitopes.  Records in the Source Organism Finder that is contained within the Molecule Finder come from GenPept, ChEBI, UniProt, and IEDB curators. 

The Molecule Finder is designed to include two parallel trees, one for non-peptidic structures and the other for protein molecules.  The first contains the structures curated by the Chemical Entities of Biological Interest (ChEBI) database.  An example is shown below.



The development team determined that the most logical way to group the proteins was by organism.  In order to accomplish this, the NCBI species was determined for each of the proteins in the database.  For viruses and bacteria, this involved traversing the NCBI taxonomy from the sub-species (strain) level up to the species level.  For each species, a set of reference proteins was selected from UniProt based upon the availability of a complete reference proteome for the species.  All GenBank entries used as protein sources for epitopes in the IEDB were BLASTed against the reference proteome set to determine their homologs.  These data were used to build the protein tree in a way that mirrors a pruned version of the NCBI taxonomy.  The result is a coherent tree that is divided along major taxonomic categories and is quickly traversed with proteins grouped logically below each species.  The user can perform a free text search for Name and can specify the source species with the Organism Finder.  The figure below shows the results for all Influenza A haemagglutinin (HA) proteins.  The user can click on Add to populate the Current Selection box with their desired molecule, or they can click Highlight in Tree to see where it appears in the Protein tree, as shown in below. 


The user can thus select all Influenza A haemagglutinin (HA) proteins by selecting the UniProt parent node of the tree rather than individually clicking on the 100+ different GenBank entries for  HA proteins used as epitope sources in the database, as shown below:



Stars are used to communicate the status of the UniProt reference proteome that was used. More details on the protein tree and the star grading system can be found here:


Topic is closed for comments