Individual GenPept proteins utilized by IEDB data are assigned to parent proteins from reference proteomes by sequence homology. These reference proteomes are graded by a star system that reflects the quality and completeness of each.
★★★ For some well-studied species UniProt provides reference proteomes that contain a full set of all proteins expressed by the species. For some bacterial species having inconsistent protein expression, additional proteins have been added to the reference proteome to create metaproteomes. These reference proteomes or metaproteomes are designated by three stars.
★★For other species that have been completely sequenced, UniProt provides complete proteomes. In addition, for some species expressing allergens, formal nomenclature designated by the International Union of Immunological Societies (IUIS) exits to describe these allergens. Complete proteomes that are not considered reference proteomes, or ones that contain formal IUIS allergen nomenclature for a subset of proteins, are designated by two stars.
★For some species, a proteome does not currently exist in UniProt, but GenBank provides a set of proteins representative of the species. These GenBank proteomes are designated by a single star.
☆For species that have no proteome in UniProt or GenBank, and no IUIS nomenclature, UniProt may still contain some records that can be used as parents. This case is designated with an unfilled star.
Species having no proteome in either UniProt or GenBank are designated by no stars.
Within each species’ proteome, individual “parent” proteins serve to group multiple distinct GenPept sequences. These GenPept entries are the “children” for each proteome protein in the Molecule Tree. This allows users to search IEDB data by selecting the parent protein from the reference proteome, rather than having to select each individual GenPept entry. The “parent” proteins within each proteome also use stars to denote the quality of information provided by each.
★★UniProt reviewed proteins or proteins having official IUIS allergen nomenclature have two stars.
★UniProt unreviewed proteins or proteins from GenBank have a single star.
☆ Nodes of the protein branch of the molecule tree containing GenPept and IEDB internal protein accessions having no homology to any protein within a reference proteome are designated with an unfilled star.
Organizational nodes, utilized by the Molecule tree to clarify the relationship between groups of similar proteins have no stars. An example of these nodes is “Immunoglobulin” used to group all immunoglobulin proteins from a single species.