Access Improved Carbohydrate Data at the PDB
A new data representation for carbohydrates in PDB entries and reference data improves the Findability and Interoperability of these molecules in macromolecular structures. The PDB archive now reflects:
- Standardized Chemical Component Dictionary nomenclature following IUPAC-IUBMB recommendations
- Uniform representation for oligosaccharides
- Adoption of glycoscience-community commonly used linear descriptors using community tools
- Annotated glycosylation sites in PDB structures
Detailed information about this project, including a list of remediated PDB entries, is available at the wwPDB website. Developers of software packages that produce, access, or visualize PDB data are encouraged to review this information and adapt their software as soon as possible, as originally highlighted in the February 2020 announcement.
The wwPDB has created a new ‘branched’ entity representation for polysaccharides, describing all the individual monosaccharide components of these in the PDB entry. As part of this process, we have standardized atom nomenclature of >1,000 monosaccharides in the Chemical Component Dictionary (CCD) and applied a branched entity representation to oligosaccharides (>8,000 PDB entries). To guarantee unambiguous chemical description of oligosaccharides in the affected PDB entries, an explicit description of covalent linkage information between their monosaccharide units is included. In addition, wwPDB validation reports provide consistent representation for these oligosaccharides and include 2D representations based on the Symbol Nomenclature for Glycans (SNFG).
To support the remediation of carbohydrate representation, software tools providing linear descriptors were developed in collaboration with the glycoscience community to enable easy translation of PDB data to other representations commonly used by glycobiologists. These include Condense IUPAC from GMML at University of Georgia, WURCS from PDB2Glycan at The Noguchi Institute, Japan, and LINUCS from pdb-care at Germany.
wwPDB has also used this opportunity to improve the organization of chemical synonyms in the CCD by introducing a new _pdbx_chem_comp_synonyms data category. This will enable more comprehensive capture of alternative names for small molecules in the PDB. To minimize disruption to users, the legacy data item, _chem_comp.pdbx_synonyms, will be retained for a transition period through 2021.
The carbohydrate remediation project is a wwPDB collaborative project that is carried out principally by RCSB PDB at Rutgers, The State University of New Jersey and is funded by NIH Common Fund Glycoscience Program through the National Cancer Institute cooperative agreement U01 CA221216 to Dr. Robert Woods at the Complex Carbohydrate Research Center at the University of Georgia in collaboration with Dr. Jasmine Young as sub-awardee at RCSB PDB at Rutgers.