Improving carbohydrates in the PDB for 2020
In July 2020, the wwPDB will roll out updated PDB structures and reference data files with standardized representation of carbohydrate molecules, improving the Findability and Interoperability of PDB data. Detailed information about this work is available from the wwPDB website, including PDBx/mmCIF dictionary extensions and over 500 example files. We encourage developers of software packages that produce, access, or visualize PDB data to review this information and adapt their software.
Through collaboration with the glycoscience community, software tools were developed to standardize atom nomenclature of nearly 800 monosaccharides in the Chemical Component Dictionary (CCD) and applied branched polymeric representation to oligo- and polysaccharides within the PDB archive, enabling easy translation to other representations commonly used by glycobiologists. To guarantee unambiguous chemical description of oligo-/polysaccharides in each of the nearly 12,000 affected PDB entries, we have included an explicit description of covalent linkage information between their monomeric units. To ensure continued Findability of common oligosaccharides (e.g., sucrose, Lewis X factor), we have expanded the Biologically Interesting molecule Reference Dictionary (BIRD) which will contain the covalent linkage information and common synonyms for such molecules.
wwPDB is also taking this opportunity to improve the organization of chemical synonyms in the CCD by introducing a new _pdbx_chem_comp_synonyms data category. This will enable more comprehensive capture of alternative names for small molecules in the PDB. To minimize disruption to users, there will be an initial transition period, where the legacy data item, _chem_comp.pdbx_synonyms, will be retained.