About RCSB PDB: Enabling Breakthroughs in Scientific and Biomedical Research and Education
RCSB PDB (RCSB.org) is the US data center for the global Protein Data Bank (PDB) archive of 3D structure data for large biological molecules (proteins, DNA, and RNA) essential for research and education in fundamental biology, health, energy, and biotechnology.
The Protein Data Bank (PDB) was established as the 1st open access digital data resource in all of biology and medicine (Historical Timeline). It is today a leading global resource for experimental data central to scientific discovery.
Through an internet information portal and downloadable data archive, PDB provides access to 3D structure data for the molecules of life, found in all organisms on the planet.
Knowing the 3D structure of a biological macromolecule is essential for understanding its role in human and animal health and disease, its function in plants and food and energy production, and its importance to other topics related to global prosperity and sustainability.
The enormous wealth of 3D structure data stored in the PDB has underpinned significant advances in our understanding of protein architecture, culminating in recent breakthroughs in protein structure prediction accelerated by artificial intelligence approaches and deep or machine learning methods.
RCSB PDB (Research Collaboratory for Structural Bioinformatics PDB) operates the US data center for the global PDB archive, and makes PDB data available at no charge to all data consumers without limitations on usage (Policies).
The Vision of the RCSB PDB is to enable open access to the accumulating knowledge of 3D structure, function, and evolution of biological macromolecules, expanding the frontiers of fundamental biology, biomedicine, and biotechnology.
Recognized experts in fields, including but not limited to, structural biology, cell and molecular biology, computational biology, information technology, and education serve as advisors to the RCSB PDB.
PDB Archive contains >1 TB of Structure Data for Proteins, DNA, and RNA
The cost to replicate the
contents of the PDB archive
is estimated at
$18 billion (USD) (Analysis)
The PDB Archive
- Grows at the rate of nearly 10% per year
- Used to download >2 million structure data files per day
- Managed by International collaboration US-Asia-Europe
- Manages “Big Data” as global Public Good
- Enable research in subject areas from Agriculture to Zoology (Analysis)
- Contributed data to nearly >1 million published research papers
- Used by >400 biological data resources
PDB Data Impact
- Basic and applied research
- Patent applications
- Discovery of lifesaving drugs
- Innovations that can lead to new product development and company formation
- STEAM education: PDB-101 provides curricula and online tools for teachers and students
Millions of Data Consumers worldwide served every year
Researchers, scientists, educators, students, curious public, medical professionals, patients, and patient advocates
Public and Private sectors, including pharmaceutical and biotechnology companies
on investment of
federal funding (Analysis)
Supporting Access to the Biological Molecules of the PDB Archive
- Deposition/Biocuration Services support Data Depositors who deposit the results of their structural studies of biological macromolecules to the PDB. All data deposited undergo expert review. Each structure is examined for self-consistency, standardized using controlled vocabularies, cross-referenced with other biological data resources, and validated for scientific/technical accuracy.
- Archive Management/Access Services support PDB Data Consumers by maintaining the PDB archive; data dictionary development and standardization, enabling global data delivery and DOI registration, and integrating PDB data with other available information.
- Data Exploration Services support PDB Data Consumers in the US and around the world through our open-access web portal RCSB.org that provides tools for structure visualization and analysis.
- Outreach/Education Services for teachers, students, and the general public are primarily delivered via our PDB-101 website (“101", as in an entry-level course).
RCSB PDB is supported by grants from the National Science Foundation (DBI-1832184), the US Department of Energy (DE-SC0019749), and the National Cancer Institute, National Institute of Allergy and Infectious Diseases, and National Institute of General Medical Sciences of the National Institutes of Health under grant R01GM133198.
In the past, RCSB PDB was also funded by the National Library of Medicine, the National Center for Research Resources, the National Institute of Biomedical Imaging and Bioengineering, and the National Institute of Neurological Disorders and Stroke.
Other funding awards to RCSB PDB by the NSF and to PDBe by the UK Biotechnology and Biological Research Council are jointly supporting development of a Next Generation PDB archive (DBI-2019297, PI: S.K. Burley; BB/V004247/1, PI: Sameer Velankar) and new Mol* features (DBI-2129634, PI: S.K. Burley; BB/W017970/1, PI: Sameer.
RCSB PDB supports an international community of users, including biologists (in fields such as structural biology, biochemistry, genetics, pharmacology); other scientists (in fields such as bioinformatics, software developers for data analysis and visualization); students and educators (all levels); media writers, illustrators, textbook authors; and the general public.
RCSB PDB services have broad impact across research and education. The inaugural RCSB PDB citation (Berman et al., Nucleic Acids Research 2000) is one of the top-cited scientific publications of all time. A 2017 bibliometric analysis performed by Clarivate Analytics shows PDB motivated high-quality research throughout the world. Papers citing had a citation-based impact exceeding the world-average in 16 scientific fields including Biology & Biochemistry, Computer Science, Plant & Animal Sciences, Physics, Environment/Ecology, Mathematics and Geosciences.
A 2017 economic analysis performed by the Rutgers Office of Research Analytics noted that a reasonable estimate to replicate the PDB data archive at the time was $12 billion.
- Impact of PDB Structures on US FDA Drug Approvals 2010-2016 (PDF)
- Supporting the NSF Big Ideas (PDF)
- Supporting NIH in Medical Research (PDF)
- Supporting the Research Goals of DOE (PDF)
- Impact of PDB Structures on Anti-Cancer Drug Approvals (PDF)
- PDB Structures and the Pandemic (PDF)
- Protein Data Bank and 50 years of Molecular Structures (PDF)
- PDB Citation MeSH Network Explorer
Worldwide Protein Data Bank (wwPDB)
The Worldwide Protein Data Bank (wwPDB) was formed to maintain a single PDB archive of macromolecular structural data that is freely and publicly available to the global community. It consists of organizations that act as deposition, data processing and distribution centers for PDB data. As the US Data Center, RCSB PDB biocurates structures submitted from the Americas and Oceania.
PDB-Dev is a prototype archiving system for structural models obtained using integrative or hybrid modeling.
EMDataResource provides access to 3DEM density maps and metadata, news, events, software tools, data standards, and validation methods.