PDB Statistics: Growth in Number of Unique Protein Sequences in Released PDB Structures (Cumulative) at Identity 30%

This chart shows the annual and cumulative numbers of protein sequences in released PDB structures. The chart can be viewed for a few different levels of sequence identity since the beginning of the PDB archive. The cumulative bars represent the growth in unique protein sequences (number of polymeric entities) across history. The yearly bars (dark blue) tell how many new protein sequences were added in a certain year.

Note: The total number of sequence clusters in the statistics table is linked to the sequence cluster group search result page. There is a default precision threshold in calculating the numbers for performance balance. So the statistics count may have a slight discrepancy compared to the actual non-redundant group search result when the result count approaches or goes above 10,000. The group search result page provides an accurate count. The statistics page provides the trend.

Chart is currently loading

Sequence cluster level:

YearNumber of New Protein SequencesTotal Number of Protein Sequences
19761111
19771122
1978325
1979126
1980228
1981735
19821752
1983557
1984966
1985773
1986780
1987787
198816103
198927130
199027157
199135192
199246238
1993131369
1994279648
1995227875
19962601,135
19973881,523
19984531,976
19996092,585
20007263,311
20017244,035
20027714,806
200310465,852
200414827,334
200515618,895
2006180610,701
2007195512,656
2008187814,534
2009185816,392
2010187018,262
2011166519,927
2012176221,689
2013184023,529
2014218325,712
2015198127,693
2016211229,805
2017226832,073
2018228034,353
2019239236,745
2020287539,620
2021232241,942
2022290744,849
202327745,126