AlphaFold DB provides open access to over 200 million protein structure predictions to accelerate scientific research.

Background

AlphaFold is an AI system developed by Google DeepMind that predicts a protein’s 3D structure from its amino acid sequence. It regularly achieves accuracy competitive with experiment.

Google DeepMind and EMBL’s European Bioinformatics Institute (EMBL-EBI) have partnered to create AlphaFold DB to make these predictions freely available to the scientific community. The latest database release contains over 200 million entries, providing broad coverage of UniProt (the standard repository of protein sequences and annotations). We provide individual downloads for the human proteome and for the proteomes of 47 other key organisms important in research and global health. We also provide a download for the manually curated subset of UniProt (Swiss-Prot).

image1

Q8I3H7: May protect the malaria parasite against attack by the immune system. Mean pLDDT 85.57.

View protein

In CASP14, AlphaFold was the top-ranked protein structure prediction method by a large margin, producing predictions with high accuracy. While the system still has some limitations, the CASP results suggest AlphaFold has immediate potential to help us understand the structure of proteins and advance biological research.

Let us know how the AlphaFold Protein Structure Database has been useful in your research, or if you have questions not answered in the FAQs, at alphafold@deepmind.com.

If your use case isn’t covered by the database, you can generate your own AlphaFold predictions using Google DeepMind’s Colab notebook or open source code. Both resources also support multimer prediction.

image2

Q8W3K0: A potential plant disease resistance protein. Mean pLDDT 82.24.

View protein

What’s new?

Integration of TED data - March 2025

The AlphaFold Protein Structure Database has two significant updates. Firstly, AlphaFold predictions are enriched with TED domain assignments, linking them to CATH classifications for improved interpretability and comparative analysis. Visualise TED domains alongside PAE plots to analyse domain interactions in complex proteins.

Secondly, we've introduced bulk file downloads, a highly requested feature designed to streamline research workflows. Users can now download up to 100 files at once from search pages and the Foldseek table, supporting multiple formats including mmCIF, PDB, csv, and PAE (JSON). Additionally, search results now display pLDDT scores and sequence lengths for quick assessment, and a new pLDDT slider allows for efficient filtering of high-confidence structures.

Read full article on EMBL-EBI site

news image

AF-P86938-F1: Thymine dioxygenase JBP1

View protein

What’s next?

We plan to continue updating the database with structures for newly discovered protein sequences, and to improve features and functionality in response to user feedback. Please follow Google DeepMind's and EMBL-EBI’s social channels for updates.

Licence and attribution

All of the data provided is freely available for both academic and commercial use under Creative Commons Attribution 4.0 (CC-BY 4.0) licence terms.

If you use this resource, please cite the following papers:
Jumper, J et al. Highly accurate protein structure prediction with AlphaFold. Nature (2021).
Varadi, M et al. AlphaFold Protein Structure Database in 2024: providing structure coverage for over 214 million protein sequences. Nucleic Acids Research (2024).

If you use data from AlphaMissense in your work, please cite the following paper:
Cheng, J et al. Accurate proteome-wide missense variant effect prediction with AlphaMissense. Science (2023).

The structures and data provided in this resource are predictions with varying levels of confidence and should be interpreted carefully. The information is for theoretical modelling only. It is not intended, validated or approved for any clinical use.

EMBL-EBI training

Recorded webinar

Accessing and interpreting predicted protein structures from AlphaFold database

AlphaFold database (AlphaFold DB) provides open access to over 200 million protein structure predictions to accelerate scientific research. This...

Online tutorial

AlphaFold

Proteins are essential components of life, predicting their 3D structure enables researchers to get an insight into its function and role. AlphaFold...
View all on-demand training