calendar_today 2025-04-30 model_training Deep Learning

AI Model Generates Novel Protein Structures and Sequences

calendar_today 2025-04-08 attribution bair.berkeley.edu/blog/

Repurposing Protein Folding Models for Generation with Latent Diffusion

PLAID, a multimodal generative model, simultaneously generates protein 1D sequences and 3D structures by learning the latent space of protein folding models. This enables compositional function and organism prompts, accessing databases 2-4 orders of magnitude larger than structure databases. PLAID addresses the multimodal co-generation problem, generating both discrete sequence and continuous all-atom structural coordinates. The model only requires sequences to train the generative model by learning a diffusion model over the latent space of a protein folding model, such as ESMFold. CHEAP, compresses the joint embedding of protein sequence and structure. PLAID samples demonstrate better diversity.
Good summary?