ESM-2

Model Description
Code Structure
Available Configurations
Workflow
References

Model Description

ESM-2 (Evolutionary Scale Modeling) is a family of transformer-based protein language models developed by Meta AI’s Fundamental AI Research (FAIR) Protein Team. Trained on large-scale protein sequence datasets such as UniRef50, ESM-2 learns representations that encode structural and functional information about proteins without requiring evolutionary alignments. This implementation supports a range of ESM-2 model sizes, including variable sequence length (VSL) support for improved efficiency with shorter protein sequences. Models are pretrained using a masked language modeling objective similar to BERT.

Code Structure

The code for this model is located in the esm2 directory within the ModelZoo. It reuses shared training infrastructure and custom data processors optimized for protein sequence modeling.

configs/: YAML configuration files for training various ESM-2 model sizes.
model.py: Top-level wrapper for initializing ESM-2 model instances and integrating with training.
esm2_pretrain_models.py: Core model architecture implementation.
utils.py: Helper utilities for config parsing and data formatting.

Available Configurations

Configuration	Description
`params_esm2_t12_35M_UR50D.yaml`	ESM-2 model with 12 layers and ~35M parameters.
`params_esm2_t33_650M_UR50D.yaml`	ESM-2 model with 33 layers and ~650M parameters.
`params_esm2_t33_650M_UR50D_vsl.yaml`	ESM-2 650M model with Variable Sequence Length (VSL) enabled for efficient training.
`params_esm2_t36_3B_UR50D.yaml`	ESM-2 model with 36 layers and ~3B parameters.
`params_esm2_t48_15B_UR50D.yaml`	ESM-2 model with 48 layers and ~15B parameters.

Workflow

For example workflows using language models from the Cerebras Model Zoo, see our tutorials on pretraining and fine-tuning. For a complete list of Cerebras ModelZoo CLI commands, see the command reference.

References

DPR Falcon

⌘I

Get Started

Setup and Installation

Models

Data Preparation

Model Configuration

Training and Eval

Configure and Run Jobs

Monitoring and Troubleshooting

Convert and Port

Advanced Usage

Model Description

Code Structure

Available Configurations

Workflow

References

Get Started

Setup and Installation

Models

Data Preparation

Model Configuration

Training and Eval

Configure and Run Jobs

Monitoring and Troubleshooting

Convert and Port

Advanced Usage

​Model Description

​Code Structure

​Available Configurations

​Workflow

​References

Model Description

Code Structure

Available Configurations

Workflow

References