ESM3

ESM3 is a multimodal generative protein language model that jointly models sequence, structure, and function. It enables controllable generation of novel proteins by conditioning on any combination of these modalities.

Protein Language Model
•Version 2024-06

Get Started

Quickstart Guide

1

Install the esm Python package

Python
pip install esm@git+https://github.com/Biohub/esm.git@main
2

Create an API key

3

Connect to the Biohub Platform API

Python
from esm.sdk.forge import ESM3ForgeInferenceClient

client = ESM3ForgeInferenceClient(model="esm3-medium-2024-08", url="https://biohub.ai", token="<your API token>")
4

Run your inference

Model Details

Model Card

Open in Hugging Face

Version

2024-03

Architecture

Transformer

Supported Modalities

Sequence, structure, function

Training Data

3,000+ sequences and 700,000+ unique training tokens

Intended Use

ESM3 is designed for prompt-driven generation of sequences and structures based on inputs of motifs, partial coordinates, secondary structure (SS) constraints, or function keywords.

Limitations & Risks

Novel sequence generation can lead to designs with hazardous properties. Model proposals may not be physically realizable; pLDDT/pTM are helpful but imperfect. Not intended for clinical or therapeutic applications without further validation.

This model is released under the MIT License.
ESM3 Multimodal Protein Language Model | Biohub Platform