Enterprise AI Analysis

AbAffinity: A Large Language Model for Predicting Antibody Binding Affinity against SARS-CoV-2

Machine learning-based antibody design is emerging as one of the most promising approaches to combat infectious diseases, due to significant advancements in the field of artificial intelligence and an exponential surge in experimental antibody data (in particular related to COVID-19). The ability of an antibody to bind to an antigens (called binding affinity) is one of the the most critical properties in designing neutralizing antibodies. In this study we introduce Ab-Affinity, a new large language model that can accurately predict the binding affinity of antibodies against a target peptide, e.g., the SARS-CoV-2 spike protein. Code and model are available at https://github.com/ ucrbioinfo/AbAffinity.

Schedule Your Strategy Session

Executive Impact & Key Findings

Ab-Affinity leverages advanced language models to revolutionize antibody design, offering unprecedented accuracy and efficiency in predicting critical binding properties against SARS-CoV-2.

0 Unique Antibodies Trained

0 Highest Pearson Correlation (Ab-Affinity)

0 Highest Spearman Correlation (Ab-Affinity)

Discuss Your Implementation

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Model Architecture

Dataset & Training

Experimental Results

Ab-Affinity's architecture is based on BERT, adapted for amino acid sequences. It uses N sequential layers of encoder blocks, each with multi-head attention and feed-forward layers. The last encoder layer output serves as the sequence embedding. A fully connected layer predicts binding affinity from this embedding. Model sizes tested include 8M, 35M, and 650M parameters (N=6, 12, and 33 respectively), based on the ESM-2 study.

The model was trained on a dataset of single-chain fragment variable (scFv) antibody sequences and associated binding scores (KD values) against a SARS-CoV-2 HR2 region peptide. This dataset included variants generated by introducing 1-3 amino acid changes into three seed antibodies. KD values were preprocessed by taking the arithmetic mean of the two closest replicates. 71,834 unique antibodies were used for training. Mean Squared Error (MSE) was used as the loss function, and Adam optimizer for parameter optimization. 85% of data for training, 15% for validation. Trained on NVIDIA A100 GPUs with a batch size of 128 for 100 epochs. Fine-tuning of pre-trained ESM-2 was performed, and a model with randomly initiated weights was also trained for comparison. Best performing model achieved highest Pearson correlation on validation set.

Ab-Affinity demonstrated superior performance in predicting binding affinity compared to other LLM-based methods (DG-Affinity, ESM-2, AbLang). T-SNE visualizations showed that Ab-Affinity embeddings produced a smooth gradient of binding affinity, unlike ESM-2. The model achieved the highest Pearson (0.652) and Spearman (0.712) correlation coefficients on the test set. Ab-Affinity embeddings also proved highly effective for downstream classification tasks, such as determining binding affinity classes (High, Medium, Low) and identifying improved binding, with significantly higher AUC values than ESM-2. Attention maps revealed focus on CDRs and adjacent regions for binding prediction. The model also implicitly captured thermostability properties, separating antibodies into clusters based on their thermostability values in t-SNE.

Enterprise Process Flow

Antibody Sequence Input

→

Positional Embedding

→

N Encoder Layers (BERT-like)

→

Multi-Head Attention

→

Feed Forward Layers

→

Sequence Embedding Representation

→

Fully Connected Layer

→

Binding Affinity Prediction

Ab-Affinity's Predictive Accuracy

0.652 Pearson Correlation Coefficient

Ab-Affinity Performance vs. Other LLMs (14H Dataset)

Model	Ref	Pearson	Spearman
Ens-Grad	(Liu et al. 2020)	0.601	0.476
ESM-F	(He et al. 2024)	0.634	0.516
AntiBERTa2	(Barton, Galson, and Leem 2024)	0.623	0.545
AbMAP	(Singh et al. 2023)	0.606	0.510
A2Binder	(He et al. 2024)	0.642	0.553
Ab-Affinity	[this]	0.652	0.526

Impact on Antibody Design

Ab-Affinity's superior predictive capability significantly streamlines the antibody design process. By accurately predicting binding affinity from sequence data, it allows for rapid screening of candidate antibodies, reducing the need for costly and time-consuming experimental validation. This accelerates the development of therapeutic antibodies and vaccines, especially for rapidly evolving pathogens like SARS-CoV-2. The model's ability to provide interpretable attention maps further aids in identifying key residue-residue interactions, guiding rational design efforts.

Learn More About This Case Study

Calculate Your Potential ROI with Ab-Affinity

Estimate the significant time and cost savings your enterprise could achieve by integrating Ab-Affinity into your R&D pipeline.

Your Industry

Number of R&D Employees (Antibody Development)

Avg. Weekly Hours on Manual Screening/Testing per Employee

Average Hourly Cost per Employee ($)

Estimated Annual Savings $0

Equivalent Hours Reclaimed 0

Quantify Your Savings

Implementation Roadmap

Our structured approach ensures a smooth integration of Ab-Affinity into your existing R&D workflows, maximizing your return on investment.

Phase 1: Initial Consultation & Data Integration

Engage with our experts to understand your specific antibody design challenges and data landscape. We'll integrate your existing sequence data and experimental results into the Ab-Affinity platform, ensuring seamless compatibility and secure data handling.

Phase 2: Model Customization & Initial Prediction Run

Based on your project's goals, we'll fine-tune Ab-Affinity with your proprietary data to optimize its performance for your specific targets. An initial prediction run will then generate binding affinity scores for your candidate antibodies, along with embedding visualizations and attention maps.

Phase 3: Validation, Iteration & Optimized Design

We'll collaborate to validate initial predictions against your experimental benchmarks. Using the model's insights, we'll iterate on candidate antibody sequences, leveraging the attention maps to guide targeted modifications for improved binding affinity and thermostability. This phase culminates in a refined list of highly promising antibody designs ready for experimental testing.

Start Your AI Journey

Ready to Transform Your Enterprise with AI?

Schedule a personalized consultation to discuss how our AI solutions can drive efficiency and innovation in your organization.

Book a Consultation Now

Enterprise AI Analysis

AbAffinity: A Large Language Model for Predicting Antibody Binding Affinity against SARS-CoV-2

Executive Impact & Key Findings

Deep Analysis & Enterprise Applications

Enterprise Process Flow

Ab-Affinity's Predictive Accuracy

Ab-Affinity Performance vs. Other LLMs (14H Dataset)

Impact on Antibody Design

Calculate Your Potential ROI with Ab-Affinity

Implementation Roadmap

Phase 1: Initial Consultation & Data Integration

Phase 2: Model Customization & Initial Prediction Run

Phase 3: Validation, Iteration & Optimized Design

Ready to Transform Your Enterprise with AI?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai