Skip to main content
Enterprise AI Analysis: AbAffinity: A Large Language Model for Predicting Antibody Binding Affinity against SARS-CoV-2

Enterprise AI Analysis

AbAffinity: A Large Language Model for Predicting Antibody Binding Affinity against SARS-CoV-2

Machine learning-based antibody design is emerging as one of the most promising approaches to combat infectious diseases, due to significant advancements in the field of artificial intelligence and an exponential surge in experimental antibody data (in particular related to COVID-19). The ability of an antibody to bind to an antigens (called binding affinity) is one of the the most critical properties in designing neutralizing antibodies. In this study we introduce Ab-Affinity, a new large language model that can accurately predict the binding affinity of antibodies against a target peptide, e.g., the SARS-CoV-2 spike protein. Code and model are available at https://github.com/ ucrbioinfo/AbAffinity.

Executive Impact & Key Findings

Ab-Affinity leverages advanced language models to revolutionize antibody design, offering unprecedented accuracy and efficiency in predicting critical binding properties against SARS-CoV-2.

0 Unique Antibodies Trained
0 Highest Pearson Correlation (Ab-Affinity)
0 Highest Spearman Correlation (Ab-Affinity)

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Model Architecture
Dataset & Training
Experimental Results

Ab-Affinity's architecture is based on BERT, adapted for amino acid sequences. It uses N sequential layers of encoder blocks, each with multi-head attention and feed-forward layers. The last encoder layer output serves as the sequence embedding. A fully connected layer predicts binding affinity from this embedding. Model sizes tested include 8M, 35M, and 650M parameters (N=6, 12, and 33 respectively), based on the ESM-2 study.

The model was trained on a dataset of single-chain fragment variable (scFv) antibody sequences and associated binding scores (KD values) against a SARS-CoV-2 HR2 region peptide. This dataset included variants generated by introducing 1-3 amino acid changes into three seed antibodies. KD values were preprocessed by taking the arithmetic mean of the two closest replicates. 71,834 unique antibodies were used for training. Mean Squared Error (MSE) was used as the loss function, and Adam optimizer for parameter optimization. 85% of data for training, 15% for validation. Trained on NVIDIA A100 GPUs with a batch size of 128 for 100 epochs. Fine-tuning of pre-trained ESM-2 was performed, and a model with randomly initiated weights was also trained for comparison. Best performing model achieved highest Pearson correlation on validation set.

Ab-Affinity demonstrated superior performance in predicting binding affinity compared to other LLM-based methods (DG-Affinity, ESM-2, AbLang). T-SNE visualizations showed that Ab-Affinity embeddings produced a smooth gradient of binding affinity, unlike ESM-2. The model achieved the highest Pearson (0.652) and Spearman (0.712) correlation coefficients on the test set. Ab-Affinity embeddings also proved highly effective for downstream classification tasks, such as determining binding affinity classes (High, Medium, Low) and identifying improved binding, with significantly higher AUC values than ESM-2. Attention maps revealed focus on CDRs and adjacent regions for binding prediction. The model also implicitly captured thermostability properties, separating antibodies into clusters based on their thermostability values in t-SNE.

Enterprise Process Flow

Antibody Sequence Input
Positional Embedding
N Encoder Layers (BERT-like)
Multi-Head Attention
Feed Forward Layers
Sequence Embedding Representation
Fully Connected Layer
Binding Affinity Prediction

Ab-Affinity's Predictive Accuracy

0.652 Pearson Correlation Coefficient

Ab-Affinity Performance vs. Other LLMs (14H Dataset)

Model Ref Pearson Spearman
Ens-Grad (Liu et al. 2020) 0.601 0.476
ESM-F (He et al. 2024) 0.634 0.516
AntiBERTa2 (Barton, Galson, and Leem 2024) 0.623 0.545
AbMAP (Singh et al. 2023) 0.606 0.510
A2Binder (He et al. 2024) 0.642 0.553
Ab-Affinity [this] 0.652 0.526

Impact on Antibody Design

Ab-Affinity's superior predictive capability significantly streamlines the antibody design process. By accurately predicting binding affinity from sequence data, it allows for rapid screening of candidate antibodies, reducing the need for costly and time-consuming experimental validation. This accelerates the development of therapeutic antibodies and vaccines, especially for rapidly evolving pathogens like SARS-CoV-2. The model's ability to provide interpretable attention maps further aids in identifying key residue-residue interactions, guiding rational design efforts.

Calculate Your Potential ROI with Ab-Affinity

Estimate the significant time and cost savings your enterprise could achieve by integrating Ab-Affinity into your R&D pipeline.

Estimated Annual Savings $0
Equivalent Hours Reclaimed 0

Implementation Roadmap

Our structured approach ensures a smooth integration of Ab-Affinity into your existing R&D workflows, maximizing your return on investment.

Phase 1: Initial Consultation & Data Integration

Engage with our experts to understand your specific antibody design challenges and data landscape. We'll integrate your existing sequence data and experimental results into the Ab-Affinity platform, ensuring seamless compatibility and secure data handling.

Phase 2: Model Customization & Initial Prediction Run

Based on your project's goals, we'll fine-tune Ab-Affinity with your proprietary data to optimize its performance for your specific targets. An initial prediction run will then generate binding affinity scores for your candidate antibodies, along with embedding visualizations and attention maps.

Phase 3: Validation, Iteration & Optimized Design

We'll collaborate to validate initial predictions against your experimental benchmarks. Using the model's insights, we'll iterate on candidate antibody sequences, leveraging the attention maps to guide targeted modifications for improved binding affinity and thermostability. This phase culminates in a refined list of highly promising antibody designs ready for experimental testing.

Ready to Transform Your Enterprise with AI?

Schedule a personalized consultation to discuss how our AI solutions can drive efficiency and innovation in your organization.

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs


AI Consultation Booking