Enterprise AI Analysis

A Gated Attention-Based Multiple Instance Learning and Test-Time Augmentation Approach for Diagnosing Active Sacroiliitis in Sacroiliac Joint MRI Scans

Authors: Zeynep Keskin, Onur İnan, Ömer Özberk, Reyhan Bilici, Sema Servi, Selma Özlem Çelikdelen, Mehmet Yıldırım

**Background and Objective:** Axial spondyloarthritis (axSpA) is a group of chronic inflammatory diseases that primarily affect the sacroiliac joints. Early diagnosis is crucial for preventing irreversible structural damage. Magnetic Resonance Imaging (MRI) is the gold standard for detecting early inflammatory changes such as sacroiliitis. However, conventional MRI interpretation is inherently subjective and susceptible to both intra- and inter-observer variability. Therefore, artificial intelligence (AI)-driven diagnostic solutions are increasingly being explored. Among them, the Gated Attention Multiple Instance Learning (MIL) framework holds strong potential in modeling heterogeneous inflammatory distributions, thanks to its slice-level attention mechanism. This study aims to evaluate the diagnostic performance of a deep learning model based on Gated Attention MIL for automated sacroiliitis detection. Furthermore, its results are compared with a baseline deep learning architecture (standard ResNet-18), and its consistency with radiologist annotations is analyzed.

**Materials and Methods:** The dataset included 554 subjects, comprising 276 patients diagnosed with axSpA and 278 healthy controls. All MRI data were derived from axial T2-weighted fat-suppressed (T2_TSE_TRA_FS) sequences. Patient-wise data splitting was employed to construct training, validation, and independent test sets. The proposed model architecture integrates ResNet-18-based feature extraction, a gated attention mechanism for instance-level weighting, and bag-level classification. Additionally, Test-Time Augmentation (TTA) was implemented to enhance robustness during inference.

**Results:** On the independent test set, the model achieved an accuracy of 85.88%, sensitivity of 92.86%, specificity of 79.07%, and an F1-score of 86.67%. Attention heatmaps generated by the MIL module showed strong spatial overlap with bone marrow edema regions annotated by expert radiologists. Implementation of TTA led to an approximate 10% improvement in overall classification accuracy.

**Conclusions:** The Gated Attention MIL framework demonstrated high diagnostic performance for sacroiliitis detection, indicating its value as a reliable decision support tool for early axSpA diagnosis. Validation on larger, multi-center datasets is warranted to ensure generalizability and to support clinical integration in routine radiology workflows.

Schedule Your Strategy Session

Executive Impact & Key Performance Indicators

This study highlights significant advancements in AI-driven medical diagnostics, offering superior accuracy and interpretability for critical disease detection. Key metrics demonstrate the immediate value for healthcare enterprises seeking to enhance diagnostic efficiency and precision.

0 Overall Accuracy

0 Disease Sensitivity

0 Healthy Specificity

0 F1-Score

Discuss Your Implementation

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Active Sacroiliitis Diagnosis

Active sacroiliitis, a key indicator of axial spondyloarthritis (axSpA), is a chronic inflammatory disease primarily affecting the sacroiliac joints. Early and accurate diagnosis is crucial to prevent irreversible structural damage and improve long-term functional outcomes. Magnetic Resonance Imaging (MRI) is the gold standard for detecting early inflammatory changes such as subchondral bone marrow edema (BME).

92.86% Model Sensitivity in Diagnosing Active Sacroiliitis

Gated Attention Multiple Instance Learning (MIL)

Gated Attention MIL is a powerful framework for modeling heterogeneous inflammatory distributions. Unlike traditional CNNs that process all slices uniformly, MIL treats each patient as a 'bag' of instances (MRI slices) and learns the importance of each slice dynamically. The gated attention mechanism generates attention weights across slices, allowing the model to emphasize inflamed regions and assign low weights to non-informative slices, significantly improving diagnostic accuracy and interpretability.

13.95% Specificity Improvement with Gated Attention MIL

Test-Time Augmentation (TTA)

To enhance model robustness against noise and acquisition variability, Test-Time Augmentation (TTA) was implemented during inference. Each patient’s slices were presented to the model in both original and horizontally flipped forms, with final predictions obtained by averaging the outputs. This strategy significantly boosted the model’s generalization capacity and strengthened diagnostic consistency despite clinical variability.

10% Approximate Accuracy Improvement with TTA

Diagnostic Framework Flow

The diagnostic framework utilizes a Gated Attention Multiple Instance Learning (MIL) architecture, consisting of three main stages: Feature Extraction (ResNet-18), Gated Attention Mechanism, and Bag-Level Classification. This approach allows for dynamic weighting of MRI slices based on their diagnostic relevance.

MRI Bag (X) (30 Slices)

→

Feature Extraction (ResNet-18)

→

Features (h) (512-dim)

→

Gated Attention Module

→

Attention (a) (Softmax)

→

Attention Pooling (z = Σ akhk)

→

Classification (MLP)

→

Prediction (Diagnosis)

Dataset and Preprocessing Overview

The study included 554 individuals (276 axSpA patients and 278 healthy controls). All MRI data were T2-weighted fat-suppressed sequences. Data was split on a patient basis into training (387 subjects), validation (82 subjects), and independent test (85 subjects) sets to prevent data leakage. Images were preprocessed, resized, normalized, and augmented.

554 Total Subjects in Dataset (276 axSpA, 278 Healthy)

Model Performance Metrics

On the independent test set, the Gated Attention MIL model significantly outperformed the baseline ResNet-18, achieving an accuracy of 85.88%, sensitivity of 92.86%, specificity of 79.07%, and an F1-score of 0.8667. The integration of TTA further boosted accuracy by approximately 10%.

Metric	Baseline ResNet-18	Proposed System
Accuracy	75.29%	85.88%
Sensitivity	85.71%	92.86%
Specificity	65.12%	79.07%
F1-Score	0.7742	0.8667

Interpretability: Attention Heatmaps

Visual inspection by an expert radiologist confirmed qualitative agreement between the Grad-CAM activation regions and clinically identified inflammatory areas. This suggests the model not only performs accurate classification but also provides clinically meaningful insights into its predictions, focusing on subchondral bone marrow edema patterns critical for active sacroiliitis diagnosis.

Gated Attention Pinpoints Inflammation

The Gated Attention MIL framework demonstrated strong spatial interpretability. Attention heatmaps generated by the MIL module showed significant overlap with bone marrow edema regions annotated by expert radiologists. The model effectively assigns high attention scores to clinically relevant sacroiliac joint regions, while suppressing uninformative slices, thereby enhancing clinical decision-making support.

Comparison with Existing AI Approaches

The proposed Gated Attention MIL model achieved the highest independent test accuracy (85.88%) compared to other reviewed machine learning and deep learning approaches for sacroiliitis detection. This highlights the effectiveness of the MIL approach in modeling heterogeneous and slice-distributed inflammatory patterns, outperforming classical CNNs and other methods that often struggle with focal pathologies.

Study	Method	Independent Test Accuracy (Acc)
Nicolaes et al. (2025)	Deep Learning (731 patients)	74.00%
Bressem et al. (2022)	Deep Learning (EULAR abstract)	75.00%
Faleiros et al. (2020)	Classical ML (MLP)	75-82%
Liu et al. (2025)	Semi-supervised Radiomics	81.20%
Roels et al. (2023)	Machine Learning (ResNet-18)	81.40%
Zhang et al. (2024)	Radiomics + Clinical Hybrid Model	85.60%
This Study	Gated Attention MIL	85.88%

Single-Center Data Limitation

To establish broad clinical reliability, the model requires validation on larger, multi-center datasets with diverse scanners and imaging protocols. Additionally, the study primarily focused on T2-weighted fat-suppressed sequences, potentially limiting sensitivity to lesion diversity. Further research will incorporate hybrid CNN–Transformer architectures and advanced attention mechanisms for improved precision.

Need for Multi-Center Validation

Despite robust performance on a diverse patient population, the dataset was sourced from a single institution, limiting the model's ultimate generalizability across different MRI scanners and institutional protocols. Future studies will prioritize multi-center validation using external datasets to ensure broad clinical reliability.

Calculate Your Enterprise ROI

Estimate the potential cost savings and efficiency gains your organization could achieve by implementing AI-driven diagnostic solutions.

Industry Sector

Number of Employees (Impacted by AI)

Average Hours Spent on Repetitive Tasks Per Week

Average Hourly Cost Per Employee ($)

Estimated Annual Savings $0

Annual Hours Reclaimed 0

Your AI Implementation Roadmap

We guide enterprises through a structured process to ensure successful AI integration and maximize value.

Phase 1: Discovery & Strategy

Comprehensive assessment of current workflows, data infrastructure, and business objectives to define a tailored AI strategy and identify high-impact use cases.

Phase 2: Pilot & Proof-of-Concept

Develop and deploy a small-scale AI solution to validate its technical feasibility and demonstrate initial ROI, gathering critical feedback for refinement.

Phase 3: Full-Scale Integration

Seamlessly integrate the AI solution into existing enterprise systems, ensuring scalability, security, and robust performance across all relevant departments.

Phase 4: Optimization & Scaling

Continuous monitoring, performance optimization, and iterative enhancements. Expand AI capabilities to new areas for sustained competitive advantage and long-term value.

Map Your AI Journey

Ready to Transform Your Enterprise with AI?

Leverage cutting-edge AI research to drive innovation, improve efficiency, and gain a significant competitive edge. Our experts are ready to design a custom solution for your unique challenges.

Schedule Your Consultation Today

Enterprise AI Analysis

A Gated Attention-Based Multiple Instance Learning and Test-Time Augmentation Approach for Diagnosing Active Sacroiliitis in Sacroiliac Joint MRI Scans

Executive Impact & Key Performance Indicators

Deep Analysis & Enterprise Applications

Active Sacroiliitis Diagnosis

Gated Attention Multiple Instance Learning (MIL)

Test-Time Augmentation (TTA)

Diagnostic Framework Flow

Dataset and Preprocessing Overview

Model Performance Metrics

Interpretability: Attention Heatmaps

Gated Attention Pinpoints Inflammation

Comparison with Existing AI Approaches

Single-Center Data Limitation

Need for Multi-Center Validation

Calculate Your Enterprise ROI

Your AI Implementation Roadmap

Phase 1: Discovery & Strategy

Phase 2: Pilot & Proof-of-Concept

Phase 3: Full-Scale Integration

Phase 4: Optimization & Scaling

Ready to Transform Your Enterprise with AI?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Jobs

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai