Skip to main content
Enterprise AI Analysis: Evaluating large transformer models for anomaly detection of resource-constrained IoT devices for intrusion detection system

Enterprise AI Analysis

Evaluating large transformer models for anomaly detection of resource-constrained IoT devices for intrusion detection system

Authored by Ahmad Almadhor, Shtwai Alsubai, Natalia Kryvinska, Abdullah Al Hejaili, Mohamed Ayari, Belgacem Bouallegue, Sidra Abbas
Published in Scientific Reports | (2025) 15:37972

This research pioneers the integration of Large Transformer Models (LTMs) into Intrusion Detection Systems (IDS) for IoT environments. By transforming IoT traffic data into a text-based format, the study demonstrates that fine-tuned BERT, DistilBERT, and RoBERTa can achieve superior anomaly detection performance. BERT, in particular, showcased exceptional learning stability and generalization, making LTMs a powerful tool for real-time threat mitigation in resource-constrained IoT devices.

0.0211 BERT Lowest Training Loss
0.0677 BERT Lowest Validation Loss
11.42 min DistilBERT Training Time
243.87 samples/s DistilBERT Eval Speed

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

OTAD Framework
Data Preprocessing
Model Performance

The Optimized Transformer for Anomaly Detection (OTAD) framework systemically adapts and fine-tunes state-of-the-art Large Transformer Models (LTMs) like BERT, DistilBERT, and RoBERTa for IoT attack classification. This approach focuses on optimizing existing LTMs within the IoT IDS context rather than developing new architectures.

A structured preprocessing pipeline was developed for the RT_IoT2022 dataset to make it compatible with NLP-based modeling. This included data cleaning, handling missing values, categorical encoding of attack types, text transformation, and balanced sampling. The dataset was converted to Hugging Face format for seamless integration.

Extensive experiments across multiple LTMs within the OTAD framework showed BERT achieving the best performance. It recorded a minimum training loss of 0.0211 at epoch 34 and the lowest validation loss of 0.0677 at epoch 49, demonstrating effective learning and robust generalization capabilities for diverse IoT attack types.

Enterprise Process Flow: IoT Network Intrusion Attack Process

Reconnaissance
Initial Access
Lateral Movement
Data Exfiltration
0.0211 Lowest Training Loss (BERT, Epoch 34)

Computational Baseline Comparison of Transformer Models

Model Parameters Training Time (50 epochs) Validation Time Eval Speed (samples/s)
BERT-base-LLM 110M 1367.72 s (22.80 min) 2.01 s (0.03 min) 122.26
DistilBERT-base-LLM 66M 684.95 s (11.42 min) 1.01 s (0.02 min) 243.87
ROBERTa-base-LLM 125M 1264.91 s (21.08 min) 1.78 s (0.03 min) 138.44

Enhancing IoT Security with LTMs

This research successfully applied Large Transformer Models (LTMs) to improve anomaly detection in resource-constrained IoT devices. By converting raw IoT traffic into a text-based format and fine-tuning models like BERT, DistilBERT, and RoBERTa, the framework achieved high accuracy in classifying diverse IoT attack types. Specifically, DistilBERT excelled in efficiency, making it ideal for real-time deployment, while BERT showed superior generalization with the lowest validation loss, demonstrating the untapped potential of LTMs in robust IoT security solutions.

Quantify Your AI Advantage

Estimate the potential savings and efficiency gains for your organization by implementing advanced AI solutions like those explored in this research.

Estimated Annual Savings $0
Annual Hours Reclaimed 0

Your AI Implementation Roadmap

A strategic phased approach to integrate LTMs for enhanced IoT security in your enterprise.

Phase 1: Data Acquisition & Encoding

Gather raw IoT network traffic data (RT_IoT2022 dataset) and encode attack categories into numerical labels for machine learning compatibility.

Phase 2: Data Preprocessing & Transformation

Perform comprehensive data preprocessing including random sampling, handling missing values, and transforming all numerical and categorical features into a structured text format suitable for NLP models.

Phase 3: Dataset Formatting & Splitting

Convert the processed dataset into the Hugging Face Dataset format and split it into training and validation sets (80/20 ratio) to ensure efficient model integration and evaluation.

Phase 4: Transformer Model Fine-tuning

Apply fine-tuning to pre-trained transformer architectures (BERT, DistilBERT, RoBERTa) for IoT attack classification, optimizing hyperparameters (learning rate, batch size, epochs, weight decay) for efficient and stable learning.

Phase 5: Performance Evaluation & Deployment

Evaluate model performance through training and validation loss, identify the best-performing model, and prepare the fine-tuned model and tokenizer for potential real-time deployment in IoT intrusion detection systems.

Ready to Transform Your IoT Security?

Leverage cutting-edge Large Transformer Models to build a resilient and intelligent IDS for your resource-constrained IoT infrastructure.

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs


AI Consultation Booking