Enterprise AI Analysis
Evaluating large transformer models for anomaly detection of resource-constrained IoT devices for intrusion detection system
Authored by Ahmad Almadhor, Shtwai Alsubai, Natalia Kryvinska, Abdullah Al Hejaili, Mohamed Ayari, Belgacem Bouallegue, Sidra Abbas
Published in Scientific Reports | (2025) 15:37972
This research pioneers the integration of Large Transformer Models (LTMs) into Intrusion Detection Systems (IDS) for IoT environments. By transforming IoT traffic data into a text-based format, the study demonstrates that fine-tuned BERT, DistilBERT, and RoBERTa can achieve superior anomaly detection performance. BERT, in particular, showcased exceptional learning stability and generalization, making LTMs a powerful tool for real-time threat mitigation in resource-constrained IoT devices.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
The Optimized Transformer for Anomaly Detection (OTAD) framework systemically adapts and fine-tunes state-of-the-art Large Transformer Models (LTMs) like BERT, DistilBERT, and RoBERTa for IoT attack classification. This approach focuses on optimizing existing LTMs within the IoT IDS context rather than developing new architectures.
A structured preprocessing pipeline was developed for the RT_IoT2022 dataset to make it compatible with NLP-based modeling. This included data cleaning, handling missing values, categorical encoding of attack types, text transformation, and balanced sampling. The dataset was converted to Hugging Face format for seamless integration.
Extensive experiments across multiple LTMs within the OTAD framework showed BERT achieving the best performance. It recorded a minimum training loss of 0.0211 at epoch 34 and the lowest validation loss of 0.0677 at epoch 49, demonstrating effective learning and robust generalization capabilities for diverse IoT attack types.
Enterprise Process Flow: IoT Network Intrusion Attack Process
| Model | Parameters | Training Time (50 epochs) | Validation Time | Eval Speed (samples/s) |
|---|---|---|---|---|
| BERT-base-LLM | 110M | 1367.72 s (22.80 min) | 2.01 s (0.03 min) | 122.26 |
| DistilBERT-base-LLM | 66M | 684.95 s (11.42 min) | 1.01 s (0.02 min) | 243.87 |
| ROBERTa-base-LLM | 125M | 1264.91 s (21.08 min) | 1.78 s (0.03 min) | 138.44 |
Enhancing IoT Security with LTMs
This research successfully applied Large Transformer Models (LTMs) to improve anomaly detection in resource-constrained IoT devices. By converting raw IoT traffic into a text-based format and fine-tuning models like BERT, DistilBERT, and RoBERTa, the framework achieved high accuracy in classifying diverse IoT attack types. Specifically, DistilBERT excelled in efficiency, making it ideal for real-time deployment, while BERT showed superior generalization with the lowest validation loss, demonstrating the untapped potential of LTMs in robust IoT security solutions.
Quantify Your AI Advantage
Estimate the potential savings and efficiency gains for your organization by implementing advanced AI solutions like those explored in this research.
Your AI Implementation Roadmap
A strategic phased approach to integrate LTMs for enhanced IoT security in your enterprise.
Phase 1: Data Acquisition & Encoding
Gather raw IoT network traffic data (RT_IoT2022 dataset) and encode attack categories into numerical labels for machine learning compatibility.
Phase 2: Data Preprocessing & Transformation
Perform comprehensive data preprocessing including random sampling, handling missing values, and transforming all numerical and categorical features into a structured text format suitable for NLP models.
Phase 3: Dataset Formatting & Splitting
Convert the processed dataset into the Hugging Face Dataset format and split it into training and validation sets (80/20 ratio) to ensure efficient model integration and evaluation.
Phase 4: Transformer Model Fine-tuning
Apply fine-tuning to pre-trained transformer architectures (BERT, DistilBERT, RoBERTa) for IoT attack classification, optimizing hyperparameters (learning rate, batch size, epochs, weight decay) for efficient and stable learning.
Phase 5: Performance Evaluation & Deployment
Evaluate model performance through training and validation loss, identify the best-performing model, and prepare the fine-tuned model and tokenizer for potential real-time deployment in IoT intrusion detection systems.
Ready to Transform Your IoT Security?
Leverage cutting-edge Large Transformer Models to build a resilient and intelligent IDS for your resource-constrained IoT infrastructure.