Skip to main content
Enterprise AI Analysis: Open-set convolutional neural network for infrared spectral classification of environmentally sourced microplastics

Enterprise AI Analysis

Open-set convolutional neural network for infrared spectral classification of environmentally sourced microplastics

This research introduces a novel open-set convolutional neural network (CNN) model for the infrared spectral classification of environmentally sourced microplastics (MPs). Addressing critical challenges in environmental MP monitoring, the model features enhanced generalization through diverse data sourcing and a targeted data augmentation strategy across varying spectral ranges. Crucially, it incorporates OpenMax for robust identification of unknown MP classes—a common scenario in real-world environmental samples. The model achieves 93.1% accuracy for both known and unknown classes within an optimal uncertainty threshold range, significantly outperforming traditional closed-set methods. This approach offers a powerful, flexible, and generalizable solution for MP identification, with potential applications beyond environmental science, such as in food safety and pharmaceutical quality control.

Key Metrics & Impact

Explore the critical performance indicators demonstrating the advanced capabilities of this AI model.

0 Overall Classification Accuracy
0 Known-Class Test Spectra Processed
0 Unknown-Class Test Spectra Handled
0 Accuracy Improvement from Targeted Augmentation

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

OpenMax is a crucial technique integrated into the CNN model to address the challenge of identifying 'unknown' microplastic classes—those not present in the training data but common in real-world environmental samples. Unlike traditional closed-set models that force all inputs into predefined categories, OpenMax leverages Weibull-calibrated probabilities to distinguish between known classes and genuine unknowns. This flexibility significantly enhances the model's applicability and robustness in dynamic environmental monitoring scenarios, preventing misclassification of novel or rare polymer types.

A novel targeted data augmentation strategy was developed to improve model performance across varying infrared spectral ranges. This 'Type I' augmentation reuses spectra with broad spectral ranges by selecting narrower subsets, simulating the diversity of real-world instruments that produce different spectral widths. This technique proved highly effective, boosting classification accuracy by up to 4.6% especially when fewer spectral variables were involved. It directly addresses the impact of input-length disparity on model accuracy, ensuring the CNN model remains robust across different data collection methodologies.

To achieve a well-generalized model, significant intra-class diversity in the training data is essential. The study sourced microplastic spectra from multiple origins (e.g., OpenSpecy, lab-collected) to capture a rich variety of real-world conditions, including different instruments, aging stages, and surface contaminations. Quantitative analysis using Euclidean distance confirmed that this dataset exhibits higher intra-class diversity compared to benchmark studies. This approach ensures the model learns robust features that generalize effectively to unseen, environmentally-aged, and varied microplastic samples.

The OpenMax-enhanced CNN model achieved 93.1% overall accuracy, demonstrating its capability to balance identification of both known and unknown classes. While performing strongly, the study also used Shapley Additive Explanations (SHAP) to interpret model decisions, especially in cases of misclassification. SHAP analysis revealed that the model sometimes over-relies on a few localized spectral intervals rather than global features, leading to high-confidence misclassifications when unknown spectra exhibited similar local patterns. This insight informs future strategies to improve model robustness against ambiguous samples.

93.1% Overall Classification Accuracy for both known and unknown classes.

Open-Set CNN Methodology Flow

Diverse Data Collection (OpenSpecy, Lab)
Targeted Data Augmentation (Type I & II)
1D CNN Model Training (18 Known Classes)
OpenMax Integration (Weibull Calibration)
Uncertainty Threshold Application (Optimal 0.87 ± 0.01)
Robust Classification (Known & Unknown MPs)

Model Performance vs. Benchmarks

Model Type Known-Class Accuracy (Test Set I)
Our SoftMax model
  • 97.3%
Our OpenMax model
  • 96.0%
Closed-set 1D CNN (pretrained)
  • 47.6%
  • Note: Benchmark model for 1D CNN by Liu et al. (ref. 23).
Closed-set 2D CNN (reproduced)
  • 56.4%
  • Note: Benchmark model for 2D CNN by Zhu et al. (ref. 35).

Addressing Spectral Range Variability

Problem: Traditional CNN models trained on fixed spectral ranges struggle with real-world microplastic data collected from diverse instruments, which often produce spectra of varying widths. This input-length disparity can significantly compromise classification accuracy, especially for narrower spectral inputs.

Solution: The study introduced a novel 'Type I' data augmentation strategy. This involved reusing spectra with broad spectral ranges by selecting narrower, specific subsets, effectively simulating the different spectral widths encountered in practice. This allowed the model to be trained on a more diverse set of input lengths.

Result: The targeted data augmentation significantly improved model robustness, particularly for spectrally limited inputs. Classification accuracy saw an increase of up to 4.6% in scenarios with fewer spectral variables. This demonstrates a practical solution for integrating data from diverse instruments and sources, making the CNN model adaptable to varied real-world spectral conditions.

Calculate Your Potential AI Impact

Estimate the ROI your organization could achieve by implementing AI-powered spectral classification.

Estimated Annual Savings
Hours Reclaimed Annually

Your AI Implementation Roadmap

A phased approach to integrate open-set CNN capabilities for microplastic classification into your operations.

Phase 1: Data Acquisition & Preprocessing

Consolidate diverse microplastic spectral datasets from OpenSpecy, lab-collected samples, and benchmark studies. Implement a robust data cleaning pipeline to remove ambiguous entries. Apply min-max normalization and standardized resolution (451-dimensional vectors) across all spectra. This phase ensures a rich, clean, and harmonized dataset for training.

Phase 2: Targeted Data Augmentation & Model Training

Implement Type I augmentation by reusing broad-range spectra to generate narrower subsets, enhancing robustness across varying spectral ranges. Apply Type II augmentation (noise, scaling, baseline shifts) to expand data diversity. Train the 1D CNN model on 18 known classes, utilizing batch normalization and dropout to prevent overfitting. Monitor cross-entropy loss with EarlyStopping on Validation Set I.

Phase 3: Open-Set Integration & Threshold Optimization

Integrate OpenMax into the trained SoftMax model. Conduct grid search to determine optimal distance measures (cosine distance identified) and tail sizes (35 selected) for Weibull distribution fitting. Perform 10-fold cross-validation on Validation Set II to determine the optimal uncertainty threshold (0.87 ± 0.01) that balances accuracy for both known and unknown classes.

Phase 4: Model Evaluation & Interpretability

Evaluate the OpenMax model's performance on a combined test set of known and unknown classes. Analyze the confusion matrix, classification report, and F1 scores. Utilize Shapley Additive Explanations (SHAP) to interpret misclassifications, identifying critical spectral intervals influencing model decisions. Refine model based on interpretability insights.

Ready to Transform Your Operations?

Book a complimentary consultation to explore how our advanced AI solutions can drive efficiency and innovation in your enterprise.

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs


AI Consultation Booking