Enterprise AI Analysis
Deep Learning Models for Coral Bleaching Classification in Multi-Condition Underwater Image Datasets
Julio Jerison E. Macrohon, PhD, Gordon Hung
Executive Impact Summary
This research introduces a novel machine-learning framework for classifying coral bleaching in multi-condition underwater images. By leveraging a diverse global dataset with samples from deep seas, marshes, and coastal zones, the study benchmarks and compares state-of-the-art models including Residual Neural Network (ResNet), Vision Transformer (ViT), and Convolutional Neural Network (CNN). After comprehensive hyperparameter tuning, the CNN model achieved the highest accuracy of 88%, outperforming existing benchmarks. This robust framework offers significant insights for autonomous coral monitoring, demonstrating how deep learning can address urgent environmental challenges in marine ecosystems without requiring substantial computational power.
Deep Analysis & Enterprise Applications
Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.
Coral reefs support numerous marine organisms and are an important source of coastal protection from storms and floods, representing a major part of marine ecosystems. However coral reefs face increasing threats from pollution, ocean acidification, and sea temperature anomalies, making efficient protection and monitoring heavily urgent. Therefore, this study presents a novel machine-learning-based coral bleaching classification system based on a diverse global dataset with samples of healthy and bleached corals under varying environmental conditions, including deep seas, marshes, and coastal zones. We benchmarked and compared three state-of-the-art models: Residual Neural Network (ResNet), Vision Transformer (ViT), and Convolutional Neural Network (CNN). After comprehensive hyperparameter tuning, the CNN model achieved the highest accuracy of 88%, outperforming existing benchmarks. Our findings offer important insights into autonomous coral monitoring and present a comprehensive analysis of the most widely used computer vision models.
Coral reefs are marine ecosystems made up of colonies of invertebrates called corals. They are usually found in tropical and subtropical seas, offering protection, food, and feeding grounds for a large number of fish populations [1]. For example, the Great Barrier Reef, the largest living thing on the planet, is home to over 9,000 known species [2]. Furthermore, according to the United Nations Environment Programme (UNEP), coral reefs cover less than 0.1 percent of the ocean but support over 25 percent of all marine creatures, hence playing a crucial role in ensuring biodiversity [3].
Coral bleaching occurs when corals expel the zooxanthellae algae due to environmental stress, causing the corals to turn white. Bleached corals are susceptible to disease and suffer reproductive issues that would often result in death [4, 5, 6]. These mass bleaching events are heavily disruptive to the marine ecosystem. Furthermore, coral reefs are essential industries such as tourism and fishing, while also offering protection against dangerous storms and waves. Recently, coral bleaching events have been increasing at an exponential rate [7]. Current studies have shown that over 50 percent of the corals in the Great Barrier Reef experienced severe bleaching from 2016 to 2017 [8]. Furthermore, accounts have shown that the occurrence of bleaching is likely to increase with global warming and ocean acidification.
Objectives
- Presenting a comprehensive analysis of the accuracy of ResNet, ViT, and CNN.
- Demonstrating a robust framework for accurate classification in multi-condition coral imagery with minimal computational power requirements.
- Offering a detailed comparison between our results and existing benchmarks.
Data
The dataset for this study was collected from Flickr using the Flickr API. It consists of a total of 923 labeled multi-condition underwater images, 438 of which are healthy and 485 of which are bleached, indicating minimal data imbalance. This dataset consists of a wide variety of coral species under different conditions, presenting a greater challenge to our deep learning models. The images were resized to a maximum of 300 pixels for both height and width. After preprocessing, the dataset was split into three distinct sets: training (70%), validation (15%), and testing (15%).
Evaluation Metrics
To thoroughly assess our models, we utilized four common evaluation measures: precision, recall, F1 score, and accuracy. Precision measures how many of the predicted positive cases were truly correct, indicating fewer false positives.
Precision = TP / (TP + FP)
Recall measures how many actual bleaching cases were correctly identified, indicating fewer false negatives.
Recall = TP / (TP + FN)
F1-score balances precision and recall by including both false positives and false negatives, with values ranging from 0 to 1 (1 indicating perfect precision and recall).
F1-Score = 2 * (Precision * Recall) / (Precision + Recall)
Accuracy determines the number of instances correctly predicted out of all instances in the dataset, computing the overall accuracy.
Accuracy = (TP + TN) / Total Samples
Classification Models
Residual Neural Network (ResNet)
ResNet is a deep learning neural network based on residual connections, making it robust to the vanishing gradient problem. It utilizes skip connections to pass gradients through numerous layers, enabling training of deeper networks. This network learns the residual rather than optimizing a transformation y = F(x), represented as y = x + F(x). While effective for very deep networks, it can be memory intensive and prone to overfitting.
Vision Transformer (ViT)
Unlike traditional convolutional architectures, ViT processes an image as a sequence of patches and applies a self-attention mechanism to capture long-range dependencies. An input image is divided into non-overlapping patches, flattened, and projected into a lower-dimensional space. The model then applies multi-head self-attention. The final classification is performed using a fully connected layer. ViT captures global relationships between patches but requires large datasets for training as it lacks built-in inductive biases of convolutions.
Convolutional Neural Network (CNN)
CNNs process visual data using layers of convolutional filters that extract hierarchical features. The fundamental operation is convolution, where an image is processed using a kernel to capture spatial patterns. A non-linearity (ReLU) is applied, and pooling layers (e.g., max pooling) reduce spatial dimensions. Extracted features are then flattened and passed through fully connected layers for classification. CNNs have shown remarkable success in vision tasks due to their ability to learn spatial hierarchies efficiently.
Performance Evaluations
After hyperparameter tunings and model training, the three models were evaluated against standard evaluation metrics. The results are summarized in Table 1:
TABLE 1. PERFORMANCE EVALUATIONS\nModel Precision Recall F1-Score Accuracy\nResNet-50 0.86 0.86 0.86 0.86\nViT 0.64 0.64 0.64 0.64\nCNN 0.89 0.88 0.88 0.88\nAs seen from Table 1, a standard CNN achieved superior accuracy of 88%, outperforming ResNet-50 (86%) and ViT (64%). Standard CNNs are highly effective at capturing local spatial hierarchies and dependencies, which is beneficial for many computer vision tasks. ResNet-50, while deeper, relies on residual connections that may not be as beneficial depending on the dataset characteristics. ViT, utilizing self-attention mechanisms, struggled more with understanding local features without extensive tuning, leading to lower performance.
Confusion Matrices & ROC Curve
Confusion matrices for ResNet-50, ViT, and CNN were generated. All three models demonstrated a greater capability to identify bleached corals compared to healthy corals. The performance of ViT significantly lagged behind ResNet-50 and CNN.
The Receiver Operating Characteristic (ROC) curve further illustrated the models' ability to distinguish between classes. CNN achieved the highest Area Under the Curve (AUC) value of 0.96, closely followed by ResNet-50 with an AUC of 0.95. ViT had the lowest AUC of 0.75, confirming its weaker performance in this context.
Building upon existing literature and state-of-the-art models and techniques, this study presents a robust and comprehensive framework for accurately classifying coral bleaching with multi-condition underwater images. Specifically, we employed three deep learning computer vision models—ResNet-50, ViT, and CNN—to classify bleached and healthy corals. For evaluation, we utilized four standard metrics: precision, recall, F1-score, and accuracy for comprehensiveness. Our top model, the CNN, achieved an accuracy of 88%, demonstrating superior performance compared to previous studies. Moreover, our proposed framework is lightweight and flexible, offering invaluable insights for researchers and biologists alike.
Coral Bleaching Classification Process
| Model | Precision | Recall | F1-Score | Accuracy | Key Advantage | Primary Limitation |
|---|---|---|---|---|---|---|
| ResNet-50 | 0.86 | 0.86 | 0.86 | 0.86 |
|
|
| ViT | 0.64 | 0.64 | 0.64 | 0.64 |
|
|
| CNN | 0.89 | 0.88 | 0.88 | 0.88 |
|
|
Calculate Your Potential AI ROI
Estimate the transformative impact of AI on your enterprise by adjusting key operational parameters. Our model provides a realistic projection of efficiency gains and cost savings.
Your AI Implementation Roadmap
Embark on a structured journey to integrate cutting-edge AI. Our phased approach ensures seamless adoption, minimal disruption, and maximum impact across your enterprise.
Discovery & Strategy
Comprehensive analysis of current operations, identification of AI opportunities, and development of a tailored strategy aligned with your business objectives.
Pilot & Prototyping
Development of proof-of-concept solutions, rapid prototyping, and real-world testing in a controlled environment to validate effectiveness and refine models.
Full-Scale Integration
Seamless deployment of AI solutions across your enterprise infrastructure, including data migration, system integrations, and performance optimization.
Monitoring & Optimization
Continuous monitoring of AI system performance, regular updates, and iterative enhancements to ensure sustained efficiency and evolving capabilities.
Ready to Transform Your Enterprise with AI?
Connect with our AI specialists to explore how these insights can be practically applied to drive innovation and efficiency within your organization.