Enterprise AI Analysis

Deep Learning Models for Coral Bleaching Classification in Multi-Condition Underwater Image Datasets

Julio Jerison E. Macrohon, PhD, Gordon Hung

Executive Impact Summary

This research introduces a novel machine-learning framework for classifying coral bleaching in multi-condition underwater images. By leveraging a diverse global dataset with samples from deep seas, marshes, and coastal zones, the study benchmarks and compares state-of-the-art models including Residual Neural Network (ResNet), Vision Transformer (ViT), and Convolutional Neural Network (CNN). After comprehensive hyperparameter tuning, the CNN model achieved the highest accuracy of 88%, outperforming existing benchmarks. This robust framework offers significant insights for autonomous coral monitoring, demonstrating how deep learning can address urgent environmental challenges in marine ecosystems without requiring substantial computational power.

0% CNN Classification Accuracy

0% Marine Biodiversity Supported by Reefs

0 Multi-Condition Dataset Images

Schedule Your Strategy Session

Deep Analysis & Enterprise Applications

Select a topic to dive deeper, then explore the specific findings from the research, rebuilt as interactive, enterprise-focused modules.

Abstract

Introduction

Methodology

Results

Conclusion

Coral reefs support numerous marine organisms and are an important source of coastal protection from storms and floods, representing a major part of marine ecosystems. However coral reefs face increasing threats from pollution, ocean acidification, and sea temperature anomalies, making efficient protection and monitoring heavily urgent. Therefore, this study presents a novel machine-learning-based coral bleaching classification system based on a diverse global dataset with samples of healthy and bleached corals under varying environmental conditions, including deep seas, marshes, and coastal zones. We benchmarked and compared three state-of-the-art models: Residual Neural Network (ResNet), Vision Transformer (ViT), and Convolutional Neural Network (CNN). After comprehensive hyperparameter tuning, the CNN model achieved the highest accuracy of 88%, outperforming existing benchmarks. Our findings offer important insights into autonomous coral monitoring and present a comprehensive analysis of the most widely used computer vision models.

Coral reefs are marine ecosystems made up of colonies of invertebrates called corals. They are usually found in tropical and subtropical seas, offering protection, food, and feeding grounds for a large number of fish populations [1]. For example, the Great Barrier Reef, the largest living thing on the planet, is home to over 9,000 known species [2]. Furthermore, according to the United Nations Environment Programme (UNEP), coral reefs cover less than 0.1 percent of the ocean but support over 25 percent of all marine creatures, hence playing a crucial role in ensuring biodiversity [3].

Coral bleaching occurs when corals expel the zooxanthellae algae due to environmental stress, causing the corals to turn white. Bleached corals are susceptible to disease and suffer reproductive issues that would often result in death [4, 5, 6]. These mass bleaching events are heavily disruptive to the marine ecosystem. Furthermore, coral reefs are essential industries such as tourism and fishing, while also offering protection against dangerous storms and waves. Recently, coral bleaching events have been increasing at an exponential rate [7]. Current studies have shown that over 50 percent of the corals in the Great Barrier Reef experienced severe bleaching from 2016 to 2017 [8]. Furthermore, accounts have shown that the occurrence of bleaching is likely to increase with global warming and ocean acidification.

Objectives

Presenting a comprehensive analysis of the accuracy of ResNet, ViT, and CNN.
Demonstrating a robust framework for accurate classification in multi-condition coral imagery with minimal computational power requirements.
Offering a detailed comparison between our results and existing benchmarks.

Data

The dataset for this study was collected from Flickr using the Flickr API. It consists of a total of 923 labeled multi-condition underwater images, 438 of which are healthy and 485 of which are bleached, indicating minimal data imbalance. This dataset consists of a wide variety of coral species under different conditions, presenting a greater challenge to our deep learning models. The images were resized to a maximum of 300 pixels for both height and width. After preprocessing, the dataset was split into three distinct sets: training (70%), validation (15%), and testing (15%).

Evaluation Metrics

To thoroughly assess our models, we utilized four common evaluation measures: precision, recall, F1 score, and accuracy. Precision measures how many of the predicted positive cases were truly correct, indicating fewer false positives.

Precision = TP / (TP + FP)

Recall measures how many actual bleaching cases were correctly identified, indicating fewer false negatives.

Recall = TP / (TP + FN)

F1-score balances precision and recall by including both false positives and false negatives, with values ranging from 0 to 1 (1 indicating perfect precision and recall).

F1-Score = 2 * (Precision * Recall) / (Precision + Recall)

Accuracy determines the number of instances correctly predicted out of all instances in the dataset, computing the overall accuracy.

Accuracy = (TP + TN) / Total Samples

Classification Models

Residual Neural Network (ResNet)

ResNet is a deep learning neural network based on residual connections, making it robust to the vanishing gradient problem. It utilizes skip connections to pass gradients through numerous layers, enabling training of deeper networks. This network learns the residual rather than optimizing a transformation y = F(x), represented as y = x + F(x). While effective for very deep networks, it can be memory intensive and prone to overfitting.

Vision Transformer (ViT)

Unlike traditional convolutional architectures, ViT processes an image as a sequence of patches and applies a self-attention mechanism to capture long-range dependencies. An input image is divided into non-overlapping patches, flattened, and projected into a lower-dimensional space. The model then applies multi-head self-attention. The final classification is performed using a fully connected layer. ViT captures global relationships between patches but requires large datasets for training as it lacks built-in inductive biases of convolutions.

Convolutional Neural Network (CNN)

CNNs process visual data using layers of convolutional filters that extract hierarchical features. The fundamental operation is convolution, where an image is processed using a kernel to capture spatial patterns. A non-linearity (ReLU) is applied, and pooling layers (e.g., max pooling) reduce spatial dimensions. Extracted features are then flattened and passed through fully connected layers for classification. CNNs have shown remarkable success in vision tasks due to their ability to learn spatial hierarchies efficiently.

Performance Evaluations

After hyperparameter tunings and model training, the three models were evaluated against standard evaluation metrics. The results are summarized in Table 1:

TABLE 1. PERFORMANCE EVALUATIONS\nModel       Precision   Recall   F1-Score   Accuracy\nResNet-50   0.86        0.86     0.86       0.86\nViT         0.64        0.64     0.64       0.64\nCNN         0.89        0.88     0.88       0.88\n

As seen from Table 1, a standard CNN achieved superior accuracy of 88%, outperforming ResNet-50 (86%) and ViT (64%). Standard CNNs are highly effective at capturing local spatial hierarchies and dependencies, which is beneficial for many computer vision tasks. ResNet-50, while deeper, relies on residual connections that may not be as beneficial depending on the dataset characteristics. ViT, utilizing self-attention mechanisms, struggled more with understanding local features without extensive tuning, leading to lower performance.

Confusion Matrices & ROC Curve

Confusion matrices for ResNet-50, ViT, and CNN were generated. All three models demonstrated a greater capability to identify bleached corals compared to healthy corals. The performance of ViT significantly lagged behind ResNet-50 and CNN.

The Receiver Operating Characteristic (ROC) curve further illustrated the models' ability to distinguish between classes. CNN achieved the highest Area Under the Curve (AUC) value of 0.96, closely followed by ResNet-50 with an AUC of 0.95. ViT had the lowest AUC of 0.75, confirming its weaker performance in this context.

Building upon existing literature and state-of-the-art models and techniques, this study presents a robust and comprehensive framework for accurately classifying coral bleaching with multi-condition underwater images. Specifically, we employed three deep learning computer vision models—ResNet-50, ViT, and CNN—to classify bleached and healthy corals. For evaluation, we utilized four standard metrics: precision, recall, F1-score, and accuracy for comprehensiveness. Our top model, the CNN, achieved an accuracy of 88%, demonstrating superior performance compared to previous studies. Moreover, our proposed framework is lightweight and flexible, offering invaluable insights for researchers and biologists alike.

Coral Bleaching Classification Process

Data Collection (Flickr API)

→

Image Preprocessing & Resizing (300px max)

→

Dataset Split (70% Training, 15% Validation, 15% Testing)

→

Model Training (ResNet-50, ViT, CNN)

→

Performance Evaluation (Precision, Recall, F1, Accuracy, AUC)

88%

Highest Classification Accuracy with CNN Model

Deep Learning Model Performance Overview
Model	Precision	Recall	F1-Score	Accuracy	Key Advantage	Primary Limitation
ResNet-50	0.86	0.86	0.86	0.86	Robust to vanishing gradient, enables deeper networks with skip connections.	Memory intensive, potential for overfitting due to depth.
ViT	0.64	0.64	0.64	0.64	Captures global relationships effectively through self-attention mechanisms.	Requires very large datasets for training, struggles with local features without extensive tuning, lacks inductive biases of convolutions.
CNN	0.89	0.88	0.88	0.88	Excellent at capturing local spatial hierarchies and dependencies, effective for many computer vision tasks.	Can struggle with capturing long-range global dependencies compared to ViT on extremely large datasets.

Calculate Your Potential AI ROI

Estimate the transformative impact of AI on your enterprise by adjusting key operational parameters. Our model provides a realistic projection of efficiency gains and cost savings.

Industry Sector

Number of Employees Involved (10-10,000)

Average Weekly Hours on Repetitive Tasks (1-40)

Average Hourly Wage ($15-200)

Annual Cost Savings $0

Annual Hours Reclaimed 0

Discuss Your Custom ROI

Your AI Implementation Roadmap

Embark on a structured journey to integrate cutting-edge AI. Our phased approach ensures seamless adoption, minimal disruption, and maximum impact across your enterprise.

Discovery & Strategy

Comprehensive analysis of current operations, identification of AI opportunities, and development of a tailored strategy aligned with your business objectives.

Pilot & Prototyping

Development of proof-of-concept solutions, rapid prototyping, and real-world testing in a controlled environment to validate effectiveness and refine models.

Full-Scale Integration

Seamless deployment of AI solutions across your enterprise infrastructure, including data migration, system integrations, and performance optimization.

Monitoring & Optimization

Continuous monitoring of AI system performance, regular updates, and iterative enhancements to ensure sustained efficiency and evolving capabilities.

Explore a Detailed Timeline

Ready to Transform Your Enterprise with AI?

Connect with our AI specialists to explore how these insights can be practically applied to drive innovation and efficiency within your organization.

Book Your Free AI Consultation

Enterprise AI Analysis

Deep Learning Models for Coral Bleaching Classification in Multi-Condition Underwater Image Datasets

Executive Impact Summary

Deep Analysis & Enterprise Applications

Objectives

Data

Evaluation Metrics

Classification Models

Residual Neural Network (ResNet)

Vision Transformer (ViT)

Convolutional Neural Network (CNN)

Performance Evaluations

Confusion Matrices & ROC Curve

Coral Bleaching Classification Process

Deep Learning Model Performance Overview

Calculate Your Potential AI ROI

Your AI Implementation Roadmap

Discovery & Strategy

Pilot & Prototyping

Full-Scale Integration

Monitoring & Optimization

Ready to Transform Your Enterprise with AI?

Ready to Get Started?

Book Your Free Consultation.

Let's Discuss Your AI Strategy!

Lets Discuss Your Needs

Select Time Zone

Big Competitive Advantage With Ai

Learn More

Our Demos

Research Center

Contact Us

1 888 985 3025

Solutions@OwnYourAi.com

Get Your Ai