To evaluate the effectiveness of the proposed FRCNN-VGG16-SPPNet model, this study compares it with the widely used single VGG16 model and assesses its performance using metrics such as Confusion Matrix, Precision, Recall, F1-Score, Accuracy, and Learning Curves. Figure 5 illustrates the Confusion Matrix for both models, while Table 3 presents the performance of VGG16 and FRCNN-VGG16-SPPNet for each species during the training stage. The FRCNN-VGG16-SPPNet model exhibits superior training results, with Precision, Recall, F1-Score, and Accuracy all reaching 0.9993, compared to the single VGG16 model, which obtains Precision, Recall, F1-Score, and Accuracy of 0.9754, 0.9748, 0.9747, and 0.9749, respectively. The findings demonstrate that both models demonstrate excellent data fitting ability, however, FRCNN-VGG16-SPPNet's overall performance exceeds that of the single VGG16 model by approximately 2.06%.
Table 3
Performance evaluation indicators for model training
Method | Species | Precision | Recall | F1-Score | Accuracy |
VGG16 | Pomadasys argenteus | 0.9811 | 0.9630 | 0.9720 | - |
Mugil cephalus | 0.9824 | 0.9964 | 0.9894 | - |
Acanthopagrus latus | 0.9790 | 0.9859 | 0.9825 | - |
Carangoides hedlandensis | 0.9416 | 0.9928 | 0.9665 | - |
Caranx sexfasciatus | 0.9928 | 0.9358 | 0.9635 | - |
All | 0.9754 | 0.9748 | 0.9747 | 0.9749 |
The proposed FRCNN-VGG16-SPPNet | Pomadasys argenteus | 1.0000 | 1.0000 | 1.0000 | - |
Mugil cephalus | 1.0000 | 1.0000 | 1.0000 | - |
Acanthopagrus latus | 1.0000 | 1.0000 | 1.0000 | - |
Carangoides hedlandensis | 1.0000 | 0.9964 | 0.9982 | - |
Caranx sexfasciatus | 0.9966 | 1.0000 | 0.9983 | - |
All | 0.9993 | 0.9993 | 0.9993 | 0.9993 |
A robust classification model is characterized by its ability to produce accurate and reliable predictions that generalize well to new data. In addition to classification accuracy, generalizability is a critical attribute of a reliable model, whereby it is capable of performing well on data that it has not encountered during the training phase. To evaluate a model's reliability, consistency in performance during training and validation is an important indicator. Typically, a model's training accuracy will exceed its validation accuracy, but a large discrepancy may indicate overfitting, which could result in overestimation and erroneous predictions. To assess a model's generalizability, independent testing using unseen data is an essential step in machine learning model development and evaluation, which enables unbiased evaluation of the model's performance.
Independent testing is a critical component in model validation, as it serves to confirm the model's generalization capacity, i.e., its ability to perform well on novel and unseen data. Furthermore, independent testing can mitigate data leakage concerns and diminish model selection errors. In the absence of an independent testing dataset, the validity of the validation outcomes may be compromised, leading to overfitting or underfitting. Notably, Table 4 demonstrates that the proposed FRCNN-VGG16-SPPNet exhibited superior testing results, with Precision, Recall, F1-Score, and Accuracy reaching 0.9382, 0.9260, 0.9294, and 0.9318, respectively. Conversely, the single VGG16 model obtained only 0.7430, 0.7350, 0.7323, and 0.7396, respectively. Upon further comparison of the training and testing results of the classification models, it was observed that the VGG16 model demonstrated a significant variation in Precision, Recall, F1-Score, and Accuracy, ranging from 0.2324 to 0.2424. Conversely, the proposed FRCNN-VGG16-SPPNet model exhibited a notably narrower range of differences, from 0.0611 to 0.0733, indicative of a comparatively consistent performance and superior generalization ability. Notably, these findings show that the VGG16 model may be subject to overfitting, while the proposed FRCNN-VGG16-SPPNet model offers enhanced robustness, reliability and stability.
Table 4
Performance validation indicators for model testing
Method | Species | Precision | Recall | F1-Score | Accuracy |
VGG16 | Pomadasys argenteus | 0.6735 | 0.6111 | 0.6408 | - |
Mugil cephalus | 0.8684 | 0.9429 | 0.9041 | - |
Acanthopagrus latus | 0.6782 | 0.8310 | 0.7468 | - |
Carangoides hedlandensis | 0.6486 | 0.6957 | 0.6713 | - |
Caranx sexfasciatus | 0.8462 | 0.5946 | 0.6984 | - |
All | 0.7430 | 0.7350 | 0.7323 | 0.7396 |
The proposed FRCNN-VGG16-SPPNet | Pomadasys argenteus | 1.0000 | 0.8113 | 0.8958 | - |
Mugil cephalus | 0.9333 | 1.0000 | 0.9655 | - |
Acanthopagrus latus | 0.9577 | 0.9577 | 0.9577 | - |
Carangoides hedlandensis | 0.8553 | 0.9420 | 0.8966 | - |
Caranx sexfasciatus | 0.9444 | 0.9189 | 0.9315 | - |
All | 0.9382 | 0.9260 | 0.9294 | 0.9318 |
Table 5 presents the performance improvement rates of the proposed hybrid model in comparison to the conventional approach of utilizing only VGG16 for classification. The metrics used to evaluate the performance include Precision, Recall, F1-Score, and Accuracy, which were observed to have improved by 26.27%, 25.99%, 26.92%, and 25.99%, respectively. The results highlight the superiority of the proposed hybrid model over the traditional method. Figure 6 and Fig. 7 provide a graphical representation of the learning curves of the two models. The proposed FRCNN-VGG16-SPPNet model demonstrates consistent training and testing results, indicating its stability and generalizability. Moreover, the model exhibits a fast convergence rate with a commendable classification accuracy. These findings suggest that the proposed FRCNN-VGG16-SPPNet model is capable of effective fish species classification.
In conclusion, the results of the study affirm the superior performance of the proposed hybrid model over the conventional VGG16 approach. The findings provide empirical support for the potential of the FRCNN-VGG16-SPPNet model as a robust tool for accurate fish species classification.
Table 5
Improvement rate for the proposed FRCNN-VGG16-SPPNet model
Method | Species | FRCNN-VGG16-SPPNet Improvement Rate (%) |
Precision | Recall | F1-Score | Accuracy |
VGG16 | Pomadasys argenteus | 48.48 | 32.76 | 39.79 | - |
Mugil cephalus | 7.47 | 6.06 | 6.79 | -- |
Acanthopagrus latus | 41.21 | 15.25 | 28.24 | - |
Carangoides hedlandensis | 31.87 | 35.40 | 33.56 | |
Caranx sexfasciatus | 11.60 | 54.54 | 33.38 | - |
All | 26.27 | 25.99 | 26.92 | 25.99 |
Improvement Rate (%) = (The proposed FRCNN-VGG16-SPPNet model - VGG16 model) / VGG16 model×100 |
The performance evaluation of a classification model is influenced by a multitude of factors, while the model's design is tailored to the specific requirements of its users. The aim of this study is to develop a model architecture that is convenient, reliable, stable, and highly accurate, for use on mobile devices in fish species recognition by both the general public and marine conservationists. Two major factors that significantly affect the classification accuracy of machine learning models are data quality and feature selection/extraction. Poor data quality, including data skewness, noise, imbalanced samples, and missing values, can negatively impact model training and hinder the learning of effective features and patterns from the data. Thus, it is necessary to perform data preprocessing and cleaning to improve data quality prior to model training. Additionally, feature selection and extraction are critical factors that directly influence the model's classification ability. The ability to select and extract effective features can enhance the model's classification performance, whereas inappropriate feature selection or failure to extract crucial features from the data can result in poor classification outcomes. Appropriate methods for feature selection and extraction must be chosen based on the data's characteristics to enhance the model's classification ability.
In this study, we propose the FRCNN-VGG16-SPPNet framework, which integrates algorithms with unique advantages in image object detection and localization, classification, and feature vector transformation. FRCNN and SPPNet play crucial roles in this framework and provide a multiplying effect that effectively enhances the performance of conventional single VGG16 models. FRCNN automatically detects fish in images containing other objects and crops images centered on the fish, significantly improving the image's recognizability and reducing the complexity of model training and classification. SPPNet's Spatial Pyramid Pooling can transform images of various sizes into feature vectors of the same size, addressing the fixed input image size issue in VGG16 Transfer Learning technology and enabling effective processing of images of different sizes, which is more convenient for model developers and users.
Based on the research analysis presented in this chapter, it is evident that the proposed FRCNN-VGG16-SPPNet framework can highlight the image features of target objects, handle images of different sizes, and exhibit exceptional classification performance.