Oil spills are environmental disasters provoked by human activities. They are described as the release of a liquid petroleum hydrocarbon into the environment, especially marine areas (Briggs and Briggs 2018), originated in refineries, oil tankers that have an accident or “clean” their tanks in the ocean, and operative discharge from ships (Huz, Lastra, and López 2018; Brekke and Solberg 2005). Indeed, considering the pollution by liquid petroleum, the illicit outflow of ballast and tank-cleaning oily residues from oil tankers and ships are the main causes contributing to the contamination of seas and oceans (Huz et al. 2018; Ansell et al. 2001). Oil spills have major economic, ecological, and social impacts (Negreiros et al. 2022), which are costly to companies due to the waste of spilled oil and fines imposed by the government for pollution (Singh et al. 2020; Krata and Jachowski 2020; Krestenitis et al. 2019b; Beyer et al. 2016). Indeed, as an ecological concern, the disasters lead to consequences affecting much of the natural marine environment (Krata and Jachowski 2020), (Krestenitis et al. 2019a) and even human health (Webler and Lord 2010; D’Andrea and Reddy 2018). Thus, early detection and immediate warning of the oil spill become crucial to attenuate the environmental consequences, control oil dispersion, and ensure human lives are not in danger (Ribeiro et al. 2020).
In the last decades, oil spills accident increased considerably, such as in well-known disasters of Amoco Cadiz (France, 1978), Exxon Valdez (Alaska – USA, 1989), “GulfWar” (Kuwait, 1991), Aegean Sea in (Spain, 1992), Erica (France, 1999), the Prestige (Spain and France, 2002), and British Petroleum platform Deepwater Horizon (Mexico 2010; Topouzelis 2008). In 2019, a huge oil spill on the Brazilian coast (Araújo et al. 2020) emerged, impacting the environment, tourism business, and fishermen. In total, 1009 affected locations were identified across 130 municipalities in all nine northeastern states and two southeastern states only five months after this tragic event (Ribeiro et al. 2020).
Safety barriers are necessary to prevent such accidents; they may be related to preventative measures (i.e., avoid the accident) or consequence reduction measures (i.e., after the accident happen). This work involves techniques useful to the second group once the intention is to rapidly detect the oil spill to mitigate its disastrous consequences since a considerable time gap between the oil spill incident and the cleaning procedure generally accentuates the negative impacts (Bubbico et al. 2020).
In this context, using images to detect oil spills is mainly separated into two approaches: manual/visual and automated (Røed and Bjerga 2017). In manual detection, most of the process is made by humans, in which contextual information such as oil rigs’ and pipelines’ location, wind speed, and direction are important (Brekke and Solberg 2005). However, this approach is time-consuming and labor-intensive due to the large number of images to be analyzed in a short period for effective oil spill monitoring (Shu et al. 2010). In addition, manual detection is highly dependent on the knowledge and experience of operators, which are subjective. According to Jiao et al (2018), as manual detection cannot rapidly detect oil spills, enterprises’ operating costs remain high, and their detection methods hardly prevent oil pollution.
In turn, automated oil spill detection automatically identifies patterns in the images to classify them. Nevertheless, a well-known problem in detecting oil spills is its resemblance to a natural ocean phenomenon called “look-alikes” (e.g., currents, eddies), which, as the oil, appear like dark spots in images (Krestenitis et al. 2019b; Topouzelis 2008). Additionally, oil spills are often especially dark due to the remote sensing radars mounted at a distance from the target area (e.g., in satellites and aircraft). According to Kerf et al. (2020), during the nighttime, the odds of detecting an oil spill lowers significantly since not every part of the water is illuminated. Even so, an efficient automatic oil spill detection system is generally faster, cheaper, and more reliable than a manual system (Shu et al. 2010).
Automatic image feature extraction techniques involving computer vision (CV) approaches have become common because of their efficiency and practical applications (Huang et al. 2021; Liu et al. 2021; Ghahremani et al. 2021; Xiao et al. 2021). Distinct methods may be applied to extract features and detect oil spills from images efficiently. In this context, features are small newsworthy, descriptive, or informative patches in images (Mahony et al. 2020). For example, Local Binary Pattern (LBP) (Ojala et al. 2002), Gray Level Co-occurrence Matrix (GLCM) (Haralick et al. 1973), and Local Tetra Patterns (LTrP) (Murala et al. 2012) are well-known feature extraction methods already applied in different areas to excerpt important small features to be inputted into classification models. Likewise, methods based on DL, such as convolutional neural networks (CNNs) (Li et al. 2021), are increasingly useful and, according to Mahony et al. (2020), usually improve prediction results using big data and abundant computing resources. Therefore, depending on the application, traditional CV techniques with global features using the context of the image as a whole (Murphy et al. 2006) are an important solution compared to DL-based methods (Mahony et al. 2020; Zhang et al. 2021; Nikan et al. 2021), such as identifying the objects in an image (Marcus 2018).
The number of publications englobing oil spills has increased over the last 20–30 years (Vasconcelos et al. 2020), with a similar trend for works proposing methods to automate the detection of oil spills. Most use CV techniques combined with machine learning (ML) (Amato et al. 2022) models or apply DL (Ahmed et al. 2022) for feature extraction and oil spill detection. For instance, Xu et al. (2020) proposed a model that uses only ML methods. The authors used CV techniques in the preprocessing, capturing morphological features of targets and support vector machines (SVM) to classify the wave information about oil spills detected by a local adaptive threshold and displayed on an electronic chart based on a geographic information system (GIS). Mera et al. (2017) use a CV system step of feature extraction, comprising the computation of 52 types of features (geometrical, textural, and physical). They applied five feature selection (FS) methods to improve the feature set. Indeed, the FS methods are preprocessing techniques for discarding features with minor impact, resulting in a reduced set of relevant features (Guyon et al. 2006). The selected features feed an SVM model, which indicates the presence or absence of oil spills in images. Also, Singha et al. (2013) performed the CV feature extraction by combining traditional and polarimetric features for object-based oil spills and look-alike discrimination. They applied a multilayer perceptron (MLP) model for image segmentation and feature classification.
Regarding DL, Chen et al. (2017) compared the results of Stacked Auto-Encoders (SAE) and Deep Belief Networks (DBN) with results achieved by SVM and MLP. Gallego et al. (2018) used a deep neural autoencoder to segment oil spills from Side-Looking Airborne Radar (SLAR) imagery. Jiao et al. (2018) proposed a CNN followed by two post-processing steps (filtering and detection box) to improve accuracy. Besides, they proposed a methodology utilizing unmanned aerial vehicle (UAV) images to inspect the areas of interest. Cantorna et al. (2019) applied clustering, logistic regression, and CNN models to detect oil spills in images. Krestenits et al. (Krestenitis et al. 2019a) combined a deep CNN and Synthetic Aperture Radar (SAR) imagery to perform a multi-classification, including oil spills, look-alikes, land areas, ships, and sea surfaces. Kerf et al. (2020) proposed a framework based on UAV, thermal infrared (IR) camera, and CNN. Zeng and Wang (2020) proposed a CNN, named the Oil Spill Convolutional Network (OSCNet), to detect oil spills in SAR imagery.
In this work, we propose a new feature extraction method, based on the q-Exponential distribution, capable of obtaining complex information from SAR images and detecting the presence of oil spills. It is a distribution-based feature extractor, which means that this method uses a probabilistic distribution model to extract features from images. The q-Exponential is a probabilistic model that stems from the Tsallis non-extensive entropy (Tsallis 1988). It is already used, for example, in reliability engineering (Negreiros et al. 2020; Sales Filho et al. 2016), finance (Ludescher and Bunde 2014), and urban agglomeration (Malacarne et al. 2002). An important characteristic of this distribution is its ability to model rare events due to its power-law behavior (Picoli et al. 2003). As the oil spills are generally small portions of the images, they can be seen as “rare events”. Although few studies correlate the Tsallis Non-Extensive Statistical Mechanics models with images (Ferraro et al. 2019), there is no study in which q-Exponential probability distribution has been used to extract features from images. We named this approach as q-Exponential feature extraction (q-EFE). The proposed q-EFE is coupled to a machine learning (ML) model to perform the classification task. Some ML models were tested (Support Vector Machine, Multilayer Perceptron, Extreme Gradient Boosting, Logistic Regression, and Random Forest) and compared. We also applied the well-known ResNet50 model, deemed successful for image recognition, three other DL-based models relying on CNN architectures, and two classical CV techniques, namely LBP and GLCM.
The remainder of this work is organized as follows: Section 2 presents the q-Exponential distribution and its related functions. Section 3 describes q-EFE, the novel feature extractor proposed in this article, whose output feeds the ML model. Section 4 presents the DL architectures used in this work for comparison purposes. Section 5 describes the oil spill-related SAR image dataset. Section 6 provides the results involving q-EFE, LBP, GLCM, each of them coupled to an ML model presented above and deep models (ResNet 50, and three CNN architectures). Finally, Section 7 brings the conclusion, limitations, and ongoing and future research.