Our analysis utilizes an extensive agricultural dataset from India from 1997 to 2020. The dataset comprises a range of features, each offering valuable insights into the dynamics of crop cultivation:
Crop: This field identifies the crop type. The dataset includes a diverse array of 55 crops, reflecting India's rich agricultural variety.
Crop Year: The dataset covers crop years from 1997 to 2020, providing a comprehensive temporal view of agricultural trends over 24 years.
Season: The data categorizes cultivation into 6 distinct seasons, including major seasons like Kharif and Rabi, facilitating an analysis of seasonal impacts on agriculture.
State: Encompassing data from 30 Indian states, this feature offers a wide geographical perspective, crucial for understanding regional agricultural patterns.
Area: Representing the land area under cultivation in hectares, the mean area is approximately 179,926 hectares, ranging from a minimal 0.5 hectares to a vast 50.8 million hectares. This indicates the varied scale of farming practices across regions.
Production: The quantity of crop production, measured in metric tons, shows an average of around 16.4 million tons. However, this varies greatly, with a maximum recorded production of about 6.3 billion tons, highlighting the variability in agricultural productivity.
Annual Rainfall: This feature, measured in millimeters, indicates the climatic conditions affecting crop growth. The average annual rainfall is about 1,438 mm, ranging from 301.3 mm to a significant 6,552.7 mm.
Fertilizer: The total amount of fertilizer used, in kilograms, averages around 24.1 million kg. It shows a wide range, suggesting diverse nutrient management strategies across different crops and regions.
Pesticide: This field details the total pesticide usage in kilograms. On average, around 48,848 kg of pesticides are used, with the data varying significantly up to 15.75 million kg.
Yield: Yield, calculated as production per unit area, averages at approximately 79.95, with an extremely varied range, peaking at 21,105. This metric is crucial for evaluating the efficiency of agricultural practices.
These statistical insights provide a more nuanced understanding of the dataset, highlighting the complexity and diversity of agricultural practices in India. Such detailed analysis is instrumental for developing targeted strategies to enhance crop productivity and sustainability.
The scatter plot matrix visualizes the relationships between various key features of the agricultural dataset with crop types differentiated by color. Each subplot in the matrix compares two different features, Area vs Production, Annual Rainfall vs Fertilizer, and so forth. Instead of comparing a feature with itself, the diagonal plots show each feature's distribution for different crops (Figure 1).
From the scatter plot matrix, we can observe the following:
Area vs Production: There appears to be a positive correlation between the area of cultivation and the production for most crops, which is expected as larger cultivation areas generally lead to higher production volumes.
Annual Rainfall vs Production: The relationship between annual rainfall and production varies among crops, suggesting that some crops may be more sensitive to rainfall than others.
Fertilizer vs Production: There seems to be a positive correlation for some crops, indicating that increased fertilizer usage may correspond with higher production. However, this relationship does not hold uniformly across all crop types.
Pesticide vs Production: Pesticide usage does not show a clear correlation with production in this visualization, suggesting that the effectiveness or necessity of pesticides may vary greatly depending on the crop.
Yield: The yield scatter plots across different features show varied patterns for different crops, indicating that yield is influenced by a complex interplay of factors, not just a single feature.
Each crop type, represented by a unique color, exhibits its own pattern of distribution and correlation across the different features, which can inform targeted agricultural practices and policies. The data points for crops like Coconut are notably distinct in plots involving production due to their high volume output, which skews the distribution.
Overall, this scatter plot matrix is a powerful exploratory tool, revealing complex relationships and highlighting the diversity of agricultural dynamics across different crops. It provides a visual basis for further statistical analysis and hypothesis generation.
The "Aggregated Pesticide Usage by Crop and Year" bar chart would offer a comprehensive view of the trends in pesticide use across different crops over the years (Figure 2). By aggregating the data, this visualization would reveal how pesticide usage has varied over time for each crop type, highlighting which crops have seen increases or decreases in pesticide application. Such a chart would be instrumental in identifying patterns and potential correlations between pesticide use and other factors like crop yield, cultivation practices, or environmental changes. It would serve as a critical tool for understanding the dynamics of pesticide management in agriculture, aiding in developing more sustainable and efficient farming practices.