Researchers from Northeastern University and Oak Ridge National Laboratory developed a machine learning approach for mapping crop type distributions in the Continental US (CONUS) during the growing season and determined the earliest date of accurate classification by crop type. Employing routinely collected moderate resolution MODIS satellite Normalized Difference Vegetation Index (NDVI) data and the US Department of Agriculture’s Crop Data Layer (CDL) product, a semi-supervised machine learning classifier was trained with data from prior years and subsequently applied to identify and map crops across the CONUS before harvest.
This study indicates that the within-season trajectory of satellite-derived NDVI can be used to successfully map and identify crop types at a continental scale before harvest through the use of a semi-supervised machine learning classifier trained on data from prior years. This technique enables near real-time monitoring of crop health and vigor, and it could be used to inform estimates of crop yield over wide areas, especially for dominant commodity crops such as corn, soybeans, and winter wheat.
Timely and accurate knowledge of crop distributions at regional to continental scales is essential for forecasting crop production and estimating crop fertilizer and irrigation needs. The goal of this study was to map crops across the CONUS before harvest and to determine the earliest data of accurate classification within the growing season. A cluster-then-label machine learning classifier was trained by clustering MODIS NDVI data and assigning CDL crop labels for prior years to the resulting phenoregions within separately derived ecoregions, which were developed by clustering synoptic climate and soil properties. Pixel-wise accuracy of classification for eight major crops by area was around 70% across the major corn-, soybean-, and winter wheat-producing areas, whereas regions with high crop diversity exhibited slightly lower accuracy. For dominant crops like corn, soybeans, winter wheat, fallow/idle cropland, and other hay/non-alfalfa, classification accuracy improved throughout the growing seasons, reaching 90% of the year-end accuracy by the end of August. For corn and soybeans, the earliest dates of classification were found to be much earlier in the central regions of the Corn Belt (parts of Iowa, Illinois, and Indiana) than in peripheral areas. This big data analytics approach permits near real-time monitoring of crop health at continental scales and could be applied to estimate crop yields and plan harvest and transport logistics.