Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

CROP MAPPING APPLICATIONS AT SCALE: USING GOOGLE EARTH ENGINE TO ENABLE

GLOBAL CROP AREA AND STATUS MONITORING USING FREE AND OPEN DATA SOURCES

Guido Lemoine and Olivier Léo

European Commission, Joint Research Centre, Institute for Environment and Sustainability, Ispra, Italy

ABSTRACT VEGETATION or PROBA-V.


The confluence of rapidly growing streams of “free and The Copernicus programme foresees simultaneous operation
open” satellite imagery at 10-30 m spatial resolution, of dual sensors with interleaved 12-day revisit frequency
expending libraries of sophisticated open source software (i.e. < 6 days site revisit) from 2016 onwards, for the next 15
components for geospatial data processing and the increase years. Thus, methodologies that rely on consistent inter-
in publicly available open data sets is driving major changes annual comparison of time series can now consider the high
in agricultural monitoring activities. In the next years, we resolution domain, leading to a dramatic scale up of the
can expect a scale step in derived crop area and status detail of the inferred agronomic information (e.g. crop area,
information at parcel level from the combined use of global biophysical parameters, anomaly detection, etc.).
sensors such as Landsat-8, Sentinel 1 and 2. In order to The introduction of “free and open” satellite coincides with
handle the unprecedented flow of such data into value the increased public release of ever more detailed ancillary
adding agricultural mapping and monitoring applications, data sets that are required to produce appropriately
novel approaches need to be developed to ensure a globally georeferenced and robust indicators and their spatial and
consistent use in a “knowledge inference” context in support temporal statistics. Furthermore, these data layers are
of, for instance, food security analysis. We demonstrate the becoming essential for creating added value information for
use of Google Earth Engine (GEE) as a prototype specific agronomic applications (e.g. support to farm
environment that could possibly support such a context with management, logistics and trade, environmental impact
3 different examples. assessment).
A third phenomenon that contributes to potential wide-
Index Terms— open data, Sentinel-1, Landsat-8, spread take up of “free and open” data flows is the release of
Google Earth Engine relevant processing algorithms for both image analysis and
geospatial integration in the open source domain. It is
1. INTRODUCTION
relatively straightforward nowadays to compose highly
The growing availability of “free and open” satellite imagery automated workflow for post-processing, data analytics and
at 10-30 m spatial resolution is a boon for agricultural reporting purposes with open source components. This
mapping and monitoring applications aimed at deriving wide includes the possibility to scale to very significant data flow
area and whole-season crop production information. (e.g. several Gb/day) with parallel processing solutions or
Whereas access to US Landsat imagery has been “freed and hardware-specific modules (e.g. using GPUs [2]) on cheap
opened” already since 2010, the introduction of Landsat-8 in commodity platforms. Example solutions exist that can
early 2013 is recognised as a significant step-up, mostly due handle data flows and analysis tasks at the size of a country
to quality issues with Landsat-7 and limited actuality of [3], a large region or specific agricultural production areas.
previous Landsat sensors. The European Copernicus However, in our work at the JRC, we are typically interested
program [1] has introduced a C-band SAR in the “free and in wide area regional and continental applications, e.g. for
open” domain with the launch of Sentinel-1 in April 2014 the monitoring of European agriculture, large arable crop
and operational imagery provision since early October 2014. areas outside Europe and, in the context of our support to
On June 12, 2015 Sentinel-2 will launch with a wide swath, food security, agricultural area and crop conditions in crisis
11-band optical and infrared sensor. Sentinel 1 and 2 sensors areas. Thus, our interest is directed towards scalable
produce imagery in the 10-60 m resolution range. solutions that provide access to global data sets and the
Combined, the Landsat and Sentinel sensors will provide a computing infrastructure to rapidly generate crop area and
capacity to monitor agricultural production at parcel level, crop status indicators, both on a continuous and ad-hoc
rather than at aggregated mixed spectral resolution of low request basis.
(100-300 m) resolution sensors like MODIS, SPOT-

978-1-4799-7929-5/15/$31.00 ©2015 IEEE 1496 IGARSS 2015


Figure 1. Summer crop decrease (in green colours) and increase (in pink colours) in Ukraine in 2014 compared to
2013 using thresholded Landsat-8 NDVI composites from the period July 10-August 31, and aggregated to
Ukrainian rajon administrative boundaries. The maximum decrease (darkest green) is 113299 ha, the maximum
increase (darkest pink) is 65856 ha. All analysis is performed within the JavaScript API of Google Earth Engine.

2. METHODOLOGY shared code simply re-runs to generate the result.


In this paper, we focus on the use of “closely-coupled” Applications of GEE code anywhere on the globe are then
parallel processing approached which is of increasing only limited by the availability of the appropriate data.
interest in global Earth Observation (EO) applications. With 3. RESULTS
“close-coupled” we mean data processing and analytics
solutions where scalable processing capacity is logically co- In this section, we demonstrate 3 use cases for which we
located with access to complete image and data archives. illustrate some aspects of GEE that address [future]
Although we use Google Earth Engine (GEE, [4]) as an application needs in agricultural monitoring needs. We do
example, more generic “cloud computing” providers are not make a direct comparison to existing methods using low
increasingly looking into solutions that address specific resolution data, as this is not always sensible (e.g. crop
needs of EO users. classification). To appreciate processing speed in GEE, we
Our (current) preference for GEE is based on the relative can only present run-times that are equivalent to the time the
maturity of accessible functionality through simple user is waiting for the result to be rendered. In GEE, there is
application programming interfaces (API). We use both the (currently) no way for users to know the exact CPU (or
JavaScript and Python API. Both APIs effectively hide the GPU) run-time used. Rendering delays for either of the cases
complexity of parallel computing and large data storage are never above the order of 1 minute. They may actually
access and permit the user to focus on the logic of data fluctuate depending on the overall user load of GEE).
selection and programmable workflow. For this, a large set In the first case, we try to estimate differences in crop area
of callable functions are provided in the GEE algorithm of summer cultivations in Ukraine between 2013 and 2014.
library. The, expanding, library combines many of the For this, we select all available Landsat-8 data in the period
standard image processing routines (filters, band math, July 10 – August 31 for the entire territory of Ukraine,
feature detection, etc.) with specific data access, masked for crop land pixels in the 2000 GlobCover land use
composition (e.g. ImageCollections) and reduction routines, classification. We then generate the NDVI from top-of-
seamlessly integrating geospatial features atmosphere reflectance and compose a single image as a
(FeatureCollections) as well. It is important to note that most maximum-composite from the NDVI values. For both years,
of the processing libraries in GEE are similar to existing we choose NDVI = 0.2 as a threshold, under which a pixel is
open source components, such as OpenCV, GDAL, PyTable, considered to belong to a bare soil, i.e. cannot be
etc. The Python API facilitates further integration with post- representative of a summer crop. Finally, we aggregate the
processing, analytics and visualisation components. One of sum of all pixel values that are above the threshold at 30 m
the most appreciated aspects of GEE is the way analysis resolution for all administrative boundaries at Ukrainian
results can be shared as scripts or code, rather than through rajon (district) level. The latter has been extracted from
the exchange of bulky downloads. The recipient of the OpenStreetMap (www.openstreetmap.org). The result can be

1497
Table 1. Confusion matrix for all large set of 2013 reference parcels (> 6 ha) from the Zeeland province, the
Netherlands using the RifleSerialClassifier of Google Earth Engine on a Landsat-8 scene of 12 July 2013. Random
training samples, classification results and confusion matrix are generated with the Python API of GEE.

ALF FRU GRA GRS MAI ONI POT SBT SWH VEG WWH rowtotals user accuracy

ALF 33 0 7 0 0 0 1 0 0 1 0 42 0.79
FRU 0 83 7 0 13 0 0 0 0 1 0 104 0.80

GRA 7 24 405 16 40 3 8 2 5 18 2 530 0.76

GRS 0 0 14 210 0 0 1 0 0 1 0 226 0.93

MAI 11 4 2 0 150 0 5 1 1 30 1 205 0.73

ONI 0 1 1 1 0 266 4 2 2 7 3 287 0.93


POT 27 0 30 29 6 4 841 5 0 47 1 990 0.85
SBT 1 0 1 6 2 3 6 446 1 6 6 478 0.93

SWH 0 0 0 0 0 0 0 0 37 0 12 49 0.76

VEG 5 2 1 0 10 6 3 1 0 59 0 87 0.68
WWH 0 0 2 11 1 7 8 2 157 3 1239 1430 0.87
columntotals 84 114 470 273 222 289 877 459 203 173 1264 4428 OA = 0.85

producer accuracy 0.39 0.73 0.86 0.77 0.68 0.92 0.96 0.97 0.18 0.34 0.98 κ = 0.82

mapped (Fig. 1) or presented in tabular format. We find a combinations of SAR and optical, etc.).
distinct geo-spatial pattern with large differences in the The last case is a test in crop delineation in central New
South-Eastern and Eastern districts, which can be related to South Wales (NSW, Australia, centered around (lat, lon) =
differences in agro-meteorological conditions and the (146.03, -34.81)) using a combination of 9 Sentinel-1A (S1)
ongoing conflict in the area (e.g. the Crimea peninsula). dual polarization (VV/VH) interferometric wide mode
Our second example is an implementation of image ground range detected (GRD) images, including ascending
classification in GEE, which we apply on a Landsat-8 and descending frames, and 4 Landsat-8 (L8) data sets for
imagery of July 12, 2013 for the south-western Netherlands the period September 29 – December 12, 2014. Since
province of Zeeland. In the Netherlands, as in other EU Sentinel-1A is only operational since October 2014, crop
member states, farmers declare crop cultivation on an annual delineation in the Northern Hemisphere is not yet fully
basis to apply for support under the EU's Common feasible, because imagery does not yet cover the critical part
Agricultural Policy. The Netherlands has released the of the growing season. In the example, we use GEE to
(anonymized) information in public domain since 2009 automatically select a random sample of points within a
(www.pdok.nl) as annual sets of approximately 770,000 delineated arable crop production area. The points are
parcel boundaries with crop code. This turns out to be a filtered on a thresholded standard deviation divided by the
rather unique data set for large scale experimentation with mean for a local circular kernel of 60 m radius. In this way,
image classification. In our example, we test GEE classifiers only “pure” radiometric samples are retained which typically
for 5% random selections of training parcels that we draw pertain to individual crop parcels or homogenous
from the set of approximately 35,000 parcels in Zeeland. background land use cases (see Fig. 2). The time series of
The remainder of the set is then used to verify the classifier SAR calibrated backscattering coefficients and L8 surface
majority result in a classical confusion matrix (Table 1). In reflectance values for each filtered point, and their local
the example, we limit field size to > 6 ha and for crop statistics, are extracted from GEE into Python geopandas
classes that have, at least, 40 parcels in the selection. Even data structures for further cluster analysis and classification.
though the timing of the Landsat-8 image is not optimal for Clusters can be labelled based on inferred radiometric
the delineation of specific crop types (in particular grass, signatures, which is still at an experimental stage. In
maize), the overall accuracy of 0.85 is already quite particular, the L8 NDVI temporal composites are very useful
impressive. This approach is easily extendable to include to stratify the cluster analysis into broad crop classes, such
other image combinations (e.g. multi-temporal sets, as winter and early and late summer crop types. Within these

1498
Figure 2. Multi-temporal composite of Sentinel-1A IW GRD VH images (9 Oct, 14 Nov, 8 Dec 2014, left) and
Landsat-8 NDVI (30 Sep, 1 Nov, 11 Dec 2014, right) prepared in Google Earth Engine. Sentinel imagery integrated
via Google Map Engine. Derived statistics are used to infer crop classes.
clusters, further refinement is than based on SAR signatures, provide a basis for long term monitoring of global
which separate classes on the basis of their surface agricultural production.
preparation (differences in surface roughness) and
5. ACKNOWLEDGEMENT
vegetation cover. Especially the cross-polarized (VH)
backscattering coefficients are useful in this respect. We also The results presented in this paper are, partly or wholly,
pick up significant rain events in the SAR signatures (drastic generated with the use of Google Earth Engine under a
soil moisture change). trusted tester agreement between the authors’ organization
and Google.
4. CONCLUSIONS
6. REFERENCES
We have presented an approach demonstrating the use of
GEE in rapid prototyping and testing of crop mapping and [1] European Commission, The European Earth
monitoring applications which have the potential to scale up Observation Programme Copernicus,
global activities in this domain to the 10-30 m resolution http://www.copernicus.eu/pages-
range. This is likely to have a crucial impact on our capacity principales/infrastructure/sentinels/
to enumerate crop production statistics that are relevant in [2] G. Lemoine, and M. Giovalli, “Geo-Correction of High-
our food security monitoring work. We believe this Resolution Imagery Using Fast Template Matching on a
development addresses a number of crucial requirements GPU in Emergency Mapping Contexts”. Remote Sens.,
that are defined in [5]. Although we have focused on the use Vol 5, 4488-4502, 2013.
of GEE, other computing configurations, e.g. based on [3] P. van der Voet, C. Bielski and E. van Valkengoed, “Fast
collaborative networks of local workstations, can be Processing Workflow to Provide High Resolutin
expected to be equally able to address the data handling and Sentinel-like Data Extracts at Parcel Level, Proc.
processing needs to scale up. Both environments will take Sentinel-2 2014 Workshop, ESA-ESRIN, Frascati, Italy,
huge benefit from rapidly developing open source software 20-22 May 2014.
solutions and should aim at maximising open access to [4] Google Inc., “Earth Engine Documentation”,
results in order to establish an evolutionary knowledge base https://sites.google.com/site/earthengineapidocs/
which will help grow the user base through sharing of [5] C. Atzberger, “Advances in Remote Sensing of
expertise and inferred information, stimulate optimised co- Agriculture: Context Description, Existing Operational
variable collection and integration of analysis techniques to Monitoring Systems and Major Information Needs.”
Remote Sens, Vol 5, 949-981, 2013.

1499

You might also like