Devis Peressutti

Jul 31, 2018

Introducing eo-learn
Bridging the gap between Earth Observation and Machine Learning
The availability of open Earth Observation (EO) data through the Copernicus and
Landsat programs represents an unprecedented resource for many EO applications,
ranging from land use and land cover (LULC) monitoring, crop monitoring and
yield prediction, to disaster control, emergency services and humanitarian relief.
Given the large amount of high spatial resolution data at high revisit frequency,
frameworks able to automatically extract complex patterns in such spatio-temporal
data are required. eo-learn aims at providing a set of tools to make prototyping of
complex EO workflows as easy, fast, and accessible as possible.

Example of remote sensing workflow that can be build using eo-learn. This workflow is used to create a
global service for water-level monitoring of reservoirs and water bodies.
So, what is eo-learn ? eo-learn is an open-source Python library that acts as a bridge
between Earth Observation/Remote Sensing and Python ecosystem for data science and
machine learning (ML). On one hand, its aim is to make entry to the field of remote
sensing for non-experts easier. On the other, to bring the state-of-the-art tools for
computer vision, machine learning, and deep learning existing in Python
ecosystem to remote sensing experts.

eo-learn is easy to use, its design modular, and encourages collaboration — sharing
and reusing of specific tasks in a typical EO-value-extraction workflow, such as
cloud masking, image co-registration, feature extraction, classification, etc.
Everyone is free to use any of the available tasks and is encouraged to improve upon
them, develop new ones and share them with the rest of the community. The library
is shared under MIT license so one can use it even if they do not want to share.
There is so much of untapped potential in remote sensing that we are not too
concerned about competition using our tools. Who knows, perhaps someone will
save the Planet with it. Everyone wins. That being said, we believe there should be
more sharing in EO so we’d love to see it done here as well.

In a nutshell
The library uses NumPy arrays and Shapely geometries to store and handle remote
sensing data. It is currently available on our GitHub and coming soon to the Python
Package Index. You can find documentation on ReadTheDocs.

The building blocks of eo-learn are EOPatch , EOTask and EOWorkflow objects. All
data are stored in EOPatch instances, where dictionaries store NumPy arrays and
Shapely geometries for time-dependent spatial information (e.g. Sentinel-2, Landsat
8 or Sentinel-1 bands, cloud masks, etc.), time-independent spatial information (e.g.
Digital Elevation Model, target LULC maps, count of valid pixels, etc.) and time-
dependent and time-independent scalar information (e.g. labels for change
detection, sun angles, etc.). An EOPatch instance is uniquely defined by coordinates
of a bounding box and the time-interval the stored data refers to. Information in any
format readable by Python packages can also be stored in EOPatch objects.
Example of spatial data that can be stored in an EOPatch in raster and vector format. These data are needed
to build a machine learning model for LULC map classification. In addition, non-spatial data as well as any
data format readable in Python can be stored in an EOPatch .

Any operation on EOPatch instances is performed by EOTask instances. Tasks are

grouped by scope and packaged into separate Python sub-packages, which
currently are:

eo-learn-core — The core sub-package which implements the basic building

blocks ( EOPatch , EOTask and EOWorkflow ) and commonly used functionalities.

eo-learn-io — Input/output sub-package that deals with obtaining data from

Sentinel Hub services and Geopedia.

eo-learn-mask — Collection of tasks used for masking of data and calculation of

cloud masks.

eo-learn-features — A collection of tasks for extracting data properties and

feature manipulation. Examples include tasks for computing spatio-temporal
and Haralick features, as well as interpolation tasks.

eo-learn-geometry — Sub-package to handle geometric transformations, such as

vector to raster conversion, and sampling of label masks for generating training
sets for ML methods.

eo-learn-ml-tools — Collection of ML utility tasks useful to set up or validate a

ML model.

eo-learn-coregistration — Collection of tasks that implement different image

co-registration techniques.

For a list of currently implemented EOTask have a look here. If the task you are
looking for is not yet implemented, worry not! Creating a new EOTask is as simple as

EOTask classes created by users can then be added to the code-base with a simple
pull request, adding new tools and functionalities that can benefit the entire
Example of NDVI trends derived from Sentinel-2 over a year of observations. Red shows values for cultivated
land, blue for build-up area, and green for grassland. eo-learn provides tasks to handle spatio-temporal
processing such as masking and filtering of cloudy observations (empty circles), and interpolation of valid
data (filled circles) to generate an interpolated time-series (continuous line). Different interpolation methods
(e.g. linear, univariate spline, B-spline, Akima) have been implemented.

Finally, a complete pipeline is built by connecting tasks using EOWorkflow .

EOWorkflow allows definition of a workflow in the form of an acyclic graph, where

EOTask instances are vertices of the graph and EOPatch instances flow through the
edges connecting the vertices. Once the workflow has been defined, it can be run in
parallel to different input EOPatch instances, allowing to automatically process large
amounts of spatio-temporal data. EOWorkflow also provides execution monitoring
reports and logs, such as input parameters of EOTask , elapsed times, memory usage
and raised exceptions, facilitating execution control and versioning of complete ML

Check the README and the documentation for more technical information on how
eo-learn works.

Example applications
eo-learn was designed to provide the most common operations to process spatio-
temporal data that would allow building of complete remote sensing applications. In
order to showcase in more detail the potential of eo-learn , we will shortly post two
blog series on land use and land cover classification at a country level using
machine learning, and on the creation of a complete service for automatic global
water-level monitoring, both using eo-learn and the Copernicus data. Some
material to get you started on these use cases can already be found in the examples

Example of water-level segmentation using multiple sources, in particular Sentinel-1 (left), Sentinel-2
(middle) and Digital Elevation Model (right). Using multiple sources leads to a more accurate delineation of
the water body.

Given our well-known interest in working with time-series and creating time-lapses,
in this blog we share a simple EOWorkflow to automatically generate time-lapses
given a bounding box and a time-range. To generate a time-lapse like the one shown
below, the required tasks are S2L1CWCSInput , AddCloudMaskTask , SimpleFilterTask

and a custom MakeGIFTask .

Time-lapse of Ouarzazate Solar power station in Morocco. Originally created by Simon Gascoin on Twitter.
And if the time-series is affected by orthorectification issues, as is often the case for
Sentinel-2 images acquired prior to 2017, one can add a RegistrationTask to
estimate and compensate for the misalignment existing between time-frames, as
shown below. The script used to generate these GIFs can be found here.

Time-lapse after frame co-registration using a rigid transformation. Misalignments can be seen for initial
frames as registration errors accumulate over the time-series.

A key resource for the success of eo-learn is, of course, the community, both of
remote sensing and machine learning experts. We therefore invite anyone with
interests in developing large-scale remote sensing applications using spatio-temporal
satellite imagery to try eo-learn out, give us feedback, and possibly contribute to it.
We welcome code improvements, new EOTask classes, and new workflow examples.
Users have already contributed some tasks, as is the case for the Haralick features
created by developers at Magellium.

We are constantly improving on new functionalities, stability, and efficiency on

tasks and workflows, so some things are likely to change in the future as the library
grows. However, we will try to minimise breaking changes as much as possible in
future releases. The first beta release on PyPI is planned in a couple of weeks.

We will be show-casing eo-learn at the International Conference on Knowledge

Discovery and Data Mining in London on 19th-23rd August, so please stop by if you
are planning to attend. Stay tuned for our series on land use and land cover
classification and how to set up a complete service for global monitoring of water-
level in reservoirs and water bodies.

eo-learn is a by-product of the Perceptive Sentinel European project. The project has
received funding from European Union’s Horizon 2020 Research and Innovation
Programme under the Grant Agreement 776115.

