Summary Paper #1 Daniel

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

01 // summary : paper #1

Étiquettes

Date de création @9 octobre 2023 21:11

Modifié @12 octobre 2023 19:32

Introduction

📌 Link to the paper : https://www-sciencedirect-com.sid2nomade-


1.grenet.fr/science/article/pii/S0034425722003285

A dictionary of frequently used terms


LST: land surface temperature. It is the main indicator/statistic in the paper. It is usually expressed as a function of
time.

LCC: land cover change. A change over the LC (Land Cover) statistic.

Decomposition: expressing a statistic as a sum of three other statistics (trend, seasonality, and error). It looks like “
Yt = Tt + St + et ”. Every term in the sum is called a “component”.
​ ​ ​ ​

Trend (Tt ): long-term gradual variation of a statistic. It usually takes the form of a piecewise linear function. When

it comes to the LST, it responds to climate change, land management or land degradation.

Seasonality (St ): periodic regular variations associated with the stages of a cycle. It takes the form of a piecewise

harmonic function. When it comes to the LST, it is associated with the stages of a solar year.

Remainder (et ): difference of Yt with respect to Tt


​ ​ ​ + St .

Abrupt changes: variations of big magnitude with respect to the time it takes to present itself. They can present on
the trend or the seasonality or both simultaneously. Usually, they are caused by natural or anthropogenic
disturbances.

Key points
Previous efforts in LST time-series analysis have mainly employed interannual LST data to characterize trends
and abrupt changes on an annual scale.

Interannual time series ignores the intra-year variability of LST and cannot provide the specific timings of abrupt
changes. Using intra-annual LST time series (e.g., 8-day, 16-day, and monthly data) can simultaneously capture
trend, seasonality, and abrupt change. Intra-annuel time series are consequently preferred over interannual
data.

Methods with intra-annual time series can be roughly grouped into two categories: change detection with unknown
component forms and change detection with known component forms.

Change detection with unknown component form: this type of method first iteratively estimates trend and
seasonality, and then detects abrupt changes by segmenting them. Here, we are interested in DBEST.

Change detection with known component form: this type of method defines the trend using a piecewise linear
model and approximates the seasonality using a harmonic model. Here, we are interested in BFAST and
BEAST.

Algorithm Function Advantages Disadvantages Applications

1) The accuracy will


decrease in the presence
1) Trend and
of high-frequency variation
seasonality Characterize both abrupt and Vegetation change
DBEST (unknown 2) Many thresholds and
decomposition 2) non-abrupt changes in the trend detection based on VI
component form) parameters can lead to
Abrupt changes component time series
different performance 3)
detection in trend
Cannot deal with missing
data

01 // summary : paper #1 1
1) Trend and
1) Sensitive to the LCC, fire detection, or
seasonality
specification of the trend and seasonal
BFAST (known decomposition 2) Distinguish trend abrupt change
parameters 2) Cannot deal decomposition using
component form) Abrupt changes and seasonal abrupt change
with missing data 3) time series of VI,
detection in trend and
Computationally expensive reflectance, and LST
seasonality

1) Distinguish trend abrupt


change and seasonal abrupt
change 2) Reduce the
LCC, ice storm, and
uncertainty of the single best 1) Sensitive to the
fire detection based
BEAST (known model in traditional methods by specification of parameters
on VI time series;
component form) combining many competing 2) Computationally
Reconstruction of LST
models 3) Able to deal with expensive
time series
missing values 4) Offer a rich set
of diagnostic statistics to assist
interpretation

Brief review of previous methods


Tool needed: STL
“Seasonal-Trend iterative decomposition procedure based on Locally weighted regression”.

DBEST and BFAST deduce their seasonality using STL. They modify its trend output.

http://www.nniiem.ru/file/news/2016/stl-statistical-model.pdf

DBEST
Obtains linear segments of the trend component via trend estimation and trend segmentation.

In trend estimation, major discontinuities of a time series are firstly identified by level-shift points. Then, STL
decomposition is performed for each segment separated by the level-shift points to estimate the trend and
seasonality.

In trend segmentation, the turning points (i.e., breakpoints) are first identified based on rank statistics and
sorted according to the local change in the trend. Then, the breakpoints are specified by minimizing the
Bayesian information criterion (BIC) for least-squares fitting. The number of output breakpoints is determined
by the maximum number of breakpoints mor the change-magnitude threshold ε.

Trend estimation method


It requires three parameters: θ1 , θ2 and D . ​ ​

θ1 wants to capture a big absolute difference of a data point Yt with respect to the next data point Yt+1 .
​ ​ ​

Once we locate a point Yt∗ that presents a big difference with the next data point, θ2 wants to capture a big
​ ​

ˉt,D between…
difference in the mean value of Y before and after t. Particularly, the difference ΔY ​

ˉ[t−D,t] , and
The mean value of Y on the interval [t − D, t]noted Y ​

ˉ[t,t+D] .
The mean value of Y on the interval [t, t + D], noted Y ​

Dis then the size of the window over which we take two mean values of Y , before tand after t.
If Yt∗ is noted a “candidate level-shift point” (or LSP) if it verifies:

1. ΔYt > θ1 , and


​ ​

ˉt,D > θ2 
2. ΔY ​ ​

The candidate LSP are sorted in descending order by the value of their ΔYt , with the first candidate LSP being

marked as the “most important LSP”. Then, for the second most important LSP onwards, from the earliest time-
step tto the latest (from left to right, basically), we check if the current candidate LSP has a distance in time of
Dor greater to the previous detected candidate LSP.
With the definitive list of LSP, we divide the intervals with the LSP as cutpoints to perform a STL for every
segment. We obtain then our trend function and seasonality function.

Segmentation method : turning points detection


Here, we focus on the obtained trend function.

01 // summary : paper #1 2
We classify all points in the time-series as a “peak”, a “valley” or neither. A peak at tis a point where the
function was increasing from t-1 and decreasing to t+1. It is analogically the same for a valley. These points are
called “turning points” for reasons that will be clear later.

We start a iterative loop involving another function : the turning point criterion function.

We create lines that connect every turning point to the next one. Between two turning points, and for every
side of the line (over and under), we detect the point with the maximum perpendicular distance to the line.
These will be called “turning point candidates”.

We pass every turning point candidate through a “turning point function” g(i) . This function returns 1 if the

point is a peak or valley, or if turning point candidate has a distance to the perpendicular line greater than
the parameter εand 0 otherwise. For the non-candidates, it also returns 0.

Notice that, when the first iteration ends, this set that includes turning points is usually bigger that the set
that only contained the peak/valleys. We add points.

We start again from 1) until we end an iteration with no new turning points. Normally, when we end the
iteration, the level-shift points are almost always naturally included as turning points. If this is not the case,
we manually add the level-shift points to the set of turning points.

Finally, the set of turning points contains : the starting point of the time series, all level-shift points, and all
points for which the turning point criterion is fulfilled.
Segmentation method : turning points validation
“Valid turning points” are defined as turning points that significantly reduce the residual sum of squares of a
least-square fit to the trend time series (explained in next paragraph) and do not result in overfitting.

The number of valid turning points can be determined by minimizing an information criterion. The Bayesian
Information Criterion (BIC) is a criterion for optimal model selection among a finite set of models. When
computing the least-squares fit, adding more turning points reduces the residual sum of squares but also
increases the number of model parameters, which may result in an over-fitting problem. BIC resolve this
problem by introducing a penalty term for the number of parameters in the model. Given a finite set of
estimated models, the preferred or optimal model is the one with the lowest BIC value.

We pass all points through a “trend local change” function h(i) . On a turning point on time t, this points

computes the difference of Tt−1 ​ − Tt . We then sort all turning points into descending order according to the
magnitude of their trend local change (in absolute value).

Notice : for the last turning point from left to right, we compute its trend local change with respect to the last
point in the time series.

There should be utotal turning points.

We index their trend local change with j : j= 1is the turning point with the greatest trend local change,
j = 2is the turning point wit the second greatest… until j = N .
We start doing linear regression by picking a subset of size sof the utotal turning points. We pick by
prioritizing the turning points with the greatest trend local change. Once we find the subset of turning points
such that its linear regression minimizes the BIC, those spoints are “valid turning points” or “breakpoints”.

We run a special kind of linear regression on every interval which bounds are the breakpoints. The non-linear
regression used is a piecewise linear regression. Note that the resulting regression does not necessarily pass
through all breakpoints. We could also chose to make a piecewise linear regression only on mbreakpoints out
of all the sbreakpoints.

BFAST
BFAST outputs trend and seasonality, and the number, timing, and confidence interval of abrupt changes in the two
components.
Before we start, we need to mention two mathematical tools used here :

The Ordinary Least Squares (OLS) residuals-based MOving SUM (OLS-MOSUM) test and least squares are used
to determine the number and position of the breakpoints.

The component coefficients are estimated using a robust regression based on M-estimation.

We should know that given that the OLS-MOSUM test for examining breakpoints requires regularly spaced data,
BFAST cannot handle equally spaced time series with missing values.

01 // summary : paper #1 3
The algorithm is as follows :

1. First, we fix the parameters. All parameters have a heavy impact on the final result.

Maximum number of breakpoints in trend (m) and seasonality (q).


This is the most influential parameter.

Minimum separation interval ϕ(between adjacent breakpoints).

Maximum number of iterations.

2. Gets seasonality from STL

3. Detection of trend breakpoints.


We use OLS-SUM here.

4. Estimation of trend coefficients : αi , βi  ​ ​

We use a robust regression based on M-estimation.


Finally, we adjust the previous trend with the new coefficients.

5. Detection of season breakpoints.


We use OLS-SUM here.

6. Detection of seasonality coefficients : γj,h , δj,h  ​ ​

We use a robust regression based on M-estimation.


We adjust the previous seasonality with the new coefficients

7. We repeat 3. to 6.
The process ends when the number and position of breakpoints are unchanged.
BEAST
BEAST infers the number and positions of the breakpoints, and the coefficients in trend and seasonality in one step.
BEAST also adopts a piecewise linear model and a piecewise harmonic model to parameterize the trend and
seasonality. However, BEAST does not require the specification of the harmonic order.

As a result, BEAST outputs trend and seasonality, the number, timing, occurrence probability, confidence interval of
abrupt changes in the two components, and time-series seasonal
harmonic orders

BEAST takes the following inputs :

Maximum number of breakpoints in trend (m) and seasonality (q).


Again, this is the most influential parameter.

Minimum separation interval ϕ(between adjacent breakpoints).

Maximum harmonic order (H ) (optional).

Sample number. This is specifically a parameter for the reversible-jump Markov Chain Monte Carlo sampling, or
MCMC sampling.

BEAST uses a Bayesian model to find the optimal value and posterior probabilities of the following elements :

Model structure parameters M : this includes the number (effective, not maximum) and timing of breakpoints in
trend and seasonality. Also includes harmonic order.

For calculating the breakpoints, BEAST infers them using Bayesian Model Averaging (BMA).

Segment-specific coefficient parameters φ: trend coefficients {αi , βi }and seasonality coefficients {γj,h , δj,h }.
​ ​ ​ ​

Summary table
Algorithm Mechanism Input Output

Yt = Tt + St + et Decomposes Tt and
​ ​ ​ ​ ​ Level-shift-threshold (θ1 and θ2 ) and
​ ​

St using separated STL. Detects abrupt duration threshold (D ); Maximum Tt and St ; Abrupt changes
DBEST
​ ​ ​

changes in Tt based on trend


​ numbers of breakpoints in trend (m) and their duration in Tt  ​

segmentation or change- magnitude threshold (ε)

Yt = Tt + St + et Tt : piecewise linear


Maximum numbers of breakpoints in
​ ​ ​ ​ ​

model St : piecewise harmonic model Tt and St ; Abrupt changes


Tt (m) and St (q ); Minimum
​ ​

BFAST (harmonic order = 3 ) Iteratively and their confidence


​ ​

separation interval (ϕ); Number of


decomposes data into Tt and St and intervals in Tt and St 
iterations
​ ​ ​ ​

detects abrupt changes

01 // summary : paper #1 4
Maximum numbers of breakpoints in
Yt = Tt + St + et Tt : piecewise linear Tt and St ; Abrupt changes,
Tt (m) and St (q ); Maximum
​ ​ ​ ​ ​ ​ ​

model St : piecewise harmonic model their probabilities, and


​ ​

BEAST harmonic order (H ); Minimum


Simultaneously infers abrupt changes and confidence intervals in Tt 


separation interval (ϕ); Sample

Tt and St based on BMA and St 


number
​ ​ ​

Data description
Simulated data
A LST time series was simulated with 10years of length and a yearly period of 46, which means 8days as
temporal resolution.

Trend, seasonality and remainder were all simulated. The LST time series is the sum of those three. This result is
called in this section the “basic time series”, particularly because other datasets are created from this time series.

The trend is a piecewise linear function that changes its slope (from −0.14K to 0.14K ) for every decade. Its
intercept (initial LST at t = 0) was set to 288K .
The seasonality was simulated by a harmonic function. The harmonic order (i.e. the multiple of the “fundamental”
frequency, the base frequency) was 1st or 2nd. The amplitudes changed from 20K to 40K . The phases varied
from 1/9period to 1/3period.

The remainder is just random numbers from −3K to 3K with a mean value of 0K .

The scenario where data is missing is also simulated. Specifically, random data points are removed with certain
proportions (0 − 40%). These scenarios imitate data discontinuities caused by clouds, snow, or the LST retrieval
algorithm.

Six datasets were then simulated using the basic time series :

Dataset 1: only contains basic time series. They were directly employed to evaluate the performance of
component decomposition.

Dataset 2: this dataset contains one abrupt change in the trend component of the basic time series. The
abrupt change in different time series had different timing coordinates, different change magnitudes (5–10K),
and different trend slopes after the abrupt change.

Dataset 3: same as dataset 2, but with two abrupt changes on the trend component instead of one.

Dataset 4: one seasonal abrupt change was simulated by changing the harmonic parameters (order,
amplitude, and phase) after the abrupt change.

Dataset 5: same as dataset 4, but with two abrupt changes on the seasonality component instead of one.

Dataset 6: these dataset present one trend abrupt change and one seasonal abrupt change.

Real data (MODIS)


Apart from simulated data, the three algorithms were test with real data from around the world, particularly data the
presented significant disturbances. The statistic studied here was Land Cover (LC). Examples of the events that
occurred in the places of interest were :

Deforestation with LC changing from forest to savannas (SAV)

Agricultural expansion with LC changing from barren (BAR) to cropland (CRO)

Farmland abandonment with LC changing from CRO to SAV

Urbanization with LC changing from grassland (GRA) to urban and built-up land (UBL)

Design of the comparison and criteria


Setting all the parameters
For DBEST, the parameters were fixed following the recommendation of Jamali et al. (2015) : θ1 ​ = 0.1, θ2 = 0.2

and D = 2years. Also, missing data was filled with interpolation.


Then, the common parameters for BFAST and BEAST are set. These are the maximum number of breakpoints for
trend (m) and seasonality (q), and minimum separation interval (ϕ). In the paper, m = q = 3and ϕ = 0.5years.
Then,

For BFAST, the maximum number of iteration was set to 5.

01 // summary : paper #1 5
For BEAST, the maximum seasonal harmonic order was set to 3and the sample number of MCMC simulation was
10, 000.
Measuring performance
We are interested in measuring two criteria: detection accuracy of abrupt changes, and component decomposition
accuracy.

These criteria are differently measured for the simulated data and the real data.

Detection accuracy of abrupt changes

Detection accuracy of abrupt changes (simulated data)


If the simulated dataset did not contain abrupt changes, the commission error (calculated by dividing the
number of false detections by the total number of datasets) was employed to compare the three methods.

Otherwise, we use two performance indicators : the F 1score and the MAE∂t . ​

For the three methods, a breakpoint detected within half a period of the actual breakpoint (i.e., the simulated
breakpoint in the simulated data) was determined to be a correct detection; otherwise, it was defined as a false
detection.

F 1score, calculated based on the user's accuracy (UA) and producer's accuracy (PA).

P A × UA TD TD
F1 = 2 × , PA = , UA =
P A + UA
​ ​ ​

TN DN

P Ais the total correct breakpoints detected over the all the real breakpoints
UAis the total correct breakpoints detected over all the detections.
The highest value of F 1is 1.0indicating perfect P Aand UA

The lowest value of F 1is 0.0 indicating P A = 0or UA = 0


The second indicator is MAE∂t which is just the average of the absolute distance in time of a detected

breakpoint and a real breakpoint.

n
∑ ∂ti
MAE∂t = i=1 , ∂tc = ∣tref − t^∣
​ ​

​ ​ ​ ​

Detection accuracy of abrupt changes (real data)


For MODIS LST time series, owing to the small amount of data and the unknown underlying disturbances, the
ability of the change detection method to detect disturbances was evaluated by comparing the number of correct
detections.

Considering the complexity of disturbances that occur in reality, a breakpoint detected within one year of the actual
LCC disturbance or within six months of an actual small disturbance is determined as a correct detection

Component decomposition accuracy

Component decomposition accuracy (simulated data)


In the simulated dataset, the root mean square errors (RMSE)values and correlation coefficient (R)values
between each predicted and simulated (i.e., actual) component (i.e., trend and seasonality) were employed to
measure the decomposition accuracy of each component.

The RMSE of the predicted trend component was denoted as RMSETt , and the RMSE of the predicted ​

seasonality was denoted as RMSES t . ​


Component decomposition accuracy (real data)


The true trend and seasonality are not known in MODIS LST time series, so the RMSE between the sum of the
predicted trend and seasonality and time-series observations (RMSETt +S t )was employed to measure the ​ ​

overall fitting accuracy of the Tt ​ + St .


Comparison with simulated data


Results of detecting breakpoints

01 // summary : paper #1 6
This finding suggests that as the complexity of the data increases, BEAST and DBEST exhibited more stable F 1
than BFAST, on the contrary, BFAST was ineffective for detecting trend abrupt changes when seasonal abrupt
changes occur simultaneously.

When there is no abrupt changes in the trend, BEAST performed equally good for both components. BFAST had a
good estimation of the breaks on seasonality, but bad estimations of breaks on the trend, which suggests that
BFAST has low robustness to data with high complexities. DBEST had bad result all round, specially detecting
false breakpoints in every simulation data.

With respect to the MAE∂t indicator, BEAST performed the best in all cases. BFAST had result comparable to

BEAST on datasets that didn’t present trend breakpoints, and worse but still acceptable results when the data did
present trend breakpoints. DBEST had also the worst performance.
Results of decomposition accuracy
Here again, BEAST has the best performance with the lowest RMSE (0.28K and 0.27K ) and the best Rfor
both trend and seasonality, respectively: 0.90and 0.99.

DBEST had a better decomposition stage, with RMSE of 0.64K and 1.37K and Rof 0.78and 0.98for trend
and seasonality, respectively.

BFAST had the worst all-around results, with RMSE of 1.34K and 1.46K and Rvalues of 0.69and 0.97
respectively.

For datasets 1, 2, and 3, BFAST showed better performance than DBEST

For dataset 4, 5 and 6, BFAST exhibited larger RMSEs and smaller Rvalues than DBEST

This reduction in accuracy suggests that BFAST is inefficient for data with high seasonal variation.

Additionally, it is interesting to note that the Rvalues of the trend components were smaller than those of the
seasonal components. This is mainly because the high seasonality of the LST produces a mirage correlation
between the predicted and actual seasonal components

Improvement of BEAST
Problems with BEAST and modifications
Although the accuracy of BEAST was significantly higher than that of the other change detection methods, this
method exhibited non-negligible commission errors for trend breakpoint detection (mean value on simulated data:
18.6%).
To accurately describe long-term variations in LST, it is necessary to further eliminate false breakpoints
detected by BEAST.

True abrupt changes are always accompanied by sudden increases or decreases in Tt or significant changes in

the slope of Tt . Meanwhile, BEAST outputs the occurrence probability of the breakpoint and remainder (et ). A
​ ​

high probability indicates a high likelihood of the breakpoint. The abnormal et tends to occur when true abrupt

change happens.

Based on these facts, four features were first selected to describe the above characteristics, including

The change magnitude of trend abrupt changes (Δtrend).

The changes in the slope of the trend before and after the abrupt change occurred. This one is measured by
the angle between the two trends (angle).

The occurrence probability of the detected breakpoints (Prob)

The proportion of abnormal remainders in the confidence interval of the abrupt change (Pp). An abnormal
remainder will be detected if the remainder is larger than three times the RMSETt +S t .​ ​

Then, four criteria are used to judge the false breakpoints: Δtrend < T 1, angle < T 2, Prob < T 3 and
Pp < T 4. The thresholds were defined empirically observing the data where false breakpoints where detected.
Most false breakpoints had Δtrend < 1K , angle < 0.3◦, Prob < 0.5, and Pp ≈ 0.
Playing with the parameters revealed that a higher F 1may result in a smaller T D . The following final
parameters were set as a compromise of balance between F 1and T D and also to avoid excessively harsh
threshold conditions : Δtrend ≤ 1K , angle < 1◦, Prob < 0.5, and Pp ≤ 1%.

01 // summary : paper #1 7
These modifications ended in a general improvement of F 1of 0.03and a reduction of the commission error
of 13, 23points of %.

Application of iBEAST
iBEAST does actually correct BEAST
The algorithms were applies to MODIS LST data with three different types of LCC, including deforestation,
urbanization, and forest gain and loss.

Again, BEAST outperformed both BFAST and DBEST. iBEAST did reduce very significantly the false positives
from BEAST, but it also missed just two true breakpoints that the original BEAST detected. Still, iBEAST had better
F 1than BEAST for every category.
BEAST had the best ratio of detected breakpoints over true disturbances. DBEST had actually a better ratio
than BFAST.

All three methods exhibited poor detection efficiency for small disturbances (wildfires, heatwaves, cold, etc.)

When it comes to RMSETt +S t , DBEST had actually the lowest value than the other three methods by a
​ ​

small margin. This doesn’t mean that the components obtained by DBEST are accurate. This is just caused
because the trend and seasonality decomposed by DBEST tend to contain many meaningless detailed
fluctuations and turning points.

Conclusion
iBEAST rises above all
BEAST. Exhibited the highest detection accuracy for abrupt changes in both trend (F 1 = 0.83)and seasonality
(F 1 = 0.95), and can characterize the process of abrupt changes. In addition, it can accurately decompose time-
series data into trend and seasonality, with mean RMSE values of 0.28K and 0.27K , respectively.

However, BEAST lacks sensitivity to subtle changes with non-negligible omission errors in simulated data
and with around 50%precision of low-magnitude short-lived disturbances.

BFAST. It demonstrated poorer performance than BEAST, with lower mean F 1s of breakpoints (0.56and 0.52in
trend and seasonality, respectively) and higher RMSE s of components (1.34K and 1.46K for the two
components, respectively) in simulated data. This method appeared to be more affected by data complexities.
Moreover, BFAST tends to detect incorrect trend dynamics when the time-series data have long-lasting missing.

DBEST. Out of the three, DBEST had the poorest accuracy in simulated LST data: the mean F 1was 0.37and
mean RMSE s of trend and seasonality were 0.64K and 1.37K , respectively. Additionally, the trend trajectory
estimated by DBEST contains a quantity of non-essential change information, and it cannot detect seasonal
abrupt changes.

iBEAST. Compared with BEAST, the user accuracy of the improved BEAST was significantly increased by 13.9%
in the simulated data, resulting in an F 1increase of 0.04, and 15(from a total of 53detections) false breakpoints
were eliminated in the MODIS LST time series.

The improved version of BEAST was the best of all the four detecting changes in LST time series. Moreover,
the optimal seasonal harmonic order in the improved BEAST was determined.

01 // summary : paper #1 8

You might also like