Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Classification and Regionalization of rainfall over the Peruvian Pacific Coast

Pedro Rau1, Luc Bourrel1, David Labat1, Pablo Melo1, Boris Dewitte2, Frederic Frappart1,

Waldo Lavado3, Oscar Felipe3

1
UMR 5563 GET, Université de Toulouse – CNRS-IRD-OMP-CNES, 14 Avenue Edouard

Belin, 31400 Toulouse, France.

2
UMR 5566 LEGOS, Université de Toulouse - CNRS - IRD - OMP - CNES, 14 Avenue

Edouard Belin, 31400 Toulouse, France.

3
SENAMHI, Jirón Cahuide 785, Lima 11, Peru.

Corresponding author: Pedro Rau (pedro.rau@get.obs-mip.fr)

Abstract

The climate of the Peruvian coast is highly influenced by the Pacific atmospheric and oceanic

circulation. This region stands as one of the main economic zones in the country that

concentrates almost 50% of the population. Documenting the heterogeneity of precipitation

regimes is thus key for resources managements and mitigation of risks associated to extremes

weather events.

This study focuses on the definition of rainfall homogeneous regions over the Peruvian

Pacific coast. The approach is based on a two step process consisting first in a classification

and an iterative statistical methodology based on k-means clustering followed by a Regional

Vector Methodology (RVM). A network of 145 rainfall stations homogeneously spatially

distributed both meridionally and with altitude is used, which allows deriving a high-

resolution (0.5 km) gridded rainfall data product at an annual time-scale over the period 1964-

2011. Nine coherent regions identified by the iterative methodology are characterized. They

1
exhibit distinct rainfall seasonal variability that reflects the meridional transition of the

influence of the main climatic influences on the one hand and the interaction with the steep

topography on the other hand.

Over all, the results show the advantages of combining k-means clustering technique with the

Regional Vector Methodology (RVM) for regionalization purpose.

1. Introduction

Rainfall along South American coast is characterized by a complex pattern of spatial and

seasonal variability as a part of climate variability of this continent which exhibits a

considerable meridional extension and prominent topography (Garreaud et al., 2009). The

Peruvian Pacific coast is located at tropical latitudes and rainfall is mainly influenced by

ocean, atmosphere and orographic conditions because of narrow features of the Pacific

drainage basin.

This region concentrates more than 50 % of population of Peru and is also not well

documented in terms of rainfall regionalization. Recent works (Suarez, 2007; Lavado et al.,

2012; Bourrel et al., 2014; Ochoa, 2014) mostly focused on principal stations or major

watersheds, where the main cities are located. This is the motivation of our paper to consider

the rainfall regionalization as a decomposition of a large complex narrow area into smaller

homogeneous regions for research and applications in climatology and hydrology for this

important region.

This complex situation leads us to propose a method to determine homogeneous rainfall

regions as well as to identify and analyze its climatic behavior in the study area. In 1999, a

technical report (BCEOM - SOFI Consult - ORSTOM, 1999) proposed a rainfall

regionalization for Peruvian Pacific coast based on the Regional Vector Methodology

2
(Brunet-Moret 1979). In this report, nine regions were delineated mainly located in the

northern coast.

Multivariate analysis techniques have proved their efficiency to delineate homogeneous

regions based on climatic features such as rainfall data. Many authors have used factor

analysis, principal components, clustering techniques or a mixture of them, to define more

precisely climatic zones or rainfall regions (Ünal et al., 2003; Raziei et al., 2008) to classify

rainfall stations (Stooksbury & Michaels, 1991; Jackson & Weinand, 1995), or for analyzing

rainfall variability or distribution patterns (Sneyers et al., 1989; Ramos, 2001; Muñoz-Diaz &

Rodrigo, 2004; Dezfuli, 2010). Recently Sönmez and Kömüşcü (2011) proposed a rainfall

reclassification for Turkey based on k-means methodology. These studies highlight the

benefit in using clustering methods for regionalization purpose although they present

differences on focus and results. They also indicate that minor differences between

methodologies are worthy of consideration when geographical and climatological

interpretation is undertaken (Jackson and Weinand, 1995).

We propose here to use k-means technique for deriving a primary classification of rainfall

considering that precipitation in this region is influenced by a complex of parameters that are

associated to non-linear processes. First, the region is characterized by a steep topography

influencing the mesoscale atmospheric circulation (Garreaud et al., 2009). Second, the main

climatic influence over rainfall of Peru is associated to the El Niño Southern Oscillation

(ENSO) phenomenon which is characterized by a strong positive asymmetry (An and Jin,

2004; Boucharel et al., 2011). These phenomena thus relate regionally to rainfall over Peru in

a way that may not be well captured by linear techniques.

We propose here a joint method based on k-means cluster analysis and Regional Vector

Methodology (RVM) to define homogeneous regions using an iterative delineation process

and statistical criteria for merging of rainfall data. After spatial delimitation consisting in

3
refining the limits of homogeneous regions by a rainfall co-kriging interpolation, a regional

characterization of rainfall was carried out to describe their principal features (i.e. the annual

and monthly precipitation values, regime, distribution and altitudinal ranges). In the last part,

we also document the interannual variability of the rainfall temporal distribution over the

defined regions.

2. Data

2.1 Study area

The study area comprises the Pacific coastal region of Peru that covers an area of ~280,500

km². This region borders the Andes mountains by the east (69.8° W), while extending west to

the Pacific Ocean (81.3° W). It borders with Ecuador in the north (3.4° N) and with Chile in

the south (18.4° S). Its maximum width, perpendicular to the coastline is 230 km in the

southern part and is reduced to 100 km in the northern part. This area is characterized by a

significant altitudinal gradient ranging from 0 to ~ 6500 m.asl. This area includes 53 main

river watersheds that cover near the 90 % of this region. The rivers generally flows from east

to west from the Andes towards the Pacific Ocean with bare and steep slopes that favor

significant rising, flooding and erosion during highly rainy episodes (Lavado et al., 2012).

The Peruvian hydroclimatic system is also influenced by the Andes Cordillera, contrasting

oceanic boundary conditions and landmass distribution (Garreaud et al., 2009) which describe

much of its seasonal and interannual rainfall variability. This region shows greater rainfall

variations than the two main others hydrological regions of Peru: the Amazonas and the

endorheic Titicaca drainages (Lavado et al., 2012).

2.2 Rainfall data set

4
The database includes monthly rainfall records from 139 meteorological stations managed by

the SENAMHI (Servicio Nacional de Meteorologia e Hidrologia del Peru) and 6

meteorological stations managed by the INAMHI (Instituto Nacional de Meteorologia e

Hidrologia del Ecuador). It rapidly appears necessary to extend the area into the foothills of

the northern Andes, which cover bi-national river watersheds between Peru and Ecuador.

Monthly rainfall data covers the 1964 – 2011 period. Over the 145 stations, 124 stations are

located over Pacific coastal region of Peru (see Figure 1) and 11 belong to the Peruvian

Atlantic drainage and 4 to the Titicaca drainage. The data over this period was carefully

assessed in quality by using the Regional Vector Methodology – RVM (Brunet-Moret, 1979).

Finally, this method allows extracting a consistent dataset with 76% of stations with more

than 45 years of continuous records, 20% of stations between 20 and 45 years of continuous

records and only 4% of stations between 15 and 20 years of continuous records.

5
Figure 1. Geographical distribution of stations in the Peruvian Pacific coast represented by the black line.

Rainfall record length of the stations is shown in graduated color. Stations with more than 15 years of records

were taken to perform this study. A Digital Terrain Model (SRTM – 90 m) shows the topographical

characteristics and altitudes in the study area.

3. Methods

The Figure 2 is a schematic summarizing the applied method. The methodology comprises

three steps: the first one relies on the data preparation which includes a reviewing,

homogenization, and completion of monthly rainfall data; the second is the regionalization

6
process including a clustering and regional vector analysis; and the last step involves a

detailed characterization of the defined regions.

Peruvian Data
Pacific preparation, Clustering
Monthly Monthly Rainfall Process
homogenization
Rainfall Data database 145 stations (k-means)
and validation

Regional Vector
Digital Elevation Annual Rainfall Rainfall Analysis of
Model (DEM) Interpolation Spatialization predefined
90-m (Co-kriging) clusters

Validated rainfall regions


Region boundaries
by Regional Vector
definition Methodology (MVR)

Characterization of
Rainfall Patterns
by regions

Figure 2. Methodology schema applied for rainfall regionalization of the Peruvian Pacific region

3.1 Data preparation, homogenization and validation

It was carried out in three steps:

1) The analysis period was chosen to be as long as possible for a significant number of

stations over the Pacific Peruvian coast and extensions explained in section 2.2. We

also impose that the selected stations should have at least continuous records longer

than 15 years.

2) To evaluate the homogeneity of datasets for identifying inconsistent information in

terms of quality issues as: station microenvironment, instrumentation, variations in

time and position (Changnon and Kenneth, 2006); it was used the RVM analysis
7
(Brunet-Moret, 1979). It relies on the principle of pseudo proportionality rainfall index

calculated from the values of neighboring rainfall stations that characterize a

homogeneous rainfall pattern of a predetermined area. The principle of RVM is based

on the calculation of extended rainfall vector within the study period. This concept

refers to the calculation of a weighted average of precipitation anomalies for each

station, overcoming the effects of stations with extreme values of rainfall or which

have a small data record. With the prior antecedents, the regional annual pluviometric

indexes Zi and the extended average rainfall Pj are found by using the least squares

technique. This could be obtained by minimizing the sum of Equation (1),

𝑃𝑖𝑗
∑𝑁 𝑀
𝑖=1 ∑𝑗=1 ( − 𝑍𝑖) (1)
𝑃𝑗

where i is the year index, j the station index, N the number of years, and M the number

of stations. Pij stands for the annual rainfall in the station j, year i; Pj is the extended

average rainfall period of N years; and finally, Zi is the regional pluviometric index of

year i. The complete set of Zi values over the entire period is known as “regional

annual pluviometric indexes vector”. Being an iterative process, this method allows to

calculate the vector of each of the predefined regions (RV), then provides a stations –

vector behavior comparison, for finally discards those that are not consistent with the

regional vector (RV). This process is repeated as much as necessary. Therefore, a

“regional vector” (RV) is related for each defined region, and it represents the

behavior of all the stations which are part of the region. The calculated vector could be

considered as a suitable index of the climatic variability in the region.

3) For those stations that passed the homogenization process and also had missing

monthly data, once their spatial representation proved significant, were subjected to a

process of information completion. In this case, this procedure was performed using

the values of rainfall index calculated from the RV and the mean value of rainfall
8
monthly data of the concerned station. A more detailed description can be found in

Bourrel et al. (2014).

Through these three stages, 145 pluviometric stations were validated. The geographical

location of the 124 Peruvian Pacific coastal stations is depicted in Figure 1, which also

mentions the rainfall record length for each station.

3.2 Classification and Regionalization Process

3.2.1 K-means clustering technique

K-means cluster analysis is a tool designed to assign objects to a fixed number of groups

(clusters) based on a set of specified variables. It is a commonly used technique for

classifying a large amount of data. For example, Sönmez and Kömüşcü (2011) proposed a

new reclassification of rainfall regions over Turkey by using k-means methodology. Dezfuli

et al. (2010) suggested a rainfall regionalization based on k-means technique coupled with

Principal Components Analysis. Ramachandra Rao and Srinivas (2006) use k means

technique as part of a hybrid clustering test to identify groups of similar catchments based in

flow data. One of the principal advantages of k-means technique consists in its cluster’s

identifying performance which allows ranking the obtained clusters as a function of their

representativeness. The process involves a partitioning schema into k different clusters

previously defined. Objects that are within those k clusters must be as similar as possible to

those that belongs to its own group and completely dissimilar to the objects that are in the

other clusters. Similarity depends on correlation, average difference or another type of

metrics. By definition each cluster is characterized by its own centroid with the cluster

members located all around it. According to Sönmez and Kömüşcü (2011), basically a k–

means clustering process must take into account three principal steps:

9
a) A k objects selection will be randomly performed among the whole group of data, where

each k object represents the centroid of each k cluster. b) All the objects in the group will be

compared with its centroid based on a similarity metric previously defined. c) Each cluster

will have all the objects with a calculated similarity bigger than others. Every time an object

is integrated to any group, the centroid is recalculated immediately. The whole procedure is

an iterative process that continues until all the objects finally belong to a particular cluster.

The assignment of objects to different groups is quite well executed through the k means

algorithm since the intracluster similarities are strengthened while the intercluster

dissimilarities are maximized.

A key part of the k-means application is to define an optimum number of clusters. In order to

succeed in the definition of partitioning groups, an estimation of the silhouette number must

be performed for each desired number of groups. As stated by Kaufman and Rousseeuw

(1990) the silhouette value is calculated by the following equation (2):

min{𝑏(𝑖,𝑘)}−𝑎(𝑖)
𝑆(𝑖) = (2)
max⁡{a(i),min(b(i,k))}

Where: a (i) corresponds to the average similarity between the ith object and the other objects

of the same group and b (i,k) is the average similarity between the ith object and the members

of the kth clusters. The range of variation for this silhouette index is between -1 and +1, when

the silhouette value is close to +1 means that there is a better member correspondence to its

own cluster, while a negative value represents the object this is not well located in the

appropriate cluster. Meanwhile the value of 0 means that objects could belong to any k

cluster. There is also computed an average silhouette width for the hole k clusters which

represents the mean of S(i), and it can be used to choose the best number of clusters, by

taking the value of k for which S(i) is maximal.

3.2.2 Regionalization Analysis

10
After k-means clustering, regionalization was conducted by Regional Vector

Methodology (RVM), which is generally oriented to: a) rainfall regionalization processes

(establishment of representative vectors of homogeneous rainfall zones) and b) to assess

rainfall data quality based on the homogeneity within a predetermined region (Espinoza et al.,

2009). The process for regionalization is similar as the process explained in section 3.1. It

depends on a determination of a “mean station” or “vector” from all data involved in the study

area that will be compared with each pluviometrical station (Brunet-Moret 1979). Prior to use

the RVM, it is necessary to define those regions whose stations will be validated. There are

different ways to predefine regions. This definition can be based on geographical patterns or

topographical constraints related to isohyets, or based on rainfall stations clusters. Here,

rainfall stations clusters are set as predefined regions. Once calculated, the RV is compared

iteratively with data station for discarding those stations whose data are not consistent with

the RV and reprise the process. On several occasions the rejection of a given station could

mean that this station belongs to a neighboring region that could present greater consistency.

Therefore in many cases, stations or areas are re-grouped or divided in order to obtain regions

that may show homogenous features. It should be noted that the RV mainly represents the

behavior or climatic regime of a given region. The statistical main criteria for regrouping

stations into homogeneous regions are considering a standard deviation less than 0.4 and a

correlation coefficient greater than 0.7 between RV and stations. Rainfall database

management and RVM were carried out using the software HYDRACCES (Vauchel, 2005).

3.2.3 Rainfall data interpolation

In order to define region delineations, a rainfall spatial distribution combined to topographic

features was considered. Annual rainfall was interpolated incorporating elevation data using a

geostatistical approach. Geostatistical techniques have proven to be quiet efficient in data

prediction by minimizing estimation variances, and its use are widely extended in the
11
hydrometereological field (Dingman et al., 1988). Many authors consider that optimal

interpolation techniques based in geostatistical approaches (i.e. Kriging) gives better

estimations of rainfall distribution than classical methods as Inverse Distance Weighted or

Thiessen Polygons (Phillips et al., 1992; Tabios and Salas, 1985). Moreover, one of the

principal differences between classical methods and kriging is that the latter is based on the

so-called semivariogram, which depicts the spatial autocorrelation of the measured sample

points (Tabios and Salas, 1985). Cokriging, which is a multivariate version of kriging

technique, takes into account correlated secondary information (i.e. digital elevation models

DEM) (Goovaerts, 2000). For example, Hevesi et al. (1992a, 1992b) and Daly et al. (1994)

consider that, in mountainous regions, precipitation tends to be increased as altitude rises, and

it is mainly associated to orographic effect. In this research cokriging was chosen as

interpolation method and a DEM with a spatial a resolution of 90 m, provided by NASA-

NGA, Shuttle Radar Topographic Mission (SRTM) data

(http://srtm.csi.cgiar.org/SELECTION/ inputCoord.asp) was considered as secondary variable

or as correlated predictor using the universal co-kriging methodology (Buytaert et al., 2006;

Diodato, 2005) based on a spherical variogram which is widely used in rainfall interpolation

studies (Goovaerts, 2010; Mair et al., 2011). For cokriging, calculation was performed using

the Geostatistical module available in ArcGis 10.2 and reviewed with an R script.

This rainfall interpolation map was used for regional delineation considering the shape of

isohyets with a geometrical approach (perpendicular and bisector criteria of limits traversing

isohyets and stations) and a statistical approach (revalidation of new defined areas with RVM

with proper fit of stations inside each region).

4. Results

4.1 Initial Rainfall Classification

12
A cluster analysis of the precipitation data was performed by applying k-means technique on

the 124 rainfall stations previously selected. The optimal value for the cluster numbers was

determined by average silhouette value and negative silhouette number for cluster numbering

varying from 3 to 10 (Table 1).

Maximum silhouette values are obtained for cluster-three (0.64), cluster-four (0.60) and

cluster-six (0.55), considering as a reasonable structure a cluster having a silhouette value

greater than 0.50 and as a weak structure a silhouette value less than 0.50 following

Kononenko and Kukar (2007). The number of negative silhouette values is minimal for

cluster-three (6), cluster-four (4) and cluster-six (6). After plotting the cluster groups into a

map showing their spatial distribution, we select the cluster-three and cluster-six from them;

these two clusters show some signs about rainfall classification according to topographical

and latitudinal variation (Figure 3.a and 3.b). Cluster-four was an intermediate group that

corresponds to one sub-region in the north.

Table 1. Results of the K-means analysis for number of clusters varying from 3 to 10.

Number of Clusters 3 4 5 6 7 8 9 10

Average Silhouette Value 0.64 0.60 0.54 0.55 0.54 0.54 0.46 0.45

Negative Silhouette Number 6 4 9 6 8 6 11 9

The two cluster groups (cluster-three and six) exhibit a similar spatial distribution.

Pluviometric stations from both groups present an altitudinal distribution along the coast,

defining three regions: the stations located in lowlands (green triangles), in middle watersheds

(white circles) and in highlands (black points). Cluster-six group presents three additional

regions, two of them closely related to northern precipitation features for the middle-

13
watershed (cluster 4 represented by red triangles) and highlands (cluster 6 represented by

yellow circles). Two stations are considered as isolated (cluster 5 represented by blue circles).

a) b)

Figure 3 a) Spatial distribution of cluster-three group after the k-means process. Silhouette value for each cluster

group is also shown in the graph below. b) Idem for cluster-six group after the k-means process.

14
Even if cluster-six group appears less representative than cluster-three group in terms of

silhouette value, cluster-six group is considered acceptable for represent correctly the

behavior of northern precipitation, offering an initial classification of rainfall regime along the

Pacific Peruvian coast.

4.2 Regionalization

After cluster definition, a Regional Vector analysis was performed over these preliminary

regions as a first step of regional refining procedure in an iterative process adding and

deleting stations from regions considering the criteria described in section 3.2.2 and the

coefficient of variation (CV) of stations. In Figure 4, the Group 1 located in the western area

of the coast (lowlands), presents greater values of CV (> 1.8) are reported than those which

are located in middle watersheds and in highlands. Northern region presents higher CV values

in lowlands and in the middle watersheds. Highlands present lower CV values (< 0.8) along

the coast independently of the latitude.

15
Figure 4. Spatial distribution and range of coefficient of variation (CV) for all of the pluviometric stations
network of the Peruvian Pacific Coast.

High CV values in the northern region correspond to strong variability of the rainfall (> 1000

mm/yr). High CV values are also observed along southern latitude. They are mostly caused by

small fluctuations around the near zero annual average. These fluctuations are due to the

large-scale mid tropospheric subsidence over the southeastern subtropical Pacific Ocean,

enhanced by the coastal upwelling of cold water (Lavado et al., 2012; Garreaud et al., 2002).

Based on the iterative process of RV reanalysis of the clusters obtained using k-means

methodology, we identify nine homogeneous rainfall patterns (see Figure 5). Rainfall stations
16
from clusters 1, 2 and 4 located in the coastal zone and in northern Andes (see Figure 3.b)

exhibit higher coefficients of variation in indirect relationship with the proximity of the

highlands. Cluster 1 includes the regions 1, 4 and 7, showing this division along the coastal

zone. Cluster 4 defines region 2, in this case clustering process successfully assigned each

station as well as RV reported them as separate from other regions. Cluster 5 and 6 are

regrouped into region 3. Finally, Cluster 3 defines regions 5, 6, 8 and 9; in this case the low

variability as the latitudinal extension defines these four regions.

Following the schematics proposed in Figure 2, the spatial approach was necessary for

delineate geographical boundaries, regionalizing finally in this way the previous classification

obtained by k-means clustering and regionalization obtained by the statistical approach of

both k-means and RVM. For this step, an interpolated surface of annual rainfall over the

period 1964-2011 was calculated using co-kriging methodology considering topographical

features as explained in section 3.2.3. Annual rainfall features exhibits a relationship with

altitude and latitude, rainfall is higher at low latitudes and at southern latitudes in high

altitudes as showed in Figure 6. After knowing rainfall features, a spherical semivariogram

model was used for the kriging rainfall interpolation before the adjustment with the DEM.

Applying the methodology described in section 3.2.3, the nine regions were well delineated

taking into account the rainfall interpolation map as showed in Figure 5.

Correlation coefficient between the stations and the regional vector of each region was

calculated separately and the spatial distribution of these coefficients of correlation is shown

in Figure 7. The purpose of this analysis is to emphasize the level of representation of the

regional vector and identify locally the areas within a region where this vector is more

representative. Considering regions 4 and 7, the coefficient of correlation is less than 0.7 and

greater than 0.5. These coefficients are considered as acceptable considering the dryer

conditions with more than 90% of the records near 0 mm of rainfall throughout the year due

17
to hydroclimatic features, where any value greater than 0 mm causes a strong variability

reducing the relationship with its RV. For the northern regions 1 and 2, the mean correlation

is more than 0.9 being a very good representation of RV and the more representative areas are

showed in red coloration. Regions 3, 5, 6, 8 and 9 located in highland, have correlations

greater than 0.7 being a good representation of the RV with the more representatives areas in

orange coloration

18
Figure 5. The nine homogeneous rainfall regions after the regionalization process of clustering and
RVM. Interpolated surface of annual rainfall (isohyets obtained using cokriging method) is also
showed to demonstrate rainfall differences between regions.

19
Total annual average rainfall (mm/y)
1200
R3
1000

800 y = 52.73x + 1167.3


R² = 0.5836
R9
600
R5
400 R6
R2
y = 21.802x + 339.73 R8
R² = 0.5251
200
R1 R4 R7
0
-4.0 -6.0 -8.0 -10.0 -12.0 -14.0 -16.0 -18.0
Latitude (degrees)
Upstream regions Downstream regions

Figure 6. Relationship between total annual rainfall for nine regions versus latitude, grouped in
upstream and downstream regions

20
Figure 7. Coefficient of correlation related to the regional vector recalculated for each final region. A
global value of correlation is also showed by region in bold as well as spatial distribution of
correlation with the regional vector.

21
3.2 Regions Characterization

Region 1 extends over northern lowlands, from 4.2°S to 7.3°S, covering an area of ~ 20,300

km2. It corresponds to a range of altitudes which varies between 0 m and 500 m asl. Average

annual rainfall for this region is about 90 mm·yr-1 including drier areas as the Sechura desert.

A maximum monthly rainfall is observed in March (see Figure 8.a.1) with precipitations less

than 50 mm.month-1 that represents near to 90% of the annual rainfall, showing the unimodal

behavior of rainfall regime. The rest of the year is considered as drier due to precipitations

near or equal to zero mm, corroborating the monthly intermittency of rainfall regime in the

coast (Garreaud et al., 2002; Lavado et al., 2012). Region 2 comprises an area of ~ 27,600

km2, characterized by a middle watershed altitudinal gradient ranging from 0 m to 1500 m asl

and latitudinal variation from 3.4°S to 7.3°S. A large part of this area belongs to the foothills

of the northern Andes without considering necessarily a political border between bi-national

river watersheds of Peru and Ecuador. Then six rainfall stations from INAMHI

(Meteorological and Hydrological National Service of Ecuador) are included in the database.

This zone exhibits a monthly intermittent regime similar to a coastal region, mostly

influenced by oceanic and continental air masses (Buytaert et al., 2006; Takahashi, 2004).

The annual maximum amount of rainfall value is around 370 mm·yr-1. The wettest period

occurs between January and April (JFMA) cumulating near to 90% of total rainfall. Northern

coastal regions as regions 1 and 2 are significantly affected by strong events represented by

two peaks reaching 413 mm.month-1 in March 1983 and 299 mm.month-1 in March 1998 for

region 1 (see Figure 8.a.2) and 746 mm.month-1 in March 1983 and 708 mm.month-1 in

March 1998 for region 2 (see Figure 8.b.2). Most of this variability in rainfall, reflected too in

higher CV values (See Figure 4), is directly due to the presence of the El Niño Southern

Oscillation (ENSO) phenomenon (Wang and Fiedler, 2006), which is the one of the main

climate anomalies that drives hydroclimatic behavior in the coast of Ecuador and

northwestern of Peru (Lagos et al., 2008; Lavado et al., 2012; Bourrel et al., 2014) with its
22
climate mechanism associated as strong events (Horel and Cornejo-Garrido, 1986; Goldberg

et al., 1987; and Bendix and Bendix, 2006).

Region 3 covers the third part of northern area (~ 27,200 km2) including Ecuadorian stations.

This area extends from 3.6°S in the borders with Ecuador in the north to 8.3°S in the south

and limit integrally with the Amazon basin by the east. Actually, this is also the wettest region

(see Figure 8.c.1 and 8.c.2). This region corresponds to a zone of high altitudes varying from

1500 to 3500 m.asl and also shows a homogeneous rainfall regime. On the other hand, rainfall

amount decreases in the southern direction without showing intermittent characteristics.

Rainfall distribution is well defined with a rainy season from January to April (JFMA) that

represents near to 70% of the annual rainfall. Mean annual rainfall reaches 1024 mm.year-1,

representing five times of the mean annual rainfall of region 1 and 2. That corroborates the

effects of high altitudes with tropical Amazon influence, leading to an attenuation of the

effects of ENSO strong events as 1982/1983 and 1997/1998.

Region 4 is the longest region varying from latitudes of 7.3°S to 15.5°S located between the

coastal plain and the foothills of the western Andes; at the north it borders with Region 1

while to the south the region borders with Region 7. Covering an area of almost 48,600 km2,

this region contains some of the principal coastal cities as the capital Lima and have an

attitudinally range from 0 m.asl to 1500 m.asl. This region corresponds to a zone influenced

by the large-scale mid tropospheric subsidence of the southeastern subtropical Pacific Ocean,

enhanced by the coastal upwelling of cold water (Lavado et al., 2012; Garreaud et al., 2002).

Then, the rainfall regime reaches a mean value of 16 mm.yr-1 defining the driest region in the

country with the monthly intermittency characteristic for coastal regions. The wet period from

January to March (JFM) represents near to 75% of the annual regime. Due to the local

conditions it is possible to obtain a slight increase of rainfall in August being not

representative as a peak for the annual regime (see Figure 8.d.1). In the southern part are

23
founded drier areas as the Nazca desert. Region 5 covers ~ 32,500 km2. This area extends

from 7°S in the boundary with regions 2 and 3 in the north to the boundary with region 6 near

to 11°S in the south and in the boundary with region 3 and the Amazon basin by the east. The

mean annual rainfall reaches 492 mm.yr-1 and the wet period occurs between December and

April (DJFMA) cumulating near to 80% of total rainfall. There were not identified peaks as

the El Niño strong events, resulting in a homogeneous rainfall pattern (see Figure 8.e.2).

Altitudinal range varies according to the latitude from 1000 m.asl in the north and 2000 m.asl

in the south to 5000 m.asl. The narrow shape of the central area that covers regions 4, 5 and 6

does not define an intermediate region as in the north (regions 1, 2 and 3) and south (regions

7, 8 and 9) due to strong altitudinal variation and after RV procedure. It was not possible to

establish any intermediate region as proposed by cluster analysis.

Region 6 covers ~ 30,400 km2 and extends from 11°S in the boundary with region 5 in the

north to 15°S in the boundary with region 9 in the south and limits integrally with the

Amazon basin by the east. It is located in highlands varying from 2000 to 5000 m.asl of

altitude, showing a homogeneous rainfall regime as well. Rainfall distribution is well defined

with a rainy season from December to March (DJFM) that represents near to 85% of the

annual rainfall (See Figure 8.f.1) and mean annual rainfall reaches 366 mm.year -1. It is

impossible to distinguish in the rainfall temporal variability any peaks corresponding to the

strong El Niño events (See Figure 8.f.2).

Region 7 is the southern coastal region and extends from latitude 15.5°S to 18.4°S

approximately covering ~ 49,300 km2 varying from 0 to 2500 m.asl of altitude. It is extended

in the north with region 4 and with the Chilean border in the south. This region is

characterized by lower rainfall regime as a coastal region with a rainy season from January to

March (JFM), accounting for 65% of the annual rainfall. Furthermore, this region is one of the

driest areas in the country where the annual rainfall (23 mm.year-1) is recorded with a monthly

intermittency (see Figure 8.g.1 and 8.g.2). This region could be considered as an extension of
24
region 4, also influenced by the large-scale mid tropospheric subsidence of the southeastern

subtropical Pacific Ocean but differing in the development of regular events in the last decade

as can be seen in Figure 8. Region 7 presents a succession of peaks in the last decade,

contrary to region 4 where peaks are not visible.

Region 8 comprises an area of ~ 25,400 km2, characterized by a middle watershed altitudinal

gradient ranging from 2500 m to 4000 m asl and a latitudinal variation from 14.6°S to 17.8°S.

Its extension covers principally the boundary with region 6 in the north and with the Chilean

border and the Titicaca basin in the south. Although much of its area belongs to the foothills

of the southern Andes mountains. This zone exhibits a monthly intermittent regime as a

coastal region, that are mostly influenced by oceanic and continental air masses (Garreaud et

al., 2002). However, rainfall depth presents higher values than region 7 reaching 296 mm·yr-1.

The wettest period occurs between December and March (DJFM) cumulating near to 90% of

total rainfall (see Figure 8.h.1).

Finally, region 9 covers ~ 30,100 km2. This area extends from 14.4°S in the boundary with

region 6 in the north to the border with the Titicaca basin in the south and east around 17.7°S

and with the Amazon basin by the east. Altitudinal range varies from 3500 m to 5500 m.asl.

The mean annual rainfall reaches 594 mm.yr-1 and the wet period occurs between December

and March (DJFM) cumulating near to 80% of total rainfall. There were not identified peaks

as the strong El Niño events, resulting in a homogeneous rainfall pattern (see Figure 8.i.2).

The major characteristics of rainfall are summarized in Table 2 for each region and

represented as a box plot in Figure 9, where outliers are represented by small circles, and

correspond to values exceeding 1.5 times the interquartile range (IQR). All regions have

observations that exceed Q3 + 1.5(IQR), region 1 and region 2 northern coastal regions have a

greater number of anomalous values in comparison with other regions which are reflected too

in Figure 4.

25
R1 R2 R3
a.1) 300 b.1) 300
c.1) 300
250 250 250

200 200 200

P (mm)
P (mm)

P (mm)
150 150 150

100 100 100

50 50 50

0 0 0
a.2) S O N D J F M A M J J A b.2) S O N D J F M A M J J A c.2) S O N D J F M A M J J A

800 800 800


600 600 600

P (mm)
P (mm)

P (mm)
400 400 400
200 200 200
0 0 0
1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010

R4 R5
d.1) 50 e.1) 300
40 250

200
P (mm)

P (mm)
30
150
20
100
10 50

0 0
d.2) S O N D J F M A M J J A e.2) S O N D J F M A M J J A

100 800
80 600

P (mm)
P (mm)

60
400
40
20 200
0 0
1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010
R7 R6
g.1) 50 f.1) 300
40 250

200
P (mm)

P (mm)

30
150
20
100
10 50

0 0
g.2) S O N D J F M A M J J A f.2) S O N D J F M A M J J A

100 800
80 600
P (mm)
P (mm)

60
400
40
20 200
0 0
1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010

R8 R9
h.1) 300 Figure 8. Rainfall regime (1964-2011) for i.1) 300
250 250
the nine identified regions. A rainfall time
200 200
series is shown by region,
P (mm)

P (mm)

150 150

100 100

50 50

0 0
h.2) S O N D J F M A M J J A i.2) S O N D J F M A M J J A

800 400
600 300
P (mm)
P (mm)

400 200
200 100
0 0
1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010

26
Figure 9. Boxplot of monthly rainfall for the nine identified regions.

Table 2. Minimum, Maximum and Average of annual rainfall for the nine identified regions.

Annual
Annual Annual
Region Average Regime
minimum maximum CV Std Dev.
Rainfall
Rainfall (mm) Rainfall (mm)
(mm)

1 3.2 1345.2 89.7 2.6 233.3 Unimodal

2 17.3 2772.2 366.5 1.5 534.2 Unimodal

3 533.0 1812.9 1023.7 0.3 294.4 Unimodal

4 1.6 62.2 15.5 0.7 11.4 Unimodal

5 174.1 825.8 492.4 0.3 145.8 Unimodal

6 75.0 693.5 365.9 0.4 133.3 Unimodal

7 5.1 54.9 23.2 0.6 13.5 Unimodal

8 23.2 528.8 296.1 0.4 111.8 Unimodal

9 220.5 833.2 594.0 0.2 143.2 Unimodal

27
5. Conclusions

Rainfall fluctuations over the Peruvian Pacific coast exhibit a high variability at both spatial

and temporal scales. A method is proposed that allows defining nine homogeneous regions.

The approach is based on a two-step process consisting in a preliminary cluster analysis (k-

means) followed by a Regional Vector Methodology (RVM) analysis. K-means clustering

allows for an initial classification into three regions for lowlands, middle basins and

highlands. The method also highlights the complicated situation of the northern area where

three additional regions were delineated. A regional definition was further proposed based on

the results of the RVM applied to the inferred clusters. At last the delineation of the regions

and data density issues could be addressed based on a rainfall co-kriging interpolation.

The two northern coastal regions, region 1 and region 2 were very well represented by the

Regional Vector (RV), reflecting the strong El Niño events influence. Highland regions

(regions 3-5-6-8 and 9) were represented by the RV showing the homogeneous behavior of

rainfall without the strong El Niño influence reflecting a low coefficient of variation. On the

contrary, coastal lowland regions (regions 4 and 7) are characterized by an acceptable

representation by the RV reflecting the drier conditions along the coast due to upwelling

conditions. The monthly seasonal cycle of rainfall in the southern regions (region 7, region 8

and region 9) exhibits differences with the rest of regions, with in particular a shift by one

month for maximum rainfall. Rainfall peaks in February for region 7 and region 8 and

January for region 9 whereas it peaks in March as for the others regions. Such heterogeneity

in temporal and spatial variability will be discussed in a future research considering the

hydroclimatic approach. Overall we have provided here a regional analysis that can be used in

future researches for the study of relationship between rainfall variability at local scales and

some aspects of the regional oceanic and atmospheric circulation in Peru.

28
6. Acknowledgments

This work was supported by Peruvian Ministry of Education (MINEDU-PRONABEC,

scholarship). Authors would like to thank SENAMHI (Meteorological and Hydrological

Service of Peru) for providing complete rainfall raw dataset.

References

An S.I, Jin F.F. 2004. Nonlinearity and asymmetry of ENSO. Journal of Climate 17:2399–
2412.

Bendix A, Bendix J. 2006. Heavy rainfall episodes in Ecuador during El Niño events and
associated regional atmospheric circulation and SST patterns. Adv. Geosci 6:43–49.
BCEOM. 1999. Estudio hidrológico-meteorológico en la vertiente del Pacífico del Perú con
fines de evaluación y pronóstico del fenómeno El Niño para prevención y mitigación de
desastres. Asociación BCEOM-Sofi Consult S.A. -ORSTOM, Programa de apoyo a la
emergencia Fenómenodel Niño. Contrato de préstamo n°4250-PE-BIRF, Presidencia de la
Republica, Perú. Volumen I.

Boucharel J, Dewitte B, Du Penhoat Y et al. 2011. ENSO nonlinearity in a warming climate.


Clim Dyn. doi 10.1007/s00382-011-1119-9.
Bourrel L, Rau P, Dewitte B et al. 2014. Low-frequency modulation and trend of the
relationship between ENSO and precipitation along the northern to centre Peruvian Pacific
coast. Hydrological Processes. doi: 10.1002/hyp.10247.
Brunet-Moret Y. 1979. Homogénéisation des précipitations. Cahiers ORSTOM. Serie Hydr
3–4.
Buytaert W, Celleri R, Willems P et al. 2006. Spatial and temporal rainfall variability in
mountainous areas : A case study from the south Ecuadorian Andes. Journal of Hydrology
329:413–421. doi:10.1016/j.jhydrol.2006.02.031.
Changnon S, Kenneth K. 2006. Changes in Instruments and Sites Affecting Historical
Weather Records : A Case Study. J. Atmos. Oceanic Technol 23:825–828.
Daly C, Neilson R.P, Phillips D.L. 1994. A Statistical-Topographic Model for Mapping
Climatological Precipitation over Mountainous Terrain. J. Appl. Meteor 33:140–158.
Dezfuli A.K. 2010. Spatio-temporal variability of seasonal rainfall in western equatorial
Africa. Theor. Appl. Climatol 104(1-2): 57–69. doi:10.1007/s00704-010-0321-8.
Dingman S.L, Seely-Reynolds D.M, Reynolds, R.C. 1988. Application of kriging to
estimating mean annual precipitation in a region of orographic influence. JAWRA Journal of
the American Water Resources Association 24(2): 329–339. doi:10.1111/j.1752-
1688.1988.tb02991.x.

29
Diodato N. 2005. The influence of topographic co-variables on the spatial variability of
precipitation over small regions of complex terrain. Int. J. Climatol 25:351–363. doi:
10.1002/joc.1131.
Enfield D.B. 1981. Thermally Driven Wind Variability in the Planetary Boundary Layer
Above Lima, Peru. J. Geophys. Res 86:2005-2016.

Espinoza J.C, Ronchail J, Guyot J.L, Cochonneau G et al. 2009. Spatio-temporal rainfall
variability in the Amazon basin countries (Brazil, Peru, Bolivia, Colombia and Ecuador). Int.
J. Climatol 29:1574–1594. doi:10.1002/joc.
Garreaud R, Rutllant J, Fuenzalida H. 2002. Coastal lows along the Subtropical West Coast of
South America: Mean Structure and Evolution. Mon.Wea. Rev 130:75-88.
Garreaud R.D, Vuille M, Compagnucci R, Marengo J. 2009. Present-day South American
climate. Palaeogeography, Palaeoclimatology, Palaeoecology. 281 (3–4):180–195.
Goldberg R.A, Tisnado G, Scofield R.A. 1987. Characteristics of extreme rainfall events in
north-western Peru during the 1982– 1983 El Niño period, J. Geophys. Res 92:C14 225–241.
Goovaerts P. 2000. Geostatistical approaches for incorporating elevation into the spatial
interpolation of rainfall. Journal of Hydrology 228:113–129.
Hevesi J, Flint A, Istok J. 1992. Precipitation estimation in mountainous terrain using
multivariate geostatistics. Part II: Isohyetal maps. J. Appl. Meteor. Climatol 31:677-688.
Hevesi J, Istok J, Flint A. 1992. Precipitation estimation in mountainous terrain using
multivariate geostatistics. Part I: structural analysis. J. Appl. Meteor. Climatol 31:661-676.
Horel J.D, Cornejo-Garrido A.G. 1986. Convection along the coast of northern Peru during
1983: Spatial and temporal variation of clouds and rainfall. Mon.Wea. Rev 114:2091–2105.
Jackson I.J, Weinand H. 1995. Classification of tropical rainfall stations: A comparison of
clustering techniques. Int. J. Climatol 15(9):985–994. doi:10.1002/joc.3370150905.
Kaufman L, Rousseeuw P. 1990. Finding Groups in Data: An Introduction to Cluster
Analysis. John Wiley & Sons, Inc, Hoboken.

Kononenko I, Kukar M. 2007. Machine learning and data mining: Introduction to principles
and algorithms. Horwood Publishing, Chichester.

Lavado W.S, Ronchail J, Labat D, Espinoza J.C, Guyot J.L. 2012. Basin-scale analysis of
rainfall and runoff in Peru (1969–2004): Pacific, Titicaca and Amazonas drainages.
Hydrological Sciences Journal 57 (4): 1–18.
Lagos P, Silva Y, Nickl E, Mosquera K. 2008. El Niño – related precipitation variability in
Peru. Adv. Geosci 14:231–237.

Mair A, Fares A. 2011. Comparison of rainfall interpolation methods in a Mountainous


Region of a Tropical Island. Journal of Hydrological Engineering 16(4): 371-383.
Muñoz-Diaz D, Rodrigo F. 2004. Spatio-temporal patterns of seasonal rainfall in Spain
(1912-2000) using cluster and principal component analysis: comparison. Ann. Geophys
1435–1448.

30
Ochoa A, Pineda L, Crespo P, Willems P. 2014. Evaluation of TRMM 3B42 precipitation
estimates and WRF retrospective precipitation simulation over the Pacific–Andean region of
Ecuador and Peru. Hydrol. Earth Syst. Sci 18:3179–3193, 2014.
Phillips D.L, Dolph J, Marks D. 1992. A comparison of geostatistical procedures for spatial
analysis of precipitation in mountainous terrain. Agricultural and Forest Meteorology.
doi:10.1016/0168-1923(92)90114-J.
Ramachandra Rao, Srinivas V.V. 2006. Regionalization of watersheds by hybrid-cluster
analysis. Journal of Hydrology 318(1-4): 37–56. doi:10.1016/j.jhydrol.2005.06.003.
Ramos M. 2001. Divisive and hierarchical clustering techniques to analyse variability of
rainfall distribution patterns in a Mediterranean region. Atmospheric Research 123–138.
Raziei T, Bordi I, Pereira L.S. 2008. A precipitation-based regionalization for Western Iran
and regional drought variability. Hydrol. Earth Syst. Sci doi:10.5194/hess-12-1309-2008.
Sneyers R, Vandiepenbeeck M, Vanlierde R. 1989. Principal component analysis of Belgian
rainfall. Theor. Appl. Climatol 204:199–204.
Sönmez İ, Kömüşcü A.Ü. 2011. Reclassification of rainfall regions of Turkey by K-means
methodology and their temporal variability in relation to North Atlantic Oscillation (NAO).
Theor. Appl. Climatol 106(3-4):499–510. doi:10.1007/s00704-011-0449-1.
Stooksbury D, Michaels P. 1991. Cluster analysis of southeastern US climate stations. Theor.
Appl. Climatol 150:143–150.
Suarez W. 2007. Le bassin versant du fleuve Santa (Andes du Pérou): dynamique des
écoulements en contexte glacio-pluvio-nival. Dissertation, Université Montpellier II.
Takahashi K. 2004. The atmospheric circulation associated with extreme rainfall events in
Piura, Peru, during the 1997—1998 and 2002 El Niño events. Ann. Geophys 22:3917-3926.
Tabios G.Q, Salas J.D. 1985. A Comparative Analysis of Techniques for Spatial Interpolation
of Precipitation. JAWRA Journal of the American Water Resources Association 21: 365–380.
doi:10.1111/j.1752-1688.1985.tb00147.x.
Ünal Y, Kindap T, Karaca M. 2003. Redefining the climate zones of Turkey using cluster
analysis. Int. J. Climatol 23:1045–1055.
Vauchel P. 2005. Hydraccess: Software for Management and processing of Hydro –
meteorological data software, Version 2.1.4. Free download
www.mpl.ird.fr/hybam/utils/hydracces.html.
Wang C, Fiedler P. 2006. ENSO variability and the eastern tropical Pacific: A review.
Progress in Oceanography 69:239-266.

31

You might also like