Characterization of Solar Irradiance Profiles For Photovoltaic System Studies Through Data Rescaling in Time and Amplitude

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Characterization of Solar Irradiance Profiles for

Photovoltaic System Studies through Data Rescaling in


Time and Amplitude
Gianfranco Chicco Valeria Cocina Filippo Spertino
Politecnico di Torino Politecnico di Torino Politecnico di Torino
Energy Department Energy Department Energy Department
corso Duca degli Abruzzi 24 corso Duca degli Abruzzi 24 corso Duca degli Abruzzi 24
10129 Torino, Italy 10129 Torino, Italy 10129 Torino, Italy
gianfranco.chicco@polito.it valeria.cocina@polito.it filippo.spertino@polito.it

Abstract-This paper addresses the representation of the data irradiance come from the global horizontal irradiance, the
coming from solar irradiance measurements, to be used in horizontal diffuse irradiance and the beam normal irradiance.
evaluations referring to the operation of photovoltaic systems.
Starting from the consideration that for different days of the This paper deals with the characterization of the solar
year the sunrise and sunset timings change and the solar irradiance evolution in time, with the aim of identifying the
irradiance patterns at clear sky conditions occur with different main features that can be used within a tool to represent the
maximum amplitude, a bi-normalization procedure is applied in photovoltaic production of a given site. The set of data
order to produce comparable normalized patterns for the analysed are taken from two locations in the South of Italy.
various days. The normalized patterns are then subject to
clustering in order to obtain a meaningful grouping of similar The time period is one year, and the rate at which these data
days. Finally, from the clustering results a day-type succession have been gathered is one minute.
matrix is constructed, whose entries are interpreted as the One of the main aspects is the creation of a framework to
conditional probability of finding a given day type providing compare the measured solar irradiance data with reference
that the type of the preceding day is known. Data used in the data set up in a general framework. For this purpose, a
analysis are taken from real sites.
dedicated variable space is created, in which the time axis is
Index Terms--photovoltaic systems, solar irradiance, Moon- normalized in such a way to map the time interval between
Spencer model, time and amplitude rescaling, distributed the sunrise and the sunset in the (0,1) interval. The practice of
generation, data normalization.
rescaling the time axis has been used in other references such
I. INTRODUCTION as [3]. In the same way, the solar irradiance values are
In the analysis of the productivity and performance of normalized in such a way that the unity value corresponds
photovoltaic (PV) systems, availability of actual with the conditions at clear sky taken from the Moon-Spencer
environmental data is a key aspect. These data mainly include model. The combination of the two ways of performing
solar irradiance and temperature, while further data on rescaling is a specific contribution of this paper. On these
humidity, wind speed and wind direction may be useful to set bases, the measured data are handled in the new variable
up refined models. At a given site, measurement of solar space and may be characterized in a statistical way. The
irradiance provides a set of data at a given sampling rate. For results of the statistic characterization are used to provide
the purpose of characterizing the trends of the irradiation information to be sent to forecasting tools and to specific
obtained from the measured solar irradiance, a common tools in which suitable scenarios are constructed for analysing
reference is the irradiation produced from the solar irradiance the grid integration of the distributed generation from PV
data considered in the ideal conditions of a clear sky. sources.
The theoretical instantaneous values of solar irradiance The paper has four additional sections. In the second
obtained at clear sky on a surface orientated in any direction section the evaluation of the solar irradiance is presented,
are given by the Moon-Spencer model [1,2]. This model uses with the description of the experimental setup. In the third
a set of geometrical angles and the time equation representing section the representative results from simulation performed
the lag or lead position of the Sun with respect to the clock. are presented and discussed. The final section contains the
In the Moon-Spencer model, the contributions to the solar concluding remarks.

‹,(((
II. SOLAR IRRADIANCE EVALUATION TABLE I.
CORRELATION VALUES CALCULATED FROM MEASURED DATA AT 1
A. Description of the Experimental Setup MIN AND AVERAGED AT DIFFERENT TIME INTERVALS
Measurements collected in two meteorological stations, correlation 1-min 10-min 15-min 30-min 1 hour
called synthetically “Gi” and “Ma”, placed in grid-connected measured average average average average
PV systems at latitude around 40° N, have been used. The 24-hour data 0.928 0.945 0.949 0.956 0.963
equipment of each station is the following: excluding
• 1 pyranometer (ISO Secondary Standard) for measuring 19:30 pm – 0.905 0.930 0.934 0.943 0.952
5:30 am
the horizontal global irradiance Ghpyr, with spectral range
0.285 í 2.8 μm; sensitivity ≈ 10 μV/W⋅m-2; response time
C. Data rescaling in time and solar irradiance amplitude
≤ 5 s; zero offset < 10 W/m2; directional error (up to 80°
C.1. Rationale for applying the rescaling procedure
with 1000 W/m² beam) < 10 W/m2;
For the “Gi” site, Fig. 1 shows the superposition of the
• 2 reference solar cells in crystalline silicon for measuring
solar irradiance plots obtained for each clear sky day of the
the horizontal global irradiance;
year by using the Moon-Spencer model with photovoltaic
• 2 reference solar cells in polycrystalline silicon with
modules located on the horizontal plane and on a tilted plane
South orientation for measuring the 30° tilted global
(angle of 30°). It is apparent that the clear sky conditions are
irradiance Gtcell;
not represented by the same line throughout the year. There
• 1 thermo-hygrometer for measuring the ambient
are two major sources of variation, one on the horizontal axis
temperature and the relative humidity;
(due to the differences among the sunset and sunrise hours
• 1 anemometer with vane for measuring the wind speed.
along the year), and one on the vertical axis (due to the
All parameters have been recorded for the whole year 2012 different Sun position at noon during the year).
with time step of 1 min. The global irradiance data from solar These differences express the rationale for using the
cells on tilted plane has been validated through the specific bi-normalization proposed in this paper, in which
comparison with the pyranometer uncertainty. both the time (horizontal) axis and the solar irradiance
Typical values of uncertainties are ≤ ±10 kWh/m2 on a (vertical) axis are normalized to the range (0,1).
monthly basis in spring/summer period. Furthermore, these From Fig. 1 it can be seen that higher advantages of the
values are the threshold within which the measurements of the normalization appear for surfaces orientated with low tilt
solar cell must be included for having the label of “acceptable angle (as it may occur in some building-integrated solutions),
values”. The missing values are limited to a few readings in for which the difference between the solar irradiance at clear
some days. A comprehensive analysis of the experimental sky conditions for the days with the lowest and the highest
results for the meteorological stations is presented in [4]. values of solar irradiance at noon becomes more significant
B. Correlations between the two sites (e.g., on the horizontal plane this ratio would be close to 2).
The “Gi” and “Ma” sites are located at a distance of about
70 km in a flat territorial region. The overall correlations
between the data of the two sites (for variable time step),
calculated with one-minute data and with the average of
groups of successive data (at 10 min, 15 min, 30 min and one
hour) are reported in the second row of Table I. The
correlation values are high, also because the set of data used
in this case considers all the data gathered during time
Fig. 1. Evolution in time of the solar irradiance in clear sky conditions.
(including the night period), and the presence of the night
period causes a strong contribution to increase the correlation The results of using normalized solar irradiance are shown
values. The night period can be eliminated by deleting the in Fig. 2 for two days in March and July, respectively. The
corresponding data. In order to keep uniformity in the number variables nGtcell and nGt(θ) are the normalized tilted global
of data for each day, it is possible to eliminate the data irradiance found from the measurements and the Moon-
uniformly from 19:30 pm to 5:30 am for all the days. This is Spencer model, respectively, being θ the angle between the
not equivalent to taking into account only the periods from solar beam and the direction orthogonal to the tilt plane [1].
the sunrise to the sunset, but enables us using the raw data Correspondingly, on the horizontal axis the time period
gathered at each minute to compute the correlations. The before the sunrise and after the sunset has been cut off. From
correlation results are indicated in the third row of Table I. these plots, a number of aspects can be noticed:
By eliminating the null values during the night, the - The number of minutes represented during the day is quite
correlation values are clearly reduced, even though they different. In order to obtain comparable patterns, the
remain significantly high. On the basis of these number of points describing the patterns has to be the
considerations, the calculations presented in this paper will same. For this purpose, a common number of points N has
refer to only one of the sites, namely, the “Gi” site. to be chosen, applying a suitable routine based on data
interpolation (such as the one illustrated in [5]) in order to 2. ONP: clustering with ordered normalized patterns (N
form the new data sets for the daily patterns. The patterns points for each pattern).
from the Moon-Spencer model are generated for each 3. NPR: clustering with base patterns with reduced number
minute and are treated in the same way as the patterns of points (e.g., by grouping Ns successive points, the
with measured data. clustering is run with N/Ns points). In order to prepare the
- When the data sampling is relatively fast, the data input properly, namely, grouping the entire set of N
phenomenon of broken clouds appears [4], according to points, Ns has to be chosen such that N is a multiple of Ns.
which there are measured solar irradiance values that 4. ONPR: clustering with ordered patterns determined from
exceed significantly the values at clear sky conditions, the normalized patterns with reduced number of points
followed by other values much lower than the values at (with the same type of grouping of the NPR case).
clear sky conditions (as it can be seen in Fig. 2a). This
When the differences between the normalized patterns
effect can be mitigated by averaging a number of
and the normalized Moon-Spencer model are considered as
successive points in order to represent the pattern in a
inputs, the four data inputs are indicated with ND, OND,
smoothed way with a lower number of points.
NDR and ONDR, respectively.
- The occurrence of cloudy sky conditions even in days
with prevailing clear sky is not subject to any regularity C.3. Clustering procedure
throughout the year, so that different patterns representing The k-means clustering [6] procedure is used for creating
the same qualitative sky condition may differ by the the groups of days, resulting in K clusters. This clustering
position of the cloudy portion of the pattern. This issue method has the advantage of creating averaged groups, while
can be mitigated by representing the pattern with an other clustering algorithms (such as hierarchical clustering)
ordered set of points, losing the succession in time of the tend to perform better in identifying the outliers [7].
points but preserving the qualitative representation of the The output of the clustering algorithm is the vector c =
sky conditions. {cm} ∈ ℵM,1 with length equal to the number of days,
containing for each day the number of the cluster to which the
pattern corresponding to that day has been assigned by the
clustering process.
On the basis of the clustering results, the attributes
associated to the clusters are defined, typically resorting to a
terminology that makes it possible to identify for each cluster
the characteristics of the days, e.g., clear sky, quasi-clear sky,
…, up to cloudy sky. An application example is shown in
a) March day
Section III.B. The corresponding attributes are included in the
vector a = {a1, …, aK} ∈ ℵK,1.

C.4. Day-type succession matrix


In order to provide useful information to represent the
relations between successive days on the basis of the input
data and of the clustering results, the day-type succession
matrix D = {dij} ∈ ℜK,K is introduced here. In this matrix, the
b) July day
row i indicates the preceding day with attribute ai, the column
Fig. 2. Normalized solar irradiance for two days.
j indicates the day under consideration with attribute aj, and
the value dij represents the number of occurrences for which a
C.2. Data input for identifying groups of similar days
day with attribute ai has been followed by a day with attribute
In order to prepare the data input for the clustering
aj in the data analysed. For example, if a1 identifies clear sky
procedure, two types of patterns have been considered:
and a2 indicates quasi-clear sky, d12 = 15 means that 15
- Normalized patterns (horizontal and vertical).
occurrences have been found in which a day denoted as clear
- Differences between the normalized patterns and the
sky is followed by a day with quasi-clear sky.
normalized Moon-Spencer model.
In practice, the entries of the matrix D are determined by
In addition, the patterns have been taken both in the initial starting from an empty matrix and adding a unity value to the
succession of points, and as an ordered sequence of points (in position defined by components of the vector c, namely, the
the ascending order). Subsequently, four types of data input row cm-1 and the column cm, for m = 2,…, M (the component
for clustering with normalized patterns are considered: m = 1 is not used, since there is no preceding day). The sum
1. NP: clustering with normalized patterns (N points for each of the entries of the matrix D is M-1.
pattern).
III. SIMULATION STUDY RESULTS identified in a similar way by using all the types of pre-
A. Data input processing. In fact, in all the resulting partitioning there is a
One year of data gathered from the “Gi” site at one minute cluster composed of a number of patterns ranging from 54 to
rate have been used for the case study application. The solar 59, containing the days far from the clear sky conditions.
irradiance patterns for all the days are rescaled in order to be However, for visualization purposes, using 1000 points it is
represented with N points in the horizontal interval (0,1). In less immediate to see the differences of this cluster with
the examples presented in this paper, the common number of respect to the other ones, and the use of a reduced number of
points to represent the patterns has been chosen as N = 1000 points facilitates the identification of this type of cluster.
points (of the same order as the number of minutes from the Focusing our attention to the four clusters, the following
sunrise to the sunset for the site analysed), and the definitions can be provided to the types of day:
interpolation has been applied by using the procedure - Clear sky (cluster 2 ONPR and cluster 4 ONDR)
presented in [5]. The reduced number of points has been - Quasi-clear sky (cluster 4 ONPR and cluster 3 ONDR)
chosen as Ns = 20, i.e., with a value relatively low, but higher - Quasi-cloudy sky (cluster 1 ONPR and cluster 1
than the number of hours ranging from the sunset to the ONDR)
sunrise in the Summer season. The reduced points have been - Cloudy sky (cluster 3 ONPR and cluster 2 ONDR)
found by averaging groups of 50 successive points from each
C. Day-type succession matrix
pattern represented with 1000 points.
From the clustering results, the day-type succession
In the following part of this paper, for the sake of clarity
matrix D, partitioning of the 365 successive days of the 366
and of comparison among the different types of pre-
days (as the first day has no preceding day) is shown in Table
processing, the normalized time interval is represented in
X by using ordered normalized patterns and reduced data
terms of the number of points used for the pattern
(ONPR), and in Table XI by using ordered differences
representation. Fig. 3a shows the normalized patterns of solar
between the normalized patterns and reduced data (ONDR).
irradiance (in which the values higher than unity occur
The results show that the matrix D entries depending on the
because of the broken clouds) and Fig. 3b shows the ordered
clustering results are relatively different. This is an effect of
normalized solar irradiance patterns.
the k-means algorithm, that tends to create homogeneous
groups. The variety of the input data in the case studied is
such that changing the type of data input the results may
change significantly.
TABLE X.
DAY-TYPE SUCCESSION MATRIX (ONPR DATA INPUT)
a) normalized solar irradiance patterns clear quasi-clear quasi-cloudy cloudy
clear 42 15 16 7
quasi-clear 13 24 30 15
quasi-cloudy 17 25 96 11
cloudy 8 18 7 21
TABLE XI.
DAY-TYPE SUCCESSION MATRIX (ONDR DATA INPUT)
b) ordered normalized solar irradiance patterns clear quasi-clear quasi-cloudy cloudy
Fig. 3. Pattern representations with normalized solar irradiance. clear 13 17 18 15
quasi-clear 12 112 33 9
B. Clustering results quasi-cloudy 20 28 22 10
The k-means clustering has been run for all the types of cloudy 18 9 7 22
data input. The results are shown in the following figures
(from Fig. 4 to Fig. 11) and tables (from Table II to Table IX Starting from a given day, from the entries of the matrix D
- next page). For each type of data input, the clustering results it can be assessed which is the conditional probability that the
are represented by using the normalized patterns also when successive day will be of any of the types of day defined. For
different data (e.g., the ordered normalized patterns) are used this purpose, the entries of the matrix D are elaborated by
as inputs for the clustering procedure. calculating inside each rows the conditional probabilities of
From the above results, it emerges that the solutions with finding a given type of day under analysis, provided that the
visually clearer distinction among the patterns are obtained day type of the preceding day is known. As an example,
by using the ordered normalized patterns with 20 points Table XII shows the results referring to the ONPR data input
(ONPR, Fig. 7) or the ordered normalized pattern differences and to the results of Table X. These results can be used to
with 20 points (ONDR, Fig. 11) as features to run the construct different scenarios with meaningful successions of
clustering procedure. In practice, the days with cloudy sky are day types in each scenario.
TABLE II. NORMALIZED PATTERNS, 1000 POINTS TABLE VI. NORMALIZED PATTERN DIFFERENCES, 1000 POINTS
Cluster (NP data) 1 2 3 4 Cluster (ND data) 1 2 3 4
number of components 50 43 216 57 number of components 216 53 41 56

cluster 1 cluster 2 cluster 3 cluster 4 cluster 1 cluster 2 cluster 3 cluster 4


Fig. 4. Clustering results from normalized irradiance patterns (1000 points). Fig. 8. Clustering results from normalized pattern differences (1000 points).
TABLE III. ORDERED NORMALIZED PATTERNS, 1000 POINTS TABLE VII. ORDERED NORMALIZED DIFFERENCES, 1000 POINTS
Cluster (ONP data) 1 2 3 4 Cluster (OND data) 1 2 3 4
number of components 81 54 77 154 number of components 69 58 158 81

cluster 1 cluster 2 cluster 3 cluster 4 cluster 1 cluster 2 cluster 3 cluster 4


Fig. 5. Results of the clustering based on the ordered normalized patterns Fig. 9. Results of the clustering based on the ordered normalized pattern
(1000 points), and normalized patterns corresponding to the same clusters. differences (1000 points) and corresponding normalized patterns.
TABLE IV. NORMALIZED PATTERNS, 20 POINTS TABLE VIII. NORMALIZED PATTERN DIFFERENCES, 20 POINTS
Cluster (NPR data) 1 2 3 4 Cluster (NDR data) 1 2 3 4
number of components 56 87 78 145 number of components 59 70 186 51

cluster 1 cluster 2 cluster 3 cluster 4 cluster 1 cluster 2 cluster 3 cluster 4


Fig. 6. Clustering results based on ordered normalized patterns (20 points). Fig. 10. Clustering results from normalized pattern differences (20 points).
TABLE V. ORDERED NORMALIZED PATTERNS, 20 POINTS TABLE IX. ORDERED NORMALIZED DIFFERENCES, 20 POINTS
Cluster (ONPR data) 1 2 3 4 Cluster (ONDR data) 1 2 3 4
number of components 82 81 54 149 number of components 63 56 80 167

cluster 1 cluster 2 cluster 3 cluster 4 cluster 1 cluster 2 cluster 3 cluster 4


Fig. 7. Results of the clustering based on the ordered normalized patterns (20 Fig. 11. Results of the clustering based on the ordered normalized pattern
points), and normalized patterns corresponding to the same clusters. differences (20 points) and corresponding normalized patterns.
TABLE XII. considered, and the best solutions have been indicated as the
PROBABILITY OF FINDING A GIVEN TYPE OF DAY UNDER ANALYSIS ordered normalized patterns and the ordered normalized
KNOWING THE TYPE OF PRECEDING DAY (ONPR DATA INPUT)
pattern differences (with respect to the reference Moon-
preceding day under analysis
Spencer model of solar irradiance at clear sky). An
day clear quasi-clear quasi-cloudy cloudy total
exemplificative clustering with four clusters has been run by
clear 53% 19% 20% 9% 100%
using the k-means method, identifying four day types, and
quasi-clear 16% 29% 37% 18% 100%
posterior probabilities of transition between the different
quasi-cloudy 11% 17% 64% 7% 100%
types of day identified by clustering have been calculated. A
cloudy 15% 33% 13% 39% 100% further case with a larger number of clusters has shown the
possibility of gaining details on the representation of the
D. Variation of the number of clusters characteristics of different days, reducing in this case the
Further details can be observed by increasing the number possibility of using information obtained from the day-type
of clusters. For example, Fig. 12 shows the results obtained succession matrix D to carry out a statistical analysis of the
by running the clustering procedure with K = 12 by using the conditional probability that starting from a known type of day
ONPR data input. In this case, the definitions of the types of the successive day will be of a given type of day. Indeed, the
days become more various, but clearly the pattern grouping is choice of the number of clusters to be used may be discussed
better defined. The number of cells of the matrix D becomes on the basis of the compromise between the need of having a
K2 = 144. Since the number of entries to be located in the low number of clusters to make the definition of the different
matrix is M-1 = 365, many of these cells will contain very types of day more intuitive and to get better statistical
low numbers or even zeros, making it difficult to associate a information from the day-type succession matrix, and the
probabilistic meaning to the succession of the days. The need for representing the types of day in more details in order
situation could be improved by analysing data for very long to define the characteristics of the clusters more clearly.
time periods, i.e., many years. As such, conceptually the The results are being used by the authors to generate
construction of a “large size” D matrix may be useful when relevant scenarios to express the patterns of solar irradiance at
the input data contain measurements taken for several years. the given sites. Furthermore, by associating the model of the
solar arrays, it will be possible to create suitable patterns of
power production that take into account the real evolution of
the ambient variables.
ACKNOWLEDGMENTS
The research leading to these results has received funding
from the European Union Seventh Framework Programme
Fig. 12. Clustering results with ONPR data input and K = 12 clusters. FP7/2007-2013 under grant agreement no. 309048, project
Horizontal axis: number of points. Vertical axis: normalized solar irradiance. SiNGULAR (Smart and Sustainable Insular Electricity Grids
Under Large-Scale Renewable Integration).
IV. CONCLUDING REMARKS
REFERENCES
Dealing with photovoltaic systems data affected by
uncertainty, daily and seasonal effects, as well as the presence [1] P. Moon and D.E. Spencer, “Illumination from a non uniform sky”,
Trans. of the Illumination Engineering Society, vol. 37 (12), pp. 707-
of a number of null values during the 24 hours, raises the 7261, 1942.
issue of comparability among the data in order to analyse the [2] F. Batrinu, E. Carpaneto, G. Chicco, S. Gagliano, F. Spertino and G.M.
characteristics of sunny, partially clouded or cloudy days Tina, “Assessing the performance of photovoltaic sites and grid-
connected plants: a study case”, Proc. VI World Energy System
occurring during the year. This paper has illustrated a novel Conference, Torino, Italy, pp. 386-393, 10-12 July 2006.
way to pre-process the solar irradiance data by carrying out [3] T. Xu and G. Gross, “A production simulation tool for systems with an
normalization on both the horizontal (time) axis in order to integrated concentrated solar plant with thermal energy storage”, Proc.
Bulk Power System Dynamics and Control - IX Optimization, Security
make the time periods comparable, and the vertical and Control of the Emerging Power Grid (IREP), Crete, Greece, 25-30
(amplitude) axis to make the solar irradiance comparable. August 2013.
With this normalization, followed by suitable interpolation [4] F. Spertino, P. Di Leo and V. Cocina, “Accurate measurements of solar
irradiance for evaluation of photovoltaic power profiles”, Proc. IEEE
of the data gathered in order to express all the patterns with PowerTech, Grenoble, France, 16-20 June 2013.
the same number of points within the normalized time [5] G. Chicco, V. Cocina, A. Mazza and F. Spertino, “Data Pre-Processing
interval, the patterns have been sent to a clustering procedure and Representation for Energy Calculations in Net Metering
in order to find out an appropriate grouping of the days and to Conditions”, Proc. IEEE Energycon 2014, Dubrovnik, Croatia, 13-16
May 2014, paper 262.
discover the characteristics of the time periods in which there [6] J.T. Tou and R.C. Gonzalez, Pattern recognition principles, Addison-
is no clear sky. Wesley, 1974.
Different constructions of the features representing the [7] G. Chicco, “Overview and performance assessment of the clustering
methods for electrical load pattern grouping”, Energy, vol. 42 (1), pp.
patterns that are sent to the clustering procedure have been 68-80, June 2012.

You might also like