Probability Distributions of The Aggregated Residential Load

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

9th International Conference on Probabilistic Methods Applied to Power Systems KTH, Stockholm, Sweden June 11-15, 2006

Probability distributions of the aggregated residential load


Enrico Carpaneto and Gianfranco Chicco
Abstract This paper deals with the characterization of the probability distributions of the aggregated residential load. A detailed statistical study has been performed on a set of data referred to single-house extra-urban customers, in order to assess the time evolution of the average value and standard deviation of the aggregated load and its possible representation with some typical probability distributions. The results have shown that the aggregated residential load data can be satisfactorily represented by using a Gamma probability distribution with parameters variable in function of time and number of customers. Index Terms Residential aggregated load, Probability distributions, Goodness-of-fit, Monte Carlo simulations.

he recent evolution of the electricity systems towards time-dependent tariff rates and integration of distributed generation is increasing the importance of assessing the time evolution of the electricity consumption. Taking into account the effects of the aggregation of residential loads is now essential for studying the time evolution of the load in the distribution system feeders. In fact, the electricity consumption of the single residential customer is too variable in time to allow for obtaining a sound estimate of its individual load pattern [1]. The residential load aggregation can be obtained by either working directly at the distribution system level (if the results of measurements carried out on several feeders are available) [2], or resorting to a bottom-up approach in which the aggregated load patterns of singlehouse customers are computed on the basis of information obtained from real case investigations on customer behaviour, lifestyle, and usage of the appliances [3]. In particular, it is important to assess not only the average value of the aggregated load, but also how its probability distribution varies during the day and in function of the number of the residential customers. Previous studies have shown that the time evolution of the average power, normalized with respect to the total contract power of the customers, has a predictable behaviour, especially when the number of customers is relatively high (e.g., over 100) [4]. Yet, when the number of customers is low, the possible variations of the load power at any given time instant are significantly high and strongly depend on the number of customers and on the randomness in the customer composition and lifestyle [5]. This paper presents the results of a study aimed at characterizing the time evolution of the probability distributions of the aggregated residential load when the
E. Carpaneto and G. Chicco are with the Dipartimento di Ingegneria Elettrica, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Torino, Italy (e-mail enrico.carpaneto@polito.it, gianfranco.chicco@polito.it).

I. INTRODUCTION

number of residential customers varies. Starting from the statistical characterization of the aggregated load patterns of single-house customers carried out in [4] by using the bottomup approach, the aggregated load power has been assessed in function of the number of residential customers by using Monte Carlo simulations. Then, a goodness-of-fit analysis has been extensively performed with several probability distributions (Normal, Log-Normal, Gamma, Gumbel, Inverse-normal, Beta, Exponential, Rayleigh and Weibull) in order to assess which probability distribution fits the distribution of the load power at each time instant most satisfactorily. The results obtained allow for establishing a sound basis to be used within a comprehensive probabilistic evaluation of the residential load or to be integrated into more general procedures of analysis where detailed knowledge of the variability of the residential load patterns is required [6]. Running Monte Carlo simulations is essential to obtain the customer data for variable numbers of customers. In fact, field measurements [2,7-9] would be possible only on a preselected set with fixed number of customers. In addition, it is sometimes difficult to gather only the data of the residential customers, without superposition of other loads (e.g. building services). This difficulty emerges in particular in urban areas, where the residential load and the general services of the buildings are supplied by the same feeder. Section II of the paper deals with the formation of the data set for extra-urban customers. Section III illustrates the characteristics of the statistical tests used. Section IV provides the numerical results and their discussion. Section V contains the concluding remarks. II. FORMATION OF THE DATA SET The analysis has been structured on the basis of the results obtained for extra-urban customers in a previous study [4] carried out by using a comprehensive approach including two phases. In the first phase, a direct investigation of the customers electricity usage has been performed for a real set of single-house extra-urban customers. The results were related to the presence at home of the family members and to the detailed usage of the appliances, and were processed and validated on the subset of customers who gave acceptable information. In the second phase, an overall Monte Carlo simulation was conducted to form the data set for the successive statistical study. Different types of days (working days and weekend days) and periods (summer and winter) were considered. For space limitations, only the results obtained for winter working days are presented here.

Copyright KTH 2006

9th International Conference on Probabilistic Methods Applied to Power Systems KTH, Stockholm, Sweden June 11-15, 2006

III. PROBABILITY DENSITY FUNCTIONS FOR THE STATISTICAL TESTS Various probability distributions have been used for the goodness-of-fit statistical tests [10], including two oneparameter distributions (Exponential and Rayleigh), five twoparameter distributions (Gamma, Gumbel, Weibull, Normal, Log-normal and Inverse Normal), and the three-parameter Beta distribution (whose third parameter has been set to the maximum value of the sample data). Table I contains some details on the probability distributions tested, with the corresponding expressions of the Probability Density Function (PDF) and Cumulative Distribution Function (CDF). The Chi-square, Kolmogorov-Smirnov (KS) and geometrical adaptation statistical tests have been used for investigating the goodness-of-fit of the various probability distributions as a function of the load power P.
TABLE I PROBABILITY DISTRIBUTIONS USED FOR THE GOODNESS-OF-FIT TESTS PDF CDF parameter name limits f(P) F(P)

to the critical range of values (min, max), depending on the number of degrees of freedom, so that the test is successful if min, unsuccessful if > max, whereas for min < max the result is undefined. Alternatively, the maximum level of significance max% corresponding to = min sets the limits of acceptance of the chi-square test results. B. Kolmogorov-Smirnov (KS) test The error of the KS test is given by the maximum mismatch between the Empirical CDF (ECDF) obtained by the set of data under analysis and the CDF of the probability distribution under test. The error is compared to a critical value crit and the test is successful if crit. If the CDF under test is fully specified by assigning all its parameters, the result of the test gen is independent of the distribution, and the critical values crit are found in specific tables in function of the level of significance (see [11] p.797 and Table 1 of [12]). However, if the CDF parameters are estimated from the data, these critical values are no longer valid and must be determined by simulation or by specific tables. Specific tables have been found for the Exponential distribution (p.798 of [11] and Table 1 of [13]) and for the Normal distribution (p.799 of [11] and Table 1 of [14]). For the other cases in which the distribution parameters are extracted from the data, the critical values corresponding to a generic distribution can be seen only as upper bounds of the actual critical values. The assessment of the critical values is then performed by using a Monte Carlo simulation. At first, a set of m = 1 , , M values is specified, at which the CDF under test is calculated. Then, a specified number K of Monte Carlo simulations is performed. At each simulation, a vector of length H is filled with H random values extracted from the CDF under test. The extraction is carried out by using H random extractions in the interval (0,1) from a uniform probability distribution, that are considered as the values of the CDF under test, whose corresponding values are computed from the abscissa of the CDF under test. Then, the simulated empirical CDF is built at the M predefined locations, and the KS error is computed as the absolute value of the maximum difference between the points of the simulated empirical CDF and of the CDF under test referred to the same values m = 1, , M. The K errors of the KS test are then used to build the related CDF, and the critical value is evaluated for a given level of significance. The Monte Carlo simulations performed in this paper assume M = 1000, K = 5000 and H set to the sample set size. C. Geometrical adaptation tests. The assessment of the fitness of the Empirical CDF (ECDF) to a reference CDF has been performed by using two graphical representations, with suitable functions of the generic CDF value F. The first representation transforms each value F into
a E = ln (1 F ) (1) and plots aE versus the power P, so that an exponential CDF would be represented by a straight line in the (P,aE) plane. The second representation transforms each value F into aW = ln( ln (1 F )) (2)

( a + b) P a 1 (c P ) b 1
Beta

( a )(b) c a + b 1

x x
0 1

Pc

a 1

(1 x )b 1dx
b 1

a 1

0 = Beta incomplete(P/c,a,b)

(1 x )

dx

0Pc a>0 b>0

Exponential

1 e b b

1e
P

P b

P0 b>0

Gamma

b (a )
a

P a 1

P e b

0 = Gamma incomplete(P/b)
Pa b

b (a ) e
a

x a 1

x b dx

P0 a>0 b>0

Gumbel

1 e b

Pa b

e e

Pa b

1 e e

Inverse Normal

ae

( Pa )2
2 bP

2bP 3

(P a ) (P + a ) b erf + e erf bP bP
2a

P0 a>0 b>0

(ln P b )2

Lognormal

2a2

a P 2

1 2

ln P b 1 + erf a 2 P 1 erf 2
P b
2

P0 a>0

(P )2
2 2

Normal

2
Rayleigh

1 2
2

P
>0

2P b
2

P e b

1e
a

P0 b>0

Weibull

a P a 1 ba

P e b

P 1 e b

P0 a>0 b>0

A. Chi-square test The parametric chi-square test has been used, adopting the same data sample for the statistical test. The Yates correction [11] has been introduced in order to better estimate the significance level. The results of the test depend on the prespecified number and structure (uniform or non-uniform) of the classes, and on the level of significance%. If the value of % is specified in advance, the observed value is compared

Copyright KTH 2006

9th International Conference on Probabilistic Methods Applied to Power Systems KTH, Stockholm, Sweden June 11-15, 2006

and plots aW versus ln(P), so that a Weibull CDF would be represented by a straight line in the (ln(P),aW) plane. In addition, plotting aW versus P would represent a Gumbel CDF as a straight line in the (P,aW) plane. IV. NUMERICAL TESTS AND RESULTS A. Tests for a 100-customer case A first set of tests has been performed on the data obtained from the Monte Carlo simulation at the same time instant. An example is presented for hour 12:00 of a winter working day with N = 100 customers. The number of Monte Carlo simulations carried out to obtain the ECDF for this situation is 100. The characteristics of the complete data sample include the minimum value 23.69 kW, maximum value 51.69 kW, average value 39.21 kW, standard deviation 5.77 kW (14.7%), and skewness -0.0004. The results of the KS test with level of significance 5% are shown in Table II. The critical values of the KS test have been computed by running 5000 Monte Carlo simulations for each probability distribution (other than Exponential and Normal), resulting in values less restrictive gen = 0.1360. In particular, in this case the tests are not than crit accepted only for the Exponential and Rayleigh probability distributions, whereas all the other distributions exhibit acceptable goodness-of-fit. Fig. 1 reports the various CDFs. More details are reported for the Gamma CDF with average value and standard deviation equal to the ones of the data sample (shape factor a = 46.2 and scale factor b = 848.5 W according to Table I). Fig. 2 shows the details of the KS test. Fig. 3 reports the results of the chi-square test with 7 degrees of freedom (maximum acceptable error 14.07). Appling the Yates correction, the observed error (6.85) has been acceptably low and the test has been passed with a maximum level of significance 44.5%, and with a non-excessive adaptation (the critical value being 2.17).

TABLE II KS TEST ERRORS FOR THE POWER AT HOUR 12:00 WITH 100 CUSTOMERS (LEVEL OF SIGNIFICANCE 5%)

CDF Beta Exponential Gamma Gumbel Inverse-Normal Log-Normal Normal Rayleigh Weibull

KS test error 0.0832 0.5101 0.0800 0.1160 0.0659 0.0887 0.0653 0.3434 0.0855

critical value 0.1255 0.1060 0.1313 0.1314 0.1225 0.1230 0.0886 0.1318 0.1306

result accepted rejected accepted accepted accepted accepted accepted rejected accepted

Fig. 2. KS test for the Gamma CDF.

Fig. 3. Results of the chi-square test for the Gamma CDF for N = 100.

Fig. 1. ECDF of the load at hour 12:00 for N = 100 and CDFs of various probability distributions with the same average value and standard deviation.

Fig. 4 shows the results of the geometrical adaptation tests. The acceptability of the Normal distribution is also confirmed by the high value of the shape factor of the Gamma distribution. As indicated in Fig. 5, for the Gamma CDF the KS observed error during the day never exceeds the 5% acceptance threshold.

Fig. 4. Geometrical adaptation tests for the Gamma CDF.

Copyright KTH 2006

9th International Conference on Probabilistic Methods Applied to Power Systems KTH, Stockholm, Sweden June 11-15, 2006

0.3

0.2

KS 5% acceptance threshold
0.1

0 0 240 480 720 960 1200 1440

The results obtained for hour 12:00 can be generalized by considering the results of the KS test with the various CDFs for the 1440 minutes of the day. Fig. 6 shows that the Exponential and Rayleigh distributions do not fit the ECDF satisfactorily. A zoom into a specific time interval (from hour 11:40 to hour 12:20, Fig. 7) allows for identifying the LogNormal, Inverse Normal and Gamma CDFs as the ones exhibiting the best values of goodness-of-fit, compared to the corresponding thresholds. However, as indicated for the Gamma CDF in Fig. 8, in some hours of the day the KS observed error could exceed the 5% acceptance threshold.
2.0

KS observed error

Normal

time (min)

Fig. 5. Results of the KS 5% test with the Gamma CDF for N = 100.

1.5

Gumbel

B. Tests for a 10- customer case The results of the same tests indicated in the previous subsection are shown here on the data obtained from 100 Monte Carlo simulations referred to hour 12:00 of a winter working day with N = 10 customers. The characteristics of the complete data sample include the minimum value 1.508 kW, maximum value 11.011 kW, average value 3.805 kW, standard deviation 1.618 kW (42.5 %), and skewness 0.1109. Table III shows the results of the KS test with level of significance 5%. The Log-Normal and Inverse Normal distributions exhibit the better goodness-of-fit. In this case, the Normal probability distribution no longer fits the data. The Gamma CDF has shape factor 5.53 (much lower than in the previous case) and scale factor 688 W.
TABLE III KS TEST ERRORS FOR THE POWER AT HOUR 12:00 WITH 10 CUSTOMERS (LEVEL OF SIGNIFICANCE 5%)

KS error ratio

Weibull
1.0

Beta (max)

0.5

Gamma Inverse Normal Log-Normal


0.0 700 710 720 730 740

time (min)

Fig. 7. Observed errors of the KS test with significance level 5% from hour 11:40 to hour 12:20 for N = 10.
0.3

KS observed error

0.2

CDF Beta Exponential Gamma Gumbel Inverse-Normal Log-Normal Normal Rayleigh Weibull
5.0

KS test error 0.1134 0.3721 0.1007 0.2108 0.0851 0.0889 0.1469 0.1588 0.1208

critical value 0.1263 0.1060 0.1325 0.1334 0.1280 0.1263 0.0886 0.1346 0.1322

Result accepted rejected accepted rejected accepted accepted rejected rejected accepted

KS 5% acceptance threshold
0.1

0 0 240 480 720 960 1200 1440

time (min)

Fig. 8. Results of the KS 5% test with the Gamma CDF for N = 10.

V. EXTENDED TESTS AND RESULTS A. Extended tests for variable numbers of customers The same set of tests specified in the previous subsections have been carried out for a different number of customers (from 10 to 300) for all the 1440 minutes of the day, and for the 4 types of days considered (working day and weekend day in the summer and winter seasons). Each customer has a contract power of 3 kW. Some significant results are summarized in the sequel. B. Time evolution of the aggregated load patterns and standard deviations (winter working days) A first result can be achieved by comparing the time evolution of the aggregated load power for different numbers of customers. Fig. 9, Fig. 10 and Fig. 11 show the load patterns for a winter weekday with N = 10, N = 100 and N = 300, respectively. The internal filled band represents the regions

Exponential
4.0

KS error ratio

3.0

Normal Gumbel Rayleigh

2.0

1.0

0.0 0 240 480 720 960 1200 1440

time(min)

Fig. 6. Observed error to critical value ratio of the KS test with significance level 5% for a winter working day with N = 10.

Copyright KTH 2006

9th International Conference on Probabilistic Methods Applied to Power Systems KTH, Stockholm, Sweden June 11-15, 2006

(-,+), where is the average value and is the standard deviation of the data concerning each minute. The upper and lower lines represent the maximum and minimum values obtained in the Monte Carlo simulation. It is evident how when N increases there is a reduction in the range of variation of the aggregated load power, as well as a trend to obtain more symmetrical probability distributions. In particular, Fig. 12 shows how the uncertainty of aggregated load (represented by the standard deviation in per cent of the average value) depends on the number of aggregated customers and varies during the day. C. Evaluations at specific hours Further evaluations have been carried out by comparing the evolution of the load in function of the number of customers. A first case is presented in Fig. 13, considering the CDF of the load at hour 12:00. When the number of customers increases, the CDFs move from left to right, but the standard deviation does not increase in the same way as the increase of the average value. This fact is well highlighted by the representation of the specific power (W/customers) shown in Fig. 14, where it is clear that when N varies the average value of the specific power remains within a narrow range, whereas the standard deviation varies considerably. This fact is important to establish a reference value of specific power that can be used to make good estimates of the consumption of the group of customers tested. Extending the calculation to all the time instants allows for observing that the specific load power profile shown in Fig. 15 remains very similar.
20

350 300 250

power [kW]

200 150 100 50 0 0 240 480 720 960 1200 1440

time (min)

Fig. 11. Aggregated load patterns for N = 300.


40 35

N = 20

standard deviation (% of average value)

30 25 20 15 10 5

N = 40

N = 80
0 0 240 480

N = 150 N = 300
720 960 1200 1440

time (min)

Fig. 12. Time evolution of the standard deviation of the load power in per cent of the average value.
1

N = 10
0.9 0.8 0.7 0.6

20

30

40

15

50 60

70

80

90 100

power [kW]

10

CDF
0 240 480 720 960 1200 1440

0.5 0.4

0.3 0.2 0.1

0 0 10 20 30 40 50 60

time (min)

load [kW]

Fig. 9. Aggregated load patterns for N = 10.


120

Fig. 13. CDF of the load at hour 12:00 for N = 10 to 100.


1

100
0.8

power [kW]

80
0.6

60

CDF
0.4

40
0.2

N = 20 N = 40 N = 80 N = 300 N = 150

20

0 0 240 480 720 960 1200 1440

0 0 200 400 600 800 1000

time (min)

specific power (W/customer)

Fig. 10. Aggregated load patterns for N = 100.

Fig. 14. CDFs of the aggregated specific load power at 16:00 for different numbers of customers.

Copyright KTH 2006

9th International Conference on Probabilistic Methods Applied to Power Systems KTH, Stockholm, Sweden June 11-15, 2006

1000

800

600

400

200

0 0 240 480 720 960 1200 1440

time (min)

Fig. 15. Specific load power profiles (winter working day).

In order to assess the most suitable probability distribution, a comparison has been made by taking into account as parameter the ratio between the observed error of the KS test and the KS error threshold for the corresponding probability distribution. For each number of customers, the probability distributions for which this ratio is the lowest at the various time instants have been identified. The results are summarized in Table IV, showing the percentage of winning time instant for the various probability distributions. From this point of view, the Gamma distribution emerges as the most promising one for the various numbers of customers. Only the Inverse Normal distribution could be a viable alternative for a low number of customers.
TABLE IV PERCENTAGE OF WINNING TIME INSTANTS FOR THE VARIOUS PROBABILITY DISTRIBUTIONS

extra-urban areas. It has to be stressed that these values cannot be generally applied to all residential customers, regardless of their location. In fact, for urban areas generally lower values of specific power are expected for the same number of customers, due to the reduced size of the houses, reduced number of persons per house, and different types of activity. The whole analysis could be repeated with a different data set, with initial parameters concerning the composition of the residential customer set (e.g., including new appliances, customer preferences, or customer willingness to participate to tariff-driven programs), to perform scenario studies and assessing the effects of the penetration of new technologies on the time evolution of the aggregated residential consumption. Examples are the assessment of the distributed generation impact on residential districts, distribution system load forecasting, and simulation of unbalanced loads. The Gamma probability distribution has clearly emerged as the one with the best goodness-of-fit. This result is particularly interesting, since the simple relationships between the Gamma parameters and the average value and standard deviation, as well as the existence and easy formulation of its characteristic function, make the Gamma distribution particularly flexible and powerful for many applications. VII. REFERENCES
[1] [2] [3] [4] C.F.Walker and J.L.Pokoski, Residential load shape modeling based on customer behavior, IEEE Trans. on Power Apparatus and Systems, Vol. PAS-104, No.7, July 1985, pp.1703-1711. I.C.Schick, P.B.Usoro, M.F.Ruane and J.A.Hausman, Residential enduse load shape estimation from whole-house metered data, IEEE Trans. on Power Systems, Vol.3, No.3, August 1988, pp.986-991. A.Capasso, W.Grattieri, R.Lamedica and A.Prudenzi, A bottom-up approach to residential load modeling, IEEE Trans. on Power Systems, Vol.9, No.2, May 1994, pp.957-964. A.Cagni, E.Carpaneto, G.Chicco and R.Napoli, Characterisation of the aggregated load patterns for extra-urban residential customer groups, Proc. IEEE Melecon 2004, Dubrovnik, Croatia, May 12-15, 2004, 3,.951-954. ISTAT, 14th general census of the population and of the houses (in Italian), 2001. Available: http://dawinci.istat.it/MD/ R.Herman and J.J.Kritzinger, The statistical description of grouped domestic electrical load currents, Electric Power Systems Research, Vol.27, May 1993, pp.43-48. J.P.Ross and A.Meier, Whole-house measurements of standby power consumption, International Conference on Energy Efficiency in Appliances, Report LBNL-45967, Berkeley Lab, September 2000, Available: http://standby.lbl.gov/articles.html K.Rosen and A.Meier, Energy use of televisions and videocassette recorders in the U.S., Report LBNL-42393, Berkeley Lab, March 1999, Available: http://eetd.lbl.gov/EA/Reports B.Lebot, A.Meier and A.Anglade, Global implications of standby power use, ACEEE Summer Study on Energy Efficiency in Buildings, Report LBNL 46019, Berkeley Lab, August 2000. Available: http://standby.lbl.gov/articles.html R.B.DAgostino and M.A.Stephens (ed.), Goodness-of-fit techniques, Dekker, New York, 1996. K.S.Trivedi, Probability and Statistics With Reliability, Queuing and Computer Science Applications, Wiley, New York (2002). L.H.Miller, Table of percentage points of Kolmogorov statistics, J. Am. Stat. Assoc 51 (1956) 113. H.W.Lilliefors, On the Kolmogorov-Smirnov test for exponential with mean unknown, J. Am. Stat. Assoc. 64 (1969) 388. H.W.Lilliefors, On the Kolmogorov-Smirnov test for normality with mean and variance unknown, J. Am. Stat. Assoc. 62 (1967) 400.

specific power (W/customer)

probability distributions and percentages of winning time instants Log-Normal Exponential Weibull Gumbel 0 0.1 0 0.3 0 0 0.1 0.1 0 0 Gamma Normal Inverse Normal N Rayleigh

Beta

[5] [6] [7]

10 20 30 40 50 60 70 80 90 100

43.1 58.9 74.9 70.5 86.0 81.5 80.8 76.0 80.3 80.1

0 0 0 0 0 0 0 0 0 0

2.9 4.7 7.1 11.5 7.8 12.5 11.3 13.5 11.1 12.1

0 0 0 0 0 0 0 0 0 0

46.0 32.4 14.5 8.7 3.2 3.5 3.1 5.5 5.1 5.1

2.2 1.7 1.7 3.2 1.5 1.6 2.7 2.8 2.1 1.9

4.8 2.2 1.8 5.8 1.5 0.9 2.0 2.1 1.4 0.8

1.0 0 0 0 0 0 0 0 0 0

[8] [9]

[10] [11]

VI. CONCLUDING REMARKS The results presented in this paper form a useful basis for preparing a meaningful set of probabilistic data concerning the time-evolution of the aggregated residential load, to be used into any tool for probabilistic simulation (e.g., based on analytical or Monte Carlo simulations). The results shown are specifically referred to groups of residential customers in

[12] [13] [14]

Copyright KTH 2006

You might also like