Professional Documents
Culture Documents
A Fast Technique For Smart Home Management ADP With Temporal Difference Learning
A Fast Technique For Smart Home Management ADP With Temporal Difference Learning
A Fast Technique For Smart Home Management ADP With Temporal Difference Learning
SOC is fixed, because the energy out of the TES unit can not Algorithm 1 : ADP Using Temporal Difference Learning
be controlled, which is the thermal demand of the household. TD(1)
We can overcome this by either removing the end-of-day TES 1: Initialize V̄k0 , k ∈ K,
constraint or by having a range of values. However, this means 2: Set r = 1 and k = 1,
we end up with a sub optimal level of TES SOC at the end 3: Set s1 .
of the day or require user interaction to adjust the TES SOC, 4: while r ≤ R do
which we should avoid in practical applications. 5: Choose a sample path ωr .
However, the required computation to generate value func- 6: for k = 0, . . . , K do
tions using DP grows exponentially with the size of the state, 7: Solve the deterministic problem (17).
action and outcome spaces. One way of overcoming this prob- 8: for i = 1, . . . , I do
lem is to approximate the value function, while maintaining 9: Find right and left marginal contributions (18).
the benefits of DP. 10: end for
11: if k < K then
III. A PPROXIMATE DYNAMIC P ROGRAMMING (ADP) 12: Find the post decision states (11) and the next
pre decision states (12).
ADP, also known as forward DP, is an algorithmic strategy
13: end if
for approximating a value function, which steps forward in
14: end for
time, compared to backward induction used in value iteration.
15: for k = K, . . . , 0 do
Policies in ADP are extracted from these VFAs [29]. Similar
16: Calculate the marginal values (19).
to DP, ADP operates on an MDP formulation of the prob-
17: Update the estimates of the marginal values (20).
lem, so all the non-liner constraints and transition functions in
18: Update the VFAs using CAVE algorithm.
Section II-B can be incorporated with the same computational
19: Combine value functions of each controllable
burden as modelling linear transition functions and constraints.
device (16)
ADP is an anytime optimisation solution technique2 so we
20: end for
always obtain a solution regardless of the constraints and
21: r = r + 1.
the inputs. In this instance, the problem is formulated as a
22: end while
maximisation problem for convenience.
23: Return the value function approximations V̄kR ∀k.
TABLE I
DAILY O PTIMISATION R ESULTS FOR THE T HREE S CENARIOS
C. Quality of the Solutions From ADP, DP and Fig. 9. The total electricity cost of PV-battery systems of scenarios 1-3
Stochastic MILP from deterministic and stochastic ADP for a range of possible PV output and
demand profiles, which are generated by adding Gaussian noise with vary-
ADP and DP results in better quality solutions than stochas- ing standard deviation and zero mean to the actual PV output and electrical
demand.
tic MILP as they both incorporate stochastic input variables
using appropriate probabilistic models and non-linear con-
straints [24]. However, ADP results in slightly lower quality ADP enables us to incorporate the stochastic input variables
schedules compared to the optimal DP solutions because the without a noticeable increase in the computational effort over
value functions used are approximations. This is evident in a deterministic ADP. Moreover, stochastic ADP requires less
Table I. Note that the DP based SHEMS with two controllable computational effort than deterministic DP. An ADP-based
devices (battery and thermal) is computationally intractable stochastic optimisation for a PV-battery system can reduce the
to be use in an existing smart meter, and we only use it to total electricity cost by 13.62%, 0.16%, and 94.67% for sce-
compare solutions with ADP. narios 1, 2, and 3, respectively, in instances without Gaussian
noise. The benefits from Scenarios 1 and 3 are noticeable
D. Benefits of a Stochastic Optimisation Over a and their forecast errors are what we can expect from elec-
Deterministic Optimisation trical demand prediction algorithms [39], [40]. The benefits
A stochastic optimisation always performs better over a from the stochastic optimisation is minimal when the fore-
deterministic one and we investigate this using a range of pos- cast errors are very high (Scenario 2) or very low, which are
sible PV output and the demand profiles as shown in Fig. 9. highly unlikely. In Scenario 2, the stochastic optimisation gives
The Fig. 9 is obtained as follows: first we obtain VFAs from slightly lower quality results after 0.2 kW standard deviation of
both the stochastic and deterministic optimisations using ADP. Gaussian noise because both the forecast and kernel estimation
The stochastic optimisation uses the kernel estimated proba- errors are high. However, these scenarios are highly unlikely
bility distributions of the PV output and electrical demand and the resulting cost is negligible compared to the benefits
while the deterministic ADP only uses the predicted mean PV from a stochastic optimisation. Moreover, the initial point of
output and the electrical demand (i.e., all the random vari- Scenario 2 already has a mean absolute error of 0.696, which is
ables are zero). Second we obtain the total electricity cost highly unlikely to increase any further in a practical scenario.
for different possible PV output and demand profiles, which In summary, even though the benefits vary with the scenario
are generated by adding Gaussian noise with varying standard and the forecast error, a stochastic optimisation performs bet-
deviation and zero mean to the actual PV output and elec- ter or same as a deterministic one, and ADP provides these
trical demand. The mean absolute errors of the actual (i.e., benefits without a noticeable increase in the computational
zero Gaussian noise) and the predicted mean electrical demand effort.
are 0.238%, 0.696%, and 0.117%, for Scenarios 1, 2, and 3,
respectively, which are the initial points. Note that the forecast E. Computational Aspects
errors associated with residential electrical demand predictions In our second set of simulation results, we discuss the
are typically very high and our aim here is not to minimise computational performance of the three solution techniques
forecast errors but to find a suitable stochastic optimisation (Section V-E) and the benefits of extending the decision
technique that performs well under uncertainty. horizon (Section V-F) of households with PV-battery systems.
KEERTHISINGHE et al.: A FAST TECHNIQUE FOR SMART HOME MANAGEMENT: ADP WITH TEMPORAL DIFFERENCE LEARNING 3301
TABLE II
Y EARLY O PTIMISATION R ESULTS FOR T WO H OUSEHOLDS OVER 3 Y EARS
Fig. 10. Effects of extending the decision horizon for PV-battery systems,
(a) computational time of ADP, DP and stochastic MILP against the length
of the decision horizon, and (b) normalised electricity cost with error bars
against the length of the decision horizon.
that a daily DP results in a significant cost savings of 12.04% [2] A. C. Chapman, G. Verbič, and D. J. Hill, “Algorithmic and strategic
($107.35) and 1.91% ($39.35) for Households 1 and 2, respec- aspects to integrating demand-side aggregation and energy manage-
ment methods,” IEEE Trans. Smart Grid, vol. 7, no. 6, pp. 2748–2760,
tively, compared to a daily stochastic MILP approach. The Nov. 2016.
difference in the savings is because of the following reasons. [3] Time-of-Use Pricing, Ausgrid, Sydney, NSW, Australia. [Online].
In scenarios with high demand (i.e., Household 2), most of the Available: http://www.ausgrid.com.au/Common/Customer-Services/
Homes/Meters/Time-of-use-pricing.aspx#.WDO97lwpHIU
time the battery will discharge its maximum possible power [4] Flexible Pricing FAQs, Energy Australia, Melbourne, VIC,
to the household during peak periods, so the battery and the Australia. [Online]. Available: https://www.energyaustralia.com.au/
inverter will operate in the maximum efficiencies even though faqs/residential/flexiblepricing-plan-information
MILP solver does not consider the non-linear constraints. The [5] S. Lu et al., “Centralized and decentralized control for demand
response,” in Proc. IEEE PES Innov. Smart Grid Technol. (ISGT),
converse is happening for scenarios with low demand (i.e., Anaheim, CA, USA, 2011, pp. 1–8.
Household 1). [6] M. Pipattanasomporn, M. Kuzlu, and S. Rahman, “An algorithm for
For demonstration purposes we show optimisation results intelligent home energy management and demand response analysis,”
IEEE Trans. Smart Grid, vol. 3, no. 4, pp. 2166–2173, Dec. 2012.
for two households over three years in Table II. The average [7] S. Li, D. Zhang, A. B. Roget, and Z. O’Neill, “Integrating home energy
yearly electricity cost for a residential PV-battery system with simulation and dynamic electricity price for demand response study,”
the proposed ADP based SHEMS can reduce the total electric- IEEE Trans. Smart Grid, vol. 5, no. 2, pp. 779–788, Mar. 2014.
[8] M. Muratori and G. Rizzoni, “Residential demand response: Dynamic
ity cost over 3 years by 57.27% and 50.95% for Households energy management and time-varying electricity pricing,” IEEE Trans.
1 and 2, respectively, compared to the 22.80% and 28.60% Power Syst., vol. 31, no. 2, pp. 1108–1117, Mar. 2016.
improvements by having only a PV system. It is important to [9] Australian PV Institute. (Jan. 2015). Australian PV Market Since April
note that a DP based SHEMS over a two-day long decision 2001. [Online]. Available: http://pv-map.apvi.org.au/analyses
[10] “The integrated grid realizing the full value of central and distributed
horizon may result in a slightly better solution. However, it is energy resources,” Elect. Power Res. Inst. (EPRI), Palo Alto, CA, USA,
computationally difficult and the computational power avail- Tech. Rep., 2014.
able will be limited as it won’t be worthwhile investing in a [11] “Emerging technologies information,” Aust. Energy
Market Operat. (AEMO), Tech. Rep., 2015. [Online].
specialised equipment to solve this problem given the savings Available: https://www.aemo.com.au/Media-Centre/2015-Emerging-
on offer. Technologies-Information-Paper
[12] “The economics of grid defection when and where distributed solar gen-
eration plus storage competes with traditional utility service,” Rocky
VI. C ONCLUSION Mountain Inst., Boulder, CO, USA, Tech. Rep., 2014.
[13] Z. Chen, L. Wu, and Y. Fu, “Real-time price-based demand response
This paper shows the benefits having a smart home energy management for residential appliances via stochastic optimization
management system and presented an approximate dynamic and robust optimization,” IEEE Trans. Smart Grid, vol. 3, no. 4,
pp. 1822–1831, Dec. 2012.
programming approach for implementing a computationally
[14] C. Keerthisinghe, G. Verbič, and A. C. Chapman, “Evaluation of a
efficient smart home energy management system with sim- multi-stage stochastic optimisation framework for energy management
ilar quality schedules as with dynamic programming. This of residential PV-storage systems,” in Proc. Aust. Univ. Power Eng.
approach enables us to extend the decision horizon to up Conf. (AUPEC), Perth, WA, Australia, Sep./Oct. 2014, pp. 1–6.
[15] M. C. Bozchalui, S. A. Hashmi, H. Hassen, C. A. Canizares, and
to a week with high resolution while considering multiple K. Bhattacharya, “Optimal operation of residential energy hubs in smart
devices. Our results indicate that these improvements pro- grids,” IEEE Trans. Smart Grid, vol. 3, no. 4, pp. 1755–1766, Dec. 2012.
vide financial benefits to households employing them in a [16] F. De Angelis et al., “Optimal home energy management under dynamic
electrical and thermal constraints,” IEEE Trans. Ind. Informat., vol. 9,
smart home energy management system. Moreover, stochas- no. 3, pp. 1518–1527, Aug. 2013.
tic approximate dynamic programming always performs better [17] J. Wang, Z. Sun, Y. Zhou, and J. Dai, “Optimal dispatching model of
over deterministic approximate dynamic programming under smart home energy management system,” in Proc. IEEE Innov. Smart
Grid Technol. Asia (ISGT Asia), Washington, DC, USA, 2012, pp. 1–5.
uncertainty without a noticeable increase in the computational
[18] K. C. Sou, J. Weimer, H. Sandberg, and K. H. Johansson, “Scheduling
effort. smart home appliances using mixed integer linear programming,” in
In practical applications, we can use value function approx- Proc. 50th IEEE Conf. Decis. Control Eur. Control Conf. (CDC-ECC),
imations generated from approximate dynamic programming Orlando, FL, USA, 2011, pp. 5144–5149.
[19] M. A. Pedrasa, E. D. Spooner, and I. F. MacGill, “Robust schedul-
during offline planning phase to make faster online solu- ing of residential distributed energy resources using a novel energy
tions. This is not possible with stochastic mixed-integer linear service decision-support tool,” in Proc. IEEE PES Innov. Smart Grid
programming and generating value functions using dynamic Technol. (ISGT), Anaheim, CA, USA, 2011, pp. 1–8.
[20] M. A. A. Pedrasa, T. D. Spooner, and I. F. MacGill, “Coordinated
programming is computationally difficult. In the future, we scheduling of residential distributed energy resources to optimize
will learn initial value function approximations of approximate smart home energy services,” IEEE Trans. Smart Grid, vol. 1, no. 2,
dynamic programming using historical results, which will fur- pp. 134–143, Sep. 2010.
ther speed up the value function approximation convergence. [21] D. P. Bertsekas, Dynamic Programming and Optimal Control, vol. 1,
3rd ed. Belmont, MA, USA: Athena Sci., 2005.
Given the benefits outlined in this paper and the possible [22] H. Tischer and G. Verbič, “Towards a smart home energy management
future work, we recommend the use of approximate dynamic system—A dynamic programming approach,” in Proc. IEEE PES Innov.
programming in smart home energy management systems. Smart Grid Technol. Asia (ISGT), Anaheim, CA, USA, 2011, pp. 1–7.
[23] C. Keerthisinghe, G. Verbič, and A. C. Chapman, “Addressing the
stochastic nature of energy management in smart homes,” in Proc. Power
Syst. Comput. Conf. (PSCC), Wrocław, Poland, Aug. 2014, pp. 1–7.
R EFERENCES [24] C. Keerthisinghe, G. Verbič, and A. C. Chapman, “Energy manage-
ment of PV-storage systems: ADP approach with temporal difference
[1] “Australian government bureau of resources and energy economics,” learning,” in Proc. Power Syst. Comput. Conf. (PSCC), Genoa, Italy,
Energy in Australia, Tech. Rep., 2014. Jun. 2016, pp. 1–7.
KEERTHISINGHE et al.: A FAST TECHNIQUE FOR SMART HOME MANAGEMENT: ADP WITH TEMPORAL DIFFERENCE LEARNING 3303
[25] D. F. Salas and W. B. Powell, “Benchmarking a scalable approximate Chanaka Keerthisinghe (S’10) received the
dynamic programming algorithm for stochastic control of multidimen- B.E. (Hons.) and M.E. (Hons.) degrees in electrical
sional energy storage problems,” Working Paper, Dept. Oper. Res., and electronic engineering from the University of
Financial Eng., Princeton, NJ, USA, 2013. Auckland, New Zealand, in 2011 and 2012, respec-
[26] D. R. Jiang, T. V. Pham, W. B. Powell, D. F. Salas, and W. R. Scott, tively. He is currently pursuing the Ph.D. degree with
“A comparison of approximate dynamic programming techniques on the Centre for Future Energy Networks, University
benchmark energy storage problems: Does anything work?” in Proc. of Sydney, Australia. His research interests include
IEEE Symp. Adap. Dyn. Program. Reinforcement Learn. (ADPRL), demand response, energy management in residential
Orlando, FL, USA, Dec. 2014, pp. 1–8. buildings, wireless charging of electric vehicles and
[27] R. N. Anderson, A. Boulanger, W. B. Powell, and W. Scott, “Adaptive applying approximate dynamic programming and
stochastic control for the smart grid,” Proc. IEEE, vol. 99, no. 6, optimization techniques in power systems.
pp. 1098–1115, Jun. 2011.
[28] W. B. Powell et al., “SMART: A stochastic multiscale model for the
analysis of energy resources, technology, and policy,” INFORMS J.
Comput., vol. 24, no. 4, pp. 665–682, 2012.
[29] W. B. Powell, Approximate Dynamic Programming: Solving the Curses
of Dimensionality. Hoboken, NJ, USA: Wiley, 2007.
[30] W. B. Powell and H. Topaloglu, Approximate Dynamic Programming
for Large-Scale Resource Allocation Problems. Princeton, NJ, USA:
Gregor Verbič received the B.Sc., M.Sc., and
Princeton Univ. Press, 2005.
Ph.D. degrees in electrical engineering from the
[31] F. Borghesan, R. Vignali, L. Piroddi, M. Prandini, and M. Strelec,
University of Ljubljana, Ljubljana, Slovenia, in
“Approximate dynamic programming-based control of a building cool-
1995, 2000, and 2003, respectively. In 2005, he
ing system with thermal storage,” in Proc. 4th IEEE/PES Innov. Smart
was a North Atlantic Treaty Organization Natural
Grid Technol. Europe (ISGT EUROPE), Oct. 2013, pp. 1–5.
Sciences and Engineering Research Council of
[32] M. Strelec and J. Berka, “Microgrid energy management based on
Canada Post-Doctoral Fellow with the University of
approximate dynamic programming,” in Proc. 4th IEEE/PES Innov.
Waterloo, Waterloo, ON, Canada. Since 2010, he has
Smart Grid Technol. Europe (ISGT EUROPE), Oct. 2013, pp. 1–5.
been with the School of Electrical and Information
[33] P. Samadi, H. Mohsenian-Rad, V. W. S. Wong, and R. Schober, “Real-
Engineering, University of Sydney, Sydney, NSW,
time pricing for demand response based on stochastic approximation,”
Australia. His expertise is in power system oper-
IEEE Trans. Smart Grid, vol. 5, no. 2, pp. 789–798, Mar. 2014.
ation, stability and control, and electricity markets. His current research
[34] E. L. Ratnam, S. R. Weller, C. M. Kellett, and A. T. Murray,
interests include integration of renewable energies into power systems and
“Residential load and rooftop PV generation: An Australian dis-
markets, optimization and control of distributed energy resources, demand
tribution network dataset,” Int. J. Sustain. Energy, pp. 1–20,
response, and energy management in residential buildings. He was a recipient
Oct. 2015. [Online]. Available: http://www.tandfonline.com/doi/abs/
of the IEEE Power and Energy Society Prize Paper Award in 2006. He is an
10.1080/14786451.2015.1100196
Associate Editor of the IEEE T RANSACTIONS ON S MART G RID.
[35] (Jan. 2015). Smart-Grid Smart-City Customer Trial Data.
[Online]. Available: https://data.gov.au/dataset/smart-grid-smart-
city-customer-trial-data
[36] T. Jaakkola, M. I. Jordan, and S. P. Singh, “Convergence of stochastic
iterative dynamic programming algorithms,” Neural Comput., vol. 6,
no. 6, pp. 1185–1201, 1994.
[37] G. A. Godfrey and W. B. Powell, “An adaptive, distribution-free
algorithm for the newsvendor problem with censored demands, with
applications to inventory and distribution,” Manag. Sci., vol. 47, no. 8, Archie C. Chapman (M’14) received the B.A.
pp. 1101–1112, 2001. degree in math and political science and the B.Econ.
[38] N. Sharma, P. Sharma, D. Irwin, and P. Shenoy, “Predicting solar gen- (Hons.) degree from the University of Queensland,
eration from weather forecasts using machine learning,” in Proc. IEEE Brisbane, QLD, Australia, in 2003 and 2004, respec-
Int. Conf. Smart Grid Commun. (SmartGridComm), Brussels, Belgium, tively, and the Ph.D. degree in computer science
Oct. 2011, pp. 528–533. from the University of Southampton, Southampton,
[39] K. Bao, F. Allerding, and H. Schmeck, “User behavior prediction U.K., in 2009. He is currently a Research Fellow
for energy management in smart homes,” in Proc. 8th Int. Conf. in Smart Grids with the School of Electrical and
Fuzzy Syst. Knowl. Disc. (FSKD), vol. 2. Shanghai, China, Jul. 2011, Information Engineering, Centre for Future Energy
pp. 1335–1339. Networks, University of Sydney, Sydney, NSW,
[40] D. Lachut, N. Banerjee, and S. Rollins, “Predictability of energy use in Australia. He has expertise is in game-theoretic and
homes,” in Proc. Int. Green Comput. Conf. (IGCC), Dallas, TX, USA, reinforcement learning techniques for optimization and control in large dis-
Nov. 2014, pp. 1–10. tributed systems. His research focuses on integrating renewables into legacy
[41] “Water heating data collection and analysis,” E3–Equipment power networks, using distributed energy and load scheduling methods, and on
Energy Efficiency, Tech. Rep., 2012. [Online]. Available: http:// designing tariffs and market mechanisms that support efficient use of existing
www.energyefficient.com.au/reports-technical.html infrastructure and new controllable devices.