Predicting The Wind: Data Science in Wind Resource Assessment

P edic i g he Wi d: Da a Scie ce i Wi d Re ce A e me
Se f- d e i
F ia R chec , Da a Scie i
2020-03-11, P Da a San Diego Mee p
So rce Code:
gi h b.com/ rs/predic ing_ he_ ind
I e ded a die ce: Da a anal s s and da a scien is s ho are in eres ed in learning abo ho da a science
echniq es are applied in he ind energ domain.
A e da
Wha i i d e ce a e me ?
H mea e he i d
P edic i g l g- e m i d eed
P edic i g i d bi e e
Wha is Wind Reso rce Assessmen ?
S nd g d, b ...
H m ch e i in he ind?
Will be able ell he gene a ed elec ici a a ?
A e he ne 25 ears?
Wind resource assessment to the rescue!
P edic l g- e beha i f he i d
P edic e f i d bi e
K if ill ake a !
Wind Resource Assessment =
B i di g de f he h ica d i e ci i g
U ce ai i j af e
M de da a cie ce i ea i e e i he e d
Red ce e i i a d g ba a i g!
P edic i g he Wi d: A Da a Scie ce P b e !
Ge i g i d da a
C ea i g i d da a
A a i g i d da a
B i di g a de f he i d
P edic i g he i d a d f he i d fa
Thi i a ica da a cie ce !

G W d Da a: M Ma
Me ma l k like hi :
Mea e he i d a diffe e heigh
Ha e e f i d eed, i d di ec i , em e a e, h midi , a d eci i a i

A a i g Wi d Da a
Le ' ad da a f a e a a d chec !
I [1]: impor a da as d
da a = d. ad_ a ('./da a/ _ a . a ')
da a. d(2). ad()
O [1]: d_30 d_45 d_58 di

i e
1999-12-31 16:00:00 3.39 3.73 3.84 12.37 243.68
1999-12-31 16:10:00 3.27 3.65 3.81 12.22 250.02
1999-12-31 16:20:00 3.31 3.63 3.80 12.12 252.29
1999-12-31 16:30:00 3.78 4.26 4.38 12.04 249.54
1999-12-31 16:40:00 3.96 4.38 4.52 11.99 254.50
O e a da a h d eed / a 30, 45, a d 58 he gh , e ea e C, a d d d ec

, 10- e e a .
(The e da a a e a a ca a d I ge e a ed he h eb .)
Le ' ge a feel f he i d da a b l i g he !
In [3]: # Pl i d eed ime e ie
impor bright ind as b
anemometers = ['spd_30','spd_45', 'spd_58']

b .plot_timeseries(data[anemometers])
O t[3]:
... , ha l k e e !
Le ' a i g e da ee e de ai !
In [4]: b .plo _ imeseries(da a.loc['2019-03-11',anemome ers])
O [4]:
Ob e a i : Wi d eed a ie a h gh he da . Highe heigh ea highe i d eed.

Which di ec i n d e he ind c me f m? Le ' l af e enc e !
I [5]: . _ a ( a a[' _30'], a a[' '])
O [5]:
Wind Data Analysis Takeaways
Wi d da a = g i e e ie
Wi d da a e
The e a e d ai - eci c c ea de e i d da a (e.g. f e e c e )
F om Mea emen o P edic ion: B ilding a Model
We ant to predict ho much energ a ind farm ill likel produce in 25 ears of operation.
P ble :
Onl 2 ears of met mast ind measurements to predict 25 ears of ind
With that little data e reall don't kno enough about ho the ind ill beha e!
S l i :
Get more, longer-term data from other sources, co ering as much time as possible
Ra i ale:
The more e kno about the past, the better e can predict the future.
Sec ion O e ie
H a d he e ge el g- e da a
B ild a i le del edic i d eed
U ea e ad a ced del f ciki -lea edic i d eed
I e del i h i de e g d ai k ledge
I e iga e h c ea dc a e i d eed del
Ge ing More, Long-Term Da a
P a da a ce : G ba c ae de , ea e e b d a e
C e e Sa db (b e): 4 c ae de d ( a e), 3 a ea e e a ( e)
Le ' ad c i a e de (ERA5) a d ai ai da a!
I [6]: f om a b impo Pa
impo a da a d
_da a = ' a5_0': ' a5_0. a ',

' a5_1': ' a5_1. a ',
'MYF': 'MYF_200001010000_202003070000. a ',
'NKX': 'NKX_200001010000_202003070000. a ',
'SAN': 'SAN_200001010000_202003070000. a '
fo a , in _da a. ():
_da a[ a ] = d. ad_ a (Pa ('./da a/'). a ( ))
_da a[' a5_0']. ad()
O [6]: pd di mp
ime
1999-12-31 16:00:00 4.548827 249.010856 12.370013
1999-12-31 17:00:00 3.981960 251.738388 11.915547
1999-12-31 18:00:00 2.607753 249.791620 11.245783
1999-12-31 19:00:00 1.559933 233.880179 9.782310
1999-12-31 20:00:00 1.359054 231.334928 10.454369
Ci a e de ica d i d eed a g d e e a d ha e 1-h e i .
(Thi eb h h Id aded ERA5 da a a d hi eb h h Id aded ai

ai da a.)
From Short-Term to Long-Term Data
Our challenges:
Airport measurements are taken and climate models are calculated far from our site (the Sandbo ).
The have measurements in greater intervals than our met mast.
How much can we trust them to tell us about the wind characteristics at the Sandbo ?
Le ' plo me ma ind peed da a again he da a from he ERA5 clima e model!
I [7]: l _da a = d.c ca ([da a[' d_58']. e am le('1W').mea (), l _da a['e a5_0'][' d']. e am le('1W').mea ()], a i =1)
l _da a.c l m = ['Mea eme ', 'LT Refe e ce']
b . l _ ime e ie ( l _da a, da e_f m='2017')
O [7]:
Good ne : De pi e being ph icall far a a , here eem o be grea imilari ie be een he clima e model
and he me ma da a. (Thi i no al a he ca e, b i i here o make hi orial f n and ea .)
P blem: Ho do e e ploit the similarities bet een references and met mast data to get an idea of hat
our met mast data ould ha e looked like if e had measured for 25 ears?
S l i n:
1. Build model describing relationship bet een measurement and references

2. Let model predict hat long-term measurement ould look like
Let's build a model!

A Sim le M del: O h g nal Lea S a e
O hogonal lea a e : D a a be - line be een all ime amp-poin of efe ence and me ma ind
peed ha minimi e he o hogonal di ance be een line and ime amp-poin
I [8]: from b d.a a e.c e a import O a Lea S a e
# Re a le dail
da a_1D = da a. e a e('1D'). ea ()
_da a_1D = _da a['e a5_0']. e a e('1D'). ea ()
_ de = O a Lea S a e ( ef_ d= _da a_1D[' d'],

a e _ d=da a_1D[' d_58'],
a e a _ d='1D')
_ de . ()
'N da a ': 761,

' ff e ': 0.48915546304725566,
' 2': 0.8682745042710436,
' e': 0.7728460999482368
I look like e ha e a good amo n of da a (7k+ poin ) and a e pec able of 0.89. Le ' plo o line of
be !
In [9]: l _m del. l ()
O [9]:
There is some sca er b model s he da a q i e ell.

P blem:
To make en e of he model in e m of ho ell i can p edic ind peed , e an o ei o

p edic he ind peed fo he ime pe iod hen e ha e me ma mea emen and hen compa e
he e mea emen o he model' p edic ion .
Fo hi p po e, a e o me ic i inapp op ia e - i ell no hing abo ind peed !
(In e p e a ion of i al o a he ick fo o hogonal lea a e eg e ion in gene al)
S l i n:
U e RMSE ( oo mean a e e o ) of p edic ed ind peed . ac al mea ed ind peed a e o me ic!
I [10]: # Define co ing me ic: RMSE
import as
def ( d c , ac a ):
return . ((( d c -ac a )**2). a ())
a _ d c =
a _ c =
Le ' c e he im le h g nal lea a e m del ing RMSE!
In [11]: edic ion = (ol _model. a am [' lo e']*l _da a_1D[' d']+ol _model. a am ['off e '])
all_ edic ion [' im le'] = edic ion

all_ co e [' im le'] = m e( edic ion, da a_1D[' d_58'])
in ('RMSE of im le model: :.3f '.fo ma (all_ co e [' im le']))

in ('RMSE a % of ind eed mean: :.0f %'.fo ma (all_ co e [' im le']/da a_1D[' d_58'].mean()*100))
RMSE of im le model: 0.315

RMSE a % of ind eed mean: 11%
The RMSE i 11% f he ind eed. Thi i n eall a g d n mbe . If e ld b ild he jec a ming
11% fa e ind han e ld ac all ha e, e ld ha e made a e e en i e mi ake.
S :H can e im e he m del?
Le ' lea n ab Ph ic !
Topography vs. Wind
I ag e a dg e e c e ef a a e ca e.
A g e e ah f ef gh , acce e a e a d he a d dece e a e a d he
b ( e ca e bec e d ag a ).
P fh ea e e a e ce e ec ed d eed .
Ele a ion Map of A ea A o nd he Sandbo
We mmi ed he cale, b i l k a if he e a e me ele a i diffe e ce a d he Sa db .

Ho can e ake ad an age of he opog aphic info ma ion?
Le ' b d g de f d ffe e d d ec !
Ra a e: We a d ha he e e a d eed be ee a a d efe e ce e
d ec ha he .
A Be e M del: Binned O h g nal Lea S a e
O r simple model j st binned in 12 direction sectors.
Q ick remark: We se SciP for b ilding o r model here instead of Bright ind, since I ant to sho o that
o can e en anal e ind data ith more common data science tools.
In [12]: # B da a b d ec
dir_bin = pd.In er alInde .from_break (np.lin pace(0,360,13))
da a_1D = da a.re ample('1D').mean()

l _da a_1D = l _da a['era5_0'].re ample('1D').mean()
l _da a_1D['dir_bin'] = pd.c (l _da a_1D['dir'], dir_bin )

da a_1D['dir_bin'] = pd.c (da a_1D['dir'], dir_bin )
I [13]: # B ild binned or hogonal lea q are model
from c . d import ODR, M de , Da a
def e _da a_ _d _b (d _b ):
_da a_ _b = _da a_1D[' d'][ _da a_1D['d _b '] == d _b ]
da a_ _b = da a_1D[' d_58'][da a_1D['d _b '] == d _b ]
ret rn _da a_ _b , da a_ _b
def de _fc (B, ):

ret rn B[0]* +B[1]
b _ a =
for b _ , d _b in e e a e(d _b ):
b _ a [b _ ] = ' _ a e ': None, 'be a ': None, ' e': . a , ' ed c ': None
_da a_ _b , da a_ _b = e _da a_ _d _b (d _b )
c c e _ = ( e ( _da a_ _b . de ). e ec ( e (da a_ _b . de )))

b _ a [b _ ][' _ a e '] = e (c c e _ )
if not c c e _ :
contin e
e = ODR(Da a( _da a_ _b [c c e _ ]. a e , da a_ _b [c c e _ ]. a e ), M de ( de _fc ),

be a0=[1., 0.5]). ()
b _ a [b _ ][' ed c '] = de _fc ( e .be a, _da a_ _b )

b _ a [b _ ]['be a '] = e .be a
b _ a [b _ ][' e'] = e(b _ a [b _ ][' ed c '], da a_ _b )
a _ ed c ['b ed'] = d.c ca ([b _ a [' ed c '] for b _ a in b _ a . a e ()\

if b _ a [' ed c '] is not None]). _ de ()
a _ c e ['b ed'] = . a ea ([b _ a [' e'] for b _ a in b _ a . a e ()])
('RMSE f b ed de : :.3f '.f a (a _ c e ['b ed']))
RMSE f b ed de : 0.289
Score o far
RMSE of simple model: 0.315

RMSE of binned model: 0.289
Our binned model performs better already!
A More Ad anced Model: RandomFore Regre or
E pec a ion: Random forest regressor captures nuances in relationship between reference and
measurement better than other models
Backgro nd: A random forest generates an estimate from multiple decision trees. Each tree only gets
trained on a subset of samples and features. This means that each tree has some "specialized" knowledge
about the data and can see a particular aspect of the data better than other trees. By combining the
predictions of all trees, a random forest can provide a relatively robust prediction.
Approach:
Engineer features that model can pick up on

Run the model
Fea e E gi ee i g
Let's come up ith some (simple) features that the random forest can feed on.
In [14]: = da a_1D[' d_58']
# Ge c c e ime e
conc_inde = o ed(li ( e ( .inde ).in e ec ion(l _da a_1D.inde )))
X = l _da a_1D.loc[conc_inde ,[' d', 'di ', ' m ']]. o _inde ()
# R lli g mea
def make_ olling(da a, indo _ id h):
olling = da a[' d']. olling( indo _ id h).mean()
olling = olling.fillna( olling.mean())
olling.name = ' d_ olling_ '.fo ma ( indo _ id h)
return d.conca ([da a, olling], a i =1)
X = make_ olling(X, 3)
X = make_ olling(X, 5)
# Da e-ba ed fea e ca e em al a e
X.loc[conc_inde ,'d'] = X.inde .da
X.loc[conc_inde ,'m'] = X.inde .mon h
X.head(3)
O [14]: d di d_ i g_3 d_ i g_5 d

i e
2018-02-01 1.455543 173.896001 15.397280 3.040377 3.040787 1.0 2.0
2018-02-02 1.528880 168.001625 16.170596 3.040377 3.040787 2.0 2.0
2018-02-03 2.025680 251.970139 16.567941 1.670035 3.040787 3.0 2.0
No that e ha e data ith features, let's build 2 models: One model for hich e ill ithhold some
alidation data and, for comparison ith the pre ious models, one model that uses all data.
I [15]: f om k ea .e e b e impo Ra d F e Reg e

f om k ea . de _ e ec i impo ai _ e _ i
X_ ai , X_ a , _ ai , _ a = ai _ e _ i (X, , e _ i e=0.2, a d _ a e=42)

f_ de = Ra d F e Reg e ( _e i a =100, b_ c e=T e, a d _ a e=100, i _ a e _ eaf=10)
f_ de .fi (X_ ai , _ ai )
i ('RMSE (RF de , ai i g e ): :.3f '.f a ( e( f_ de . edic (X_ ai ), _ ai )))

i ('RMSE (RF de , a ida i e ): :.3f '.f a ( e( f_ de . edic (X_ a ), _ a )))
de = Ra d F e Reg e ( _e i a =100, b_ c e=T e, a d _ a e=100, i _ a e _ eaf=10)

de .fi (X, )
edic i = de . edic (X)

a _ edic i ['f e '] = d.Se ie ( edic i ,i de =c c_i de ). _i de ()
a _ c e ['f e '] = e( edic i , )
i ('RMSE (RF de , a da a): :.3f '.f a (a _ c e ['f e ']))
RMSE (RF de , ai i g e ): 0.303

RMSE (RF de , a ida i e ): 0.330
RMSE (RF de , a da a): 0.300
The RMSE is not too high and de nitel ithin the range of the other models. Let's compare the models in
more detail.
Com a ing he 3 Wind S eed Model
I [16]: from a ib impor a
d.Da aF a e(a _ edic i ). c['2018-04':'2018-06',:]. (fig i e=(25,5))

da a_1D[' d_58']. c['2018-04':'2018-06']. (c=' ', =2, abe ='Ta ge Ma ')
. ege d()
. h ()
All 3 models follo he arge mas nicel . The fores model some imes cap res peaks be er han o her
models (J n 18, Ma 6), b also occasionall has bigger misses (Apr 12, Ma 30).
Time for a direct score comparison! Remember: The lo er the RMSE, the better the model.
I [17]: .S (a _ c ). .ba ();
Despite all the fanciness of the random forest model, it does not reach the score of our binned model. This
being said an RMSE of 0.3 is still relativel high hen measured in terms of ind speed.
An In-De h L k a he Binned M del
W e eb eb ed de , e d d e da e ec . Le ' d a . We a
de a d b de . F , e ' d a e be f a e de ed e b .
In [18]: bin_ ise_params = pd.Series([stat['n_samples'] f stat in bin_stats.values()], inde =dir_bins)

bin_ ise_params.plot.bar(figsi e=(15,5))
plt.grid()
6b a e de 25 a e .F ed c e eb , e de bab e e ab e.
I ld be be e e im le m del f he e ca e , ince e kn ha i ha been ained n a l f
da a and e f m ea nabl ell.
I [19]: be a = []
fo be a, alid_bi ed, bi _ in i ([ a ['be a '] fo a in bi _ a . al e ()],

[ a [' _ am le ']>=25 fo a in bi _ a . al e ()],
di _bi ):
if no alid_bi ed:
be a .a e d( .a a a ([ l _m del. a am [' l e'], l _m del. a am [' ff e ']]))
i (' : Sim le m del'.f ma (bi _))
el e:
be a .a e d(be a)
i (' : Bi - i e m del'.f ma (bi _))
(0.0, 30.0]: Sim le m del

(30.0, 60.0]: Sim le m del
(60.0, 90.0]: Sim le m del
(90.0, 120.0]: Sim le m del
(120.0, 150.0]: Sim le m del
(150.0, 180.0]: Bi - i e m del
(180.0, 210.0]: Bi - i e m del
(210.0, 240.0]: Bi - i e m del
(240.0, 270.0]: Bi - i e m del
(270.0, 300.0]: Bi - i e m del
(300.0, 330.0]: Sim le m del
(330.0, 360.0]: Sim le m del
No that e ha e a more rob st model, let's re-predict o r time series and check o r RMSE metric!
I [20]: b _ e = []
b _ ed c = []
f d _b , be a in (d _b , be a ):
_da a_ _b = _da a_1D[' d'][ _da a_1D['d _b '] == d _b ]
da a_ _b = da a_1D[' d_58'][ _da a_1D['d _b '] == d _b ]
b _ ed c = de _ c (be a, _da a_ _b )
b _ ed c .a e d(b _ ed c )
b _ e = e(b _ ed c , da a_ _b )
b _ e .a e d(b _ e)
a _ ed c ['b ed_ _ e'] = d.c ca (b _ ed c ). _ de ()

a _ c e ['b ed_ _ e'] = . a ea (b _ e )
('RMSE b ed - e de : :.3 '. a (a _ c e ['b ed_ _ e']))
RMSE b ed - e de : 0.403
I [21]: d.Se e (a _ c e ). _ a e (). d(3)
O [21]: b ed 0.289
e 0.300
e 0.315
b ed_ _ e 0.403
d e: a 64
It looks as if lling the gaps in the binned model had a er bad impact on o r RMSE, making the ne model
the orst-performing one. What is going on here?
We sho ld look at the RMSE per bin to get a better pict re of ho hich binned model is to blame for the
increase in error.
In [22]: pd.DataFrame( 'rmse': bin_rmses, 'n_samples': [stat['n_samples'] for stat in bin_stats. al es()] ,
inde =dir_bins).plot.bar(s bplots=Tr e,figsi e=(15,5));
We are getting the highest errors in the bins here e inserted the simple model.
B t, if there is no ind in those bins (= directions), the errors in those bins are not important!
What e reall need is a bin- eighted scoring metric!
I ed Sc i g Me ic: Bi -Weigh ed RMSE
Fi , e i ca c a e he eigh e gi e each bi . Thi i i a he be f a e i
each bi .
In [23]: bin_n_samples = [s a ['n_samples'] f s a i bin_s a s. al es()]

bin_ eigh s = pd.Series(bin_n_samples, inde =dir_bins)/np.s m(bin_n_samples)
N e ca b i d c i gf ci !
In [24]: def rmse_binned(predic ion_spd, reference_spd, reference_dir):

sqr_errors = (predic ion_spd-reference_spd)**2
eigh s = bin_ eigh s[reference_dir]
error = np.sqr (np.nanmean(sqr_errors. al es* eigh s. al es))
e error
Wi h he c i g f ci de be , e ' ca c a e he bi - eigh ed RMSE f a f de '
edic i .
I [25]: a _ c e _b ed =
f de _ a e, ed c _ d in a _ ed c . e ():
da a_1D[' d_58'][da a_1D['d _b '] == d _b ]
a _ c e _b ed[ de _ a e] = e_b ed( ed c _ d, _da a_1D[' d'], _da a_1D['d '])
d.Da aF a e([a _ c e , a _ c e _b ed]).T. e a e(c = 0: 'U e ed', 1: 'B -We ed' )\

. _ a e ('B -We ed'). e.ba ( =0, a =0.5, c =' b e').f a (' :.4f ')
O [25]: U e g ed B -We g ed
b ed_ _ e 0.4026 0.1328
b ed 0.2888 0.1347
e 0.3152 0.1416
f e 0.2996 0.1807
The bi - eigh ed RMSE h c ea e diffe e ce be ee he de ha he eigh ed c e:
O bi ed de ha i i g he i e de ' i f ai i ga c e be
O i e de e f e ha he bi ed de
O f e de e f
S , af e a , e d e b b ed de ed c g- e d eed a a !
P edic ing he Long-Te m Wind S eed
No e can predict the long-term ind speed at our site. This helps us to get a good idea of ho much ind
energ e can potentiall harvest if e build a ind turbine there.
Remember: We assume the ind in the future ill behave like the ind in the past. To get to the long-term
ind speed, e take all predictions from our bin_fill_ im le model. To ma imi e accurac , e
substitute it ith actual measurements from our mast herever possible.
I [26]: _ _a _ a = a _ ['b _ _ ']

_ _a _ a [ a a_1D[' _58']. ] = a a_1D[' _58']
('R a - a a a a ( :.1% ).'\

. a ( a a_1D. a [0], a a_1D. a [0]/ _ _a _ a . a [0]))
R a 761 - a a a a (10.3%).
Let's summari e our long-term ind speed time series!
I [27]: ('S a : :%Y-% -% , L : :.0 a , A . :.2 / '\

. a ( _ _a _ a . [0], _ _a _ a . a [0]/(365.25), _ _a _ a . a ()))
S a : 1999-12-31, L : 20 a , A . 2.86 /
A fe o d of ca ion abo he long- e m ind eed ime e ie
W d eed a a a e he da . The ef e, d b e d ce e e g
h a f ac f ha 24 h e d.
We d ced a dail e e e ha d e e ec he e d eed cha ge d g he da ,

beca e e a e aged h d eed e ha 24 h e d.
A a e , e ge ea ce e g d c be h h e e e .
Ac a d e ce a e e ch ec ca ed ha h he e.
F m Wind S eed Mea emen P edic i n: Takea a
I i ea b i da i e i d eed edic i de
M e ad a ced de e f a a be e
D ai k edge ca he a ih di g he igh de
...a d i h a e i g de e f a ce!
From Mas Heigh o T rbine Heigh
We have: Long-term wind speed time series at 58 m height (height of the wind speed sensor on the
mast)
We want: Long-term wind speed time series at height of turbine
We need:
Information about turbine height
Information about how wind speed behaves with height
In this section: Ver short and simpli ed version.

T bi e Heigh
We arbi raril choose Ves as V112 as rbine model.
H b heigh : 119 m (h b: "nose" of he rbine aro nd hich blades ro a e)
T rbine heigh : H b heigh + ele a ion of land ha rbine s ands on (ass me 100 m for all rbines)
Mas heigh : Mas heigh + ele a ion of land ha mas s ands on (ass me 80 m)
Shea : Beha io of Wind Speed i h Heigh
We e he ind o le o e la o cale ma ind eed o bine heigh :
Shea e onen :
Le ' nd he hear e ponen b ing brigh ind' S ea .A e age() me hod.
In [28]: anem me e _heigh = [30, 45, 58]

a e age_ hea = b .Shea .A e age(da a[anem me e ], anem me e _heigh )
in ('Shea e nen : :.3 '.f ma (a e age_ hea .al ha))
Shea e nen : 0.19

Appl Shea La o Long-Te m Time Se ie
N ha e kn he hea e nen , e can calc la e he l ng- e m ime e ie a bine heigh .
In [29]: h_ bine 100.0 + 119.0

h_ma 80.0 + 58.0
l _ eed_a _ bine l _ eed_a _ma *(h_ bine/h_ma )**a e age_ hea .al ha
in ('On a e age, he ind i :.1% fa e a he bine heigh han a he mea ed heigh .' \
.fo ma (l _ eed_a _ bine.mean()/l _ eed_a _ma .mean()-1))
On a e age, he ind i 9.1% fa e a he bine heigh han a he mea ed heigh .

Predicting Wind T rbine Po er O tp t
We ha e c me a l g a . N ha e ha e a l g- e m i d eed ime e ie a bi e heigh , e a e
ead edic bi e e .
Po er Curve
Po er C r e: T rbine po er o tp t as f nction of ind speed. Let's plot the V112 po er c r e!
I [30]: e _c e = d. ead_c ('./da a/ e a _ 112_ e _c e.c ', i de _c =0).i c[:,0]

e _c e.i de . a e = 'Wi d S eed [ / ]'
e _c e. ( i e='Ve a V112 P e C e (P e O i kW)', fig i e=(15,5), i =(0,3500));
Obser ations: T rbine onl starts to prod ce po er at abo t 2 m/s ind speed, po er o tp t is stead
bet een ca. 12 and 25 m/s.
Po er C r e s. Predicted Wind Speed
No that e kno ho much po er the V112 turbine produces b ind speed, let's see ho our predicted
long-term ind speed ts into the picture.
I [53]: e _c e. ( e='Ve a V112 P e C e (P e O W) . L -Te W d S eed')

_ eed_a _ b e. . ( ec da _ =Tr e, a a=0.5, b =20 , f e=(15,5), abe =' - e ');
Obser ations: Long-term ind speed is er small in comparison to hat the turbine can handle, the V112
turbine is completel o ersi ed for this site!
Calculating Po er Output
De i e he V112 being e i ed, le ' la a nd i h he ene g d c i n n mbe e ld ge if
e e e b ild hi bine. We an ge a "feel" f he e and i in he c n e f he
c mm ni a nd he Sandb , i e.
I [78]: _ e _ . e ( _ eed_a _ b e, e _c e. de , e _c e. a e , ef 0, 0)
_ e _ d.Se e ( _ e _ , de _ eed_a _ b e. de )
e_ e e _d a _ ea _ e _ . a e[0]/(365.25)
_ e _ ea _ e _ . ()/ e_ e e _d a _ ea
('Mea b e e ea W : :,.0f '.f a ( _ e _ ea ))
Mea b e e ea W : 24,925
Alm 25 MWh! I ha a l ? I ha a li le? Le ' e e hi n mbe in he e m :
I [79]: (' T a ab e a e
da f 1 ea : :.0f '.f a ( _ e _ ea /(3.5/60*1.2)/365.25))
(' F Te a M de S c a e e ea : :.0f '.f a ( _ e _ ea /100))
T a ab e a e da f 1 ea : 975
F Te a M de S c a e e ea : 249
Tha i a l f a and a g d am n f elec ic ca cha ge !

Ho Man Ho seholds Co ld We Po er?
2017: T e ea Sa D e e dc ed 5600 W f e ec c ( ce: SDGE a E
P ec ).
T e Sa db ZIP c de (92121) ad 1677 e d 2010 ( ce: -c de .c ).
I [97]: e d _ e _ b e _ e _ ea /5600
c _ _92121_ e _ b e e d _ e _ b e/1677
('W e bad - aced b e, e c d e :.1 Sa D e e d ( :.1% a a d e Sa db ).'\

. a ( e d _ e _ b e, c _ _92121_ e _ b e))
('W 377 bad - aced b e , e c d e :.0 Sa D e e d ( :.1% a a d e Sa db ).'\
. a ( e d _ e _ b e*377, c _ _92121_ e _ b e*377))
W e bad - aced b e, e c d e 4.5 Sa D e e d (0.3% a a d e Sa db ).

W 377 bad - aced b e , e c d e 1678 Sa D e e d (100.1% a a d e Sa db ).
We , a e a d eadf ce a . B , ad , d ea e e a ea c a .T ea : We
d ' ea a b e d a e e Sa D e e d .
C ea , e ea c a da a, b d a d b ec e e Sa db d e a e e e.
N Ca ac Fac (NCF)
E e mea eh ell a bine a ind eed di ib i n and elec ici g id en i nmen in e m
f ne ca aci fac (NCF).
Thi me ic de c ibe h m ch elec ici he bine ill gene a e f m he ac al ind en i nmen , in

c m a i n h m ch i c ld he e icall gene a e, if he ind ble en gh make he bine
gene a e i ma im m e all he ime.
In [100]: ncf output_per_ ear/(365.25*po er_cur e.ma ())

print('The net capacit factor is :.1% .'.format(ncf))
The net capacit factor is 2.2%.
Thi ne ca aci fac i eall , eall l ! (T ical NCF : 30% - 50%)

We c ld lace hi bine in a be e !
N b d ld b ild a bine cl e he Sandb (gi en a i cal da a)!
How "Valuable" Would our Power be?
Challenge: Rene able energ is not (al a s) produced hen needed
Selling energ in high-demand hours can be more pro table s. in lo -demand hours
Blue: Demand / Orange: Demand minus solar and ind (Sell energ hen this alue is high at slightl
cheaper prices than fossil fuel po er plants to make good pro t.)
Let's plot our diurnal pro le to see if we would produce a good amount of electricty during these pro table
hours.
I [108]: mea ed_da a = da a[' d_58']. e am le('1h').mea ()

mea ed_da a.g b (mea ed_da a.i de .h ).mea ()\
. l (fig i e=(15,5), i le='Di al P e P d c i P file', lim=(0,23));
Unfortunately, it looks as if we produce power right when a lot of solar power is in the grid, pushing
electricity prices down. Not every wind project is like this – sometimes wind speeds are high just as energy
demand peaks.
...b ha o ld be a good po for a ind rbine?
Al h gh Sa Dieg bei g e f he , he e a e eg d f i d bi e i Calif ia.
Refe e ce
( le ed ab e)
Analy ing Wind Data
M : .
Getting Wind Data: Met Masts
"W , M ,S D " L P CC B -SA 2.0

" 3 " CC B -SA 2.0
Getting More, Long-Term Data
ASOS : .
Topography vs. Wind
F : :// . . / ? =-GIT N - 4M
E : :// . . / / .
Power Curve
V V112 P C : . - - . / /7- - 112-

R c (c )
Ca c a P O
Toas er po er cons mp ion: energ secalc la or.com, ass med 1200 W for 3.5 min es
Tesla Model S 100 kWh ba er : Wikipedia
H "Va ab " W P b ?
D ck c r e image from Wikimedia Commons
...b a b a a b ?
California ind map from inde change.energ .go

Predicting The Wind: Data Science in Wind Resource Assessment

Uploaded by

Document Information

Original Title

Copyright

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Predicting The Wind: Data Science in Wind Resource Assessment

Uploaded by

Copyright:

P edic i g he Wi d: Da a Scie ce i Wi d Re ce A e me

2020-03-11, P Da a San Diego Mee p

gi h b.com/ rs/predic ing_ he_ ind

Will be able ell he gene a ed elec ici a a ?

Thi i a ica da a cie ce !

Mea e he i d a diffe e heigh

Ha e e f i d eed, i d di ec i , em e a e, h midi , a d eci i a i

da a = d. ad_ a ('./da a/ _ a . a ')

O [1]: d_30 d_45 d_58 di

O e a da a h d eed / a 30, 45, a d 58 he gh , e ea e C, a d d d ec

In [3]: # Pl i d eed ime e ie

impor bright ind as b

anemometers = ['spd_30','spd_45', 'spd_58']

In [4]: b .plo _ imeseries(da a.loc['2019-03-11',anemome ers])

Ob e a i : Wi d eed a ie a h gh he da . Highe heigh ea highe i d eed.

I [5]: . _ a ( a a[' _30'], a a[' '])

_da a = ' a5_0': ' a5_0. a ',

_da a[' a5_0']. ad()

Ci a e de ica d i d eed a g d e e a d ha e 1-h e i .

(Thi eb h h Id aded ERA5 da a a d hi eb h h Id aded ai

b . l _ ime e ie ( l _da a, da e_f m='2017')

1. Build model describing relationship bet een measurement and references

Let's build a model!

I [8]: from b d.a a e.c e a import O a Lea S a e

_ de = O a Lea S a e ( ef_ d= _da a_1D[' d'],

'N da a ': 761,

There is some sca er b model s he da a q i e ell.

To make en e of he model in e m of ho ell i can p edic ind peed , e an o ei o

U e RMSE ( oo mean a e e o ) of p edic ed ind peed . ac al mea ed ind peed a e o me ic!

I [10]: # Define co ing me ic: RMSE

all_ edic ion [' im le'] = edic ion

in ('RMSE of im le model: :.3f '.fo ma (all_ co e [' im le']))

RMSE of im le model: 0.315

We mmi ed he cale, b i l k a if he e a e me ele a i diffe e ce a d he Sa db .

dir_bin = pd.In er alInde .from_break (np.lin pace(0,360,13))

da a_1D = da a.re ample('1D').mean()

l _da a_1D['dir_bin'] = pd.c (l _da a_1D['dir'], dir_bin )

from c . d import ODR, M de , Da a

def de _fc (B, ):

c c e _ = ( e ( _da a_ _b . de ). e ec ( e (da a_ _b . de )))

e = ODR(Da a( _da a_ _b [c c e _ ]. a e , da a_ _b [c c e _ ]. a e ), M de ( de _fc ),

b _ a [b _ ][' ed c '] = de _fc ( e .be a, _da a_ _b )

a _ ed c ['b ed'] = d.c ca ([b _ a [' ed c '] for b _ a in b _ a . a e ()\

('RMSE f b ed de : :.3f '.f a (a _ c e ['b ed']))

RMSE of simple model: 0.315

Engineer features that model can pick up on

In [14]: = da a_1D[' d_58']

X = l _da a_1D.loc[conc_inde ,[' d', 'di ', ' m ']]. o _inde ()

O [14]: d di d_ i g_3 d_ i g_5 d

I [15]: f om k ea .e e b e impo Ra d F e Reg e

X_ ai , X_ a , _ ai , _ a = ai _ e _ i (X, , e _ i e=0.2, a d _ a e=42)

i ('RMSE (RF de , ai i g e ): :.3f '.f a ( e( f_ de . edic (X_ ai ), _ ai )))

de = Ra d F e Reg e ( _e i a =100, b_ c e=T e, a d _ a e=100, i _ a e _ eaf=10)

edic i = de . edic (X)

i ('RMSE (RF de , a da a): :.3f '.f a (a _ c e ['f e ']))

RMSE (RF de , ai i g e ): 0.303

d.Da aF a e(a _ edic i ). c['2018-04':'2018-06',:]. (fig i e=(25,5))

I [17]: .S (a _ c ). .ba ();

In [18]: bin_ ise_params = pd.Series([stat['n_samples'] f stat in bin_stats.values()], inde =dir_bins)

fo be a, alid_bi ed, bi _ in i ([ a ['be a '] fo a in bi _ a . al e ()],

(0.0, 30.0]: Sim le m del

a _ ed c ['b ed_ _ e'] = d.c ca (b _ ed c ). _ de ()

('RMSE b ed - e de : :.3 '. a (a _ c e ['b ed_ _ e']))

I [21]: d.Se e (a _ c e ). _ a e (). d(3)

In [23]: bin_n_samples = [s a ['n_samples'] f s a i bin_s a s. al es()]

In [24]: def rmse_binned(predic ion_spd, reference_spd, reference_dir):