Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

Chapter 2

Func ona Programming in Python

If you’ve reached this point, you shou d be fami iar with the basics of programming in python,
in either an interpreter se ng or in a .py script which you then run. Within these scripts, you
can write mu p e ines of code which set variab es, manipu ate them, and u mate y print or
otherwise save va ues of interest.

The next step forward in the abstrac on of how your code wi be structured is known as
func ona programming. This refers to the co ec on of code into specific se f defined
func ons, with bite sized tasks that you can verify are working proper y unit tes ng and
which a ow for easy debugging. Later, we wi discuss an object oriented approach to Python
deve opment though the use of cla e , but even within these ater frameworks, the
fundamenta unit of code is s the func on though it wi go by the name me hod in the
OOP sty e .

You a ready have p enty of experience using func ons. When we import ibraries ike n m or
ma lo lib we are impor ng func ons and c asses other peop e have wri en that are stored
in those ibraries. When I ca

im o n m a n

= n .lin ace(0,2*n . i,100)


m _ a e = n . in( )

both n .lin ace() and n . in() are func ons that someone on the n m team wrote
down and p aced in the ibrary. We know how to use these func ons even though we didn’t
write them because we read the documenta on, i.e.,

hel (n . e o )
Hel on b il -in f nc ion e o in mod le n m :

e o (...)
e o ( ha e, d e=floa , o de ='C')

Re n a ne a a of gi en ha e and e, filled i h e o .

Pa ame e
----------
ha e : in o le of in
Sha e of he ne a a , e.g., ``(2, 3)`` o ``2``.
d e : da a- e, o ional
The de i ed da a- e fo he a a , e.g., `n m .in 8`. Defa l i
`n m .floa 64`.
o de : 'C', 'F' , o ional, defa l : 'C'
Whe he o o e m l i-dimen ional da a in o -majo
(C- le) o col mn-majo (Fo an- le) o de in
memo .

Re n
-------
o : nda a
A a of e o i h he gi en ha e, d e, and o de .

The he p func on above spits out a ot of text, but the most re evant bit is at the top: The
documenta on shows us what the func on does and outputs, what inputs it needs and
in which order and what data type these input shou d be , and some examp es of its use.

In this sec on, we wi be ta king a about func ons: how to define them, how to document
them, best prac ces in imp emen ng them, and more. So ets dive in!

2.1 Defining Func ons

Defining func ons in python is easy. The abso ute simp est func on I cou d define wou d ook
ike this:
def m _f nc ion():
a

In the above examp e, we see that the specia word def te s Python we are defining a
func on. We give it a name m _f nc ion in this case fo owed by parenthesis containing
arguments I have not supp ied any . We end with a co on, the same way we wou d for a oop
or condi ona statement. Then, a code associated with my func on gets indented again, just
ike a oop or condi ona statement .

You’ no ce I’ve put the word a into my func on. This usefu word is specia in Python
don’t use it as a variab e name and it te s the interpreter to just keep on wa kin, nothing to
see here. It’s a perfect thing to add when you want to define a func on remind yourse f you
need it but don’t want to add any code to it yet.

But that’s boring! Let’s add some code now. A common task in Astronomica Python is to oad
a series of fi image fi es in a directory on our computer and stack up a the images into
some storage container ike a mu dimensiona numpy array , as we as “extra data from the
fi headers ike object name, exposure me, etc. Thankfu y, the a o ibrary has a
he pfu func on for reading in data.

However, the a o func ons we’ use are specia ized for genera use. But for our specific
program, we might want to make it easy to oad up a who e directory exact y how we know we
want to in on y a sing e ine of code.

In short, we want what’s ca ed a wrapper. A func on that combines severa other func on
ca s in a usefu way for us as east .

I’ start by defining the base she of the func on:

def load_di ec o _image ( a h):


a

I’ve given the func on a recognizab e name that te s us what the func on does oads images
in a directory , and I’ve specified the first argument to the func on. Arguments are, much ike
in a mathema ca func on ike sine, what gets fed into the func on in order to faci itate the
ca cu a on. In this case, that’s a path string represen ng the oca on of the directory on our
hard drive.
Important note! When we set the name “path”, this is the variab e name we’ use inside our
func on. It is on y interna to our func on, and users can supp y any variab e they want (of
any name) to our func on. Whatever is in s ot number 1 of the argument ist wi be
assigned the temporary variab e “path” whi e it’s inside the func on.

2.1.1 Wri ng Documenta on

When we printed out the documenta on above for the n m func ons, there was a ot
there. When wri ng func ons for our own code, we genera y do not need to be that intense
documenta on depth shou d sca e with how many peop e wi use a given code n m is
used by hundreds of thousands of users, so it needs rock so id documenta on . However we
shou d a ways at east somewhat document our code. Trust me. Ask anyone who has bui t
research genera ng code i.e., hundreds of func ons, thousands of ines of code, bui t over
the course of months to years, ish can te you that you need documenta on, so that even
future you can understand what you were trying to do.

To add documenta on to your func on, we use trip e quotes as fo ows:

def load_di ec o _image ( a h):


'''
Load a di ec o ' o h of image in o con enien o age ni .
Re i e a o .io.fi .
'''
a

Now, we can see that if I run the hel () command on my func on, I get this:

hel (load_di ec o _image )

Hel on f nc ion load_di ec o _image in mod le __main__:

load_di ec o _image ( a h)
Load a di ec o ' o h of image in o con enien o age ni . Re i e

Basica y, I can now see my func on’s documenta on and know what it does.
2.1.2 Forma ng your Documenta on + Best Prac ces

What I did above technica y counts as documenta on. But there are a few extra things we
rea y need to make it usefu . Let’s improve our documenta on:

def load_di ec o _image ( a h):


'''
Load a di ec o ' o h of image in o con enien o age ni .
Re i e a o .io.fi .
No e: All Image in di ec o m be of ame ha e.

Pa ame e
----------
a h:
a h o he di ec o o i h o load, a a ing.

Re n
-------
image_ ack: a a _like
A ack of all image con ained in he di ec o .
A a of ha e (N,X,Y) he e N i he n mbe of image ,
and X,Y a e he dimen ion of each image.
image_dic : dic
A dic iona con aining heade fo each image, he ke
a e he ame a he indice of he co e onding
image in he image_ ack
'''
a

Coo , now our func on is much be er defined. We know what needs to be input, what data
types they need to be, and even have a word of warning on y works on directories where the
fits images are the same dimension . I rea ized this wou d be true when wri ng the
documenta on for image_ ack , so wri ng documenta on at east to this eve up front can
some mes he p c arify what you are trying to do with a func on in the first p ace and catch
poten a pi a s we’ want to catch that size issue within the actua code too .

You may have no ced that I have a very regu ar system for showing the parameters and return
va ues, i.e., name : d e fo owed by an indented ine with the descrip on. You need not do
exact y this when you’re first star ng, and can do anything within the trip e quotes in terms of
forma ng. However, as we move into higher eve programming, you may write code that you
have to host on Github and which needs an actua documenta on website ike a
Read heDoc .io . You may have used such a site for research code you’ve down oaded and
used. These sites are bui t using automa c frameworks, e.g., S hin , which scrape your en re
codebase and bui d the site for you automa ca y. Super convenient! But, these too s require
your documenta on be forma ed a certain way. So, it’s never too ear y to get into the habit of
wri ng documenta on recognizab e by these too s.

2.1.3 How detai ed is too detai ed?

Some mes we’re just doing some quick exp oratory data ana ysis EDA and are wri ng a
quick func on to extract and p ot some quick data. Wri ng good documenta on takes me,
and there’s a tradeoff in efficiency if you stop to write good documenta on for every func on.

Persona y, I have two ines in the sand that I use to determine the type of documenta on to
write. For EDA and other quick script type things essen a y, p aying around, and things that
won’t end up in the actua paper , I don’t bother with documenta on or I write a quick one
iner. For my own research code, which wi produce outputs I pub ish in a paper, I write
documenta on that is at minimum a detai ed descrip on, and if I p an to pub ish my code
a ong with the paper, then I use fu documenta on as shown above. If you are wri ng code to
be used by others, then it is abso ute y essen a to write fu sca e, forma ed documenta on.

2.2 Back to Our Func on: Checking the Input

O en, a usefu first step in wri ng a func on is confirming that the inputs adhere to the data
types, dimensions, or other ru es we’ve estab ished in our documenta on. Why? Because if
inputs do not meet these standards, our func on may not operate as intended. A common
saying in the industry is “garbage in, garbage out . O en, we trust that a user or us entering
bad data or wrong y shaped, or typed, data into our func on wi cause a catastrophic fai ure
which causes the code to stop and throw an error. For examp e, having two images of
different dimensions in our directory wi throw an error when we te numpy to stack those
arrays.
But what is even more insidious, and harder to track down, than the above, is when our code
inside our func on runs without error, despite the input being incorrect. If this happens, the
func on may output garbage that gets fed into other func ons, and tracking down the bug
may become hard.

As usua , input checking is a trade off between me spent wri ng it and further progress, and
once again, I usua y imp ement such steps at the “produc on code or “this is going in the
paper stage.

Our on y input to our samp e func on here is a string path to a oca on, so there are two
things we can check: first, that it is a string, and second, that the oca on exists on the
computer of reference. Both of these wou d end up flagging errors ater when we tried to use
a o s oading func on using a h , but et’s for comp eteness do it ourse ves:

im o o
def load_di ec o _image ( a h):
'''
Load a di ec o ' o h of image in o con enien
o age ni . Re i e a o .io.fi .
No e: All Image in di ec o m be of ame ha e.

Pa ame e
----------
a h:
a h o he di ec o o i h o load, a a ing.

Re n
-------
image_ ack: a a _like
A ack of all image con ained in he di ec o .
A a of ha e (N,X,Y) he e N i he n mbe of image ,
and X,Y a e he dimen ion of each image.
image_dic : dic
A dic iona con aining heade fo each image, he ke
a e he ame a he indice of he co e onding
image in he image_ ack
'''
if no i in ance( a h, ):
ai e A e ionE o ('Pa h m be a ing.')
if o . a h.i di ( a h) == Fal e:
ai e OSE o ('Pa h doe no oin o a alid loca ion.')
e n

My two checks have now been imp emented. First, we check that a h is an instance of type
string if it isn’t, we raise an error. I’ ta k more about defining errors, raising them, etc., ater,
but A e ionE o and OSE o are two that are bui t into python which mean “I am
asser ng a variab e be a certain way and it isn’t and “You’ve messed up on something re ated
to input and output oca ons on your computer . The difference is superficia , it just further
informs the user or us of what type of error occured in our code, which our error message
we’ve added a so does.

You’ a so no ce I’ve changed the a into a e n . Returning is what we do at the end of


a “finished func on, where we take va ues ca cu ated in the func on and “return them to the
overa code more on this in a sec .

Let’s test my input checking:

c om_ a h = 2
load_di ec o _image (c om_ a h)

---------------------------------------------------------------------------

A e ionE o T aceback (mo ecen call la )

<i hon-in -27-0d9b0ed7f57c> in <mod le>


1 c om_ a h = 2
----> 2 load_di ec o _image (c om_ a h)

<i hon-in -26-f6f134fb9056> in load_di ec o _image ( a h)


20 '''
21 if no i in ance( a h, ):
---> 22 ai e A e ionE o ('Pa h m be a ing.')
23 if o . a h.i di ( a h) == Fal e:
24 ai e OSE o ('Pa h doe no oin o a alid loca ion.')

A e ionE o : Pa h m be a ing.
Great! We’ve shown that my A e ionE o correct y triggered when c om_ a h was
not given as a string.

c om_ a h = ' /Folde Tha Doe n E i /o he _folde /'


load_di ec o _image (c om_ a h)

---------------------------------------------------------------------------

OSE o T aceback (mo ecen call la )

<i hon-in -28-9f0c7a13997b> in <mod le>


1 c om_ a h = ' /Folde Tha Doe n E i /o he _folde /'
----> 2 load_di ec o _image (c om_ a h)

<i hon-in -26-f6f134fb9056> in load_di ec o _image ( a h)


22 ai e A e ionE o ('Pa h m be a ing.')
23 if o . a h.i di ( a h) == Fal e:
---> 24 ai e OSE o ('Pa h doe no oin o a alid loca ion.')
25 e n

OSE o : Pa h doe no oin o a alid loca ion.

Great! This me, I made up something that is indeed a string, but that isn’t a oca on on my
computer, and my check, that o . a h.i di () is T e , threw an error.

As a fina check, et’s put in a string that shou d work a rea oca on :

eal_ a h = '/U e /'


load_di ec o _image ( eal_ a h)

And, as expected, we see that our rea path throws no errors.

We’re now ready to actua y write the func on itse f! I know that seemed ike a ot of up front
effort, but no ce that the number of ines isn’t that arge especia y if we had a one iner
documenta on , and over me, you’ be ab e to add input checking quick y and efficient y. It’s
a so a ways good to remember you can add documenta on and input checking a er the fact
but not too ong a er !
2.3 Loca Scope and G oba Scope

Before we go through the actua detai s of this par cu ar examp e func on, I want to ta k
about the concept of scope within our Python programs. So far, when working with scripts in
which every ine is a dec ara on or ca cu a on or oop or condi ona , everything exists within
what is known as the g oba scope of the code. That simp y means that if I were to run my
script in the interpreter, a the variab es at east, their fina states wou d be accesib e to me in
the interpreter, and I can use any previous y defined variab e anywhere I want in my code.

One caveat to this is iterators, which are created when you set up, e.g., a fo -loo . In this
case it’s even more confusing: the fina iterator wi s be around a er the oop, e.g.,

fo i in ange(10):
con in e

We see that I wrote a fo -loo which did nothing but iterate over a ist
[0,1,2,3,4,5,6,7,8,9] . But ater, I ca ed the variab e “i and it was s , its ast va ue from
the oop.

That seems a i e sketchy, and it kind of is. It is part of the reason we use standard iterator
variab es ike i, j, k in our oops… because they’re ess ike y to end up overwri ng an
important variab e we want to use ater. Of course, if I set up a new oop using i , it wi be
proper y overwri en at the start of the oop.

So, as I’ve described it, a our variab es are a swimming together in the big poo that is g oba
scope, and any variab e can by accessed anywhere.

That’s bad.

Let me re iterate. Whi e that way of being, which we get used to in basic scrip ng, is
extreme y convenient, it is a so dangerous, and it makes tracking down bugs in which one
variab e gets set or ca cu ated wrong and this issue propogates through the code into our fina
answer extreme y difficu t.

If you’ve ever tried to type an absurd y comp icated expression into mathema ca or Wo fram
A pha , you’ve seen this effect. Garbage answer comes out, but the on y way to figure out why
is to start breaking down the terms of the expression into sma pieces eva uated separate y.
This is exact y what we want to do with our code, and func ons give us the abi ity to do this.

Func ons have what is ca ed oca scope. This means that any variab e defined within a
func on stays within that func on. It can’t be accessed from outside the func on, it can’t be
messed with or overwri en by any code outside the func on, it’s comp ete y wa ed off and
iso ated. Once we take a string and input it to our samp e func on here as a h , for the
purposes of the inside of the func on, path is tota y iso ated.

A Huge Caveat. Loca scope is not two direc ona . Anything accessib e in the g oba scope is
a so accessib e in the oca scope of a func on. It is the reverse that isn’t true. Observe:

a = 3
b = 5
def f nc(c,d):
e n a+b+c+d

f nc(1,2)

11

We p. My func on on y takes arguments, c and d . But inside the func on, I wanton y
disregard oca scope and u ize a and b as we . As suggested by the name, g oba scope is
tru y g oba , even in func ons.

This seems to defy our desire to iso ate sma units of code sing e chunks of ca cu a ons into
different, separated func ons. So what’s the so u on?

So u on number : Simp y never ca variab es inside func ons that aren’t either inputs to the
func on or created within the func on. For examp e:

a,b = 3,5
def f nc(a,b,c,d):
e n a+b+c+d

f nc(1,2)

---------------------------------------------------------------------------

T eE o T aceback (mo ecen call la )

<i hon-in -43-00c7caa4d412> in <mod le>


3 e n a+b+c+d
4
----> 5 f nc(1,2)

T eE o : f nc() mi ing 2 e i ed o i ional a g men : 'c' and 'd'

By specifying that the variab es we ca “a and “b in our func on are posi ona arguments,
we have overwri en the g oba scope and to d our func on that the “a and “b it needs to use
are ones supp ied by the user. Now,

f nc(a,b,1,2)

11

returns the same va ue, but I knew exact y what was going on a ong the way.

Another so u on, of course, wou d be to have no variab es in the g oba scope at a , that is,
have everything iso ated into func ons. But this is typica y imprac ca , most genera use
scripts we write wi have at east some code hanging in the g oba name space. So so u on
number is the most so id way to ensure you aren’t e ng bugs “ eak into your func ons.

2.4 But Wait, Didn’t Debugging Just Get Harder?

If you’ve spent any me wri ng func ons, you may have run into the fo owing issue: You
write a simp e, but maybe ine func on to do some task. You run it, and there’s a bug not
an error raised, but the output is weird. But un ike in your script, you can’t just ook at the
intermediate variab es in the ca cu a on anymore, because they were in the func on!

def m _f nc():
a = 1
a 2 = 3
e n a + a 2

in ( a )

---------------------------------------------------------------------------

NameE o T aceback (mo ecen call la )

<i hon-in -46-2ea387ab95ff> in <mod le>


----> 1 in ( a )

NameE o : name ' a ' i no defined

When we run commands in the ipython interpreter, or jupyter notebook, or wherever, we are
in the g oba namespace, so we can’t get to the variab es created inside the func on. This o en
eads to the inser on of a mu tude of in statements into our func ons to check
intermediate steps, but even this isn’t idea ; some mes we need to mess with those variab es,
interrogate their shape, or ength, or other proper es.

There are two ways to go with this. When you’re star ng out, I recommend p ay tes ng your
code outside of func ons in the g oba namespace, tweaking and bugfixing un things work.
Then, when you’re sa sfied, copy that code into a func on. Once you get more comfortab e
with high eve programming, there are actua y industry so u ons, e.g., so ware that ets you
actua y “jump into the namespace of a func on and muck around. This is awesome, but not
necessary when you have the me and space to just test the code going into func ons in a
script environment or jupyter notebook ce first.

Enough jabbering! Let’s get back to our examp e func on. As I noted ear ier, a o has a
modu e that ets us oad images in the fi format easi y. If you’re interested in earning more
about the ins and outs of these methods, check out the sec on on a o as we as their
own website, which has so id documenta on. For now, I’ just use their too :

f om glob im o glob
f om a o .io im o fi

def load_di ec o _image ( a h):


'''
Load a di ec o ' o h of image in o con enien o age ni .
Re i e a o .io.fi , glob.
No e: All Image in di ec o m be of ame ha e.

Pa ame e
----------
a h:
a h o he di ec o o i h o load, a a ing.

Re n
-------
image_ ack: a a _like
A ack of all image con ained in he di ec o .
A a of ha e (N,X,Y) he e N i he n mbe of image ,
and X,Y a e he dimen ion of each image.
image_dic : dic
A dic iona con aining heade fo each image, he ke
a e he ame a he indice of he co e onding
image in he image_ ack
'''
if no i in ance( a h, ):
ai e A e ionE o ('Pa h m be a ing.')
if o . a h.i di ( a h) == Fal e:
ai e OSE o ('Pa h doe no oin o a alid loca ion.')

file _in_di = glob( a h)


image_ ack = []
heade _ ack =
fo i,f in en me a e(file _in_di ):
i h fi .o en(f) a HDU:
image_ ack.a end(HDU[0].da a)
heade _ ack[i] = HDU[0].heade
image_ ack = n .a a (image_ ack)
e n image_ ack, heade _ ack

The exact detai s of the above code aren’t super important, as ong as you see and understand
how this is now a wrapped into the func on and the two quan es of interest are output.
You may no ce some assump ons bui t into the code, such as that the image and header of
the fits fi e are stored in the th extension of the HDU don’t worry if that means nothing to
you right now . For astronomica data from te escopes, this is a most a ways the case, but this
wou d be an examp e of persona code in which we knew the format of the fits images we
were trying to oad and thus which extension to choose. It’s a usefu aside, however, to
consider that if we were wri ng genera use code for a pipe ine that wou d see many different
fits fi es of different interna storage systems, we’d need more robust code for dynamica y
oading them this way.

2.5 Chaining Func ons Together

Once you start wri ng your code into func ons, you’ find that the output of func on one
tends to become the input of func on two. For examp e, I cou d write a new func on:

def median_image(image_ ack):


'''
Take a ack of image and e n he median image.
Pa ame e
----------
image_ ack: a a _like
ack of image , fi dimen ion being image inde .
Re n
-------
median_image: a a _like
ingle image of he median of he in image
'''
median_image = n .median(image_ ack,a i =0)
e n median_image

Now, if I wanted to median the first three images in my fu stack, I cou d feed the fo owing
image_ ack = load_di ec o _image (image_ a h)
fi _3 = median_image(image_ ack[0:2])

I again want to emphasize that we can ca our variab es whatever outside the func ons and
then feed them in. O en though, the names end up being simi ar or the same.

You might be wondering why you wou d write a func on that had a sing e ine of code as its
ca cu a on. The short answer is, you wou dn’t my median_image() func on adds so i e
beyond your genera use of n .median() that it isn’t worth wri ng. But usua y in
Astronomy… we don’t just want a median. We want… say… a sigma c ipped mean. Now that
wou d take a few ines to accomp ish, and is probab y worth wri ng a func on for.

As a genera guide ine, I tend to put something in a func on if it

Does a sing e “task or “unit of my program


Has more then ines OR
Is on y ines but is used SO DANG OFTEN in my code that wri ng one ine instead
of every me saves me work.

2. The Concept of Main( )

So far, we’ve discussed the way one formats func ons, and how to take what’s output from a
func on i.e., isted in the return statement and save it to a new variab e see the above
examp e , which can then be put into other func ons, etc. How does this actua y flow in a
more major script’s workflow?

One of the simp est ways is through a main() func on. Let’s say I’ve wri en four func ons
which do the fo owing:

Load the images from a directory into a stack


C eaned each image somehow maybe removing cosmic rays or bad pixe s
A igned the images which were, say, dithered
Created “coadds of the images by stacking them in various ways mean, c ipped mean,
median, weighted mean .
Each func on assumes genera input and has genera output. To make it specific, I cou d write
a func on, which we o en simp y ca main() , ike this:

def main(image_di ,cleaning_ke o d,alignmen _ke o d,coadd_ke o d):


image_ ack, heade _ ack = load_di ec o _image (image_di )
cleaned_image = clean_image (image_ ack,cleaning_ke o d)
aligned_image = align_image (cleaned_image ,alignmen _ke o d)
coadded_image = coadd_image (aligned_image ,coadd_ke o d)
e n coadded_image

This wou d usua y be the ast func on defined in our code, and we can see that we here
indicate that main takes in a the info needed to run a the func ons proper y more on this in
a second . Assuming that a works, we cou d then open a termina , run our python script, and
then simp y run something ike final_o = main(in ) func on to run everything in
sequence and get the fina output.

But wait, it gets even easier than that! At the bo om of our Python script, be ow the main
and other func ons, we can add the fo owing:

if __name__ == '__main__':
main(mage_di ,cleaning_ke o d,alignmen _ke o d,coadd_ke o d)

What is this? The above is a condi ona statement that checks whether our current Python
fi e has been run. Essen a y, when I open an ipython interpreter and type n m c i . ,
Python automa ca y sets a “secret variab e ca ed __name__ to __main__ , because the
script is being run. You can put whatever you want inside this b ock, which is on y True if you
run the script en re y. In this case I’ve chosen to put a func on ca to my own main()
func on inside. Now, if I open the interpreter and type

n m c i .

It wi execute my main() ca automa ca y, without me having to type in main(blah,blah)


into the termina myse f.

You may be wondering why you wou dn’t simp y have a ca of your main func on at the
bo om of your script, without this weird condi ona . And you’re right: If you did that, the
same thing wou d happen, and running the script wou d then run your main ca , hence
running a your func ons. But something we haven’t ta ked about yet, but wi ta k about in
detai soon, is the idea of im o ing your own func ons between python fi es. When you
begin doing this, it becomes considerab y more important to have actua execu ons tucked
away inside these condi ona s that on y run if our target fi e is run direct y in the interpreter,
rather than imported into another script.

2.7 F exib e Func ons: non-posi ona arguments

Thus far, our discussion of the defini on of func ons has on y inc uded what are known as
o i ional arguments. When I define a simp e func on ike the fo owing:

def f nc(a,b,c,d):
e n (a+b-c)*d

You can see c ear y that the posi on of the four variab es in the argument ist ma ers.
Whatever the first number I supp y is wi be deemed a , the second number I feed wi be b ,
and so on. And this affects the output now, as an order of opera ons has been estab ished
rather than a simp e sum . If I flip around the order of the numbers I feed in, I’ c ear y get a
different answer.

There are severa other forms of argument, beyond posi ona . The first is an op ona , defau t,
or key word argument the three are used interchangeab y . This is extreme y usefu when we
want to obey the go den scope ru e above about not using any variab es not asked for as
arguments, but we do know that this va ue o en takes a sing e va ue.

To give a concrete examp e, et’s say I want to ca cu ate the sine of some va ues in my code,
and usua y the ang es I’m working with are in radians which is what n . in() requires but
some mes they’re in degrees. I can write a quick wrapper for my sine func on as fo ows:

def m _ in( , ni =' adian'):


if ni ==' adian':
e n n . in( )
elif ni =='deg':
ne _ = * n . i / 180.0
e n n . in(ne _ )
The way this works is that my func on assumes ni to be “radians un ess otherwise
specified:

m _ in(n . i)

1.2246467991473532e-16

We see this returns to computer precision as expected. However, if I specify different units:

m _ in(90, ni ='deg')

1.0

The code knew to convert my degrees into radians and then return the n . in() va ue.

A coo thing about these types of arguments is that because they are inked to keywords e.g.,
‘deg’ was inked to the variab e ni , they are not posi ona . A great examp e of this comes
from the ma lo lib ibrary. P o ng func ons in this ibrary tend to have a bunch of
op ona arguments with defau ts set, but which you can change. If you use the keyword for
those parameters, their order doesn’t ma er:

im o ma lo lib. lo a l
= n .a ange(5)
= **2

l . lo ( , ,l ='-',colo =' ed',m =5,label=' oin ',al ha=0.9)


l .legend()

<ma lo lib.legend.Legend a 0 7f9981023610>


Now see what happens if I change around the order of the non posi ona arguments for
l . lo () , the on y posi ona arguments are x and y, so these must a ways be supp ied as
the first two arguments :

l . lo ( , ,al ha=0.9,colo =' ed',m =5,l ='-',label=' oin ')


l .legend()

<ma lo lib.legend.Legend a 0 7f9981f00ca0>

We get exact y the same p ot.

One important note about keyword arguments is that, of course, they a must be supp ied
a er the posi ona arguments. For examp e, if I ran

l . lo ( ,al ha=0.9, ,colo =' ed',m =5,l ='-',label=' oin ')


File "<i hon-in -58-1fd2328cc66e>", line 1
l . lo ( ,al ha=0.9, ,colo =' ed',m =5,l ='-',label=' oin ')
^
S n a E o : o i ional a g men follo ke o d a g men

Python he pfu y te s me that a posi ona argument one not assigned to a keyword came
a er a keyword argument, and this is a no no. In short, if you define a func on with three
posi ona arguments and four keyword arguments, it might ook ike:

def f nc_a g ( , , ,a=1,b=25,c=None,d=Fal e):


if d:
e n + +
elif c i no None:
in (' o ')
el e:
e n +b

The above func on is of course nonsensica , but make sure you understand the code flow that
occurs try it out yourse f! Run the func on whi e on y supp ying , , va ues, then whi e
messing with changing d from Fal e to T e or c to anything.

2. Even more flexibi ity! *args and **kwargs

What if a situa on arises where we want our func on to accept an un imited number of
arguments? To give a simp e examp e: What if I want to write a m() func on that sums up
as many numbers as you put into it? Of course, we cou d write a func on that takes a sing e
argument as a ist or array, but for the sake of this examp e, how wou d we a ow the user to
enter as many numbers as possib e?

The answer is with the beauty of *a g . Let’s take our sum examp e:

def m m(a,b,*a g ):
nning_ m = a+b
fo i in a g :
nning_ m+=i
e n nning_ m

m m(1,2)

m m(1,2,3,4,5,6)

21

Coo , right?

The concept is actua y rather simp e. When we add *a g to the end of our ist of arguments
for our func on, it te s Python to take any addi ona supp ied arguments and store them in a
ist which inside our func on wi be known as a g . I can then do things with that ist in the
examp e, I iterated through them and added them to the ini a sum of the posi ona
arguments. Of course, since they’re a ready a ist, a faster method wou d be:

def m m(a,b,*a g ):
e n a+b+n . m(a g )

where by faster I mean both in ines of code and computa ona y vector sums over an array
or ist are a ways faster than a for oop; more on that in the chapter on op miza on .

But what if the extra arguments we want to accept aren’t in just any order, and we want to
track that somehow? Natura y, the so u on is simi ar, but instead we’ use the signifier
**k a g . This te s our func on to accept any number of addi ona keyword arguments, ike
the ones we’ve been discussing above. For examp e:

def e _ in ( ing,**k a g ):
in ( ing)

My e _ in () func on now requires a string input… but is happy to accept any other
kwargs I throw at it:

e _ in ('Hello, o ld ', b e ='I e had m mo ning coffe',ene g _le el=5)


Hello, o ld

What happened to those extra keyword arguments? Like the examp e for *a g , they got
stored, but this me into a dic onary of name k a g , and can be accessed in the func on.
Let’s use one:

def e _ in ( ing,**k a g ):
in ( ing)
if ' e ' in k a g .ke ():
in (k a g [' e '])

e _ in ('Hello, Wo ld ', e ='----------------')

Hello, Wo ld
----------------

Our func on is s agnos c to any extra keywords supp ied. But, IF one of those keyword
args happens to be e , my func on does something specia : it yanks the va ue of that key
from the interna dic onary of kwargs and in this case prints it.

The above examp es provide a base eve of use… but may not seem that exci ng. Why not
just make e an op ona keyword of my e _ in () func on?

Once again, the simp icity of the examp e be ays the true use: threading extra arguments
through mu p e func ons. Let’s say I have a main() func on in my script, which on y take a
few main parameters that set up my run. And et’s say that buried inside my main() func on
is a func on ca to e _ in () which te s me my code finished, say, a igning the
images. That wou d ook ike this:

def main(image_di ,cleaning_ke o d,alignmen _ke o d,coadd_ke o d):


image_ ack, heade _ ack = load_di ec o _image (image_di )
cleaned_image = clean_image (image_ ack,cleaning_ke o d)
aligned_image = align_image (cleaned_image ,alignmen _ke o d)
e _ in ('Fini hed Aligning Image , mo ing on o coadd .')
coadded_image = coadd_image (aligned_image ,coadd_ke o d)
e n coadded_image
The Above is a pre y common way to track our code progress in academic code. But you may
no ce an issue even if I setup a keyword argument in e _ in () which takes in the
separator, i.e.,

def e _ in ( ing, e =None):


in ( ing)
if e i no None:
in ( e )

e _ in ('hello ')

hello

e _ in ('hello ', e ='------')

hello
------

The prob em is, my main() func on doesn’t have an argument, posi ona or otherwise, that
takes in e . You can see that if every func on inside main() has severa op ona
arguments, and we wanted the abi ity to adjust them from a func on ca of main() , we’d
have to add a of those arguments as op ona arguments of main() as we . That’s both
messy and a huge pain. Instead, we can do the fo owing:

def main(image_di ,cleaning_ke o d,alignmen _ke o d,coadd_ke o d,**k a g ):


image_ ack, heade _ ack = load_di ec o _image (image_di ,**k a g )
cleaned_image = clean_image (image_ ack,cleaning_ke o d,**k a g )
aligned_image = align_image (cleaned_image ,alignmen _ke o d,**k a g )
e _ in ('Fini hed Aligning Image , mo ing on o coadd .',**k a g )
coadded_image = coadd_image (aligned_image ,coadd_ke o d,**k a g )
e n coadded_image

The above necessitates that each interior func on have been defined to a ow **k a g to
be input the way pre y print did in ine . But what wi happen now is I can run main()
and feed in any addi ona keyword arguments for any of the interior func ons and every
func on wi be fed the fu set, but can pick out the ones re evant to it using a if ___ in
k a g .ke () type mechanism.
You may sense a danger here, which is that mu p e interior func ons of yours may have some
check ike the one above that checks for the same keyword argument. That wou d be bad, if
the input keyword arg was on y meant to refer to one of the interior func ons.

The trick then, is to have, for examp e, e _ in () s check ook in the kwarg dic onary
for something ca ed e _ in _ e instead. At the outer ayer, you wou d then add that
if you wanted it to get to your e _ in () func on.

One addi ona note on the forma ng: the asterisk, “ * or “ ** has two meanings as I’ve
used them throughout codes above: “pack , and “unpack . When you use the “ ** in your
func on defini on, it is te ing your func on to take a addi ona keyword arguments and their
va ues, and pack them into a dic onary ca ed k a g accesib e within the func on. However,
in the ast func on above, I’ve done exact y that in the defini on ine of main() . Hence, some
dic onary ca ed k a g was created and I cou d access it as fo ows:

def main(image_di ,cleaning_ke o d,alignmen _ke o d,coadd_ke o d,**k a g ):

in (k a g )

image_ ack, heade _ ack = load_di ec o _image (image_di )


cleaned_image = clean_image (image_ ack,cleaning_ke o d)
aligned_image = align_image (cleaned_image ,alignmen _ke o d)
e _ in ('Fini hed Aligning Image , mo ing on o coadd .')
coadded_image = coadd_image (aligned_image ,coadd_ke o d)
e n coadded_image

For c arity, I’ve shown that a regu ar dic onary ca ed k a g exists within main() due to the
**k a g in its defini on regard ess of what I do with it.

So why the “ ** in the func ons be ow?

The second use of that symbo is unpacking. When you use “ * or “ ** in a func on ca ,
rather than defini on, it assumes that the fo owing word args or kwargs refers to a ist or
dic onary, and actua y unpacks them into separate inputs to the func on, whether they be
just va ues ike in our m _ m() examp e, or whether they be keyword arguments and their
va ues via a k a g= al e,k a g2= al e2 type system.
This exp ains why I actua y used the “ ** in both the defini on of main() to get the extra
kwargs in, and pack them into a dic onary , as we as in the ca s to other previous y defined
func ons, to unpack that dic onary back into func on keyword arguments passed into the
func ons.

The use of args and kwargs is definite y intermediate in ski progression your codes may not
need it right away. But as they grow more comp ex, it is good to be aware of this high y
flexib e way of dea ing with func on inputs, because at some point you’ have a code that is
be er off for using it.

2.9 Tes ng Func on Outputs: Unit Tes ng

Ear ier, we discussed the tes ng of inputs to your func ons to ensure proper data types or
any other restric on your func on needs to produce sensib e resu ts. What about the output?

When we write func ons, the goa is to take a arge process ike reducing a set of data from
raw images to science spectra and reduce it into sma , repeatab e, sing e task chunks so we
can eva uate that each step is performing proper y and independent y. During the
deve opment of such a code, and such func ons, you ike y test the func ons outputs
yourse f, manua y i.e., put in some samp e data, make sure the output of the func on makes
sense.

The prob em is that code ives and breathes. A er inser ng your code into a arger framework,
you’ find you have to go back and tweak that func on, add an extra input or output, modify
one part of the ca cu a on. A more advanced, but va uab e way to ensure your func ons s
do what you want them to is by imp emen ng what are known as unit tests.

Unit tests are extra pieces of code that throw samp e prob ems with known outcomes at each
of your produc on func ons and ensure that the func ons are opera ng as expected. For
arge sca e co abora ons with intense pipe ines, the amount of code that exists in the unit
tests may even exceed, or vast y exceed, the amount of produc on code actua y doing the
science! But it is these tests that make the scien sts confident in every step of their pipe ine,
even as it evo ves and changes over me.
Whi e that sounds daun ng, imp emen ng unit tes ng is ess cha ening than it sounds. There
are severa frameworks that hand e the unit tes ng for you. In this examp e, we’ be using
e .

P e is i insta ab e, and simp e to use. Simp y create a fi e that starts with e _ or


ends in _ e . somewhere that you can access the func ons of interest say, in the same
directory as your code ater we’ ta k about how to put them in a separate tests directory .
Assuming you’ve done this, inside your Python fi e for the test, you’ want to import e
as we as your func ons. For examp e, If we had a the func ons discussed in this chapter in
one python fi e ca ed ili _f nc ion . , then in the first ine of my
e _ ili ie . fi e I’d have

im o e
f om ili _f nc ion im o *

where here I’m simp y impor ng a the func ons we wou d’ve defined.

Next, we want to define some tests. The basic nature of defining a test is to create a func on
which runs your produc on func on with some set input and asserts that the output is some
known va ue. For examp e:

def e _load_image _f om_di ec o ():


e ing_ a h = '/ ome_ a h_I_ne e _me _ i h_ o_ ome_ e _fi _file /'
image_ ack, heade _dic = load_di ec o _image ( e ing_ a h)
# I kno ha he e a e 7 am le image in ha di ec o ,
# of image dimen ion 1200 b 2400
a e image_ ack. ha e == (7,1200,2400)

In the above examp e, our func on goes over to some tes ng images I’ve saved somewhere
for this purpose, and tries to oad them with my func on. I know things about those images
for examp e their dimensions, and that there are of them. This means that the expected
shape of the resu ng image stack is , , . You’re a ready used to checking
equiva encies using == , now we assert this equiva ency.

Be ieve it or not, that’s it! At east for se ng up a simp e test. Now, outside in the regu ar
termina , in this directory, simp y type
e

and the so ware wi

ocate any fi es that start with e _ or end in _ e . in this directory


run any of the func ons within
report on successes p aces the assert is true or fai ures p aces where the asser on
fai s .

Now, any me we make changes to our load_di ec o _image () func on, we can simp y
run e again to make sure we didn’t break anything. Of course, if we change the number
of outputs, we have to adjust our test to reflect that, etc.

There is a ot more to tes ng for examp e, methods to test many inputs a at once, which
we’ cover ater in the chapter on bui ding packages. But fee free to start se ng up some
very simp e tests for your research code now!

You might be thinking, wou dn’t it be great if the tes ng code just ran automa ca y any me I
changed my research code? Good news, friend! This is the exact purpose of too s which
provide Con nuous Integra on (CI). Essen a y, you can set up something simi ar to e
which actua y ives in the c oud and tests your code every me you push a new commit to
Github or your version contro service of choice . Whi e there is no need at the undergraduate
eve to be trying to both host your persona research code on github AND have it
con nuous y integrated and tested, it’s a ways good to be aware it is an avai ab e op on once
your code gets comp ex enough to warrant it!

2.10 Wrap Up

Congrats on making it through this chapter! If variab es are the atoms of code, func ons are
the mo ecu es a cri ca fundamenta bui ding b ock of arger, more comp ex programs.
Learning how to write them, document them, and test them, is a cri ca step in becoming a
be er programmer. Here are the takeaways you shou d have at the end of this chapter:

Func ons iso ate chunks of code in a oca namespace which the rest of your code can’t
access, making them si os.
Func ons take arguments: You can specify posi ona , keyword op ona , and even
infinite args or kwargs arguments.
Inside your func ons, you shou d on y use variab es made within the func on or
supp ied as arguments no dipping into the g oba namespace!
Inside your func ons, you shou d a ways add documenta on of some kind to estab ish
the func on’s purpose, its inputs, and its outputs
It’s o en worth taking a few ines to check that inputs match the requirements of the
func on and raise errors if they don’t.
Func ons return things if you don’t inc ude a return statement, the ca cu a ons in the
func on go away when ca ed. We set new variab es equa to the func on ca ed with
some parameters, and p ace what we want the func on to output in the e n
statement ine
Just ike we check the inputs inside a func on, we can check the output of a func on
using unit tes ng to make sure it is opera ng as intended.

As a ways, the best way to get be er with these concepts is prac ce! See the associated
chapter excercises for rea is c astronomy examp es of func ons you may want or need to
write!

You might also like