Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Python

(version 3.x with bash shell syntax)



Introduction

References
http://docs.python.org/3/tutorial/

Dive into Python 3

Free online find the website!

Think like a scientist


Free online check sources at dive into python site

Probably only Python 2 version available


Why Python?
Public websites offer many bioinformatics tools. Many are quite sophisticated. However, there will
be times when you will either have a ton of data or you will need additional analysis. You will thus
need to customize a programming tool. Once you are comfortable with a programming tool it will be
fairly easy to manipulate your data in the manner best suited for your research.
Python is a high-level programming language. It is fairly intuitive (compared to C for example).
Python was first released in 1981 by Guido Van Rossum (a fan of the Flying Circus). Guido worked
for Google for a time and now is involved in Dropbox. He remains the BDFL of Python.



Python is used by many companies and is incorporated into many applications. Users can write
functions, modules to be incorporated into an existing application. It ships with Mac OSX and is
easily downloaded for all Unix platforms and nearly any other operating system available. The code
is free and open source. Python releases are stable and its development continues to evolve.
Coding in Python is similar to pseudo code so it is easy to learn. The code is readable containing no
braces and requiring consistent indentation. The python.org website is excellent for the new learner
and the advanced programmer. Its power lies in its vast number of libraries that are amenable to
any application.

Starting Python
First to do is login in to the computer and bring up a terminal window.
In the terminal type:
python3.x [return]


NOTE: If you do not have the executable file python3.x in your path you will need to find the
executable and type its full path name. You may also add the path to the executable to your
PATH environment variable. The best way to do this is with in your bash_profile or bashrc file.
export PATH="/opt/local/bin:$PATH
OR
export PATH="/opt/local/bin:${PATH}"

Note the output on the screen. Type
>>> license()


What do you learn about the software you are using?

To exit python command interpreter:
ctrl-D
quit()


Help
The help function, help(), can be used to remind yourself of available functions, usages and
definitions. Type:
>>> help()

# and follow instructions to search keywords

You can also type for example:


>>> help(finally)


Your first program in Python
Guess what it will be?
Here are three ways to run a program with python.
1. Interactive python session
Type the following at the Python command line prompt :
>>> print(Hello world!) [return]
Hello World!

This runs python in the interactive mode.



2. At the terminal command line using a python (*.py) file.
Open a vi session (vi hello.py) and edit the file to contain
print(Hello world!)

Close the file and type:


python3.x hello.py

The output should be:


Hello World!

3. As a standalone executable

You can also add the 1st line program path option to run the script.
Start a vi session (vi hello.py) and edit the file to contain the following:
#!/usr/bin/python3.x
#read_file.py
#class header:
#
print(Hello world!)

Be sure the path to the code in the first line is correct and that the permissions on the file are correct.
Also, dont forget to add the class headers to all programs you write.
Close the file and type:
./hello.py

The output should be:


Hello World!

Additional options to launch python:


exec(open('dir_list.py').read())
The exec command can be executed within an interactive python session.
Variables are then available in the interactive session.
python3.x i script.py 10 100 1000

-i option puts user into interactive mode after running script.py


arguments 10 100 1000 can be accessed within python from sys.argv
>>>
>>>
>>>
>>>

sys.argv
sys.argv[1:]
sys.argv[0]
sys.argv[2]


Syntax errors
Correct syntax produces no errors:
>>> print(Hello World!)
Hello World!

If you type incorrect syntax a run time error is produced:


>>> print Hello World!
File "<stdin>", line 1
print Hello World!
^
SyntaxError: invalid character in identifier

In the above case a print statement used with Python version 2.x syntax produces a runtime error in
python version 3.x
Run time errors are also called exceptions.
Semantic errors
Correct syntax produces an incorrect answer. Program completes without producing a run time
error:
>>> print(Hello Word!)
Hello Word!

Here the phrase is merely misspelled.


Values
Can be letter or number.
>>> print(Hello World!)
Hello World!
>>> print(4)
4

Type
Tells whether the value is an integer or a string. Try the following:
Type str, string:
>>> type(Hello World!)

Type int, integer:


>>> type(4)

Type float, floating point number:


>>> type(3.14)

syntax of numbers. Make a variable, m1, to be an integer of 1 million (1,000,000). Try the following:
>>> m1=1,000,000
>>> m2=1000000

Try printing the variable m1 and m2. Which one worked?


The one with commas made the variable m1 a variable of type tuple
Find the type of each variable.
A tuple is a list we will discuss this class type shortly.

Quotes
Quotes mark strings. You can also use single quotes. If you wish to use a double quote in your string
and you are defining the string with double quotes then you must escape the double quote.
Similarly for single quotes.
Text between triple quotes specifies comments as well as text after # character. Similar to the use of
# sign in bash.
>>> #You can place COMMENTS after a pound (#) sign
>>> Or you can place COMMENTS after triple quotes if you have more
than one line of text.


Variables
Names of variables must follow these rules:
Any length
Letters or numbers (and underscore)
First character must be a letter
In general only lowercase (variables are case sensitive)
Cannot use python reserved names (keywords for example)

OK

Error

x=26

123abc = First three

abc123=First three

num_#=26
finally

To list the current variables in a session type:


>>> dir()
['__builtins__', '__doc__', '__name__', '__package__']
>>> x=2
>>> dir()
['__builtins__', '__doc__', '__name__', '__package__', 'x']

This command returns a list of strings containing the objects (variables) in the current session. The
list is in alphabetical order and lists attributes reachable from it.

Statements
Statements are executed and return a result (or not).
Print is a statement that returns to stdout the given value of a variable.
>>> print(x)
26

An assignment statement (assigning a variable to a value) is executed but the result is not printed.
>>> x=4

Expressions
Expressions are similar to those you use in mathematics.
>>>
>>>
>>>
>>>
>>>

20-4
3
x
x-3
greeting=Hello World!


Operators (Operands)
Operators and order of increasing precedence:
+

addition

subtraction

mutliplication

division

**

power

()

parentheses

Left ->Right

flow

>>> 3*2**3
24

Additional operators
/=
//
abs(x)

>>> x=23
>>> x/=2
>>> x
11.5
>>> x//2
5.0

addition of an = sign to the operator resets the variable to the result


floor divide
absolute value

#variable reset
#floor divide

String as operands
Strings can also be associated with operands.

>>> s=abc
>>> s+s
abcabc
>>> 3*s
abcabcabc

# concatenation
# repetition


Print function
You will find the print function is used often. You should have the variable greeting already defined
in your session. If not type it again:
>>> greeting=Hello World!
>>> greeting
Hello World!

Notice the result of typing greeting is given in single quotes. Single quotes are used in the same
manner as double quotes. They are printed when typing greeting because the variable is a string.
When you type:
>>> print(greeting)
Hello World!

The quotes are stripped out as part of the function print.


What happens when you type:
>>> print(greeting)

What happened?
Multiple variables and text can be printed using comma separation.
>>> x=2
>>> y=5
>>> print("values: ",x,"

",y,"

",x/y)

String formatting
Strings can be formatted and used in print statements.
>>> input(Input data here: )

>>> pi = 3.14159265358979323
>>> print("pi is %f "% pi ,"gives: ", "pi is %.2f" % pi)
pi is 3.141593 gives: pi is 3.14
>>> print("pi is %e" % pi)
pi is 3.141593e+00
>>> print("{0} is a {1}".format('this', 'test'))
this is a test
>>> print("{pos1} is a {pos2}".format(pos1 = 'this', pos2 = 'another
test'))
this is a another test
>>> print("{pos1} is a {pos2}i of pi: {0}".format(pi,pos1 = 'this', pos2
= 'another test'))
this is a another testi of pi: 3.141592653589793

In the above examples f gives fixed point notation and e exponential notation. The variables to
print are given by either a % sign or two braces.
>>> x,y=12,4.2
>>> ("%.2f" % (x/y))
'2.86'
>>> ('{0:0.2f}'.format(x/y))
'2.86'


For additional formatting options see:
http://docs.python.org/2/library/string.html

Input function
The input function can be used for making a script interactive with the user.
>>> input(Input data here: )
Input data here:
#this is the user prompt for data entry

#if you type 100 here the result will be returned as:

100

The result can be passed to a variable as well.


>>> xx=input(Input data here: )

Note input requires a string which can be formatted as discussed above.



Type conversion
What is the type of the variable xx above? In Python 3 all results from the input function are type str.
Since the input you are asking for is sometimes a number you must convert the string to a number
using:

>>> x1=int(input(Input data here: ))


>>> str1=str(input(Input string here: ))

Functions
Functions can be declared to isolate steps and simplify the main code. The following is a sample
program showing the general format for a python script. You should insert the text into a file and
test it.
#!/opt/local/python3.x
#
#fitch:20120305:test1.py: python example fnc code
#Usage: ./test1.py
def square(inp1):
"""
what the function does """
out1=inp1**2
return(out1)
def cube(inp2):
"""
what the function does """
out2=inp2**3
return(out2)
#Input data
str1=input("Please input a number: ")
num_str=float(str1)
#calculation
ans1=square(num_str)
ans2=cube(num_str)
print(The value squared is: ,ans1)
print(The value cubed is: ,ans2)

Import a module
One of the powers of python is the simplicity of adding modules and functions. A module consists of
a set of related functions defined in a *.py file. For example a math module you can imagine would
consist of several basic math functions (log, cosine, sine, square, exponential, etc). Another module
might consist of statistics (number of points, maximum, minimum, average, standard deviation, etc).
Modules have the feature they can be added to the available built in functions as needed. The way to
do this is through import.
>>> x=4
>>> log(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'log' is not defined

What happened? The function log is not a built in command. It is part of the math module. You must
import the math module before you can use its functions.
>>> import math
>>> log(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'log' is not defined

What happened? The function log is not a built in command. It is part of the math module. To tell
this to python you must specify the module the function belongs to.
>>> math.log(x)
0.6931471805599453

math.log(x) returns the logarithm of base e. Base 10 logarithms can be calculated in two ways.
Try to find the two different ways using the help function.
help() then type math then search on log
help("math")
help("math.log")
There are two other ways you can import functions from a module.
from math import *

This syntax allows all functions within the math module to be accessible via the function name only.
To find a logarithm you do not need to type math.log you can now just type:
>>> log(x)
0.6931471805599453

A single function from the math module can also be imported. In this case type:
from math import log

This statement will only make the log function available from the math module.
NOTE: If modules have duplicate names but different algorithms you should tread cautiously.
For example, the power function, pow, is a built-in as well as having a counterpart defined in
the math module. The built-in version will work on integers without conversion to floats. The
math module first converts integers to floats then calculate the power expression using a
different algorithm. The conversion adds time to the function so if you are just calculating
powers of integers the built-in version will be faster.
If you type import math you will have both pow and math.pow available. If you type
import math * you will overwrite the built-in function pow with the math library version.
Something you may or may not wish to do.

Additional modules
One of the powers of python is the simplicity of adding modules and functions. A module consists of
Two other modules that you will find useful are sys and os modules. The sys module gives access
to some environment variables (PATH for instance) and other interpreter export data. The os
module gives you quick and easy file manipulation ability within python.
>>> import os
>>> print(os.getcwd)
/Users/fitch/CODE/COMP_LAB_PYTHON

Other functions can be found using the help() function.


How might you list the contents of a directory?
Search help(os) search directory

Try:
>>> seq=['G', 'G', 'C', 'C', 'T', 'T', 'C', 'T', 'C', 'G', 'A', 'A', 'T', 'G', 'A', 'A', 'T', 'C']
>>> str=''
>>> str.join(seq)
'GGCCTTCTCGAATGAATC'

For loop
For loops are used in the same manner as any other language. In python a for loop is implemented
with the syntax
>>> for i in list:

In the os module a function listdir will return a list consisting of the filenames in the directory
argument. Type the following at the python command line:
>>> f=os.listdir(os.getcwd())
>>> print(f)
['.dir_list.py.swp', '__pycache__', 'dir_list.py', 'fnc.py',
'humansize.py', 'humansize_inp.py', 'quad.py', 't.py', 'test.py',
'test2.py']
>>> for f in os.listdir(os.getcwd()):
...
print(f)
# be certain to indent here !
...
.dir_list.py.swp
# current vi session
__pycache__ # python storage of compiled python scripts (binary) for
cross platform use
dir_list.py
fnc.py
humansize.py
humansize_inp.py
quad.py


In this case the list generated by os.listdir is a standard formatted list that print understands.
Thus, only the entries are printed.

TRY IT
Using a for loop type the individual variables within your current python session.

for I in dir():
print(i)
EXTEND LATER: to not list attributes (ie if first two characters are __ dont print)
>>> seq=['G', 'G', 'C', 'C', 'T', 'T', 'C', 'T', 'C', 'G', 'A', 'A', 'T', 'G', 'A', 'A', 'T', 'C']
>>> str=''
>>> str.join(seq)
'GGCCTTCTCGAATGAATC'

Recursion
A recursive algorithm calls itself.
def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n - 1)

A recursive algorithm must have a termination condition


n == 0

And a reduction step where the function calls itself


factorial(n - 1)


Another example:
def recursive(string, num):
print("#%s - %s" (string, num))
recursive(string, num+1)

Also worth noting, python by default has a limit to the depth of recursion available, to avoid
absorbing all of the computer's memory. On my computer this is 1000. I don't know if this changes
depending on hardware, etc. To see yours :
import sys
sys.getrecursionlimit()

and to set it :

import sys #(if you haven't already)


sys.setrecursionlimit()

You might also like