Perl and Bioperl

PERL AND BIOPERL
CONTROL STRUCTURES
 “if” statement - first style

 if ($porridge_temp < 40) {
print “too hot.\n”;
}
elsif ($porridge_temp > 150) {
print “too cold.\n”;
}
else {
print “just right\n”;
}
CONTROL STRUCTURES
 “if” statement - second style

 statement if condition;
 print “\$index is $index” if $DEBUG;
 Single statements only
 Simple expressions only
 “unless” is a reverse “if”
 statement unless condition;
 print “millennium is here!” unless $year < 2000;
CONTROL STRUCTURES
 “for” loop - first style

 for (initial; condition; increment) { code }
 for ($i=0; $i<10; $i++) {
print “hello\n”;
}
 “for” loop - second style
 for [variable] (range) { code }
 for $name (@employees) {
print “$name is an employee.\n”;
}
THE FOR STATEMENT
 Syntax
for (START; STOP; ACTION) { BODY }
 Initially execute START statements once.
 Repeatedly execute BODY until STOP is false.
 Execute ACTION after each iteration.
 Example
for ($i=0; $i<10; $i++) {
print(“Iteration: $i\n”);
}
THE FOREACH STATEMENT
 Syntax
foreach SCALAR ( ARRAY ) { BODY }
 Assign ARRAY element to SCALAR.
 Execute BODY.
 Repeat for each element in ARRAY.
 Example
asTmp = qw(One Two Three);
foreach $s (@asTmp){$s .= “sy ”;}
print(@asTmp); # Onesy Twosy Threesy
CONTROL STRUCTURES
 “while” loop
 while (condition) { code }
 $cars = 7;
while ($cars > 0) {
print “cars left: ”, $cars--, “\n”;
}
 while ($game_not_over) {…}
CONTROL STRUCTURES
 “until” loop is opposite of “while”

 until (condition) { code }
 $cars = 7;
until ($cars <= 0) {
print “cars left: ”, $cars--, “\n”;
}
 while ($game_not_over) {…}
CONTROL STRUCTURES
 Bottom-check Loops
 do { code } while (condition);
 do { code } until (condition);
 $value = 0;
do {
print “Enter Value: ”;
$value = <STDIN>;
} until ($value > 0);
SUBROUTINES (FUNCTIONS)
 Defining a Subroutine
 sub name { code }
 Arguments passed in via “@_” list
 sub multiply {
my ($a, $b) = @_;
return $a * $b;
}
 Last value processed is the return value
(could have left out word “return”, above)
 Calling a Subroutine
 subname; # no args, no return value
 subname (args);
 retval = &subname (args);
 The “&” is optional so long as…
 subname is not a reserved word
 subroutine was defined before being called
 Passing Arguments
 Passes the value
 Lists are expanded
 @a = (5,10,15);
@b = (20,25);
&mysub(@a,@b);
 this passes five arguments: 5,10,15,20,25
 mysub can receive them as 5 scalars, or one array
 Examples
 sub good1 {
my($a,$b,$c) = @_;
}
&good1 (@triplet);
 sub good2 {
my(@a) = @_;
}
&good2 ($one, $two, $three);
DEALING WITH HASHES
 keys( ) - get an array of all keys

 foreach (keys (%hash)) { … }
 values( ) - get an array of all values
 @array = values (%hash);
 each( ) - get key/value pairs
 while (@pair = each(%hash)) {
print “element $pair[0] has $pair[1]\n”;
}
DEALING WITH HASHES
 exists( ) - check if element exists

 if (exists $ARRAY{$key}) { … }
 delete( ) - delete one element
 delete $ARRAY{$key};
OTHER USEFUL FUNCTIONS
 push( ), pop( )- stack operations on lists

 shift( ),unshift( ) - bottom-based ops
 split( ) - split a string by separator

 @parts = split(/:/,$passwd_line);
 while (split) … # like: split (/\s+/, $_)
 splice( ) - remove/replace elements
 substr( ) - substrings of a string
STRING MANIPULATION
 chop
 chop(VARIABLE)
 chop(LIST)
 index(STR, SUBSTR, POSITION)
 index(STR, SUBSTR)
 length(EXPR)
STRING MANIPULATION (CONT.)
 substr(EXPR, OFFSET, LENGTH)

 substr(EXPR, OFFSET)
 Example: string.pl
PATTERN MATCHING
 See if strings match a certain pattern

 syntax: string =~ pattern
 Returns true if it matches, false if not.
 Example: match “abc” anywhere in string:
 if ($str =~ /abc/) { … }
 But what about complex concepts like:
 between 3 and 5 numeric digits
 optional whitespace at beginning of line
PATTERN MATCHING
 Regular Expressions are a way to describe character

patterns in a string
 Example: match “john” or “jon”
 /joh?n/
 Example: match money values
 /\$\d+\.\d\d/
 Complex Example: match times of the day
 /\d?\d:\d\d(:\d\d)? (AM|PM)?/i
PATTERN MATCHING
 Symbols with Special Meanings

 period . - any single character
 char set [0-9a-f] - one char matching these
 Abbreviations
 \d - a numeric digit [0-9]

 \w - a word character [A-Za-z0-9_]
 \s - whitespace char [ \t\n\r\f]
 \D, \W, \S - any character but \d, \w, \s
 \n, \r, \t - newline, carriage-return, tab
 \f, \e - formfeed, escape
 \b - word break
PATTERN MATCHING
 Symbols with Special Meanings

 asterisk * - zero or more occurrences
 plus sign + - one or more occurrences
 question mark ? - zero or one occurrences
 carat ^ - anchor to begin of line
 dollar sign $ - anchor to end of line
 quantity {n,m} - between n and m
occurrences (inclusively)
 [A-Z]{2,4} means “2, 3, or 4 uppercase letters”.
PATTERN MATCHING
 Ways of Using Patterns

 Matching
 if ($line =~ /pattern/) { … }
 also written: m/pattern/
 Substitution
 $name =~ s/ASU/Arizona State University/;
 Translation
 $command =~ tr/A-Z/a-z/; # lowercase it
COMMAND LINE ARGS
 $0 = program name
 @ARGV array of arguments to program
 zero-based index (default for all arrays)
 Example
 yourprog -a somefile
 $0 is “yourprog”
 $ARGV[0] is “-a”
 $ARGV[1] is “somefile”
BASIC FILE I/O
 Reading a File
 open (FILEHANDLE, “$filename”) || die \ “open of
$filename failed: $!”;
while (<FILEHANDLE>) {
chop $_; # or just: chop;
print “$_\n”;
}
close FILEHANDLE;
BASIC FILE I/O
 Writing a File
 open (FILEHANDLE, “>$filename”) || die \ “open of
$filename failed: $!”;
while (@data) {
print FILEHANDLE “$_\n”;
# note, no comma!
}
close FILEHANDLE;
BASIC FILE I/O
 Predefined File Handles

 <STDIN> input
 <STDOUT> output
 <STDERR> output
 print STDERR “big bad error occurred\n”;
 <> ARGV or STDIN
READING WITH <>
 Reading from File

 $input = <MYFILE> ;
 Reading from Command Line
 $input = <> ;
 Reading from Standard Input
 $input = <> ;
 $input = <STDIN> ;
READING WITH <> (CONT.)
 Reading into Array Variable

 @an_array = <MYFILE> ;
 @an_array = <STDIN> ;
 @an_array = <> ;
PACKAGES
 Collect data & functions in a separate (“private”)

namespace
 Reusable code
PACKAGES
 Access packages by file name or path:

 require “getopts.pl”;
 require “/usr/local/lib/perl/getopts.pl”;
 require “../lib/mypkg.pl”;
PACKAGES
 Command: package pkgname;

 Stays in effect until next “package” or end of block { … } or
end of file.
 Default package is “main”
PACKAGES
 Package name in variables

 $pkg::counter = 0;
 Package name in subroutines
 sub pkg::mysub ( ) { … }
 &pkg::mysub($stuff);
 Old syntax in Perl 4
 sub pkg’mysub ( ) { … }
PACKAGES
#
# Get Day Of Month Package
#
package getDay;
sub main::getDayOfMonth {
local ($sec, $min, $hour, $mday) = localtime;
return $mday;
}
1; # otherwise “require” or “use” would fail
PACKAGES
 Calling the package

 require “/path/to/getDay.pl”;
$day = &getDayOfMonth;
 In Perl 5, you can leave off “&” for previously defined
functions:
 $day = getDayOfMonth;
WHAT ARE PERL MODULES?
 Modules are collections of subroutines
 Encapsulate code for a related set of processes
 End in .pm so Foo.pm would be used as Foo
 Can form basis for Objects in Object Oriented

programming
USING A SIMPLE MODULE
 List::Util is a set of List utilities functions
 Read the perldoc to see what you can do
 Follow the synopsis or individual function examples

LIST::UTIL
use List::Util;
my @list = 10..20;
my $sum = List::Util::sum(@list);
print “sum (@list) is $sum\n”;
use List::Util qw(shuffle sum);
my $sum = sum(@list);
my @list = (10,10,12,11,17,89);
print “sum (@list) is $sum\n”;
my @shuff = shuffle(@list);
print “shuff is @shuffle\n”;
MODULE NAMING
 Module naming is to help identify the purpose of the

module
 The symbol :: is used to further specify a directory
name, these map directly to a directory structure
 List::Util is therefore a module called Util.pm located in
a directory called ‘List’
(MORE) MODULE NAMING
 Does not require inheritance or specific relationship
between modules that all start with the same directory
name
 Case MaTTerS! List::util will not work
 Read more about a module by doing “perldoc

Modulename”
MODULES AS OBJECTS
 Modules are collections of subroutines
 Can also manage data (aka state)
 Multiple instances can be created (instantiated)
 Can access module routines directly on object

OBJECT CREATION
 To instantiate a module call ‘new’
 Sometimes there are initialization values
 Objects are registered for cleanup when they are set to

undefined (or when they go out of scope)
 Methods are called using -> because we are
dereferencing object.
SIMPLE MODULE AS OBJECT EXAMPLE
#!/usr/bin/perl -w
use strict;
use MyAdder;
my $adder = new MyAdder;

$adder->add(10);
print $adder->value, “\n”;
$adder->add(10);
print $adder->value, “\n”;
my $adder2 = new MyAdder(12);

$adder2->add(17);
print $adder2->value, “\n”;
my $adder3 = MyAdder->new(75);
$adder3->add(7);
print $adder3->value, “\n”;
WRITING A MODULE: INSTANTIATION
 Starts with package to define the module name
 multiple packages can be defined in a single module file -
but this is not recommended at this stage
 The method name new is usually used for instantiation
 bless is used to associate a datastructre with an object
WRITING A MODULE: SUBROUTINES
 The first argument to a subroutine from a module is
always a reference to the object - we usually call it ‘$self’
in the code.
 This is an implicit aspect Object-Oriented Perl
 Write subroutines just like normal, but data associated

with the object can be accessed through the $self
reference.
WRITING A MODULE
package MyAdder;
use strict;
sub new {
my ($package, $val) = @_;
$val ||= 0;
my $obj = bless { ‘value’ => $val}, $package;
return $obj;
}
sub add {
my ($self,$val) = @_;
$self->{’value’} += $val;
}
sub value {
my $self = shift;
return $self->{’value’};
}
WRITING A MODULE II (ARRAY)
package MyAdder;
use strict;
sub new {
my ($package, $val) = @_;
$val ||= 0;
my $obj = bless [$val], $package;
return $obj;
}
sub add {
my ($self,$val) = @_;
$self->[0] += $val;
}
sub value {
my $self = shift;
return $self->[0];
}
USING THE MODULE
 Perl has to know where to find the module

 Uses a set of include paths
 type perl -V and look at the @INC variable
 Can also add to this path with the PERL5LIB
environment variable
 Can also specify an additional library path in script use
lib ‘/path/to/lib’;
USING A MODULE AS AN OBJECT
 LWP is a perl library for WWW processing
 Will initialize an ‘agent’ to go out and retrieve web pages
for you
 Can be used to process the content that it downloads
LWP::USERAGENT
#!/usr/bin/perl -w
use strict;
use LWP::UserAgent;
my $url = 'http://us.expasy.org/uniprot/P42003.txt';
my $ua = LWP::UserAgent->new(); # initialize an object
$ua->timeout(10); # set the timeout value
my $response = $ua->get($url);
if ($response->is_success) {
# print $response->content; # or whatever
if( $response->content =~ /DE\s+(.+)\n/ ) {
print "description is '$1'\n";
}
if( $response->content =~ /OS\s+(.+)\n/ ) {
print "species is '$1'\n";
}
}
else {
die $response->status_line;
}
OVERVIEW OF BIOPERL TOOLKIT
 Bioperl is...
 A Set of Perl modules for manipulating gnomic and other
biological data
 An Open Source Toolkit with many contributors
 A flexible and extensible system for doing bioinformatics
data manipulation
SOME THINGS YOU CAN DO
 Read in sequence data from a file in standard formats
(FASTA, GenBank, EMBL, SwissProt,...)
 Manipulate sequences, reverse complement, translate
coding DNA sequence to protein.
 Parse a BLAST report, get access to every bit of data in
the report
 Dr. Mikler will post some detailed tutorials
MAJOR DOMAINS COVERED
 Sequences, Features, Annotations,

 Pairwise alignment reports
 Multiple Sequence Alignments
 Bibliographic data
 Graphical Rendering of sequence tracks
 Database for features and sequences

ADDITIONAL DOMAINS
 Gene prediction parsers

 Trees, Parsing Phylogenetic and Molecular Evolution
software output
 Population Genetic data and summary statistics
 Taxonomy
 Protein Structure
SEQUENCE FILE FORMATS
 Simple formats - without features

 FASTA (Pearson), Raw, GCG
 Rich Formats - with features and annotations
 GenBank, EMBL
 Swissprot, GenPept
 XML - BSML, GAME, AGAVE, TIGRXML, CHADO
PARSING SEQUENCES
 Bio::SeqIO
 multiple drivers: genbank, embl, fasta,...
 Sequence objects
 Bio::PrimarySeq
 Bio::Seq
 Bio::Seq::RichSeq
LOOK AT THE SEQUENCE OBJECT
 Common (Bio::PrimarySeq) methods

 seq() - get the sequence as a string
 length() - get the sequence length
 subseq($s,$e) - get a subsequence
 translate(...) - translate to protein [DNA]
 revcom() - reverse complement [DNA]
 display_id() - identifier string
 description() - description string
DETAILED LOOK AT SEQS WITH
ANNOTATIONS
 Bio::Seq objects have the methods

 add_SeqFeature($feature) - attach feature(s)
 get_SeqFeatures() - get all the attached features.
 species() - a Bio::Species object
 annotation() - Bio::Annotation::Collection
FEATURES
 Bio::SeqFeatureI - interface
 Bio::SeqFeature::Generic - basic implementation
 SeqFeature::Similarity - some score info
 SeqFeature::FeaturePair - pair of features

SEQUENCE FEATURES
 Bio::SeqFeatureI - interface - GFF derived
 start(), end(), strand() for location information
 location() - Bio::LocationI object (to represent complex
locations)
 score,frame,primary_tag, source_tag - feature information
 spliced_seq() - for attached sequence, get the sequence
spliced.
SEQUENCE FEATURE (CONT.)
 Bio::SeqFeature::Generic
 add_tag_value($tag,$value) - add a tag/value pair
 get_tag_value($tag) - get all the values for this tag
 has_tag($tag) - test if a tag exists
 get_all_tags() - get all the tags
ANNOTATIONS
 Each Bio::Seq has a Bio::Annotation::Collection via

$seq->annotation()
 Annotations are stored with keys like ‘comment’ and ‘reference’
 @com=$annotation-> get_Annotations(’comment’)
 $annotation-> add_Annotation(’comment’,$an)
ANNOTATIONS
 Annotation::Comment
 comment field
 Annotation::Reference
 author,journal,title, etc
 Annotation::DBLink
 database,primary_id,optional_id,comment
 Annotation::SimpleValue
CREATE A SEQUENCE OUT OF THIN AIR
use Bio::Seq;
my $seq = Bio::Seq->new(-seq => ‘ATGGGTA’,
-display_id => ‘MySeq’,
-description => ‘a description’);
print “base 4 is “, $seq->subseq(4,5), “\n”;
print “my whole sequence is “,$seq->seq(), “\n”;
print “reverse complement is “,
$seq->revcom->seq(), “\n”;
READING IN A SEQUENCE
use Bio::SeqIO;
my $in = Bio::SeqIO->new(-format => ‘genbank’,
-file => ‘file.gb’);
while( my $seq = $in->next_seq ) {
print “sequence name is “, $seq->display_id,

“ length is ”,$seq->length,”\n”;
print “there are “,(scalar $seq->get_SeqFeatures),
“ features attached to this sequence and “,
scalar $seq->annotation->get_Annotations(’reference’),
“ reference annotations\n”;
}
WRITING A SEQUENCE
use Bio::SeqIO;
# Let’s convert swissprot to fasta format
my $in = Bio::SeqIO->new(-format => ‘swiss’,
-file => ‘file.sp’);
my $out = Bio::SeqIO->new(-format => ‘fasta’,
-file => ‘>file.fa’);`
while( my $seq = $in->next_seq ) {
$out->write_seq($seq);
}
A DETAILED LOOK AT BLAST PARSING
 3 Components
 Result: Bio::Search::Result::ResultI
 Hit: Bio::Search::Hit::HitI
 HSP: Bio::Search::HSP::HSPI
BLAST PARSING SCRIPT
use Bio::SearchIO;
my $cutoff = ’0.001’;
my $file = ‘BOSS_Ce.BLASTP’,
my $in = new Bio::SearchIO(-format => ‘blast’,
-file => $file);
while( my $r = $in->next_result ) {
print "Query is: ", $r->query_name, " ",
$r->query_description," ",$r->query_length," aa\n";
print " Matrix was ", $r->get_parameter(’matrix’), "\n";
while( my $h = $r->next_hit ) {
last if $h->significance > $cutoff;
print "Hit is ", $h->name, "\n";
while( my $hsp = $h->next_hsp ) {
print " HSP Len is ", $hsp->length(’total’), " ",
" E-value is ", $hsp->evalue, " Bit score ",
$hsp->score, " \n",
" Query loc: ",$hsp->query->start, " ",
$hsp->query->end," ",
" Sbject loc: ",$hsp->hit->start, " ",
$hsp->hit->end,"\n";
}
}
}
BLAST Report
Copyright (C) 1996-2000 Washington University, Saint Louis, Missouri USA.
All Rights Reserved.
Reference: Gish, W. (1996-2000) http://blast.wustl.edu
Query= BOSS_DROME Bride of sevenless protein precursor.

(896 letters)
Database: wormpep87
20,881 sequences; 9,238,759 total letters.
Searching....10....20....30....40....50....60....70....80....90....100% done
Smallest
Sum
High Probability
Sequences producing High-scoring Segment Pairs: Score P(N) N
F35H10.10 CE24945 status:Partially_confirmed TR:Q20073... 182 4.9e-11 1

M02H5.2 CE25951 status:Predicted TR:Q966H5 protein_id:... 86 0.15 1
ZC506.4 CE01682 locus:mgl-1 metatrophic glutamate recept... 91 0.18 1
……
USING THE SEARCH::RESULT OBJECT
use Bio::SearchIO;
use strict;
my $parser = new Bio::SearchIO(-format => ‘blast’, -file => ‘file.bls’);
while( my $result = $parser->next_result ){
print “query name=“, $result->query_name, “ desc=”,
$result->query_description, “, len=”,$result->query_length,“\n”;
print “algorithm=“, $result->algorithm, “\n”;
print “db name=”, $result->database_name, “ #lets=”,
$result->database_letters, “ #seqs=”,$result->database_entries, “\n”;
print “available params “, join(’,’,
$result->available_parameters),”\n”;
print “available stats “, join(’,’,
$result->available_statistics), “\n”;
print “num of hits “, $result->num_hits, “\n”;
}
USING THE SEARCH::HIT OBJECT
use Bio::SearchIO;
use strict;
my $parser = new Bio::SearchIO(-format => ‘blast’, -file => ‘file.bls’);
while( my $result = $parser->next_result ){
while( my $hit = $result->next_hit ) {
print “hit name=”,$hit->name, “ desc=”, $hit->description,
“\n len=”, $hit->length, “ acc=”, $hit->accession, ”\n”;
print “raw score “, $hit->raw_score, “ bits “, $hit->bits,
“ significance/evalue=“, $hit->evalue, “\n”;
}
}
TURNING BLAST INTO HTML
use Bio::SearchIO;
use Bio::SearchIO::Writer::HTMLResultWriter;
my $in = new Bio::SearchIO(-format => 'blast',

-file => shift @ARGV);
my $writer = new Bio::SearchIO::Writer::HTMLResultWriter();

my $out = new Bio::SearchIO(-writer => $writer
-file => “>file.html”);
$out->write_result($in->next_result);
TURNING BLAST INTO HTML
# to filter your output

my $MinLength = 100; # need a variable with scope outside the method
sub hsp_filter {
my $hsp = shift;
return 1 if $hsp->length('total') > $MinLength;
}
sub result_filter {
my $result = shift;
return $hsp->num_hits > 0;
}
my $writer = new Bio::SearchIO::Writer::HTMLResultWriter

(-filters => { 'HSP' => \&hsp_filter} );
my $out = new Bio::SearchIO(-writer => $writer);
$out->write_result($in->next_result);
# can also set the filter via the writer object

$writer->filter('RESULT', \&result_filter);
CUSTOM URL LINKS
@args = ( -nucleotide_url => $gbrowsedblink,
-protein_url => $gbrowsedblink
);
my $processor = new Bio::SearchIO::Writer::HTMLResultWriter(@args);
$processor->introduction(\&intro_with_overview);
$processor->hit_link_desc(\&gbrowse_link_desc);
$processor->hit_link_align(\&gbrowse_link_desc);
sub intro_with_overview {
my ($result) = @_;
my $f = &generate_overview($result,$result->{"_FILEBASE"});
$result->rewind();
return sprintf(
qq{
<center>
Hit Overview 
Score: Red= (>=200), Purple 200-
80, Green 80-50, Blue 50-40,
Black <40
MULTIPLE SEQUENCE ALIGNMENTS
 Bio::AlignIO to read alignment files
 Produces Bio::SimpleAlign objects
 Interface and objects designed for round-tripping and

some functional work
 Could really use an overhaul or a parallel MSA
representation
GETTING SEQUENCES FROM GENBANK
 Through Web Interface Bio::DB::GenBank (don’t
abuse!!)
 Alternative is to download all of genbank, index with
Bio::DB::Flat (will be much faster in long run)
SIMPLE SEQUENCE RETRIEVAL
use Bio::Perl;
my $seq = get_sequence(’genbank’,$acc);
print “I got a sequence $seq for $acc\n”;


SEQUENCE RETRIEVAL SCRIPT
#!/usr/bin/perl -w
use strict;
use Bio::DB::GenPept;
use Bio::DB::GenBank;
use Bio::SeqIO;
my $db = new Bio::DB::GenPept();

# my $db = new Bio::DB::GenBank(); # if you want NT seqs
# use STDOUT to write sequences
my $out = new Bio::SeqIO(-format => 'fasta');
my $acc = ‘AB077698’;
my $seq = $db->get_Seq_by_acc($acc);
if( $seq ) {
$out->write_seq($seq);
} else {
print STDERR "cannot find seq for acc $acc\n";
}
$out->close();
SEQUENCE RETRIEVAL FROM LOCAL
DATABASE
use Bio::DB::Flat;
my $db = new Bio::DB::Flat(-directory => ‘/tmp/idx’,

-dbname => ‘swissprot’,
-write_flag => 1,
-format => ‘fasta’,
-index => ‘binarysearch’);
$db->make_index(’/data/protein/swissprot’);
my $seq = $db->get_Seq_by_acc(’BOSS_DROME’);

Perl and Bioperl

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Perl and Bioperl

Uploaded by

Copyright:

Available Formats

PERL AND BIOPERL

 “if” statement - first style

 “if” statement - second style

 “for” loop - first style

 “until” loop is opposite of “while”

 keys( ) - get an array of all keys

 exists( ) - check if element exists

 push( ), pop( )- stack operations on lists

 split( ) - split a string by separator

 index(STR, SUBSTR, POSITION)

 substr(EXPR, OFFSET, LENGTH)

 See if strings match a certain pattern

 Regular Expressions are a way to describe character

 Symbols with Special Meanings

 \d - a numeric digit [0-9]

 Symbols with Special Meanings

 Ways of Using Patterns

 zero-based index (default for all arrays)

 Predefined File Handles

 Reading from File

 Reading into Array Variable

 Collect data & functions in a separate (“private”)

 Access packages by file name or path:

 Command: package pkgname;

 Package name in variables

 Calling the package

 End in .pm so Foo.pm would be used as Foo

 Can form basis for Objects in Object Oriented

 Follow the synopsis or individual function examples

 Module naming is to help identify the purpose of the

 Read more about a module by doing “perldoc

 Multiple instances can be created (instantiated)

 Can access module routines directly on object

 Objects are registered for cleanup when they are set to

my $adder = new MyAdder;

my $adder2 = new MyAdder(12);

 Write subroutines just like normal, but data associated

 Perl has to know where to find the module

 Sequences, Features, Annotations,

 Multiple Sequence Alignments

 Graphical Rendering of sequence tracks

 Database for features and sequences

 Gene prediction parsers

 Simple formats - without features

 Common (Bio::PrimarySeq) methods

 Bio::Seq objects have the methods

 SeqFeature::Similarity - some score info

 SeqFeature::FeaturePair - pair of features

 Each Bio::Seq has a Bio::Annotation::Collection via

print “sequence name is “, $seq->display_id,

Reference: Gish, W. (1996-2000) http://blast.wustl.edu

Query= BOSS_DROME Bride of sevenless protein precursor.

F35H10.10 CE24945 status:Partially_confirmed TR:Q20073... 182 4.9e-11 1

my $in = new Bio::SearchIO(-format => 'blast',

my $writer = new Bio::SearchIO::Writer::HTMLResultWriter();

# to filter your output

my $writer = new Bio::SearchIO::Writer::HTMLResultWriter

# can also set the filter via the writer object

 Interface and objects designed for round-tripping and

my $seq = get_sequence(’genbank’,$acc);

print “I got a sequence $seq for $acc\n”;

my $db = new Bio::DB::GenPept();

my $db = new Bio::DB::Flat(-directory => ‘/tmp/idx’,

You might also like