BSC 450 CH 3 Notes

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 22

BSC 450: Chapter 3

Amino Acids, Peptides, and Proteins

3.1, Amino Acids

Proteins: polymers of amino acids, each amino acid residue joined to its neighbor
by a specific type of covalent bond

Amino Acids Share Common Structural Features


 All 20 of the common amino acids are alpha-amino acids
o Carboxyl group and amino group bonded to the same a-carbon atom
o Differ from each other in their side chains, R groups
 Vary in structure, size, electric charge
 Influence solubility of amino acids in water
o All of the common amino acids except glycine has the a-carbon bound
to 4 groups (carboxyl group, amino group, R group, and a hydrogen
atom)
 Glycine: R group is another hydrogen atom
 The alpha carbon is a chiral center
 Tetrahedral arrangement allows the four groups to occupy two
spatial arrangements
 Amino acids have two stereoisomers – enantiomers
 Molecules with a chiral center are optically active – they rotate
the plane of plane-polarized light
 Nonpolar, Aliphatic R Groups
o Glycine (Gly, G)
o Alanine (Ala, A)
o Proline (Pro, P)
o Valine (Val, V)
o Leucine (Leu, L)
o Isoleucine (Ile, I)
o Methionine (Met, M)
 Aromatic R Groups
o Phenylalaine (Phe, F)
o Tyrosine (Tyr, Y)
o Tryptophan (Trp, W)
 Polar, Uncharged R Groups
o Serine (Ser, S)
o Threonine (Thr, T)
o Cysteine (Cys, C)
o Asparagine (Asn, N)
o Glutamine (Gln, Q)
 Positively Charged R Groups
o Lysine (Lys, K)
o Histidine (His, H)
o Arginine (Arg, R)
 Negatively Charged R Groups
o Asparate (Asp, D)
o Glutamate (Glu, E)
 Additional carbons in an R group are designated beta, gamma, delta, epsilon
proceeding out from the alpha carbon
o As opposed to the number convention in most organic compounds
o In heterocyclic R groups such as histidine, the lettering system is
ambiguous, so the numbering system is used
o For branched amino acid side chains, equivalent carbons are given
numbers after their Greek letters
 Absolute configuration: special nomenclature
o Specified by the D, L system
o L and D refer only to the absolute configuration of the four
constituents around the chiral carbon, not the optical properties
o L-amino acids have the a-amino group on the left
o D-amino acids have the a-amino group on the right
 The RS system can also specify configuration around a chiral center

The Amino Acid Residues in Proteins are L Stereoisomers


 Nearly all compounds with a chiral center occur naturally in either the L or
the D stereoisomer, not both
 Amino acid residues are exclusively L stereoisomers
o Very few D-amino acid residues have been found
 The formation of stable, repeating substructures in proteins requires their
constituent amino acids be of one stereochemical series
 Cells can synthesize the L isomer because the active sites of enzymes are
asymmetric
o The reactions they catalyze are stereospecific
Amino Acids Can Be Classified by R Group
 Five main classes based on R group properties
o Polarity or tendency to interact with water at biological pH (7)
o Varies from nonpolar and hydrophobic (water-insoluble) to polar and
hydrophilic (water-soluble)

Nonpolar, Aliphatic R Groups


 Nonpolar and hydrophobic
 Alanine, valine, leucine, and isoleucine
o Cluster together in proteins, stabilizing structure through the
hydrophobic effect
 Glycine
o Simplest structure
o Often grouped with nonpolar amino acids, but its very small side
chain makes no real contribution
 Methionine
o One of the two sulfur-containing amino acids
o Slightly nonpolar thioether group on its side chain
 Proline
o Aliphatic side chain with cyclic structure
o The second amino group of proline resides is held in a rigid
conformation that reduced the flexibility if polypeptide regions

Aromatic R Groups
 Phenylalanine, Tyrosine, Tryptophan
o Aromatic side chains make them relatively nonpolar (hydrophobic)
o Can contribute to the hydrophobic effect
o Tyrosine’s hydroxyl group can form hydrogen bonds
o Tyrosine and tryptophan are more polar because of the hydroxyl
group and the nitrogen of the indole ring, respectively
o Tryptophan and tyrosine (and phenylalanine) absorb UV light
 Characteristic strong absorbance of light by most proteins at a
wavelength of 280 nm

Polar, Uncharged R Groups


 More soluble in water, more hydrophilic
 Contain functional groups that hydrogen bond with water
 Serine, Threonine, Cysteine, Asparagine, and Glutamine
 Polarity of serine and threonine contributed by OH groups
 Polarity of asparagine and glutamine contributed by amide groups
 Cysteine’s polarity contributed by sulfhydryl group is modest
o Weak acid and can weak hydrogen bond with O or N
 Asparagine and glutamine are the amides of two other amino acids
o Aspartate and glutamate
o They are easily hydrolyzed to form acids or base
 Cysteine is readily oxidized to form a covalently linked dimeric amino acid
called cystine (two cysteine residues/molecules joined by disulfide bond)
o The residues become strongly hydrophobic (nonpolar)
o Disulfide bonds are important in proteins by forming covalent links
between parts of polypeptide molecules or between chains
Positively Charged (Basic) R Groups
 Very hydrophilic (most hydrophilic are those with + or – charge)
 R groups have significant positive charge at pH of 7
 Lysine
o Second primary amino group at the epsilon position on its aliphatic
chain
 Arginine
o Positively charged guanidinium group
 Histidine
o Aromatic imidazole group
o Only amino acid with ionizable side chain with pKa near neutrality,
histidine may be positively charged (protonated form) or uncharged at
pH 7.0
o Residues facilitate many enzyme-catalyzed reactions by serving as
proton donors/acceptors

Negatively Charged (Acidic) R Groups


 Aspartate and Glutamate
o R groups have a net negative charge at pH 7.0
o Both have second carboxyl groups

Uncommon Amino Acids Also Have Important Functions


 Residues can be created by modification of common residues
o Post synthetic modification
 4-hydroxyproline
o Derivative of proline
o Found in plant cell wall proteins
o Found in collagen, a fibrous protein of connective tissues
 5-hydroxylysine
o Derivative of lysine
o Found in collagen, a fibrous protein of connective tissues
 6-N-Methyllysine
o Constituent of myosin, a contractile protein of muscle
 Gamma-carboxyglutamate
o Found in blood-clotting protein prothrombin and in certain proteins
that bind calcium ions as part of their biological function
 Desmosine
o A derivative of the Lys residue
o Found in elastin, a fibrous protein
 Selenocysteine and Pyrrolysine
o Special case: non created through post synthetic modification
o Introduced during protein synthesis through an unusual adaptation of
the genetic code
o Selenocysteine
 Contains selenium rather than the sulfur of cysteine
 Derived from serine
 Constituent of only a few known proteins
o Pyrrolysine
 Found in a few proteins in methanogenic archaea and one
known bacterium
 Plays a role in methane biosynthesis
 Some amino acids may be altered transiently to alter the protein’s function
o Addition of phosphoryl, methyl, acetyl, adenylyl, ADP-ribosyl, and
other groups to amino acid residues can increase or decrease protein
activity
o Phosphorylation is a common protein regulatory modification
 Ornithine and citrulline are key intermediates (metabolites) in the
biosynthesis of arginine and in the urea cycle

Amino Acids Can Act as Acids and Bases


 Amino, carboxyl, and ionizable R groups function as weak acids or bases
 Zwitterion: an amino acid lacking an ionizable R group dissolved in water
at a neutral pH
o Can act as an acid or a base
o Predominated at neutral pH
 Amphoteric: substances with this dual nature (often called ampholytes)
 Nonionic form: fully protonated
 Zwitterionic form: protonated and positive amino group, deprotonated and
negative carboxyl group

Amino Acids Have Characteristic Titration curves


 Acid-base titration involves the gradual addition or removal of protons
 Plots have distinct stages corresponding to the deprotonation of two different
groups (carboxyl and amino group) – 2 stages for non-ionizable R groups, 3
stages for ionizable R groups
o Low pH: predominant ionic species is in its fully protonated form
o First stage: carboxyl group loses its proton
 Midpoint: equimolar concentrations of proton donor and
acceptor
 Point of inflection: pH = pKa of protonated group being
titrated
 Second point of inflection: removal of first proton is
complete, removal of second proton has just begun
o Amino acid residue present largely as dipolar ion
(zwitterion)
o Second stage: removal of proton from amino group
 Point of inflection
o Completion: predominant form is deprotonated amino and carboxyl
groups
 Titration curve gives information
o quantitative measure of the pKa of each of the two ionizing groups (-
COOH and -NH3+)
o shows two regions of buffering power
 one is centered around the relatively flat portion of the curve,
extending 1 pH unit on either side of the first pKa
 the other is centered around the point of inflection in the second
stage
o Henderson-Hasselbalch can be used to calculate the proportions of
proton-donor and proton-acceptor species required to make a buffer at
a given pH

Titration Curves Predict the Electric Charge of Amino Acids


 Important relationship between the amino acid’s net charge and the pH of
the solution
 At the point of inflection between the two stages, the amino acid is present
predominately in its dipolar form
o Fully ionized but with no net electric charge
 Isoelectric point/isoelectric pH (pI): the characteristic pH at which the net
charge is zero
o The arithmetic mean of the two pKa values
o Net negative charge at any pH above its pI
 Will move toward the positive electrode (anode)
o Net positive charge at any pH below its pI
 Will move toward the negative electrode (cathode)
o The further the pH is from the isoelectric point, the greater the net
electric charge
o Ex: glycine is in its fully protonated form at a pH of 1.0, the net
positive charge is 1.0; at pH of 2.34 equal concentrations of the
protonated and deprotonated carboxyl group exist, the net positive
charge is now 0.5

Amino Acids Differ in their Acid-Base Properties


 Amino acids with a single a-amino group, a single a-carboxyl group, and an
R group that does not ionize have a titration curve resembling that of
glycine (the example above)
o Similar but not identical pKa values, differenced due to the differing
chemical environment impose by their unique R groups
 -COOH group 1.8-2.4
 -NH3+ group 8.8-11.0
 Amino acids with an ionizable R group have more complex titration curves
with three stages corresponding to three possible ionization steps
o Three pKa values
o Additional stage for the titration of the ionizable merges to some
extent with the others
o The isoelectric point reflects the nature of the ionizable R groups
present
 Only histidine has an R group with significant buffering power near the
neutral pH
o Usually found in the intracellular and extracellular fluids of most
animals and bacteria

3.2, Peptides and Proteins

Peptides are Chains of Amino Acids


 Peptide bond: covalently joins two amino acid molecules through a
substituted amide linkage  yields a dipeptide
o Removal of one of the elements of water (dehydration) from the a-
carboxyl group of one amino acid and the a-amino group of another
o Condensation reaction
o Under standard conditions, the amino acid is favored over the
dipeptide
 The carboxyl group must be chemically modified or activated
so that the hydroxyl group can be more readily eliminates
(more thermodynamically favorable)
 Oligopeptide: a few amino acids joined through peptide bonds
 Polypeptide: many amino acids joined together
o Polypeptides generally have MW below 10,000
o Proteins generally have MW above 10,000
 Residue: an amino acid unit in a peptide (the part left over after losing the
elements of water, H from amino group and hydroxyl moiety from carboxyl
group)
 Amino-terminal (N-terminal): end with free a-amino group (left)
 Carboxyl-terminal (C-terminal): end with free carboxyl group (right)
 Hydrolysis of peptide bond is exergonic, but occurs slowly due to high
activation energy
o Forms quite stable bonds, avg. half-life of about 7 years

Peptides Can Be Distinguished by Their Ionization Behavior


 Peptides contain only one free a-amino and a-carboxyl group at opposite
ends that ionize as they do in free amino acids
o The a-amino and a-carboxyl of nonterminal amino acids are
covalently joined in the peptide bonds and do not ionize or contribute
to acid-base behavior of the peptides
o R groups that can ionize will contribute to the overall acid-base
properties of the molecule
 The acid-base behavior of a peptide can be predicted from its free a-amino
and a-carboxyl groups combined with the nature and number of ionizable R
groups
 Peptides have characteristic titration curves and isoelectric pH (pI)
o Exploit these properties to separate peptides and proteins
 pKa of an ionizable R group can change when an amino acid becomes a
reside linked in a peptide
o pKa affected by loss of charge in a-carboxyl and a-amino groups, the
interaction with other R groups, and environmental factors

Biologically Active Peptides and Polypeptides Occur in a Vast Range of Sizes and
Compositions
 No generalizations about MW of peptides and proteins in relation to
functions
o Range in length from two to many thousands of amino acid residues
o Small peptides can have biologically important effects
 Many small peptides exert their effects at very low concentrations
o Ex: vertebrate hormones (oxytocin, thyrotropin-releasing factor)
 Polypeptide chains vary considerably in length
o Cytochrome c has 104 amino acid residues in a single chain
o Titin has 27,000 amino acids linked together
 Some proteins consists as a single polypeptide chain
 Multisubunit: proteins with two or more polypeptides associated
noncovalently
o Individual chains may be identical or different
o If at least two are identical it is oligomeric, and the identical chains
are protomers
o Ex: hemoglobin has 2 alpha subunits and 2 beta subunits, it can be
considered either a tetramer of four polypeptide subunits or a dimer of
alpha-beta protomers
 A few proteins contain two or more polypeptide chains linked together
covalently
o Ex: two chains of insulin are linked by disulfide bonds
o The individual polypeptides are referred to as chains not subunits
 Composition is highly variable
o The 20 common amino acids almost never occur in equal amounts
 Some may occur once, in large amounts, or not at all
o Dividing molecular weight of a protein by 110 will calculate the
approximate number of amino acid residues
 The average MW of the common amino acids is 138
 Since smaller amino acids predominate, and if we take into
account eh proportions in an average protein, the average MW
is closer to 128
 Since water is removed to create each peptide bond, the average
MW is about 110 (128-18)

Some Proteins Contain Chemical Groups Other Than Amino Acids


 Simple proteins: contain no other chemical constituents than amino acids
o Ribonuclease A and chymotrypsin
 Conjugated proteins: contain permanently associated chemical components
in addition to amino acids
o Prosthetic group: the non-amino acid part of the group
o Classified on the basis of their chemical nature of prosthetic groups
 Lipoproteins: contain lipids
 Glycoproteins: contain carbohydrates
 Phosphoproteins: contain phosphate groups
 Hemoproteins: contain heme (iron porphyrin)
 Flavoproteins: contain flavin nucleotides
 Metalloproteins: contain a specific metal
o Some proteins contain more than one prosthetic group
o The prosthetic group usually plays an big role in protein function
3.3, Working with Proteins

Proteins Can Be Separated and Purified


 Pure preparation is essential to determine properties and activities
 Classic methods advantage of properties that vary from one protein to the
next
o Size, charge, binding properties
 Newer methods modify the protein being purified; convenience comes at the
cost of potentially altering the protein’s activity
 Source of protein is generally tissue or microbial cells
o The first step is to break open cells, releasing proteins into a solution
called a crude extract (if necessary differential centrifugation can be
used to prepare subcellular fraction or to isolate specific organelles)
o Commonly, the extract is then subjected to treatments that separate
the proteins into different fractions based on a property such as size or
charge (fractionation)
 Utilize differences in protein solubility, a complex function of
pH, temperature, salt concentration, etc.
 “salting out” is a process that lowers solubility of proteins with
the addition of certain salts
 Addition of certain salts in the right amount can
selectively precipitate in some proteins, others remain in
solution
 Precipitated protein is removed by centrifugation
o Solution usually needs to be further altered before next purification
steps
 Dialysis separates proteins from small solutes by taking
advantage of the size of larger proteins
 Partially purified extract is placed in bag/tube with
semipermeable membrane
 Bag is suspended into larger volume of buffered solution
of appropriate ionic strength
 Membrane allows the exchange of salt and buffer but not
proteins
 Dialysis retains large proteins within the bag/tube while
allowing concentration of other solutes to change until
they come into equilibrium
 The most powerful methods for fractionating proteins make use of column
chromatography
o This process takes advantage of differences in protein charge, size,
binding affinity and other properties
 Porous solid material with appropriate chemicals properties
(stationary phase) is held in a column
 Buffered solution (mobile phase) migrates through it
 Protein is dissolved in the same buffered solution is layered on
the top of the column
 Protein percolates through the matrix as an expanding band
with larger mobile phase
 Individual proteins migrate faster or more slowly through the
column
 Separation improves as the length of the column increases, but
each individual protein band also broadens over time
 Ion-exchange chromatography exploits differences in the sign and
magnitude of the net electric charge of proteins at a given pH
o Column matrix is a synthetic polymer containing bound charged
groups
 Cation exchangers: bound anionic groups
 Anion exchangers: bound cationic groups
o Affinity of each protein for the charged groups is affected by the pH
(determines the ionization state of the molecule) & concentration of
competing free salt ions in the surrounding solutions
o Optimize separation by gradually changing the pH or salt
concentration of the mobile phase, to create a pH or salt gradient
o Cation-exchange chromatography
 The solid matric has negatively charged groups
 Mobile phase: proteins with a net positive charge migrate
through the matrix more slowly than those with a net negative
charge
 Migration for the former is retarded more by interaction
with the stationary phase
o Expansion of the band in the mobile phase is caused by separation of
proteins with different properties and diffusional spreading
o As length of the column increases, resolution improves
o As length of time spent in column increases, resolution can decline as
a result of diffusional spreading
o As the protein solution leaves the column, fractions of the effluent are
collected in test tubes
 Each fraction can be tested for the presence of the protein or for
properties such as ionic strength or total protein concentration
 Fractions positive for the protein of interest can be combines as
the product of the chromatographic step of protein purification
 Size-exclusion chromatography (gel filtration) separates proteins according
to size
o Large proteins emerge from the column sooner than small ones
o Solid phase: cross-linked polymer beads with pores of particular size
 Large proteins cannot enter so they take a shorter path through
the column, around the beads
 Small proteins enter the cavities and are slowed by the path
o Used to approximate the size of a protein being purified
 Affinity chromatography is based on binding affinity
o Beads in the column have a covalently attached ligand – a group or
molecule that binds to a macromolecule such as a protein
o Any protein with affinity for the ligand binds to the beads
o Proteins that do not bind are washed through the column
o The bound protein is eluted by a solution containing either a high
concentration of salt or free ligand often bound to the ligand used
 HPLC, high-performance liquid chromatography enhances chromatographic
methods
o Makes use of high-pressure pumps that speed up the movement of the
protein molecules down the column
o Higher quality chromatographic materials can withstand the crushing
force of the pressurized flower
o Reduces time in column, limiting diffusional spread of protein bands
and improving resolution greatly
 In most cases, several different methods must be used sequentially to purify
a protein completely, each method separating on the basis of a different
property
 Chromatographic methods are often impractical at early stages because
samples are so large but as each purification step is completed, the sample
becomes smaller and more feasible to use more sophisticated
chromatographic procedures at later stages

Proteins Can Be Separated and Characterized by Electrophoresis


 Electrophoresis: the migration of charged proteins in an electric field
o Not generally used to purify proteins because simpler methods exist
o Often affects structure and function of proteins
o Proteins can be visualized as well as separated
 Allows for estimation of number of proteins and degree of
purity
 Can be used to determine isoelectric point and molecular
weight
o Occurs in gels made of cross-linked polymer polyacrylamide
 Slows migration of proteins in proportion to their charge to
mass ratio
 Migration may be affected by protein shape
o E, Electric potential: the force moving the macromolecule
o Electroporetic mobility (u): the ratio of its velocity, V, to the electric
potential, E
 Also equal to the net charge, Z divided by frictional component,
f (which reflects protein shape)
 u = V/E = Z/f
o Migration of a protein in a gel during electrophoresis is a function of
its size and shape
 Sodium dodecyl sulfate (SDS) used as detergent in the estimation og purity
and molecular weight
o Protein will bind 1.4 times its weight of SDS
o One molecule of SDS for each amino acid residue
o SDS contributes large net negative charge
o Partially unfolds proteins, assuming rod-like shape
o Electrophoresis will separate exclusively based on mass
 Smaller more rapid
 Dye such as Coomassie blue will bind to proteins but not gel
o Can provide a good approximation of its molecular weight
o Subunits will be separated by SDS treatment, separate band for each
 Isoelectric focusing: procedure used to determine the isoelectric point (pI) of
a protein
o pH gradient established by allowing a mixture of low MW acids and
bases to distribute themselves in an electric field across the gel
o When protein mixture is applied, each protein migrates until it reaches
the pH that matches its pI
 Two-dimensional electrophoresis: combining isoelectric focusing and SDS
electrophoresis sequentially permits the resolution of complex mixtures of
proteins
o Much more sensitive
o Separates proteins of identical MW that differ in pI or those with
similar pI values that differ in MW

Unseparated Proteins Can Be Quantified


 For proteins that are enzymes, the amount in a given solution or tissue
extract can be measure, or assayed in terms of the catalytic effect it produces
o The increase in the rate at which the substrate is converted to the
reaction products
o Must know:
 1. The overall equation of the reaction catalyzed
 2. An analytical procedure for determining the disappearance of
substrate/appearance of product
 3. Whether cofactors are required for the enzyme
 4. The dependence of the enzyme on substrate concentration
 5. The optimum pH
 6. A temperature zone for stability and high activity
o By international agreement. 1.0 unit of enzyme activity is defined as
the amount of enzyme causing transformation of 1.0 umol of substrate
to product per minute at 25 degrees C under optimal conditions of
measurement
o Activity: the total units of enzyme in a solution
o Specific activity: the number of enzyme units per milligram of total
protein, measure of enzyme purity
 Increases during purification of an enzyme and becomes
maximal and constant when the enzyme is pure
 After each purifications step, the activity of the preparation is assayed, the
total amount of protein is determined independently
o The ratio of the two gives the specific activity
o Activity and total protein generally decrease with each step
o Specific activity increases even as total activity falls
 A protein is generally considered pure when further purification steps fail to
increase specific activity and when only a single protein species can be
detected
 For proteins other than enzymes, other quantification methods are required
o Transport proteins can be assayed by their binding to the molecule
they transport
o Hormones and toxins by the biological effect they product
o Some structural proteins represent such a large fraction of a tissue
mass that they can be readily

3.4, The Structure of Proteins: Primary Structure

Primary structure: covalent bonds linking amino acid residues in polypeptide


chain; the sequence of amino acids (differences can be informative)

Secondary structure: the particularly stable arrangements of amino acid residues


giving rise to recurring structural patterns

Tertiary structure: all aspects of three-dimensional folding of a polypeptide

Quaternary structure: a protein with two or more polypeptide subunits

The Function of a Protein Depends on its Amino Acids Sequence


 Proteins with different functions always have different amino acid sequences
 Genetic disease can be caused by defective protein, ranging from a single
change in amino acid to a deletion of a large segment of the chain
 Functionally similar proteins from different species often have similar amino
acid sequences
 Some flexibility is possible
o 20-30% are polymorphic, having amino acid sequence variants
o Many have little or no effect on the function of the protein
 Proteins with even similar functions can have very different sequences
The Amino Acid Sequences of Millions of Proteins Have Been Determined
 1953:
o Watson and Crick deduced the structure of DNA and its basis for
replication
o Fredrick Sanger worked out that the sequence of amino acid residues
in the polypeptide chain of insulin
 A decade after these discoveries, the genetic code relating the nucleotide
sequence of DNA to the amino acid sequence of protein molecules was
elucidated
o The amino acids sequences of proteins are now most often derived
indirectly from DNA sequences in genome databases

Protein Chemistry Is enriched by Methods Derived from Classical Polypeptide


Sequencing
 Few proteins still sequences with the Sanger method because protein
sequence can usually be predicted by the gene ecoding it
o Determine amino terminus
o Determine amino acid content (by acid hydrolysis), select cleavage
reagents based on presence of target amino acids in protein
o Cleave into smaller polypeptides
o Sequence each polypeptide
o Determine order of polypeptides in protein, which ones have amino
and carboxyl terminus in them
o Order others by overlaps with sequences of peptides obtained by
cleaving the protein with a different reagent
o In the traditional method, the amino-terminal amino acid was first
labeled and its identity determined
 Edman degradation: a procedure that labels and removes only the amino-
terminal residue from a peptide, leaving all other peptides intact
o The peptide is reacted with phenylisothiocyanate under mildly
alkaline conditions  converts the amino-terminal amino acid to a
phenyltiocarbamoyl (PTC) adduct
o Peptide bond next to the PTC adduct is then cleaved with removal of
the amino-terminal amino acid
o The derived amino acid is extracted with organic solvents, converted
to a more stable derivative by treatment with aqueous acid, and then
identified
o The use of sequential reactions first carried out in basic and then
acidic conditions provides a means of controlling the entire process
o Each reaction can go to completion without affecting any of the other
peptide bonds
o The process is repeated until as many as 40 amino acid residues have
been identified.
 Determining the sequence of large proteins
o Methods to eliminate disulfide bonds and to cleave proteins precisely
 Proteases catalyze the hydrolytic cleavage of peptide bonds
 Some cleave only the peptide bond next to certain amino
residues, fragmenting a chain in a predictable and
reproducible way
 Trypsin, a digestive enzyme, catalyzes the hydrolysis of
peptide bonds in which the carbonyl group is either a Lys
or Arg residue
o Polypeptide with three Lys or Arg residues will
yield four smaller peptides upon cleavage
o Three of them will have a carboxyl-terminal Lys or
Arg
 Some chemical reagents also cleave the peptide bonds adjacent
to certain residues
 The choice of a reagent to cleave the protein with can be aided
by first determining the amino acid content
 Using acid to reduce the protein to its constituent amino
acids
o In classical sequencing, a large protein would be cleaved into
fragments twice, using a different protease or cleavage reagent each
time so that fragment endpoints differed
 Both fragments would be purified and sequences
 Order of fragments could be determined by overlapping
sequences
 Traditional sequencing methods are still valuable in lab
o Sequencing of some amino acids from the amino-terminus using
Edman chemistry is often sufficient to confirm identity
o Techniques employed in individual steps of the traditional sequencing
method are also useful for other purposes
 Ex: methods used to break disulfide bonds can also be used to
denature proteins when needed
Mass Spectroscopy Offers an Alternative Method to Determine Amino Acid
Sequences
 Can provide a highly accurate measure of the molecular weight of a protein
o Some can sequence short segments rapidly
 Analytes are first ionized in a vacuum
o The newly charged molecules are introduced into an electric or
magnetic field, their paths are a function of their mass-to-charge ratio
o Ability to deduce mass with high precision
 For many years this could not be applied to proteins and nucleic acids
o The m/z measurement applied to molecules in the gas phase which
would have caused decomposition
 MALDI MS, matrix-assisted laser desorption/ionization mass spectrometry
o Proteins are placed in a light-absorbing matrix
o Short pulse of laser light ionizes proteins and desorbs them from the
matrix into the vacuum system
o Used to measure the mass of a wide range of macromolecules
 ESI MS, electrospray ionization mass spectrometry
o Macromolecules in solution are forced directly from the liquid to the
gas phase
o Solution of analytes is passed through a charged needle at high
electric potential, dispersing the solution in a fine mist
o The solvent surrounding the macromolecules evaporates, leaving
charged macromolecular ions in the gas phase
o Protons added during passage through the needle give additional
charge to the macromolecule
o The m/z of the molecule can be analyzed in the vacuum chamber
 Techniques require miniscule amount of sample
o Readily applied to small amounts of protein that can be extracted from
a two-dimensional electrophoretic gel
o Once the mass of a protein is known, MS can be used to detect
changes in mass due to the presence of bound cofactors, metal ions,
covalent modifications, etc.
 Process for determining molecular mass of a protein with ESI MS
o Protein is injected into the gas phase
o Protein acquires a variable number of protons and positive charge
from solvent
o Creates a spectrum of species with different mass-to-charge ratios
o Each successive peak corresponds to a species that differs from that of
its neighboring peak by a charge difference of 1 and a mass difference
of 1
 The mass of a protein can be determined from any two
neighboring peaks
 MS can also sequence short stretches of polypeptides
o Tandem MS or MS/MS
 Solution treated with protease/chemical reagent to hydrolyze it
into shorter segments
 The mixture is then injected into a device that is two mass
spectrometers in tandem
 In the first, the peptide mixture is sorted so that only one of the
several types of peptides produced by cleavage emerges at the
other end
 The sample of the selected peptide travels through the vacuum
chamber between the two mass spectrometers
 In collision cell, the peptide is further fragmented by high-
energy impact with a “collision gas”
 Each peptide is broken in only on place, most at peptide bonds
 The second MS measures the m/z ratios of all the charged
fragments, generating one or more sets of peaks
 The difference in mass from peak to peak identifies the amino
acid that was lost in each case, revealing the sequence of the
amino acids
 Ambiguity: leucine and isoleucine have the same mass

Small Peptides and Proteins Can Be Chemically Synthesized


 Ways to obtain a peptide
o 1. Purification from tissue
o 2. Genetic engineering
o 3. Direct chemical synthesis
 Commercial applications and study of protein structure and function
 Traditional synthetic approaches of organic chemistry impractical for
peptides with more than four or five amino acid residues
o Due to difficulty of purifying after each step
 1962, Merrifield, breakthrough in technology
o Synthesized a peptide while keeping it attached at one end to a solid
support
 The peptide is built up on the support, one amino acid at a time
 Through a standard set of reactions in a repeating cycle
 At each step in the cycle, protective chemical groups block
unwanted reactions
 This technology is now automated
o Limitation is the efficiency of each chemical cycle
o Incomplete reaction at one stage can lead to formation of impurity in
the next
o Optimized to product proteins of 100 amino acids residues in a few
days in reasonable yield
o Similar approach is used to synthesize nucleic acids
 Still pales in comparison with biological processes

Amino Acids Sequences provide Important Biochemical Information


 Knowing the sequence of a protein provides information on its structure and
function
o Most derived by searching for similarities between protein of interest
and previously studied proteins in banks of information
 Exactly how the sequence determines the 3-dimensional structure is not
understood in detail
 Protein families that have some shared structural or functional features can
be readily identified on the basis of amino acid sequence similarities
o Families are based on the degree of similarity in amino acid sequence
o Usually identical across 25% or more of their sequences, and share at
least some of their functional/structural characteristics
o Some families are identified by only a few amino acid residues critical
to a certain function
 A number of similar substances, or “domains” occur in many functionally
unrelated proteins
 Evolutionary relationships can also be inferred from the structural and
functional similarities within protein families
 Certain amino acid sequences serve as signals that determine the cellular
location, chemical modification, and half-life of a protein
 Special signal sequences, usually at the amino terminus, are used to target
certain proteins for export from the cell
o Other proteins are targeted for distribution to the nucleus, the cell
surface, the cytosol, or other cellular locations
o Other sequences act as attachment sites for prosthetic groups, such as
sugar groups in glycoproteins ad lipids in lipoproteins
Protein Sequences Help Elucidate the History of Life on Earth
 As more protein sequences become viable, the development of more
powerful methods for extracting information from them has become a major
biochemical enterprise
 Bioinformatics: analysis of the information available in the ever-expanding
biological databases
o Make it possible to identify functional segments in new proteins and
help establish both their sequence and their structural relationships to
proteins already in databases
 Protein sequences are beginning to tell us how the proteins evolved & how
life on earth evolved
 The field of molecular evolution
o Often traced to Zuckerandl and Linus who advanced the use of
nucleotide and protein sequencing to explore evolution
o More similar sequences should yield similar proteins and genes, and
those that diverge more create more evolutionary distance between
organism
 Different proteins evolve at different rates
 Horizontal gene transfer: the rare transfer of a gene or group of genes from
one organism to another
 Homologous proteins: the members of protein familits
o Paralogs: two homologs in a family present in the same species
o Orthologs: two homologs from different species
 By considering the sequence of a protein, scientists can now construct more
elaborate evolutionary trees with many species in each taxonomic group
 External nodes: free end points one evolutionary tree
 Internal nodes: the points where two lines come together
 In most representations, the lengths of the lines are proportional to the
number of amino acid substitutions separating one species from another

You might also like