Proteomics

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 80

Ribbon Diagram

Amino Acids
Amino Acids
Proteomics
Proteome
• The proteome is the
entire set of proteins
expressed by a genome,
cell, tissue or organism
at a certain time.
• More specifically, it is
the set of expressed
proteins in a given type
of cell or organism, at a
given time, under
defined conditions.
Complexity of the Proteome
• Within an individual organism, the
genome is constant, but the proteome
varies and is dynamic.
• Every cell in an individual organism has
the same set of genes, but the set of
proteins produced in different tissues
differ from one another and are
dependent on gene expression.
• One gene may express multiple proteins,
with multiple biological functions.
• They also may present in different forms
called isoforms.
Complexity of the Proteome
Post transitional modifications:
• Many proteins are modified after translation, and it is termed as Post
transitional modification or PTM.
• They modify via processes such as phosphorylation, glycosylation, and
ubiquitination and also proteolytic cleavage.
• Ubiquitylation is one of the most prevalent PTMs and controls almost
all types of cellular events in eukaryotes.
• Phosphorylation (adding of phosphates) or dephosphorylation
(removing of phosphates) is prominent in biochemical pathways.
Complexity of the Proteome
Post transitional modifications:
• Glycosylation is the attachment of sugar molecules (oligosaccharides)
to proteins. Glycosylation affects protein folding, stability, solubility,
and cellular recognition processes. It plays a critical role in cell-cell
interactions, immune responses, and protein trafficking.
• Proteolytic cleavage is a crucial post-translational modification that
can regulate protein activity, localization, stability, and function. It is
carried out by enzymes called proteases or peptidases.
Importance of understanding the Proteome
• Uncover how proteins interact with each other, participate in
signaling pathways and regulate gene expression.
• Discovery of new therapeutic targets and development of diagnostic
biomarkers.
• Design drugs to modulate the activity of target protein.
• Enable personalized medicine approaches, where treatment
strategies are tailored to an individual's specific proteomic profile.
Proteomics
• The study of the proteome is
called proteomics, and it
involves understanding how
proteins function and interact
with one another.
• The word proteome is a
combination of protein and
genome was coined by Marc
Wilkins in 1994.
• Proteome can be studied
using a variety of techniques.
Proteomics
Importance of Proteomics in Protein analysis:
• DNAsequences do not show how proteins function or how biological
processes occur.
• Proteomics facilitates the study of, splice variants, Post transcriptional
modifications (PTMs), Phospho-proteomics and 3D structures affect
protein function.
Protein Analysis Techniques
• Solubility methods
• Filtration
• Centrifugation
• Chromatographic methods
• Electrophoretic methods
• Western Blotting
• X-ray Crystallography
• NMR analysis
• Mass Spectrometry
Steps of Integrated Protein Analysis
• Sample purification
• Poly Acrylamide Gel Electrophoresis (PAGE)
• High Performance Liquid Chromatography (HPLC)
• Mass Spectrometry (MS)
• Identification, Characterization Quantification of proteins
Poly Acrylamide Gel Electrophoresis (PAGE)
1D-SDS-PAGE:
• Proteins are separated based on molecular mass over a range of 10 -
300 kDa.
• Samples are weighed and dissolved in Sodium Dodecyl Sulfate (SDS).
• SDS is a negatively charged detergent that bind to proteins with its
hydrophobic region and binds water in its hydrophilic region.
• This SDS- protein-water interaction allows water insoluble proteins to
dissolve in water.
• The proteins are completely denatured by the SDS (with heating).
Poly Acrylamide Gel Electrophoresis (PAGE)
1D-SDS-PAGE:
• When an electric field is applied, the negative charge of the SDS
causes the proteins to move through a clear acrylamide matrix
toward the positive electrode.
• This matrix has holes that sieve out the proteins by molecular weight.
• Large proteins move more slowly through the matrix than the smaller
proteins thereby separating proteins by molecular weight.
• Molecular weight standards are included on each gel to allow for
determination of protein size.
Poly Acrylamide Gel Electrophoresis (PAGE)
1D-SDS-PAGE: 2D-SDS-PAGE:
Poly Acrylamide Gel Electrophoresis (PAGE)
2D-SDS-PAGE:
• The first-dimension separation by isoelectric focusing.
• The gel matrix has an immobilized pH gradient.
• Electric current causes charged proteins to move until it reaches the
isoelectric point.
• Then second-dimension separation by mass.
• The pH gel strip is loaded onto SDS gel.
• Have more resolving power than 1D-SDS-PAGE.
Poly Acrylamide Gel Electrophoresis (PAGE)
• Once protein bands have been separated, they can be visualized using
different methods of in-gel detection.
• The most used staining dye for proteins is Coomassie-Brilliant Blue.
• The major reason is CBB staining is simple, fast, and sensitive.
• Silver staining also is used to detect proteins after electrophoretic
separation on polyacrylamide gels.
• It combines excellent sensitivity in the low nano gram range.
• Depending on the chemistry of the stain, various steps are necessary
to fix the color to the proteins in the gel matrix .
Poly Acrylamide Gel Electrophoresis (PAGE)
1D-SDS-PAGE Gel Staining:
• Proteins appears as
bands. Each band may
contain 10-20 proteins
depending on the
material.
Poly Acrylamide Gel Electrophoresis (PAGE)
2D-SDS-PAGE Gel Staining:
• Proteins appears as spots.
• Each spot usually contains
one protein.
Proteolytic Cleavage by Trypsin
• Trypsin is a serine protease that specifically cleaves peptide bonds on the
C-terminal side of basic amino acids, such as lysine (K) and arginine (R),
except when followed by proline (P).
• This involves incubating the protein with trypsin under controlled
conditions, typically at a specific temperature and pH.
• It cleave specifically large proteins to be become small peptides until the
range of the masses of the peptides are well suit to employ in PMF (500 Da
– 3500 Da)
• The resulting peptides will have a basic residue (K or R) at their C-terminus.
• Trypsin is widely used due to its specificity, reproducibility, and well-
characterized digestion properties.
Proteolytic Cleavage by Trypsin
High Performance Liquid Chromatography
Reversed Phase High Performance Liquid Chromatography (RP-HPLC):
• The type of HPLC most used in proteomics research is RP-HPLC.
• In RP-HPLC separation of peptides (after digestion of whole proteins
with trypsin) based on their hydrophobicity.
• Molecules that possess some degree of hydrophobic character, such
as proteins, peptides and nucleic acids, can be separated by RP-C with
excellent recovery and resolution.
• The hydrophilic peptides will be eluted off the column early in the
gradient, and hydrophobic peptides will be eluted toward the end of
the gradient.
High Performance Liquid Chromatography
Reversed Phase High Performance
Liquid Chromatography (RP-HPLC):
• Stationary phase column is packed
with silica particles that are
derivatized with hydrophobic alkyl
chains, usually C18, for peptide
analysis.
• Water soluble organic solvents such
as methanol, acetonitrile is used as
mobile phase.
High Performance Liquid Chromatography
Reversed Phase High Performance Liquid Chromatography (RP-HPLC):
High Performance Liquid Chromatography
Reversed Phase High Performance Liquid Chromatography (RP-HPLC):
High Performance Liquid Chromatography
Reversed Phase High Performance Liquid Chromatography (RP-HPLC):
• The retention time of a solute is taken as the elapsed time between
the time of injection of a solute and the time of elution of the peak
maximum of that solute.
• Conventionally, HPLC has used different types of detectors such as
Fluorescence detectors, UV spectrophotometers and electrochemical
detectors.
• Quite recently, HPLC has been coupled with MS in order to identify
the components in the mixture.
Mass Spectrometry (MS)
• Mass Spectrometer is a machine that weighs molecules.
• MS can also be used to quantify known materials, to identify
unknown compounds within a sample, to elucidate the structure and
chemical properties of different molecules.
• It consist of three major parts.
Mass Spectrometry (MS)
Principle of mass spectrometry:
• The complete process involves the conversion of the sample (in solid
or liquid phase) into gaseous ions, with or without fragmentation,
which are then characterized by their mass to charge ratios (m/z) and
relative abundances.
Ionization Techniques
Electron Impact (EI - Hard method):
• EI is done by volatilizing a sample directly in the source that is
contained in a vacuum system directly attached to the analyzer.
• Small molecules 1-1000 Daltons.
Fast Atom Bombardment (FAB – Semi-hard):
• FAB is a technique that was popular in the 80's to early 90's because
it was the first technique.
• A high energy beam of atoms or ions (Cs or Xe) vaporizes fragments
and ionizes peptides, sugars up to 6000 Daltons.
Ionization Techniques
Soft ionization techniques:
• Electrospray Ionization (ESI) – peptides,
proteins up to 200 kDa
• Matrix Assisted Laser Desorption (MALDI)
– peptides, proteins, DNA up to 500 kDa
• Soft ionization keeps the molecule of
interest fully intact. No fragmentation
occurs.
• Soft ionization methods use smaller
amounts of energy to ionize the sample.
Electrospray Ionization (ESI)
• ESI is the most popular ionization technique.
• The electrospray is created by putting a high voltage on a flow of liquid at
atmospheric pressure, sometimes assisted by a concurrent flow of gas.
• The created spray is directed to an opening in the ESI spray vacuum system,
where the droplets are de-solvated by a combination of heat, vacuum and
acceleration into gas by voltages.
• Eventually the ions are ejected from the droplets and accelerated into the
mass analyzer by voltages.
• For larger molecules, the ions may contain multiple charges, allowing the
detection of very large molecules on analyzers that have limited mass to
charge (m/Z) ratio ranges.
Electrospray Ionization (ESI)
Matrix Assisted Laser Desorption (MALDI)
Matrix Assisted Laser Desorption (MALDI)
Sample Preparation:
• The analyte is in an in volatile deposit.
• It is obtained by various preparation
methods which frequently involve the
introduction of a matrix e.g., 2,5-
Dihydroxybenzoic acid (DHB) that can
be either a solid or a viscous fluid.
• Matrix disperses large amounts of
energy absorbed by the laser and
minimizes fragmentation of the
molecule.
Matrix Assisted Laser Desorption (MALDI)
• This deposit is then irradiated by energetic particles or photons that
desorb ions near the solid surface of the deposit.
• Pulsed laser is used. It provides selective ionization by choosing
appropriate wavelength e.g., IR laser CO2 laser, UV laser, YAG etc.
• It is important that the ions produced in the ionization chamber have
a free run through the machine without connecting air molecules.
• MALDI generates spectra that have just a singly charged ions.
• These ions can be extracted by an electric field and focused towards
the analyzer.
Mass Analyzers
• Time of Flight Mass Analyzer (TOF)
• Quadrupole Mass Analyzer
• Magnetic Sector Mass Analyzer
• Electrostatic Sector Mass Analyzer
• Quadrupole Ion Trap Mass Analyzers
• Ion Cyclotron Resonance
Time of Flight Mass Analyzer (TOF)
Time of Flight Mass Analyzer (TOF)
• Pulses of ions are accelerated into the evacuated drift tube (free of field or
external force).
• Ions are separated in the drift tube according to their velocities.
• Velocity of an ion depends on m/z.
• Lighter ions move faster along the tube than heavier ions.
• If the particles all have the same charge, the kinetic energies will be
identical, and their velocities will depend only on their masses.
• TOF is the measured time they take to reach the detector.
Time of Flight Mass Analyzer (TOF)

• m = mass of ion
• v = velocity of ion
• z = charge of ion
• V = accelerating voltage
Mass Detectors
• Mass detectors measure the abundance of ions after they have been
separated by the mass analyzer.
• They convert ions into electrical signals that can be quantified.
• The detection process is typically based on the principle of ion-to-electron
conversion, where ions interact with a detector surface and release
electrons, generating an electric current.
• The output of the mass detector is a mass spectrum, which represents the
relative abundance of ions at different m/z values.
• Common types of mass detectors include electron multipliers,
photomultiplier tubes, microchannel plates, and semiconductor-based
detectors.
Mass Spectrum
Mass spectrum of single charged ions:
Mass Spectrum
Mass spectrum of multiple charged ions:
Mass Spectrum
• Monoisotopic mass
is the mass
determined using the
masses of the most
abundant isotopes.
• Average mass is the
abundance weighted
mass of all isotopic
components.
Mass Spectrum
Mass Resolution:
• Ability of a mass spectrometer
to distinguish between ions of
different m/z ratios.
• Width of peak indicates the
resolution of the MS
instrument.
• The better the resolution or
resolving power, the better the
instrument and the better the
mass accuracy.
Mass Spectrum
Mass Resolution:
• Resolving power is defined as R = m/ Δm
• m is the mass number of the observed mass.
• Δm is the difference between two masses that
can be separated.
• In that case Δm is the width of the peak at half of
the peak corresponding to m.
• Mass accuracy is the difference between
measured and actual mass.
Mass Spectrum
Bioinformatics
• Analyze and assemble the
MS data
• Study protein-protein
interactions
• Design interactive
pathways (interactome)
• Predict 3-D structures of
proteins
Proteins as Drugs Targets
• Most drug targets are proteins that are encoded by genes expressed
within tissues affected by a disease.
• It is estimated that there are approximately,
o 10,000 different enzymes
o 2,000 different G-protein-coupled receptors
o 200 different ion channels,
o 100 different nuclear hormone receptors encoded in the human genome.
• These proteins are key components of malignant transformation
pathways and, therefore, are likely to be a good source of new drug
targets.
Proteins as Drugs Targets
• At present, proteins represent more than 90% of druggable targets
and more novel targets are being identified.
• The main challenge of the drug discovery process is to develop the
right drug for the right target.
• The pattern of protein changes after drug application will give
information about the mechanism of action, either for therapeutic or
toxicological effects.
• Various drugs might be compared and grouped according to their
actions on signaling cascades or metabolic pathways.
• Side effects can also be described if additional proteins are involved.
Proteins as Drugs Targets

Expression proteomics Interaction Proteomics Structural Proteomics

Analysis of changes in protein


expression levels in different Determination of protein-protein Determine or predict the three-
cellular situations or subcellular interactions and the description of dimensional structure of proteins
localizations (normal diseased) protein networks

Interaction mapping, Affinity NMR, X-ray crystallography,


PAGE, HPLC, MS techniques capture combined with mass Homology and molecular
spectrometry modeling
Interaction Proteomics
Interaction Proteomics
Protein-protein interactions (PPIs) are involved in many biological
processes:
• Signal transduction
• Protein complexes or molecular machinery
• Protein carrier
• Protein modifications
PPIs help to interpret the molecular mechanisms underlying the
biological functions and enhance the approaches for drug discovery.
Interaction Proteomics
PPIs Databases:
• DIP – Database of Interacting Protein.
• MIPS – Munich Information center for
Protein Sequences
• STRING – Search Tool for the Retrieval of
Interacting Genes/ Proteins
• PRISM – Protein Interactions By Structural
Matching
Structural Proteomics
• Knowledge of the 3-D structure of proteins is essential for understanding
their function.
• If the proteins structures are known but their functions are unknown, the
functions can be predicted by searching for similar structures whose
functions are known, as structurally similar proteins tend to have similar
functions.
• These structures (Ribbon Cartoons) are stored in Protein Data Bank (PDB),
and they are so called as PDB structures.
• They may be whole protein structure or most of the time a part of a
particular protein.
• The locations of all those domains could be detected by observing the
structures manually in PyMOL3 and RasMol as well as using sequence
alignments.
Materials used in Proteomics
• Human materials
• Animal materials
• Plants
• Bacteria
• Viruses
• Food
Biomarkers
• Biomarker, or biological marker, is in general a substance used as an
indicator of a biological state.
• It can be a substance whose detection indicates a particular disease
state, for example, the presence of an antibody may indicate an
infection.
• Proteins that are produced in increased amounts in disease states can
serve as biomarkers to detect specific diseases.
• They are termed Proteins Biomarkers.
Protein Biomarkers
• Accurate measurement of panels of protein biomarkers in body fluids has
the potential for early detection of disease and help directing
individualized therapy.
• It is a tool for target discovery and evaluation of a drug’s mechanism of
action, therefore, useful in drug discovery and development.
Types of Protein Biomarkers:
• Hormones
• Enzymes
• Glycoproteins
• Receptors
Identification of Protein Biomarkers
Pathway to Develop Biomarkers
Phase 1:
• Quantification of proteins by employing MS based methods.
Phase 2:
• Characterization of differentially abundant proteins by bioinformatics
software e.g., GO annotation, uniport, blast, pymol etc.
Phase 3:
• Experimental validation of the findings e.g., Selective Reaction
Monitoring (SRM).
Gene Ontology (GO) annotations
• GO annotation is a statement about the function of a particular gene.
• These concepts describe what a protein does (Molecular function),
how it does it (Biological process) and where it carries out this task in
a generic cell (Cellular component).
• GO Analysis can be used for integration of proteomic data from
different species, classify the differential proteins, predict the
functions of specific protein domains and identify genes involved in
certain diseases.
Gene Ontology (GO) annotations
Membrane Proteins as Drug Targets
• Membrane proteins are responsible for membrane functions such as
transport, signaling, cell-cell recognition, intercellular joining, and
attachment.
• There are two major classes of membrane proteins; integral and
peripheral.
• Integral membrane proteins play essential roles in numerous
physiological functions, such as molecular recognition, energy
transduction and ion regulation.
• Analysis of membrane proteins have many experimental challenges.
Membrane Proteins as Drug Targets
• Even so, studying these proteins is critical since they represent more
than 60% of drug targets.
• G-protein coupled receptors (GPCRs), are a popular class of
membrane proteins.
• Malfunction of GPCRs results in serious disorders such as
hypertension, congestive heart failure, stroke and cancer.
• Therefore, ongoing technological advances are exploited to study
membrane proteins, in order to improve or develop novel
pharmacological drug targets.
Membrane Proteins as Drug Targets
Challenges in membrane protein analysis:
• They are hydrophobic (difficult to dissolve)
• Majority of the membrane proteins are glycoproteins. Glycans are
attached to outer part of the proteins, making it difficult to analyse.
• High abundance proteins over govern the low abundance proteins
and failure to detect them by usual protein analysis methods.
• Protein modifications (PTMs) cause structural & functional changes.
• Different isoforms of a protein may be produced from related genes
or may arise from the same gene by alternative splicing.
Peptide Mass Fingerprinting (PMF)
• Proteolytic digestion
• Mass spectrometry
• Peptide mass determination: The measured masses are compared with
theoretical peptide masses derived from protein sequence databases. The most
common approach is to search a database such as UniProt or NCBI using software
tools like Mascot, SEQUEST, or ProteinPilot.
• Database matching: The experimental peptide masses are matched with the
theoretical masses using algorithms and statistical methods. The goal is to
identify the protein(s) that best match the observed peptide masses.
• Protein identification: Based on the peptide masses and their matching to
database entries, the protein(s) present in the original sample are identified. The
identification is typically reported as the protein(s) with the highest matching
score or lowest expectation value.
Peptide Mass Fingerprinting (PMF)
• PMF has been widely used in protein identification and characterization,
particularly for samples where the protein of interest is relatively abundant
or has known sequence information.
• It is commonly used in conjunction with database searching and statistical
analysis to assign identities to the proteins based on peptide mass
information.
• It's important to note that PMF has limitations, especially for complex
mixtures or when dealing with proteins that have closely related or highly
homologous sequences.
• In such cases, additional techniques such as tandem mass spectrometry
(MS/MS) or multi-dimensional chromatography may be employed to
improve the accuracy and reliability of protein identification.
Peptide Mass Fingerprinting (PMF)
• Also, in a mass spectrometer, the signals related to most abundant
peptide masses have signals with high intensities. Less abundant
peptide masses have signals with lower intensities. Those high
intensity signals over govern the less intensity signals preventing
those masses for the search.
• Therefore, usually PMF can identify a smaller number of proteins
confined to high abundant proteins. It fails to identify the proteins in
very minute quantities. The contaminants also identified with the
desired proteins e.g., Keratin.
• In such cases also, tandem mass spectrometry (MS/MS) or multi-
dimensional chromatography may be employed.
Tandem mass spectrometry (MS/MS)
• MS/MS offers several advantages over traditional peptide mass
fingerprinting (PMF) techniques.
• It provides additional information about the sequence and structure
of the molecule by analyzing the fragmentation pattern.
• It enables the identification of post-translational modifications,
determination of protein isoforms, and characterization of small
molecule metabolites.
• MS/MS can also be used for quantification purposes using techniques
such as selected reaction monitoring (SRM) or parallel reaction
monitoring (PRM).
Tandem mass spectrometry (MS/MS)
• It involves two successive stages of mass spectrometry: the precursor
ion selection and fragmentation, followed by the analysis of the
resulting fragment ions.
• Precursor ion selection: The first stage of MS/MS involves selecting a
specific ion, known as the precursor ion, for fragmentation. This ion is
typically chosen based on its abundance, intensity, or specific
selection methods.
• Fragmentation: Once the precursor ion is selected, it undergoes
fragmentation by Argon or Neon to generate smaller fragment ions.
Tandem mass spectrometry (MS/MS)
• Fragment ion analysis: The resulting fragment ions are separated based on
their m/z ratio and detected in the second stage of the mass spectrometer.
• Mass spectrum interpretation: The acquired fragment ion mass spectrum is
then interpreted to determine the sequence or structure of the original
molecule. This is achieved by comparing the observed fragmentation
pattern with theoretical fragmentation patterns derived from known
peptide or compound sequences.
• Database searching and analysis: The obtained fragment ion data are
typically searched against protein or compound databases using specialized
software tools, such as Mascot, SEQUEST, or MaxQuant. The database
search helps in identifying the precursor molecule by matching the
experimental fragment ion data with theoretical fragmentation patterns.
Tandem mass spectrometry (MS/MS)
Tandem mass spectrometry (MS/MS)
• b ions and y ions are commonly observed fragment ions generated
during the fragmentation of peptides.
• The fragmentation patterns of these ions provide sequence-specific
information, allowing for the reconstruction of the peptide sequence.
• By comparing the observed fragment ions with theoretical
fragmentation patterns generated from known peptide sequences,
researchers can match the experimental data to a peptide sequence
in a database and confidently identify the peptide.
• Other types of fragment ions, such as a ions, c ions, and internal ions,
may also be observed depending on the fragmentation technique and
the nature of the peptide being analyzed.
Tandem mass spectrometry (MS/MS)
• b ions, also known as N-terminal ions, are formed by the cleavage of
peptide bonds on the N-terminal side of the amino acids in the peptide
sequence. The b ions retain the N-terminal amino acid and contain the
remainder of the peptide chain towards the C-terminal end. b1 represents
the N-terminal fragment containing the first amino acid of the peptide, b2
represents the fragment containing the first two amino acids, and so on.
• y ions, also known as C-terminal ions, are generated by the cleavage of
peptide bonds on the C-terminal side of the amino acids in the peptide
sequence. The y ions retain the C-terminal amino acid and contain the
remainder of the peptide chain towards the N-terminal end. y1 represents
the C-terminal fragment containing the last amino acid of the peptide, y2
represents the fragment containing the last two amino acids, and so on.
Tandem mass spectrometry (MS/MS)
Importance of
identifying low abundance proteins

• Low abundance proteins may involve in mechanisms in that cannot


be predicted.
• Therefore, identification of them especially in the disease state
enables to study their functions.
• These proteins may become biomarkers for diseases.
• They may become targeted drugs.
LC –ESI-MS/MS
• Fast, sensitive and accurate.
• Nano gram quantity of proteins could be analyzed.
• Low abundant proteins can also be detected.
• The molecular weight, PI value & amino acid sequence of the desired
proteins could be detected.
• Thousands of proteins in a mixture could be identified at once.
• Modification of the proteins can also be detected.
Example
Identifying the different proteins on lymphoblasts present in the bone marrow of B-ALL patient:

You might also like