Final Notes INT3007

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 26

Final Notes INT3007

Week 1: Network biology

Lecture: Network biology


1. General introduction
a. Context
i. Complex networks are formed by metabolites, genes, mRNA, and proteins
1. Interaction in metabolic pathways differ less across species from larger
scale organization
ii. Quantitative measurements
1. Isolated values
2. No interaction, so we can use comparative statistics, clustering, and
forming gene sets
iii. How do these elements affect each other?
1. Functional organization – pathways
a. Simplest form of networks to describe a cellular process
2. System organization - networks
b. Dataset (gene expression)
i. Gene expression data
1. Ex. Samples of primary tumor tissue can be compared with healthy
tissue by using RNA-sequencing to measure how much of a gene is
expressed/time
2. Raw counts of gene expression
3. Data can be pre-processed and statistically analyzed to compare
samples (comparison between groups)
ii. Reading data
1. GeneID: identifier in online database
2. GeneName: symbol
3. Log2FC: log2 of fold change
a. Fold change: ratio of differences b/w cancer and healthy
tissues
b. Is a gene more or less expressed?
c. +ve = upregulated in cancer sample
d. -ve = downregulated in cancer sample
4. P.Value: significance level of comparison
5. Adj.P.Value: corrected p value for multiple testing
2. Pathway analysis
a. Biological pathway models
i. Signaling: signal results in response by system
1. Ex. Cell shape change or differentiation
2. Start of all pathways
ii. Metabolic: energy storage/production changes
iii. Gene regulation: transcription factors are activated to produce new proteins
iv. Overlap b/w pathways
v. Why
1. Data put into biological context (functional level)
2. Visual representation
3. Reduces complexity
b. Databases
i. KEGG
ii. Reactome
iii. WikiPathways
1. High output of data, so hard to keep accessible and up-to-date
2. Relies on crowd sourcing to synthesize pathways
3. Percentage of covered information is still not high (ex. 50% for protein
coding genes)
c. Software: Pathvisio
i. Uses
1. Designs pathways
a. Data nodes
b. Interactions as edges
c. Can add graphical annotations (ex. cellular compartments,
identifiers, lit references)
2. Visualizing experimental data
3. Pathway statistics
ii. Tutorial
1. Genes/proteins – black, can be grouped in complexes
2. Metabolites – blue
3. Interactions – arrowhead defines type
4. Graphical annotations (ex. cell shapes, color, etc)
5. Links to other pathways – green
iii. Data visualization
1. Color gradients and color rules
a. Ex. Genes color coded for P values and logFC
2. Multi-omics visualization
3. Time-series visualization
4. Annotation
a. Many databases, therefore many identifiers for same biological
elements
b. Identifier mapping databases (database/identifier)
c. BridgeDB knows how to map into different databases
d. Pathway analysis methods
i. Input data set and pathways database
ii. Over-representation analysis (ORA)
1. Define what cutoff is for expression
2. Assess each pathways for number of differentially expressed genes
3. Assess pathways significance
iii. Functional class scoring (FCS)
iv. Pathway topology based (PT)
v. Example ORA
1. Input: Significantly up/down-regulated genes
2. Background list: All measured genes
3. Statistical test: Fisher’s exact test
4. List of up/down-regulated pathways with Z-score and permutated p-
value
a. Perm. P-value: random pathway of same size, check is if it has
the same significance as initial pathway
vi. Z-score calculated for each pathway to rank them (higher score = higher rank)
1. Capitals: pathway independent, lowercase: pathway dependent
a. N – background list (total number of measured genes in
experiment)
b. R – input list (how many were sig. up/down-regulated)
c. n – number of genes in pathway
d. r – number of changed genes in pathways
2. Z > 1.96 – Significantly more genes changed than expected
a. Altered pathways
3. Z = 0 – Distribution of genes in data set is same as expected
4. Z < -1.96 – Significantly less genes changed than expected
a. Stable pathway
vii. Limitations of ORA and FCS
1. Do not take pathway topology/network structure into account
2. Cannot know where changes are coming from, and role of genes
changed is
3. Diagrams needed to make right conclusion
4. High resolution and low coverage
3. Network analysis
a. Biological networks: What is the context of the network what we are working with?
i. Why?
1. Good for complex networks
2. Efficient
3. Data integration
4. Visualization
ii. Types
1. Molecular
a. Gene-protein: genes are encoded into proteins
b. Protein-protein: proteins interact to form complexes etc.
c. Protein-gene: proteins can regulate gene expression
d. Metabolism: proteins needed to catalyze biochemical reactions
2. Cell-cell
3. Nervous system
4. Human disease
5. Social networks
b. Terminology
i. Nodes: set of objects where some pairs are connected by edges
ii. Edges: interactions in a network
iii. Neighbor: nodes linked by a direct edge
iv. Path: sequence of edges connecting a sequence of nodes
1. Shortest path: min number of edges to get from one node to another
v. Adjacency matrix
1. Aij = 1, an edge between nodes i and j
2. Aij = 0, no edge between nodes i and j
3. Will depend on directionality of nodes
vi. Directed vs. undirected networks
1. Un: Symmetrical matrix, ex. two proteins binding to each other
2. Directed: Arcs
vii. Weighted networks
1. Can represent flux, flow parameter, or strength for ex.
2. Will affect shortest path calculation (ex. traffic)
viii. Centrality measures
1. Indicators to identify the most important nodes and/or edges
2. Degree centrality
a. Undirected
i. Node degree: number of edges connected to a node
b. Directed
i. In-degree: no. of edges pointing t/w a node
(regulators)
ii. Out-degree: no. of edges leaving a node (targets)
c. Biological interpretation
i. High degree nodes/hub nodes = successful
3. Betweenness centrality
a. Number of shortest paths going through a node
b. =0 – no shortest paths
c. =1 – all shortest paths
d. Biological interpretations
i. Info. load on a node
ii. Control of node over connectivity of a network
4. Clustering coefficient
a. Connectivity of neighborhood (local edge density)
b. How many of the node’s neighbors are connected to each other
c. Ci = 1 – all neighbors connected
d. Ci = 0 – none of the neighbors connected
e.

f. Biological reference
i. Protein clusters
c. Finding the networks
i. Network sources
1. Depends on biological question and analysis plan
a. Start with gene list
b. Construct an network or find pathways of interest
2. Broad coverage but low resolution
a. Interactions may not be able to explain tissue-level interactions
for ex.

Case: Why do we need hub nodes in biological networks?

1. Differences b/w random and real networks?

a. Real/scale-free networks – Degree distribution (following power law)


i. Follows continuous growth of number of nodes
ii. Many nodes – few links
iii. Small number of hubs – many links (smaller degree is more common)
iv. Ex. Network of airports covering many distances (well and poorly connected
airports)
b. Random networks – Preferential attachments (following linear scale)
i. Fixed number of nodes/links
ii. No highly connected nodes
iii. Many nodes w/ same number of links
iv. Ex. Network of highways (edges) in the US you can reach from cities (nodes)
2. What defines a hub node? Can be hallmarks of cancer (in a network analysis)
a. Highly/connected and essential
b. Cruicial in connectein nodes via the shortest/most efficient path
c. Ex. In PPI, removal of hub node is more lethal than non-hub node
d. Centrality-lethality rule
i. Hubes – organize network b/c so critical in PPI
ii. But alternatively, hubs contain larger number of PPIs, w/ a higher chance of
engaging in essential PPIs
3. Do they exist in undirected networks?
a. Uncommon (but not impossible)
b. Probability of finding hub node in random network (nodes connected by same degree)
4. What does it mean for nodes to be well connected?
a. Maximum connectivity: every node is connected to each other
b. Minimum connectivity: very little connectivity between nodes
c. Goymer (2008): Yeast interaction networks showed that removing essential hubs/nodes was
no more disruptive than removing non-essential hubs
i. Essentiality should refer to the node’s local neighbourhood rather than global
connectivity
1. Essential proteins do cluster in hub-rich networks
ii. Weakly connected and strongly connected graphs – not necessarily the same as
degree vs essentiality
iii. Ex. Servers on the internet or on social media (Google connects us to all other
websites)
d. A real network (ex. PPI and yeast)
i. Essentiality: has lethal consequences if removed for the organism
ii. Does not necessarily need to have a high degree to be essential, because betweenness
also plays a factor
5. What are the hallmarks of cancer?
a. 6 hallmarks of cancer (maybe a 7th-10th in some cases)
i. Resisting cell death, sustaining proliferative signaling, evading growth suppressors,
activation invasions and metastasis, enabling replicative immortality, inducing
angiogenesis (development of new blood vessels)
ii. Chronic proliferation: affects progression of cell growth and division cycle
b. Hallmarks could be characterized by dense clusters in gene expression networks
c. Look at knocking out certain genes
i. Ketogenic diet – we know that this treats cancer because stops ‘feeding’ tumors
6. Example of random network in biological context
7. Why do hubs tend to be essential in protein networks?
Week 2: Metabolic modeling

Lecture: Introduction to metabolic modeling


1. Metabolism: life-sustaining chemical transformations within the cells of organisms
a. Purpose
i. Extraction of energy
ii. Storage of fuels
iii. Synthesis of proteins, lipids, nucleic acids, and carbs
iv. Elimination of waste
b. Connected chemical reactions (conversion to products which are the substrates for the next
reactions)
i. Catalyzed by enzymes
ii. Activity is tuned according to immediate needs or changes in the environment
iii. Network
1. Nodes: metabolites
2. Edges: reactions
c. ATP as energy currency
i. Adenosine triphosphate
ii. Electrostatic energy is generated by negative charges
iii. Energy released by hydrolysis of phosphate bond
iv. Energy from catabolism  ATP + H2O  Energy for cellular work  ADP + Pi
d. Role of metabolism in health in disease
i. Wild type, Mendelian disorder (single enzyme defects), and complex disease
ii. Major or minor defects across networks
iii. Medical relevance
2. Studying metabolic networks
a. Network connectivity
i. Node connectivity of metabolic networks (ex. degree distribution)
1. Some metabolites which participate in many reactions, and many
metabolites which only participate in a few reactions
2. Occurs for all domains
b. Dynamic modeling : Reaction rates (fluxes)
i. Enzyme reaction rates governed by rate laws
1. Michaelis-Menten kinetics: Substrate conc vs reaction rate
ii. What kinds of concentrations and through-flux will we get for a certain metabolite
1. Could change initial conditions to model metabolic activity
iii. Genome-scale network?
1. Requires extensive data on kinetic data
2. What are the existing rate laws
3. Too large to be feasible for entire metabolic networks
c. Modeling metabolic flux w/out kinetic information
i. Look at steady state concentrations of the metabolites
ii. Steady state – situation in which all state variables are constant despite processes
which influence them
1. Constant concentration of all metabolites (d[metabolite]/dt = 0) = mass
balance
iii. Requires stoichiometry
1. Need network to be flexible
2. Have subset of possible pathways through network which yield a steady
state (including flux partitioning)
iv. What can we learn from them? Can study possible phenotypes
3. Constraint-based modeling
a. Modeling network steady states based on mass balance constraints
b. Can change certain constraints
i. Ex. What is the max. biomass production rate (growth) we could have?
ii. Anaerobic vs. aerobic conditions (flux changes from Krebs cycle to fermentation
flux)
c. Toy system – Mass balance
i. A  X  Y
1. v1, v2, v3, v4 – reaction rates/fluxes
a. Conversion rates of reactants w/ unit stoichiometry
2. Mass balance of X = d[X]/dt = v1 – v2 = 0
3. v1 = v2
ii. Possible solutions
1. Plot v1 versus v2
2. Only points along main diagonal will correspond to steady state conditions
3. Solution: flux vector shows that b/c v1 = v2 the solution is c = [1,1], where
constant is not determined by stoichiometry alone
iii. AXY B/C
1. d[X]/dt = v1 – v2 = 0
2. d[Y]/dt = v2 – v3 – v4 = 0
3. Express two differentials in a vector and we get a vector which describes
column-wise the activity of the metabolites – stoichiometric matrix S
4. c = S * v = 0, where flux vectors form null space of S
iv. Constrained solution space
1. Solution can produce 2D plane with all possible solution vectors
2. Can never pick a single steady state, but we can map possible space
3. Can maximize space to model ‘best possible value’ and what the flux value
vector corresponding is
4. Constructing genome-scale constraint-based model
a. Scales calculations because there is no kinetic info required
b. In humans
i. Networks cover all metabolic functions, but only a subset will be activated in a
specific cell type or condition
ii. Gene expression data can be mapped onto network to show reaction network for a
specific condition
c. Practical: Warburg effect
i. Gene expression is continuous
ii. Ex. distribution of RNA expression is very continuous, so how can we get a discrete
network from mapping the data onto a model?
iii. iMAT
1. Grouping genes into high/medium/low expression
2. Find network structure and flux state that maximizes agreement w data
iv. Warburg effect
1. Even in aerobic conditions, cancer cells prefer metabolism via glycolysis
instead of oxidative phosphorylation
2. Normal cells: ATP from oxidative phosphorylation
3. Cancer cells: Aerobic glycolysis
5. Biomedical application of human condition-specific networks
a. Constructed networks that would describe cancer cells
b. Simulate knock outs (remove one gene at a time) to determine drug targets
c. Flux space sampling
i. What range of possible fluxes could you get for reactions of interest?
ii. Compare distribution b/w networks to identify reaction activity difference
Case: Biological networks and FBA

Creating a stoichiometric matrix:


Metabolite/reaction V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
A 1 -1 -1
B 1 -1
C 1 -1
D 1 -1
E 1 -1
F 1 -1
G 1 1 -1
ADP -2 2 -3 1
ATP 2 -2 3 -1

Identifying coupled reactions:


Mass balance – each molecule needs to be produced as much as it can be consumed. Incoming and outgoing
reactions of metabolites w/ two reactions need to carry the same flux (coupled reactions). Which coupled
reactions can be identified?

Metabolites participating in two reactions – B, C, D, E, F


B. v2 = v3
C. v3 = v4 ∴ v2 = v3 = v4
D. v6 = v7
E. v5 = v6
F. v7=v8 ∴ v5 = v6 = v7 = v8

Metabolites participating in more – A, G


A. v1 = v2(/v3/v4) + v6(/v5/v7/v8) = v4 + v8
G. v9 = v8 + v4
∴ v1 = v9
Mass balance for ATP
0 = 2(v3) - 2(v5)+ 3(v7) - (v10)
2(v3) + 3(v7) = 2(v5) + v10 (v5 = v7)
2(v3) + v7 = v10 ∴ Because v3 and v7 are independent, there is a degree of freedom (∴ two
branches possible)

Deriving the net stoichiometry of the upper and lower branches:

Upper branch – v2, v3, v4


Metabolite/reaction V2 V3 V4 net
A -1 -1
B 1 -1 0
C 1 -1 0
D 0
E 0
F 0
G 1 1
ADP -2 -2
ATP 2 2
A + 2ADP → G + 2ATP

Lower branch – v5, v6, v7, v8


Metabolite/reaction V5 V6 V7 V8 net
A -1 -1
B 0
C 0
D 1 -1 0
E 1 -1 0
F 1 -1 0
G 1 1
ADP 2 -3 -1
ATP -2 3 1
A + ADP → G + ATP

Conclusion
The upper branch is more energy efficient (produces 2 ATP molecules), this is because it also uses two ADP
molecules to take up E.

Checking flux balance


mmol
Assume that the cell can take up at most 10 mmol/l/min of A through reaction v1: v1 ≤10 /min. Write down a
l
vector v opt of reaction fluxes that maximizes ATP production (objective function) in the network, fulfilling both this uptake
limit and all mass balance constraints; and determine the maximal amount of ATP the cell can produce per time.

Note: Under steady-state conditions, all concentrations must remain constant and there can be no overall production of ATP
in the network. Instead, the flux through reaction v10 is used as a measure of the ATP produced, since it consumes the (net)
ATP production from the rest of the network.

How can we verify that this flux vector indeed described flux balance (steady-state optimum)?
Calculate the product S* v opt (stoichiometric matrix * optimal flux vector) and confirm that it yields the zero vector.

What do we know?
Any influx into A goes via the upper or lower branch (with only restriction that v1 = v2 + v6
Upper branch = 2 ATPs per A
Lower branch = 1 ATP per A
Therefore, we want to route all flux through the upper branch
v2 = v3 = v4 = 10 mmol/l/min and v5 = v6 = v7 = v8 = 0 mmol/l/min
v9 = 10
Therefore ATP production flux = 20 mmol/l/min = flux through v10

Vopt = (10, 10, 10, 10, 0, 0, 0, 0, 10, 20)

Verifying flux vector describes flux balance


S * vopt = 0

Assume that reaction v4 could not carry any flux (e.g. due to a gene defect or knockout). Determine the ATP-maximizing
flux vector under these conditions and determine the maximal ATP production. Assume the same maximal uptake of A of 10
mmol/l/min.

If v4 could not carry flux, the reaction would tend t/w the lower branch
v2 = v3 = v4 = 0 mmol/l/min and v5 = v6 = v7 = v8 = 10 mmol/l/min
v1 = v2 + v6 = v6
v1 = v6 = v7 = v8 = v9
Maximal ATP production is 10 mmol/l/min
Vopt = [v1, v2, v3, v4, v5, v6, v7, v8, v9, v10]
Vopt = [10, 0, 0, 0, 0, 10, 10, 10, 10, 10, 10]

Practical: Metabolic modeling

Week 3

Lecture: Dynamic modeling of glucose homeostasis


1. Modelling in systems biology
a. Representation of real system but will always lack certain features
b. Lower complexity
c. Good model represents all features which are relevant
d. Models in systems biology
i. Genome-scale models
1. Computationally describes gene protein reaction associations for an entire
gene network
2. Simulated to predict metabolic fluxes
3. Adv. Does not need a lot of information, can be applied to large networks
4. Disadv. Only assumes steady state conditions (no unique solutions)
ii. Machine learning deep learning
1. Programmes which make decisions based on sample data without explicitly
being told to do so
2. Applications in bio and control systems
3. Adv. No human intervention needed, easily identifies patterns
4. Disadv. Large data sets needed to train, lacks interpretation
iii. Dynamic models
1. Simplified representations of real world entities, described by set of
mathematic equations
2. Describes how systems properties change over time
3. Adv. Quantitatively estimate fluxes
4. Disadv. Requires assumption bc kinetic paramentes is required
2. Metabolic modelling
a. Important for human function
i. Relevant to medical treatments: individual approach to medicine (tailored approach
to move t/w personalized treatments)
ii. Nutrition: individual variability when digesting foods
3. Dynamic modelling
a. Mechanistic model
i. Based on mathematical description of bio phenomenon
1. Ex. Glucose-insulin system
ii. Application: investigate short term dynamics (ex. after eating a meal)
1. Can help improve metabolic parameters
iii. Uses ODE
1. Provides quantitative info on the interactions, dynamics, and regulation of
system’s components
2. Equations directly reflect knowledge of glucose metabolism
3. Handled by parameters
iv. Requirements
1. Too simple: doesn’t describe response accurately
2. Too complex: requires info. unavailable from human studies
4. Glucose regulatory system

a.
b. Molecules
i. Glucose – sugar monomer
ii. Insulin – hormone which inhibits glucagon
iii. Glucagon – hormone which helps produce glucose from glycogen
iv. Glycogen – complex sugar
c. Mechanism
i. Gut: food enters, digested meal
ii. Plasma: glucose enters plasma through gut
iii. Pancreas: responds to increase in glucose by secreting insulin
1. Insulin ensures uptake of glucose by muscles and fat tissue
2. Ensures this through interstitial fluid
iv. Liver: Insulin inhibits production of glucose in the liver (inhibits glucagon
production)
1. B/c of increased glucose concentrations from meal, production of glucose
decrease
v. Brain/rbc’s: constant uptake of glucose
vi. Fat cells: take up glucose in presence of insulin
d. Healthy individual
i. Food digested  glucose increases  insulin increases  glucose decreases
ii. Insulin increases  glucagon decreases  glucose (from liver) decreases  insulin
decreases
e. T2DM – body becomes resistant to insulin, loss of b-cell function
i. Organs less responsive to insulin
ii. Increased level of glucose in blood (not taken up)
iii. Beta-cells unable to produce insulin, damaged b/c overwork
iv. Complications
1. Accelerated atherosclerosis
2. Increased chance of a stroke
f. Minimal models
i. Parsimonious descriptions of key components of system functionality
5. Glucose minimal model
a. Main concepts
i. Predicts plasma glucose concentration given measured insulin concentrations
(following oral glucose dose)
ii. Includes parameters that govern insulin sensitivity (how much insulin needed to
deposit certain amount of glucose)
iii. Rate of change in glucose mass = input – output
iv. Input
1. Rate of appearance/Ra - glucose appearing in plasma via gut
2. Net hepatic glucose balance/NHGB - glucose production by liver
v. Output
1. Rate of disappearance/Rd – glucose leaving plasma through uptake by
periphery
2. NHGB – glucose uptake by liver
6. Glucose minimal model
a. dQ/dt = Ra (t) + NHGB (t) – Rd
i. NHGB and Rd mediated by insulin
1. Enhances uptake of glucose to periphery
2. Inhibits glucose production from liver
ii. Sg – glucose effectiveness  promotes glucose disposal  inhibits hepatic glucose
production (independent of insulin)
1. How much glucose is disposed of, or taken up
iii. dQ/dt
1. Glucose appearing in plasma (through gut) = Ra(alpha,t)
2. Glucose production by liver (basal production) = SgQb
3. Uptake to periphery and liver mediated by insulin = -X(t)Q(t)
4. Uptake by periphery and liver based on glucose effectiveness = - SgQ(t)
iv. dQ/dt = Ra(alpha,t) + SgQb - SgQ(t) - X(t)Q(t)
v. dQ/dt = - (Sg+X(t))*Q(t) + Ra(alpha,t) + SgQb
b. dX/dt – rate of change of insulin action (in remote compartment)
i. Insulin leaving = -p2X(t)
ii. Insulin inflow (dependent on insulin levels in plasma) = p3(I(t)-Ib)
iii. dX/dt = -p2X(t) + p3(I(t)-Ib)
c. NB
i. To go from Q (plasma glucose mass) to G (plasma glucose conc.) you divide by
distribution volume (V)
1. G(t) = Q(t)/V
ii. Sg, P2, and P3 are parameters
d. Applications of this model
i. Provide plasma insulin measurements as input, model predicts corresponding glucose
concentrations
ii. Use: estimating insulin sensitivity using parameters
iii. Si = (P3/P2)*V
1. P2 – regulates insulin outflow (remote compartment)
2. P3 – regulates insulin inflow (remote compartment)
iv. Parameter estimation
1. Using exp. data and model
2. Change the parameters in a certain way so that difference b/w experimental
and simulation results is reduced
a. Produces optimal model
b. Can tell us about the physiology of the individual
3. Resudial/ – difference between experimental and predicted data points
a. Summed up and squared, and we can find the parameter values
which minimize error
e. Why do we need a model
i. Saves time and money
ii. Clamp studies
1. After overnight fast, glucose is infused to create new steady state level
above “fasting”
2. Increased glucose disposal and hepatic levels decreases
3. Glucose also admitted by setting levels at a normal range (clamp)
4. After several hours, steady states are achieved for plasma glucose and
plasma insulin infusion rates
5. Glucose infusion rate = glucose utilization
6. Insulin sensitivity = glucose disposal rate * steady state glucose conc * diff.
steady state fasting glucose and insulin conc.
Case: Oral glucose minimal model

Model preducts glucose concentration in plasma based on glucose concentration of oral dose
https://diabetes.diabetesjournals.org/content/63/4/1203

Q – plasma glucose mass


Rd – rate of plasma glucose disappearance
Ra – rate of glucose appearance in plasma from oral input
NHGB – net hepatic glucose balance
G – plasma glucose concentration
V – glucose distribution volume
Qb – basal plasma glucose mass

By assuming Rd and NHGB are linearly dependent on Q (modulated by insulin in remote compartment – not
plasma)
Sg – fractional glucose effectiveness (glucose ability to promote glucose disposal and inhibit NHGB
I – insulin concentration
X(t) – Rate of change of insulin. Insulin action on glucose disposal/production, dependent on insulin leaving and
entering remote compartment
P2 – Rate of insulin action to remove glucose (negative b/c insulin is being used)
P3 – extra insulin we have gotten from meal (insulin level in plasma – basal level/I(t))
Higher = more insulin in the plasma

Ra(a, t) – piecewise linear function (amplitude a, and break point t)

JC: Integrative omics for health and disease


1. The six different omics types
Central dogma in biology: DNA → RNA → proteins
DNA – contains all information needed to make our proteins
RNA – messenger which carries information to our ribosomes
Genomics – genome of an organism (essentially constant over organism’s lifetime)
Causality is clear
Can detect insertions and deletions, assaying complete/partial DNA sequence
Analysis can be used for precision medicine (prevention)
Epigenomics – changes in gene expression w/out changes in DNA
Tissue-specific in response to environmental factors/development of a disease
Ex. Methylation, histone modification, or non-coding RNA
Transcriptomics – complete set of RNA transcriptions
Looks at genes w/ abberant expression, splicing, or allele-specific expression
Mostly mRNA, but can take all types of RNA into account
Proteomics – complete set of expressed proteins
Complex system (including posttranslational modification)
Large scale study of proteins (structure and functions)
Metabolomics – metabolome = complete set of small molecule metabolites
Study of chemical processes which involve metabolites
Dynamic and varies w/in and among organisms
Can be used to study lipidome – lipids and their functions
Glycomics – protein and lipid glycosylation (in immune system)
Important for blood transfusions
Microbiome – genetic material of all microbes in the body
Consists of helpful and harmul microbes
Strong influence on immune function

2. What are Mendelian diseases and which omics data are most commonly used for diagnosis?
Caused by a single locus or gene which follow dominant/recessive patterns of inheritance
- Ex. Huntington’s disease, muscular dystrophy, autism
- Uses exome and genome sequencing to find causative mutations
- Uses genomics sequencing, proteomics, RNA sequencing

3. How are common diseases different from Mendelian diseases? What kind of approaches are used to
better understand them (link to the different practical)?
Common diseases are caused by a combination of genetic and environmental factors
Analysis requires multiple omics data sets
Network analyses – used to identify causal mechanisms of the disease
Can be used w/ genome-scale data or gene expression data
Used to prioritize and identify disease genes and pathways
Enrichment analyses
Finding overrepresented pathways in experimental data set
Understand the global mechanisms of information flow from DNA to physiology
GWAS (genome wide association studies) – finding loci statistically responsible for disease
Rarely find loci statistically responsible for the disease

4. Discuss the approach described in Figure 2 (From genome-wide association studies to mechanism) for
obesity

Establish a comprehensive approach to identify a causal mechanism of obesity.


 

Step 1: Find relevant tissue or cell type and find the downstream target genes (regulatory genomics approach)
Also finding the downstream target genes (using genomics)
Established the variant as an expression quantitative trait locus (eQTL) for IRX3 and IRX5
(developmental genes)

Step 2: Target genes


Risk allele showed increased expression of genes near IRX3 and IRX5 (but none other)
Showed that expression of IRX’s was anti correlated (with genes involved in mitochondrial function)
and correlated (with adipocyte size)

Step 3: Establish causal nucleotide variants. 


Step 4: CRISPR to find the molecular effect.
If you see an increase in upstream regulations you should see an increase in transcriptors
Step 5: They modified the gene and looked at the effect in the phenotype (regulation of energy balance)
Step 6: Establish causality of variant on organism level (using mouse model)
Ex. used AKT interacting protein

- Find some genes (figure b) and figure out which other variants SNPs are in linkage disequilibrium (way to
gather more info about genotype or haplotype makeup of a certain disorder) with the significant variants. 
- What is an eqtl? Any variant in the genome that affects the expression of a certain gene. It can either increase
or decrease the expression of a gene. Any SNP in the genome influences the expression of a gene. 

Figure 3 - blood is easier to extract but not really representative of brain and lung, so only using genome doesnt
always represent affected areas
stratified medicine is more realistic!

5. Summarize the five main challenges mentioned.


1. Analytical challenges: hard to analyze multiple data sets w own variance and biases - need
established infrastructure
Data sets all have their own variances and biases
Current analysis methods are effective for learning about the disease, but not for individual
application in a clinical setting

2. Accuracy and validation: hard to detect structural variants/ inhered error rates, bad for clinical
settings

3. Interpretation: rare and novel molecular events hard to predict, no way to treat variants of uncertain
significance, genome to molecular

4. Finding the relevant tissue: clinical studies requires tissue analysis relevant for the applications bc
expression varies across tissues, single cell resolution
Needed for maintaining consistency between samples
5. Actionability - data that informs an intervention, precision medicine

Practical: Glucose modeling

Week 5: Computational neuroscience I

Lecture: Computational neuroscience I

1. Brain: organ to model


a. Why
i. Emulate (ex. speak recognition)
ii. Heal errors in brain processing (ex. deep brain stimulation)
iii. To understand (ex. learning)
b. Challenges
i. Tissue contains around 1 billion connections b/w neurons
1. Non-uniformly distributed
ii. Complex research subject
iii. Traditions in existing field of comp. neuroscience
iv. Lack of (human) data – has recently caught up
1. Compared to blood sample of muscle biopsy
c. Current approaches
i. Scaled
1. Macro
2. Meso
3. Micro
ii. Allen Brain Atlas
1. Combine genomics w/ neuroanatomy
2. Gene expression maps for brain
3. Measure RNA within 24 hours of death
4. Start by using MRI (satellite map)
5. Frozen samples undergo microscope analysis, and fragment brain into rough
parts to make anatomical analysis
6. Tagged material onto microarray – piece of human genome (over 1000
taken for a single brain)
2. Single neuron models
a. Morphology
i. Generate electrical signals in response to input
ii. Transmit signal to other cells
iii. Dendrites allow neuron to receive input
iv. Axon carries output to other cells
1. Can traverse large parts of brain
v. Many different types
1. Different formations or dendrite densities
b. Action potentials
i. Used giant squids – large axons
1. Underlying action potentials by measuring current
2. Voltage-clamp experiments
ii. Start of computational neuroscience: initiation and propagation
iii. Can be used to model ion channels and different synapse types
iv. Axon potential – rapidly rises and falls (consistent trajectory)
1. Resting potential: inside compared to extracellular is around -70mV
a. High concentration K+ inside neuron
b. High concentration Na+ outside neuron
c. Open K+ channels – some K+ outside cell
d. Inside is negative relative to inside b/c more ions leaving than
entering (b/c no Na entering). Using energy to pump ions against
concentration gradient to generate negative change
2. Ion channels – selectively control ion passing
3. Depolarization – initiated by signal arriving at dendrite
a. Voltage increases (closer to 0)
b. Na+ ions move into membrane
i. Neurotransmitters (from initiating signal) bind to ligand-
gated ion-channels
1. EPSP – more likely to cause action potential (Na+
channels: when open, positive ions enter
increases voltage)
2. IPSP - less likely to cause action potential
(K+/Cl+: positive ions could also leave 
decreasing voltage)
3. All connections between axon and dendrites
cause IPSPs and EPSPs
ii. IPSPs and EPSPs are mediated by different receptors and
neurotransmitters
iii. PSPs – graded potentials
1. Decline in strength as they more t/w hillock
iv. Passive conductance of PSPs from dendrites through soma
to axon hillock
v. Hillock (right after soma, where action potential occurs):
(EPSP - IPSP) > threshold = action potential
vi. Temporal and spatial summation
4. Positive polarization
a. Voltage-gated ion-channels open up
i. Rapid influx of Na+ ions
ii. Voltage becomes positive (more ions in than out of
membrane)
b. Action potential
5. Repolarization
a. Na+ channels close automatically
b. Gated K+ open
c. Decreases voltage
d. Can even hyperpolarize cell (afterpotential), makes sure that cell
really cannot conduct action potential again
6. Return to resting potential
a. All gated channels close
b. massive influx of sodium (3 Na for 2 K atoms, regenerating
negative charge)
7. Refractory period – no action potential can be generated
a. Absolute – Na+ still open
b. Relative – Na+ could be open, but hard b/c of hyper polarization
v. Conducting action potentials
1. Can propagate over large distances
2. In brain, axons are myelinated
a. Support cells (Schwann cells)
b. Facilitates saltatory conduction (no action potential)
c. Speeds up propagation speed
vi. Termination
1. Occurs at synapse
2. Ca2+ influx – leads to neurostransmitter release
3. Neurotransmitters open ion-channels on postsynapse
a. Causes EPSP or IPSP
c. Hodgkin-Huxley model – conductance based
i. Four differential equations – describe ionic bases of action potential
1. Ionic current consists of INA, IK, and of IL (IL is voltage-independent)
a. Looks at voltage needed for a reversal
2.

a. K and Na can change, so voltage dependent


b. Ion concentration difference acts as battery
i. Nernst potential – creating voltage gradient competes w/
concentration gradient (voltage needed to attain certain
balance is the equilibrium/Nernst potential)
1. Value dependent on type of ions
2. For Na the Nerst potential is higher than for K
b/c its closer to the reversal potential of Na,
explains why we have a massive influx of
sodium (3 Na for 2 K atoms, regenerating
negative charge)
3. Related mechanism – completed action potential, we have Na/K pump
which restores original gradient (spring loading the cell again). Requires
energy (explains why glucose usage in brain is so high)
4.

5.
6. Gating variables – probability a channel is open
a. M, h, and n
b. Evolve according to individual diff. equations (depend on voltage)
c. Empirical functions fit data from squid axon frever
7. Parameters (Ek, Ena, El)
a. Voltage and non-voltage gated channels
ii. Properties
1. Calc currents, conductances, and voltages through nerve cells
2. Can test HH model implementation
a. Weak stimulation – no action potential generated
b. Strong – generated action potential
iii. Adaptation of HH by Wilson
1. Simplified by adding calcium currents
2. Captures more complex behavior
d. Integrate-and-fire model – Threshold based
i. Apply current, membrane voltage increases until spike
1. Do not have ion channels factored in
ii. Either fire or don’t (doesn’t look at voltage gating)
iii. Describes the membrane potential in terms of synaptic inputs and injected current
that neuron receives
1. Action potential generated when membrane potential reaches threshold
2. Synaptic input varies periodically
3. Neurons either firing or not (does not take voltages into account)
4. Poorer biological plausibility, but higher computational efficiency
e. Compartmental modeling
i. Takes shape of neuron into account
ii. Detailed simulation of 1 or 2 neurons
iii. Divides neuron into compartments (ex. density of ion channels varies per
compartment)
iv. Fine study of morphology, pharmacology, and electrical effects
v. Used in Alzheimer’s disease
1. Protein (alpha beta) blocks K+ channel
2. Modelled hippocampal CA1 pyramidal neuron
3. Explores what part of dendritic tree is affected
4. Higher max. exitability makes it harder to reach threshold  harder to
prevent misfiring
3. Population level models
a. Reporting from the brain: electrophysiology, fMRI (blood supply affects signal)
b. Neural encoding – signal over time from data
i.
ii. Tuning curves – respond best to certain conditions
iii. Neural spikes are noisy
iv. Brain does not have many trials, so we use population coding
1. Temporal average of single neurons approximates relevant average
population activity of neurons
2. Subpopulations of same type should have similar response properties
v. Auditory system – sound frequency measured by fMRI
1. Brain responds to sound frequency
2. Group neurons by frequency preference
c. Wilson-Cowan Cortical Model (WCCM) of auditory cortex
i. For each compartment of the model, we get feedback loops
1. Model which can be used at multiple scales (ex. E and I can represent
neuron or full-on population)
2. Each bubble represents subpopulation
ii. Firing rate over time
iii. Models excitatory and inhibitory firing rate as differential equations
iv.

1. Coupled so that firing rate is slightly lower


v. Certain populations will favor higher and lower frequencies

vi. Use: explore research questions (ex. auditory attention or mechanism of tinnitus)
4. Multiscale modeling
a. Brain operated at many different scales
b. Multiscale if
i. Modeled object spans multiple time/space scales
ii. Parts of model run with different scales
iii. Model parts influence each other
c. Extremely hard to address

Case: Network control of your actions

1. How does the brain predict actions? (Internal forward model)


a. Internal forward model – CNS contains knowledge about properties of body and external
world
i. Models the behaviour of the body and captures the forward/causal relationships
between actions (promoted by stimuli) and their consequences
ii. Computational studies have proposed that CNS internally simulates the dynamic
behavior of the motor system in planning, control, and learning
1. Internally able to estimate dynamics of the system
b. Forward model – representation of motor apparatus
i. Mimics/represents normal motor behavior in response to outgoing commands
ii. Take input of motor command to predict an output
iii. Input may be an efference copy (duplicate of control signal)
iv. Actual output of system may be different than predicted output
1. Internal imperfections or unpredictable external sources
c. Cognitive model – knowledge of the physical properties of environment
d. Inverse models – causal flow of motor system
i. Knowledge of behavior of motor systems
ii. What causal events (input) resulted in what state/state transitions
iii. Generates the command you need to bring about a desired state
e. Brain can for example correct blurriness due to small twitches in eyes bc it can predict the
effect of a twitch
f. In some cases of schizophrenia: signals are not perceived as their own (associative disorder),
but rather as external
i. Ex. tickling themselves works because their prediction model isn’t accurate
2. What are the characteristics of predictive systems?
a. Perception of actions
i. Multiple forward models for multiple forward predictions
ii. Correspondence between predictions and observed behavior  helps to infer which
controllers are used to generate which observed actions
b. Prediction vs. efference copy (created w/ our own actions)
i. Self-learning
ii. In internal forward model – motor plan containing copy of output
(sensation/movement) made as a result of inputs
1. Internal copy
2. Smashing a bottle, motor neurons send copy to tell other parts of the body
what to do
iii. Part of prediction in internal forward model
iv. Prediction: how much force you need to how a bottle that you can predict exactly
how much
3. Mosaic (modular) – brain runs multiple forward models; each forward model generates a prediction to
match sensory feedback. The “correct” prediction will generate the correct response
a. Ex. Picking up a milk bottle w too much force, it picked the wrong prediction
4. Which sensory modalities are needed for the given examples?
a. Ketchup
i. Motor command is affected by efference copy  predictor  predicted load
ii. Self-generation of force: Anticipates upcoming load force to generates parallels load
force with no delay
iii. External-generation of force: cannot be accurately predicted  grip force lags behind
load force, baseline grip force increases to compensate
1. When someone else does it, there is a delay in the efference copy
b. Tickle
i. Predictive mechanisms underlie observation
ii. Felt less intensely when self-applied (vs. external force)
iii. More time delay = more ticklish because reduces ability for motor commands to
follow predictive mechanisms in observation
c. Force escalation
i. Self-generated forces are perceived as weaker than externally generated forces of the
same magnitude
ii. Arises from predictive process  sensory consequences of movement are predicted
and this influences perception of the force itself
iii. B. Upper: Body creates efference copy of applied force, so you’re going to have
force escalation (correction of own force)
iv. B. Lower: You don’t know exactly how the lever translates to a push

JC: Computational neuroscience vs. systems biology

1. According to De Schutter
a. What is computational neuroscience?
i. Often refers to theoretical approaches in neuroscience
1. Looks at how to brain computes information
2. Systems neuroscience – neural circuit function
ii. Use of computational approaches to investigate properties of nervous system at
various scales (ex. single neuron has detailed diff equations, vs larger scale, which
requires more of a black box model)
1. Implies simulation of numerical models, but analytical models also covered
2. Use computation models yourself
b. What is systems biology?
i. How interactions in biological system give rise to functions/behaviors
ii. Using theory and computational modeling in close interaction w/ experimental
verification to understand the dynamics of biological system s
2. Goal is to model neural activity
a. Anatomy of a neuron
i. Transmit chemical and electrical signals in the brain
ii. Dendrite – receive messages from other neurons (branch-like structure)
iii. Cell body – contains organelles of the cell
iv. Axon – structure which carries impulse from cell body to axon terminals
v. Synapse – chemical junction b/w axon terminals of one neuron and dendrites of the
next
vi. Myelin sheath – fatty material around parts of the axon, increases speed of
conduction
vii. Glial cells
viii. Different types of neurons
1. Number of neurons/types of dendrites
b. How do neurons communicate?
i. Electrical: direct contact and signal transduction b/w cells
1. Synaptic transmission faster than chemical b/c of gap junction
ii. Chemical: gaps b/w cells and signal transduction through neurotransmitters
iii. Action potentials propagate signals
iv. Use IPSPs and EPSPs
3. Approaches
a. Integrate-and-fire neuron – describes the membrane potential in terms of synaptic inputs and
injected current that neuron receives
i. Widely used
ii. Action potential generated when membrane potential reaches threshold
iii. Synaptic input varies periodically
iv. Neurons either firing or not (does not take voltages into account)
v. Poorer biological plausibility, but higher computational efficiency
b. Hodgkin and Huxley model – looks at voltage-gated channels to model neural activity
i. Started by looking at giant neurons of the squid
ii. But for other types of model we could need to calibrate the parameters
c. Compartmental modeling
4. Data validity differences
a. Comp. neuroscience – incomplete data (guesswork needed)
i. Simulates randomly connected networks to investigate dynamics
ii. Ex. Allen institute – harder to find proper brain samples
iii. Brain activity can be found by glucose/water prevalence using MRI
iv. Information framework – Standards b/w teams was very different, so research
sharing was much harder
b. Systems biology – operates in data rich environment (isolate important from non-important)
i. Application of graph theory to analyze genetic/molecular networks to investigate
dynamics
ii. Ex. Muscle biopsy is easier to get data from
iii. Mark-up language – more sophisticated way for data structuring, allowed for better
collaboration b/w teams researching diff groups
5. What can comp. neuroscience offer to systems biology
a. Older field – extensive experience
i. Accumulated simulator software development (multiscale modeling)
1. Can apply simulator itself
2. And use technical software expertise
b. Theoretic models 6543
i. Require extensive manipulation of inputs
6. Efforts
a. Blue brain project
i. Digital reconstructions and simulations of the mouse brain
ii. Exploits interdependencies of data to obtain dense data maps of the brain
b. Human brain project
i. Research infrastructure
ii. Six ICT research platforms and also undertakes targeted research and theoretical
studies
iii. Explores brain structure and function in humans, rodents, and other species
iv. Also looks at ethical and societal implications of HBP’s work
c. Human connectome project
i. Under the NIH
ii. Aims to provide compilation of neural data, which can be navigated and analyzed
iii. As much genetic and imaging data as possible (in twins)
d. Allen brain atlas
i. Unique approach of combining genomics and neuroanatomy
ii. Create gene expression maps for mouse and human brain
iii. In a resting period of time
e. BrainSpan
i. Foundational resource for studying transcriptional mechanisms in human brain
development
ii. Brain atlases
iii. Looks at gene expression over the life time
f. ENIGMA
i. Largest brain mapping project
ii. Network to push imaging genetics forward
iii. Combining imaging data to study brain structure, and look at it in terms of function
and changes due to diseases (sometimes incorporate genetic data as well)

Practical:

Week 6: Computational neuroscience II


Lecture:

Part 1: Imaging the brain at work


1. Neuroimaging methods classified on
a. Capacity to establish correlational/causal relations b/w brain activity and behavior
i. Method decides at what level
1. Monitoring brain activity – correlational
a. High spatial resolution – MRI, fMRI, PET, etc.
b. High temporal resolution – EEG, MEG (low spatial b/c measure
neuron activity)
2. Interfering/modulating brain activity - directly influencing brain activity
(causal)
a. TES/TMS
b. Microsimulation (great spatial and temporal resolution, but requires
opening skull)
b. Spatial and temporal resolution
i. Spatial – what is being measured (neuron-level vs brain levet)
ii. Temporal – timeframe (millisecond vs lifetime)
2. Correlation methods - high spatial resolution (MRI and fMRI)
a. MRI – anatomical imaging technique
i. Good spatial resolution
ii. Large magnet generated
1. Visualize different body tissue types  transversal slides of the brain
2. Tissue properties affect imaging
iii. Sequential slides can be used for looking at the organ in motion (ex. heart)
iv. Changing parameters of perturbation technique allows you to look at different
information visible (ex. vasculature, iron accumulation), DWI – visualize water
diffusion)
b. Network neuroscience
i. Visualization  graph made of node and edges, can be rewritten at adjacency matrix
ii. Brain areas (nodes) and axonal tracks (edges), can create weighted (using strength
scale) adjacency matrix to resemble brain connection strength
iii. Mapping anatomical connectivity
1. Reduce data space, and cluster brain areas (nodes) in undirected network
2. Use DWI (diffusion data) to visualize axonal tracts
3. Creates anatomical network model (weight = fiber tracks)
iv. fMRI
1. Looks at deoxygenated/oxygenated blood ratio affects signal intensities
2. Used for task vs. rest comparison
3. Produces activation map, can be included in network model of brain
v. Mapping functional connectivity
1. Node activity, look at exchange of information between node/pairs
2. Strength of connection – correlation strength by coactivation of node pair
over time
3. Can create maps for phases of task  dynamic model
4. Creating directed network
a. Need functional time network  connection active at what
moment?
b. Insert energy into network (stimulating certain node and seeing
how it is spread throughout network)
c. Activation spread  directed functional network model
d. Can be used to create directed/effective adjacency matrix, which is
asymmetric b/c of directed graph
c. Characteristics of brain’s network structure
i. Communities and hubs  configured to give optimal balance b/w:
1. Functional segregation – clusters of functional modules (ex. visual and
auditory)
a. Communities in brain perform functional modules
b. Don’t want unnecessary functional noise b/w modules
2. Functional integration – efficient communication b/w modules
a. Allow interaction b/w modules
b. Efficient interaction b/c short path length
ii. MRI - “Rich-club organization”
1. Brain hubs – strongly interconnected between functional modules
2. Each brain region connected by thicker nodes and edges (rich club)
3. Useful
a. Promotes function organization and efficiency (fast general
response from all modules)
b. Short cut b/w all starting/ending nodes
c. If connections deteriorate, other cognitive tasks affected
3. Correlation methods - high temporal resolution (EEG and MEG)
a. High temporal resolution
i. Can directly pick-up neural activity
ii. Electrical field – generated by changes in membrane potential
iii. In cortex, these can align to give strong signal which can be picked up by EEG
electrodes
b. EEG purpose: sleep research
i. Reflected by different signals in EEG
ii. After first cycle (entering second), you can enter REM sleep
1. Characterized similar to awake state
iii. Deep sleep stages is most important for reset
4. Inferential measures: Neurostimulation
a. Transcranial stimulation (temporarily alter activity without having to open skull
b. tES – transcranial electric stimulation
i. Weak continuous current (from positive to negative electrodes)
ii. Current induces changes in neuronal excitability (more or less likely to fire)
iii. AC/DC and current intensity influence modulation effects
c. TMS – transcranial magnetic stimulation
i. Strong brief magnetic field
1. Easier to control location and timing
ii. Electric current generates action potentials (generating electric current in brain)
1. Overstimulate – many action potentials (generating virtual lesions)
iii. Amount and intensity of pulses induces modulation effects
iv. Mapping information
1. Directed/affected functional connectivity
2. Energy inserted by TMS pulse can be analyzed

d. Microstimulation
i. Electric currents via electrodes implanted in brain
ii. DBS (deep brain stimulation) – treatment for Parkinson’s, depression, OCD,
epilepsy, addition (experimental)
1. Improving brain functions for motor problems (irregular firing patters in
subthalamic nucleus of brain)
2. Chaotic and frequent firing
3. Stimulate nucleus w/ electrode, send electrical currents by pacemaker (diff.
rhythm than irregular firing)

Part 2: Computational neuroscience applications


1. Brain-machine interfaces (neuroprosthetics – interface neuroscience and biomed engineering)
a. Device to enhance/replace input or output of neural system
b. Types
i. Sensory (input)
1. Ex. Cochlear implants or retinal implant
ii. Motor (output)
1. Peripheral nervous system and spinal-chord
2. Brain (brain-machine interfaces
iii. Hybrid sensory-motor prosthesis (in and output)
c. Brain-controlled motor prosthesis
i. Motor cortex generating artificial neural network
ii. Output - motor functions
d. Sensory input and motor output (improvements in bionic arm)
i. Stimulating somatosensory cortex
ii. Send to neuro-network model
iii. Translates information into sensory perception
iv. Fine grained touch perception  used for automatic sensory experience (in force
application for example)
e. Closed loop system desired (brain-machine-brain interface)
i. Requires a lot of optimization
1. Problem is that we need a neuro-network that learns from your neural
network to translate it into a motor function
2. Sensory stimulation needs to be translated (requires thorough search)
3. Occurring in real time (closing the loop)
ii. Brain-inspired improvements of AI
2. AI neural networks
a. Good for classification, regression, and clustering of very large data sources
b. Perceptron network – mathematical representation of neuron input and output
i. Input - dendrites with weighted potentials
1. Can be positive/negative (EPSPs and IPSPs)
2. Weight represents distance covering dendrite to axon hillock)
ii. Cell body – axon hillock
iii. Output – action potential
1. Generated when passes activation threshold
iv. a = f(Wx + bi), where f is the activation functions in practical use hardlims() to
show that action potential is either generated or not generated
c. Feedforward network – perceptron network w hidden layers of neurons
i. Input – taken together in one vector, weighted values
ii. Have activation function to provide output
1. Don’t use hard limits
iii. Object classification (ex. benign or malignant tumors)
1. Trial/error idea
a. Idea in input layer
b. Hidden layer – activation patterns
c. Output layer (benign or malignant)
d. Use error in network to optimize connections  propagate
information back to change weighting to improve classification
2. Example (salmon vs seabass sorting)
a. Looking at length in histogram (from data set), in general sea bass
larger than salmon
b. Find optimal discrimination value at length (larger than is seabass,
smaller than is salmon)
i. Many misclassified objects, so opt for other features
c. Lightness of scales, find optimal discrimination at x  better
option (but still have some errors)
3. Combination of classifications  length vs lightness to obtain decision
boundary
a. Be careful of overfitting to data set (not generalized for use in new
data sets)
4. Object variance (cars in traffic vs. fish)
a. Harder to define right features
5. Neural networks  importance already weighted
a. Useful because can self-recognize important features
iv. Deep neural networks
1. Deep learning – feature extraction and classification by network (instead of
a person)
a. Becomes increasingly complex
b. Stacked together as a network
2. Brain recognition pathway – works in similar way, which output of neurons
recognizing specific categories
3. Application in generative networks (including deep faces
v. Deep learning applications
1. Analysis of images in scene analysis  annotate objects (ex. in traffic for a
self-driving cars)
2. Analyzing heterogenous ‘Big Data’
a. Web trac analysis
b. Consumer preferences
c. Biomedical diagnosis
i. Clinical diagnosis – ex. classifying brain pathology
ii. Systems biology research

Practical:

You might also like