Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

Introduction to Chemoinformatics

Alexandre Varnek
CHEMOINFORMATICS

Definition (wikipedia)
Cheminformatics (also known as chemoinformatics and chemical
informatics) is the use of computer and informational techniques,
applied to a range of problems in the field of chemistry.

These in silico techniques are used in pharmaceutical companies


in the process of drug discovery.

In the U.S., recent NIH emphasis has been placed on developing


public domain Cheminformatics research by creating six
Exploratory Centers for Cheminformatics Research (ECCRs) as
part of the NIH Molecular Libraries Initiative.
computational

Hit

Target Protein

Filtering,
QSAR,
High Throughout Screening
Docking Large libraries
of molecules
Small Library of selected hits

experimental

Virtual Screening
Major applications of Chemoinformatics

Storage and Search Structure-Property


of chemical information Modeling

In silico
design
Chemoinformatics Why?
amount of information
many millions of compounds and reactions
many millions of publications

Storage, organization and search experimental data

Chemical Databases
Problem: FloodofInformation

30 000 000
> 47 million compounds 25 000 000

# o f str u c tu r e s
20 000 000
5-7 million new compounds / year 15 000 000
10 000 000
5 000 000
800,000 publications / year
0
1965 1970 1975 1980 1985 1990 1995 2000

Year

=> can anyone read 4.000 publications / day ?


Problem: NotEnoughInformation

>47,000,000 chemical compounds

~500,000 3Dstructures on
CambridgeCrystallographic File

we have 3D structures for 0.1 % of all compounds


Chemoinformatics Why?
complex relationships
structure - biological activity
chemical reactivity

Prediction of physical, chemical and biological properties

In silico design of new compounds


Themostfundamentalandlasting
objectiveofsynthesisisnotproduction
ofnewcompoundsbutproductionof
properties

George S. Hammond
Norris Award Lecture, 1968
Chemoinformatics How?

Storage, organization and Prediction of physical,


search experimental data chemical and biological
properties

Encodingmolecularstructuresbydescriptors
Example1: Hansch Analysis
Biological Activity = f (Descriptors) + constant
log1/C = a ( log P )2 + b log P + + Es + C

Hanschs Descriptors can


be broadly classified into
three general types:

Electronic ()
Steric (Es)
Hydrophobic (logP)
Example2: Lipinskiruleoffive

Poor absorption or permeation are more likely when:


Therearemorethan5 Hbonddonors.
Themolecularweightisover500.
TheLogPisover5.
Therearemorethan10 Hbondacceptors.

Molecule is represented by 4 parameters:


- the number of H-bond donor groups;
- the number of H-bond acceptor groups;
- molecular weight;
- logP
Chemoinformatics definition

Chemoinformatics isafield
dealingwithmolecularobjects
(graphs,vectors)in
multidimentional chemicalspace
Theoretical chemistry

Quantum Chemistry

Force Field
Molecular Modelling

Chemoinformatics
Theoretical chemistry

Quantum Chemistry
- Molecular model
- Basic concepts
Force Field - Major applications
Molecular Modelling - Learning approaches

Chemoinformatics
Molecular Model

Quantum Chemistry electrons and nuclei

Force Field atoms and bonds


Molecular Modelling

objects in chemical space


Chemoinformatics (graphs, vectors)
Learning approach

deductive >> inductive


Quantum Chemistry

Force Field deductive inductive


Molecular Modelling

Chemoinformatics deductive << inductive


Chemoinformatics:FromDatatoKnowledge

deductive inductive
learning learning
know- generalization
ledge

information context

measurement
data
calculation
Which approach is more useful for a theoretical design of
compounds possessing desired properties ?

Quantum Chemistry

Force Field Modeling

Chemoinformatics

They are complementary


but Chemoinformatics is the most suitable one for
quantitative predictions of properties
Chemoinformatics definition
Chemoinformatics is a generic term that encompasses the design,
creation, organization, management, retrieval, analysis, dissemination,
visualization, and use of chemical information
G.Paris,1998.

Chemoinformatics is the mixing of those information resources to


transform data into information and information into knowledge for the
intended purpose of making better decisions faster in the area of drug lead
identification and optimization
F.K.Brown,1998

Chemoinformatics is theapplicationofinformatics methods tosolve


chemical problems
J.Gasteiger,2004

Chemoinformatics isafielddealingwithmolecularobjects(graphs,
vectors)inmultidimentional chemicalspace

A. Varnek, 2007
Recommended reading

Chemoinformatics - A Textbook, Johann Gasteiger and


Thomas Engel, Wiley-VCH 2003.

Handbook of Chemoinformatics, Johann Gasteiger,


Wiley-VCH 2003.

An Introduction to Chemoinformatics, Andrew R. Leach,


Valerie J. Gillet, Springer 2007.
Short courses in chemoinformatics, 1 5 June 2009

Day 1
Morning Computer representation of chemical structures A. Varnek
Afternoon Creation and management of chemical databases G. Marcou, A.Varnek
Tutorials with the ChemAxon software
Day 2
Morning Molecular Descriptors A. Varnek
Afternoon Force Field approach. Conformational sampling D. Horvath, A. Varnek
Tutorials with MOE, Codessa Pro
Day 3
Morning Pharmacophores T. Langer, D. Horvath
Afternoon Chemical space, similarity/diversity and chemical library design J. Bajorath
Tutorials with MOE
Short courses in chemoinformatics, 1 5 June 2009

Day 4
Morning
Structure-Property modeling G. Marcou, A.Varnek
Afternoon Tutorials with ISIDA, CODESSA Pro and WEKA

Day 5
Morning Docking E. Kellenberger
Afternoon Virtual screening G. Marcou, D. Horvath, A. Varnek
Tutorials with MOE

You might also like