Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

MODEL SELECTION CRITERIA

 C     2C 
  (u C )  t  K i  2 
  t   x  t 

Dr. S.M. Shiva Nagendra


Department of Civil Engineering
INDIAN INSTITUTE OF TECHNOLOGY MADRAS
Chennai -600 036
http://www.iitm.ac.in
Model Selection Criteria

Iterative relationship between model building steps.


TEN STEPS FOR MODEL SELECTION

1. Definition of the purposes for modelling

(i) gaining a better qualitative understanding of the system (by means


including social learning by interest groups);
(ii) knowledge elicitation and review;
(iii) data assessment, discovering coverage, limitations, inconsistencies and
gaps;
(iv) concise summarizing of data: data reduction;
(v) providing a focus for discussion of a problem;
(vi) hypothesis generation and testing;
(vii) prediction, both extrapolation from the past and ‘‘what if’’ exploration;
(viii)control-system design: monitoring, diagnosis, decision-making and
action-taking (in an environmental context, adaptive management);
(ix) short-term forecasting (worth distinguishing from longer-term prediction,
as it usually has a much narrower focus);
(x) interpolation: estimating variables which cannot be measured directly
(state estimation), filling gaps in data;
(xi) providing guidance for management and decision-making.
2. Specification of the modelling context: scope and resources

• This steps identifies the specific questions and issues that the model is
to address; the interest groups, including the clients or end-users of the
model; the outputs required and the forcing variables (drivers).

• The accuracy expected or hoped for; temporal and spatial scope, scale
and resolution; the time frame to complete the model as fixed, for
example, by when it must be ready to help a decision;

• The effort and resources available for modelling and operating the
model, and; flexibility; for example, can the model be quickly
reconfigured to explore a new scenario proposed by a management
group.
3. Conceptualization of the system, specification of data and other
prior knowledge

• Conceptualization refers to basic premises about the working of the


system being modeled.

• Model, block diagram or bond graph showing how model drivers are
linked to internal (state) variables and outputs (observed responses).

• It defines the data, prior knowledge and assumptions about processes.

• The procedure is mainly qualitative to start with, asking what is known of


the processes, what records, instrumentation and monitoring are
available, and how far they are compatible with the physical and
temporal scope dictated by the purposes and objectives.
4. Selection of model features and families

Any modelling approach requires selection of model features, which must


conform with the system and data specification.

Major features such as the types of variables covered and the nature of their
treatment (e.g. white/black/grey box, lumped/ distributed, linear/non-linear,
stochastic/deterministic) place the model in a particular family or families.

The selection of model family should also depend on the level (quantity and
quality) of prior information specified in model conceptualization
5. Choice of how model structure and parameter values are to
be found

• Shortage of records from a system may prevent empirical modelling from


scratch and force reliance on scientific knowledge of the underlying
processes.

• Choice of structure is made easier by such knowledge, and it is reassuring


to feel that the model incorporates what is known scientifically about the
parts of the system.
6. Choice of estimation performance criteria and technique

• The parameter estimation criteria (hardly ever a single criterion) reflect the
desired properties of the estimates.

• The parameter estimation technique should be: computationally as simple as


possible to minimize the chance of coding error;

• Robust in the face of outliers and deviations from assumptions;

• Statistically efficient as feasible; numerically well-conditioned and reliable in


finding the optimum;

• Able to quantify uncertainty in the results (not at all easy, as the underlying
theory is likely to be dubious when the uncertainty is large) and accompanied
by a test for over-parameterization.
7. Identification of model structure and parameters

• This step addresses the iterative process of finding a suitable model structure
and parameter values.

• The step ideally involves hypothesis testing of alternative model structures.

• The complexity of interactions proposed for the model may be increased or


reduced, according to the results of model testing.

• In many cases this process just consists of seeing whether particular


parameters can be dropped or have to be added.
multivariate analysis
– Data reduction or structural simplification
• The phenomena being studied is represented as simply as possible without
sacrificing valuable information.

– Sorting and grouping


• Groups of similar objectives or variables are created, based upon measured
characteristics.

– Investigation of the dependence among the variables


• To identify the nature of the relationships among variables is of interest.

– Prediction
• Relationships between variables must be determined for the purpose of
predicting the values of one or more variables on the basis of observations on
the other variables.

– Hypothesis construction and testing


• Specific statistical hypothesis, formulated in terms of the parameters of
multivariate populations, are tested.
8. Conditional verification including diagnostic checking

• The identified model must be ‘conditionally’ verified and tested to ensure it is


sufficiently robust (Gaussian distribution of measurement errors, or of
linearity of a relation within the model).

• It is also necessary to verify that the interactions and outcomes of the model
are feasible and defensible, given the objectives and the prior knowledge.

• Quantitative verification Criteria may include goodness of fit (comparison of


means and variances of observed versus modeled outputs), tests on
residuals or errors (cross-correlation/ autocorrelation) and, particularly for
relatively simple empirical models, the speed and certainty with which the
parameter estimates converge as more input-output observations are
processed.
9. Quantification of uncertainty

• Uncertainty must be considered in developing any model, but is particularly


important, and usually difficult to deal with, in large, integrated models.

• Uncertainty in models stems from incomplete system understanding (which


processes to include, which processes interact); from imprecise, finite and
often sparse data and measurements; and from uncertainty in the baseline
inputs and conditions for model runs, including predicted inputs.

• A diagnostic diagram can be used to synthesize results of quantitative


parameter sensitivity analysis and qualitative review of parameter strength.
It is a reflective approach where process is as important as technical
assessments.
1. The model structure:
A mathematical model describes the true system for a real-life
situation, may only be known approximately.

2. Model development:
A mathematical model represented by approximate solutions

3. Input model parameters:


Approximate inputs and vary for different conditions

Categories of Uncertainties
statistical uncertainties
Differ each time we would make the same experiment
systematic uncertainties
In principle we know but don't in practice. Errors due poor
calibration.
• Uncertainty quantification
– Type 1 uncertainty is relatively straightforward
to perform. ues such as Monte Carlo methods
are frequently used.

– Type 2 uncertainties, are not straight forward,


efforts are made to gain better knowledge of
the system. Methods such as fuzzy logic or
evidence theory are used to quantify the Type
2 uncertainty .
10. Model evaluation or testing (other models, algorithms,
comparisons with alternatives)
Model confirmation is considered to be demonstrated by evaluating model
performance against data not used to construct the model.

Level of confirmation is rarely possible for large, integrated models,


especially when they have to extrapolate beyond the situation for which they
were calibrated.

The criteria have to be fitness for purpose and transparency of the process
by which the model is produced, rather than consistency with all available
knowledge.

Fitness for purpose should also include ‘softer’ criteria like ability to
accommodate unexpected scenarios and to report predictions under diverse
categories (by interest group, by location, by time, etc), and speed of
responding to requests for modified predictions.

Model accuracy (the traditional modeller’s criterion) is only one of the criteria
important in real applications.
References
Box, G.E.P. and Jenkins, G.M., 1976. Time series analysis forecasting and control.
2nd Edition, Holdenday, San Francisco.
Jakeman, A.J., Letcher, R.A. and Norton, J.P., 2006. Ten iterative steps in
development and evaluation of environmental models. Environmental Modelling
& Software, 21, 602-614.
Khare, M. and Sharma, P., 2002. Modelling urban vehicle emissions. WIT press,
Southampton, UK.
Shiva Nagendra, S. M. and Mukesh Khare, 2009., Univariate stochastic model for
predicting carbon monoxide on an urban roadway. International Journal of
Environmental Engineering, 1(3), 223-237.
Shiva Nagendra, S. M. and Mukesh Khare., 2004. Artificial neural network based line
source models for vehicular exhaust emission predictions of an urban roadway.
Journal of Transportation Research Part D: Transport and Environment, 9 (3),
199-208.
Shiva Nagendra, S. M. and Mukesh Khare., 2002. Line source emission modelling:
review. Atmospheric Environment, 36 (13), 2083-2098.

S.M. SHIVA NAGENDRA


S.M. SHIVA NAGENDRA, Ph.D.
DEPARTMENT OF CIVIL ENGINEERING
IIT MADRAS, CHENNAI-600 036
Email: snagendra@iitm.ac.in
Phone:+91-44-22574290 ; Fax: +91-44-22574252

You might also like