Some Science-Related Strategic Challenges - Model Risk Excerpt

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Some science-related strategic challenges

A report for the UK Government Office for Science


Ariel Research Services February 2013
Written by Michael Reilly +44 (0)7986599791 michael@arielresearchservices.com www.arielresearchservices.com
Prospero and Ariel by Steering for North 2012 All rights reserved

This report has been commissioned by the UK Government Office for Science. The views expressed in this report are not those of the UK Government and do not represent its policies.

Created by Ariel Research Services 1. Model risk what are the limits to simulation? Summary

February 2013

There are perils in using mathematical models to simulate what scientists describe as the messiness of the real. Yet models of reality have been, and will be in the future, important to major policy decisions. Scientists employ models as part of their fundamental aim to better understand the world. But although they are usually aware of the limits of simulation, there is a growing anxiety that models may present reliability, without truth. Decision-makers want political quality from model-based analysis; but they do not receive risk assessment with rationality. Studies of how decision-makers interact with models are surprisingly rare but suggest human-computer interaction can produce successful outcomes. When boundaries between academia and practice become too porous, models can become engines, not cameras, with alarming results. Scientists and decision-makers use models differently and there may be a requirement for honest brokers to mediate shared aims. The way forward is a cultural shift to where decision-makers can observe and experiment, and, as a consequence, experience that crucial component of science, adaptive learning. The messiness of the real The perils of using mathematical models to simulate what some scientists describe as the messiness of the real were exemplified by the collapse of the hedge fund Long Term Capital Management in 1998 and the investment bank Lehmann Brothers in 2007. Both episodes were marked by a failure of financial risk models but the ensuing global financial crisis and the Great Recession that subsequently followed the demise of Lehman were catastrophic. In both cases, models underestimated the impact and likelihood of extreme events and their systemic consequences (Danielsson, 2008). In the aftermath of this most recent financial crisis, criminal prosecutions have been rare, complicated by differing explanations of cause and effect, so much so that Andrew Lo, a prominent Professor of Finance, has drawn an analogy to Akira Kurosawas classic film Rashomon with its multiple yet conflicting viewpoints of the same incident (Lo, 2012). By way of stark contrast, in LAquila in Italy six scientists at the National Commission for the Forecast and Prevention of Major Risks were summarily convicted of multiple manslaughters for providing "inaccurate, incomplete and contradictory" information in advance of a devastating earthquake in 2008 (Johnston, 2012). In this instance, more than 5000 scientists were unified in agreement that seismology has limited predictive power and that the case represents a far-reaching miscarriage of justice (Leshner, 2010). Decisions, decisions, decisions Models of reality can be important to major policy decisions. The UK Department for Transport was rocked in 2012 when, under legal challenge by Virgin Trains, mistakes in the evaluation process of bids for the West Coast Mainline Rail Franchise were uncovered. The model used to calculate the capital required of bidders to mitigate insolvency risk was weak, its underlying assumptions were miscommunicated to bidders, and, in any event, the calculation was subsequently delegated to a committee of high-level officials breaching departmental protocol (Laidlaw, 2012). That bidders, facing uncertain fuel prices and the vagaries of future demand, were also expected to accurately project revenue growth over 15 2

Created by Ariel Research Services

February 2013

years, was met with derision (The Economist, 2012). Estimates of global infrastructure investment needed over next 20 years range from an incredible $40-$50t (Doshi, et al., 2007). If such estimates can be trusted, then the magnitude of model risk could increase substantially. How do scientists use models? Scientists employ models as part of their fundamental aim to better understand the world. It is easy to forget in an age of supercomputing that early models were physical rather than abstract for example orreries constructed to simulate the orbits of the planets. Nowadays, mathematical models continue to be used to synthesise data, stimulate observation and experimentation, and simulate the future (Oresekes, 2007). Many scientific experts, however, are aware of the limitations of simulation; and predicting the future can actually be damaging to the fundamental aim of science. In rare circumstances where the system is described by a small number of measurable parameters or is highly repetitive and therefore conducive to adaptive learning prediction may be possible (Oresekes, 2007). Even in these instances an erroneous conceptualisation of the system can produce a false positive. A philosopher of science Eric Winsberg, who argues that modelling possesses a different epistemology to science, calls this reliability without truth (Winsberg, 2006). Reliability without truth is a growing anxiety in science. Professor Sherry Turkle, a psychologist, observed the impact of computing on science at MIT. In a study of the 1980s and the 2000s, she documents evidence of deterioration in the scientific identity of young scientists who are drunk with code and indifferent to the inherent limitations of the computer as a scientific instrument (Turkle, 2009). But she finds that simulation has also freed students and researchers from repetitive processes, stimulating observation and experimentation, and supporting adaptive learning. The price to be paid for this immersion is diminishing scepticism. For one MIT physicist: My students know more and more about computer reality, but less and less about the real world. And they no longer even really know about computer reality because the simulations have become so complex that people dont build them any more. They just buy them and cant get beneath the surface. If the assumptions behind some simulation were flawed, my students wouldnt even know where or how to look for the problem. (Turkle, 2009) Andrew Lo, thus, did not err in referencing Rashomon. But the ubiquity of computer simulation and visualisation may be inevitable. Science is increasingly grappling with complex phenomena for which explanatory models are considerably more advanced than rigorous theories (Jogalekar, 2012). Professor Jerry Ravetz at Oxford University calls this post-normal science and it has serious implications for the legitimacy of decision-making (Funtowicz & Ravetz, 1994). How do decision-makers use models? Professor Colin Thirtle, an agricultural scientist at Imperial College in London, is fond of quoting an anecdote about President Lyndon Johnson. When confronted with a range of outcomes, Johnson remarked give me a number, ranges are for cattle. Evidence suggests that societies may be pre-disposed to leaders who exhibit over-confident determinism (Menand, 2005). It would be nave to underestimate the political quality required of model3

Created by Ariel Research Services

February 2013

based analysis or, correspondingly, to lower ones guard to the dismal record of policy-driven prediction. The tragic aftermath of the earthquake in LAquila is one of many examples illustrating the pitfalls of simulation for decision-making. In the Spring of 1997, the town of Grand Forks in North Dakota received two flood outlooks from the National Weather Service for the peak of the Red River. These scenarios, based on different assumptions, were of 47.5 and 49 feet. Although the outlooks were qualified for uncertainty, when the town was inundated at river levels of 54 feet, causing $1-2bn in property damage, the National Weather Service was blamed by many for the disaster. The mayor of Grand Forks claimed they blew it big (Pielke, 1999). A further question is, then, how do decision-makers consume model-based analysis? Behavioural psychologists are discovering that risk assessments are not received with rationality (Kahneman, 2011). Studies of how decision-makers interact with models are surprisingly rare. Professor Earl Hunt at the University of Washington observed naval weather forecasters working with a visual interface that presented multiple simulations in order to mitigate model risk (Hunt, 2003). He found that forecasters tend to rely on a favourite model regardless of its history and then adjust its outputs based on observations and satellite patterns. The results of this adaptive human-computer interaction were generally good and superior to that of the model alone. This outcome demonstrates a lesson learned from a famous interaction between Gary Klein and Daniel Kahneman in 2009 (Kahneman & Klein, 2009). Although previously at loggerheads over the merits or otherwise of intuitive expert judgement, they concluded that expert judgement is valid in an environment 1) sufficiently regular to be predictable, and, 2) where an opportunity exists to learn these regularities through prolonged practice. In contrast, in low validity environments characterised by significant uncertainty and unpredictability, simple algorithms may be superior to human judgement. The committee of high-level officials that breached Department for Transport protocol by over-ruling the admittedly flawed capital calculation process for the West Coast Mainline Rail Franchise were also dismissing the opinion of arguably two of the worlds best psychologists. The merit of a simple algorithm for decision-making in low-validity environments is not merely the whim of a psychologist. There is a trade-off in model construction between detail and error (Oresekes, 2007). Whilst scientists are eager to observe the interplay of important variables, too much detail will result - through the accumulation of errors - in a loss of statistical power. The judgement on how open a model is and how to close it using assumptions is difficult and depends on the aims of the exercise. Separating out the behaviour of important variables in a controlled environment is what makes randomised controlled trials so alluring - and elusive - to social scientists. Engines, not cameras Professor Donald MacKenzie of Edinburgh University, an observer of the sociology of finance, published in 2006 a seminal analysis of the interweaving of innovation in financial economics and developments in financial markets, which in many ways anticipated the financial crisis that occurred soon afterwards (MacKenzie, 2006). He uncovered an epistemic culture in academia where models were considered a source of knowledge but with deep ambivalence towards their realism. Nevertheless, the boundary between academia and practice was highly porous. Again, scepticism was lost in immersion. MacKenzie states that finance theory became thoroughly embedded in financial markets in 4

Created by Ariel Research Services

February 2013

technical, linguistic, and legitimatory ways. In a powerful metaphor he concludes that financial models were engines, not cameras. It is not a huge leap to suppose that models used to determine important policy decisions like the West Coast Rail Franchise have become engines too. In neuroscience, knowledge-poor fMRI visualisations allegedly bias opinion more favourably towards research articles (McCabe & Castel, 2008). And the stakes may just be getting higher. If, as Tim Berners-Lee contends, data is the new raw material of the 21st century then this performativity of models could become more even more significant. Smart cities, for example, could prove to be remarkably dumb (Rooney, 2012). Is there a way forward? For some experts worried about too much immersion and not enough scepticism, the allegory of Platos Cave is instructive. In the cave, a group of prisoners can only observe the outside world through shadows cast on a blank wall. The shadows become their reality. For Plato, it was the role of philosophers to free the world from this illusion, but those discontented by simulation feel a similar responsibility (Turkle, 2009). Scientists and decision-makers use models in different ways. They also want different things from models predominantly analytical and methodological quality for the former and political quality for the latter. These differences are not irreconcilable but there is an obvious need for an honest broker to help free stakeholders from misconceptions and mediate shared aims. An interesting case study of human-computer interaction might be the Threshold 21 model of the Millennium Institute in Washington DC (Millennium Institute, 2012). This model was purposely designed for human-computer interaction in the service of long-term national development planning. Model detail is eschewed in favour of a transparent process whereby - with the aid of mediation - the decision-maker can actually understand how inputs and assumptions affect the interplay of important variables. The black box environment is unlocked. What is clear is that model-based analysis should not be the sole evidence underpinning a decision. The validity of the environment in which the decision is being made may call for the mental models of intuitive expert judgement (Kahneman & Klein, 2009). For shorter-term decisions improved monitoring of the policy environment may be a better investment than an improvement in model detail (Oresekes, 2007). For longer-term decisions such as infrastructure investment, the decision process may have to be designed to fit an uncertain and unpredictable policy environment (Harford, 2012). Above all else, decision-makers should experience models in ways that are based on good science. The way forward is a cultural shift to where decision-makers can observe and experiment, and, as a consequence, experience that crucial component of science, adaptive learning.

Created by Ariel Research Services

February 2013

References Danielsson, J., 2008. Blame the models. Journal of Financial Stability, Volume 4, pp. 321328. Doshi, V., Schulman, G. & Gabaldon, D., 2007. Lights! Water! Motion!. Booz & Co strategy + business, Volume 46, pp. 1-16. Funtowicz, S. & Ravetz, J., 1994. Uncertainty, complexity and post-normal science. Environmental toxicology and chemistry, Volume 13, pp. 1881-1885. Harford, T., 2012. Adapt: why success always starts with failure. London: Little Brown. Hunt, E., 2003. Human-computer decision making: the view from psychology. Washington: University of Washington. Jogalekar, A., 2012. Theories, models and the future of science. [Online] Available at: http://blogs.scientificamerican.com/the-curiouswavefunction/2012/09/05/theories-models-and-the-future-of-science/ Johnston, A., 2012. L'Aquila quake: Italian scientists guilty of manslaughter. [Online] Available at: http://www.bbc.co.uk/news/world-europe-20025626 Kahneman, D., 2011. Thinking Fast and Slow. New York: Fararr, Straus and Giroux. Kahneman, D. & Klein, G., 2009. Conditions for intuitive expertise: a failure to disagree. American Psychologist, Volume 64, pp. 515-526. Laidlaw, S., 2012. Report of the Laidlaw Inquiry: Inquiry into the lessons learned for the Department for Transport from the InterCity West Coast Competition, 2012: HMSO. Leshner, A., 2010. AAAS Protests Charges Against Scientists Who Failed to Predict Earthquake. [Online] Available at: http://www.aaas.org/news/releases/2010/0630italy_letter.shtml Lo, A., 2012. Reading about the financial crisis. Journal of Economic Literature, Volume 50, pp. 151-178. MacKenzie, D., 2006. An engine not a camera: how financial models shape markets. London: The MIT Press. McCabe, D. & Castel, A., 2008. Seeing is believing: the effect of brain images on judgements of scientific reasoning. Cognition, Volume 107, pp. 343-352. Menand, L., 2005. Everybody's an expert. [Online] Available at: http://www.newyorker.com/archive/2005/12/05/051205crbo_books1?pri Millennium Institute, 2012. Threshold 21 Model. [Online] Available at: http://www.millennium-institute.org/integrated_planning/tools/T21/ Oresekes, N., 2007. The role of quantitative models in science. In: s.l.:s.n. Pielke, R., 1999. Who decides? Forecasts and responsibilities in the 1997 Red River Flood. Applied Behavioural Science Review , Volume 7, pp. 83-101. 6

Created by Ariel Research Services

February 2013

Rooney, B., 2012. 'Smart city' planning needs the right balance. [Online] Available at: http://online.wsj.com/article/SB10000872396390443916104578020411910063242.html The Economist, 2012. Train franchises: Wrong Track. [Online] Available at: http://www.economist.com/node/21564272 Turkle, S., 2009. Simulation and its discontents. London: The MIT Press. Winsberg, E., 2006. Models of success versus the success of models: reliability without truth. Synthese, Volume 152, pp. 1-19.

You might also like