Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Lecture 8: Data analysis and simulation

Data Analysis
Four parts:
Data Collection
Data Preparation
• Step 1: Data Validation (removing duplicates and false records)
• Step 2: Data Editing (correcting errors)
• Step 3: Data Coding (making it accessible for quantitative analysis)
Quantitative Data Analysis Methods
• Descriptive Statistics
Data interpretation

Data collection
30 data points is nice to determine a proper mean and confidence interval and also to
distinguish multiple classes, all containing at least 30 elements. A homogeneous patient
should than at least 30 patients contain.
More so you have stable means and variance -> for stable situation
For non-stationary situations and cyclical patterns -> more data, you need a minimum of 3
years, 3 similar periods to discover the seasonality a bit
It thus depends on what you want to look at.

Data preparation
Removing missing values/wrong values for example negative operating times
Sometimes you find the obvious mistake
Sometimes you can repare the data but when you have enough points you can just skip
them
When doing an interview, some coding is performed on the interview as part of the data
preparation
1) Validation -> removing dublicates, for example when combining multiple dataframes
2) Data editing
3) Data coding-> small calculation

Derive a variable from the existing ones.


Quantitative data analysis methods
Calculate mean, variance, simple statistics, also look al seasonality.

Data interpretation -> important when the data has to tell you if there is a problem, f.e. too
high service level etc

Two types of data:


- To support there is a problem
- Use data to calculate LOS, or operating times or…, then this data you
collect is used as input for a simulation model. Then the output data is
analysed.

Data Analysis
Read 16 – Rik Mols, Chapter 5 , p 17-23
Discuss his work with regard to the 4 parts:
• Data collection -> explains how he has collected the data, collects 2 data sets, The
second data set contains the number of surgeries executed per day in 2012 and the
number of IC indications scheduled and executed. En (registered IC admissions
data, registered IC refusals and registered late cancellations). The late
cancelations were hand written and the other come from the database.
Everything was from 2012 only. That is not a lot especially when he is looking
at the things he is looking at. Especially for the late cancellations and the
refussals as these numbers are not a lot from (small).
• Data preparation -> looks like no values are missing, het mentioned it Length of stay
is negative but does not and therefore deleted, limited in data preparations. The
refussals were times 2 or 3.
• Quantitative Analysis->(registered IC admissions data, registered IC refusals and
registered late cancellations, shows general analysis such as number of patients
per weekday with admission type and number of patients per medical discipline with
admission type and other figures and tables, the validation is explained, niet mega
uitgebreide descriptive tables. And length of stay per admission type, refussals and
late cacellations per patient group probabily. Dead before are the organ donors.
• Data Interpretation -> interprets it at the end.
He does not say if conclusions can be made about his data. This is important
because his data is small and not totally correct (times 2 for example). If you assume
that 1/16 goes to the IC, and if only 50 people arrive, only a few go to the IC, so this
process is very volatile. The weakest point is that the things he wants to look at is
only few data of that and does not provide a proper conclusion part so the
interpretation part is weak
Data Analysis
Read 29 – Carlijn Goedhart, 5.3.1-5.3.3, p 20-23
Discuss her work with regard to the 4 parts:
• Data collection -> explained, two different hospital databases; Oksimed and Okapi,
first week of January 2010 till the last week of August 2013. January 1st 2010 till
September 24th 2013. From operated patients, registered cases (indicating some type
of arrival), see table 4.
• Data preparation -> pre-operative screening, EOP (earliest operating period) and
surgery date is determined by the planner. Coupling cases that were not in both
sheets were deleted. Combining the two data bases was hard. Some time period
there was some overlap between the databases. As she has a lot of data it is easy to
skip some of the data. You always have to be careful if a point is a real outlier or not.
Really motivate deleting outliers. Always ask if it is possible that this point is there to
the company people, when it is not possible skip those.
• Quantitative Analysis -> missing more statistics about the data but she has some
figures and tables about other things
• Data Interpretation-> she does not do this as she uses her data for the simulation.
Instead of giving a operating date by a planner,
She could have added a good overview of the time between the admission of the
patient and the operating date
Essay: compare the analysis of Carlijn and Rik and one other thesis
Simulation
When do we use simulation?
• When (stochastic) analysis is too difficult (still try to estimate performance or to give
bounds for performance analytically). Try to make a simple estimation first. That
helps you later on also in the simulation. Often is too difficult to do something else
than a simulation is made, most often with a lot of stochastic ....
LPs give you more advanced rules. In simulation it is good for simple rules and for
complicated rules it is not very practical
• When illustration is needed due to complexity
• Note: usually in simulations only simple decision making rules!

Simulation
Steps in simulation?
- Build model (Excel, Arena, programming language)
- Validation (compare to real world), the model has to be close to reality, this can be
done by looking at the data, checking outcomes and check if they are close to reality.
Finally change rules such that the outcome becomes better.
- Verification (compare to theoretical model expectations), ask supervisors, look for
example also if when utilization approaches 1, the queuelength increases
- Warm-up period, duration, sub runs, runs-> in many situations the system is not
empty therefore a warm-up period is needed. You can do one long simulation and
distinguish it in sub runs. The sub runs should be such that the correlation between
two subruns is less than 0.3. then there can be said that they are more or less
independent.
- Comparison, common random numbers, sensitivity. Compare methods with same
common random numbers. If you change a bit in your parameter, do you get big
changes in your outcome -> sensitivity
- Report averages and confidence intervals
- Present results in clear way and draw conclusions!

Simulation
Read 29 – Carlijn Goedhart, Section 6, p 25-32. How does she
- Build model (Excel, Arena, programming language)- > excel, VBA
- Validation (compare to real world)
- Verification (compare to theoretical model expectations) -> asks concultant
- Warm-up period, duration, sub runs, runs -> warmup period of one month, 5 runs and
a
- Comparison, common random numbers, sensitivity
- Report averages and confidence intervals
- Present results in clear way and draw conclusions?

Simulation
Essay: do the same for one other thesis
that uses simulation
Questions/suggestions?

You might also like