Metamodeling by Using Multiple Regression Integrated K-Means Clustering Algorithm

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/253532213
Metamodeling by using Multiple Regression Integrated K-Means Clustering

Algorithm
Conference Paper · April 2013
CITATIONS READS
0 147
3 authors, including:
Ilker Akgun Murat M. Gunal

Marmara University Naval Science and Engineering Institute
18 PUBLICATIONS 161 CITATIONS 42 PUBLICATIONS 646 CITATIONS
SEE PROFILE SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Hospital management View project
İstanbul Kalkınma Ajansı DFD 2017 View project
All content following this page was uploaded by Murat M. Gunal on 04 June 2014.
The user has requested enhancement of the downloaded file.

Metamodeling by using Multiple Regression Integrated
K-Means Clustering Algorithm
Emre Irfanoglu, Ilker Akgun, Murat M. Gunal

Institute of Naval Science and Engineering
Turkish Naval Academy
Tuzla, Istanbul, TURKEY
(eirfanoglu@dho.edu.tr, iakgun@dho.edu.tr, mgunal@dho.edu.tr)
Keywords: simulation optimization, K-means clustering, generate more accurate and faster results than simulation
metamodel, multi regression does. Many methods for SimOpt have been developed,
mainly in four categories; gradient-based and random search
Abstract algorithms, evolutionary algorithms and metaheuristics,
mathematical programming based approaches, and
A metamodel in simulation modeling, as also known as statistical search techniques.
response surfaces, emulators, auxiliary models, etc. relates a
simulation model’s outputs to its inputs without the need for In this study, we suggest a four-phase approach to improve
further experimentation. A metamodel is essentially a the metamodeling process for SimOpt. Our approach
regression model and mostly known as “the model of a includes simulation experimentation, clustering,
simulation model”. A metamodel may be used for metamodeling, and optimization. In the first phase,
Validation and Verification, sensitivity or what-if analysis, conventional simulation experimentation techniques are
and optimization of simulation model. In this study, we used. Note that we assume we have a simulation model of a
proposed a new metamodeling approach by using multiple typical call centre system, and we aim to optimize some
regression integrated K-means clustering algorithm objective function. In the second phase, we apply a
especially for simulation optimization. Our aim is to clustering algorithm (k-means) to the simulation inputs. In
evaluate the feasibility of a new metamodeling approach in the third phase for each cluster, a metamodel is developed.
which we create multiple metamodels by clustering input- Finally, we applied optimization techniques to each
output variables of a simulation model according to their metamodel. Different from classical metamodeling in
similarities. In this approach, first, we run the simulation SimOpt, we integrated clustering before the “multiple
model of a system, second, by using K-Means clustering regression” metamodel, and generated one metamodel for
algorithm, we create metamodels for each cluster, and third, each cluster, instead of one metamodel for all data.
we seek the minima (or maxima) for each metamodel. We
also tested our approach by using a fictitious call center. First, we review some of SimOpt methods in the literature
We observed that this approach increases the accuracy of a in section 2. In section 3 and 4, we give brief information
metamodel and decreases the sum of squared errors. These about metamodel and clustering. In section 5, we presented
observations give us some insights about usefulness of our proposed approach by comparing with the classical
clustering in metamodeling for simulation optimization. approach. To show an application of the proposed approach,
we experimented with a call center simulation model, and
1. INTRODUCTION showed that clustered metamodels outperform the classical
approach.
Coupling the speed of optimization techniques and
flexibility of simulation emerges a new research area called 2. REVIEW OF SIMULATION OPTIMIZATION
Simulation Optimization (SimOpt), which also affected the TECHNIQUES
practice [1-3]. In the history of Operational Research,
SimOpt methods have started to appear in 1990s, with the In SimOpt, a simulation model is used to estimate the
basic idea of merging the advantages of simulation performance of a system, and based on the estimation, then,
modeling with optimization. Simulation methods are known an optimization algorithm is run to find some new input
for their flexibility to tackle the complexity in systems. values that will maximize or minimize the system
Although simulation models require extensive amount of performance estimation. As in the conventional optimization
data, they help decision makers make better decisions. models, the input values, or the decision variables, are
Optimization methods, on the other hand, are not as flexible constrained. The iterative nature of this approach generally
as for modeling complexity, but once they are built, they
618
makes the simulation model a bottleneck and therefore the of the most important steps is factor screening, the initial
model performance is significant. identification of the “important" parameters, those factors
that have the greatest influence on the response. However,
We review some of the well-known simulation optimization in our discussion of optimization of discrete-event
techniques as follows: simulation models, we assume that this has already been
determined. In most discrete-event system applications, this
Gradient-based and random search algorithms (e.g. is usually the case, since there are underlying analytic
stochastic approximation): Gradient-based search methods models which can give a rough idea as to the influence of
are a type of optimization techniques that use the gradient of various parameters. For example, in manufacturing systems
the objective function to find an optimal solution [4]. In and telecommunications networks, the analyst knows from
each iteration of the algorithm, the values of the decision queuing network models which routing probabilities and
variables are adjusted so that the simulation produces a service times have an effect on the performance measures of
lower objective function value. Gradient-based methods interest. RSM procedures usually presuppose a more “black
work well in high-dimensional spaces provided that these box" approach to the problem as stated above, so it is
spaces do not have local minima. The drawback is that unclear a priori which factors are of importance at all [10].
global minima are likely to remain unfound. Additionally, Fu [10] classifies the application of RSM in
two main categories: metamodels, and sequential
Evolutionary Algorithms and Metaheuristics (e.g. procedures.
Genetic Algorithms, Tabu Search and Simulated
Annealing): Heuristic-based methods strike a balance Meta models are special cases of RSM representation and
between exploration and exploitation. This balance permits therefore the remainder of this paper uses the term
the identification of local minima, but encourages the “metamodel” rather than RSM.
discovery of a globally optimal solution [5]. Heuristic
techniques generate good candidate solutions when the 3. METAMODEL
search space is large and nonlinear.
A metamodel is a polynomial model that relates the input-
Mathematical Programming-Based Approaches (e.g. the output behavior of a simulation model. A metamodel is
Sample Path Method): Sample path optimization (also often a least squares regression model that has form as given
known as stochastic counterpart, sample average in Eqs.(1):
approximation; see [6]) takes many simulations first, and k k k k
then tries to optimize the resulting estimates by using E  y   0   i xi   ii xi2  ...    x x j (1)
ij i
i 1 i 1 i 1 j 1
conventional mathematical programming solution
algorithms.
where βi, βii , and βij represent regression coefficients, xi
Statistical Search Techniques (e.g. Sequential Response (i = 1,…..,n) are design variables, and y is the response. The
Surface Methodology): Response surface methodology simple form of a metamodel can reveal the general
(RSM) is a statistical method for fitting a series of characteristics of behavior in complex simulation models.
regression models to the output of a simulation model [5]. The objective of a metamodel is to “effectively” relate the
The goal of RSM is to construct a functional relationship output data of a simulation model to the model’s input to aid
between the decision variables and the output to in the purpose for which the simulation model was
demonstrate how the changes in the value of decision developed [11].
variables affect the output. Relationships constructed from
RSM are often called meta-models [7]. RSM usually Since our aim in this study is to form a metamodel by using
consists of a screening phase that eliminates unimportant clustering algorithms, we review the related literature in the
variables in the simulation [8]. After the screening phase, following section. Note that we aim at classifying the input
linear models are used to build a surface and find the region variables according to the similarities between each other,
of optimality. Then, second or higher order models are run and after clustering the data, there will be n grouped
to find the optimal values for decision variables. (clustered) data sets, n metamodels. We discuss the details
of this approach after stating the clustering algorithms.
The eventual objective of RSM is to determine the optimum
operating conditions for the system or to determine a region 4. CLUSTERING
of the factor space in which operating requirements are
satisfied [9]. In the formal application of RSM for Clustering is a way to examine similarities and
optimization and for design of experiments in general, one dissimilarities of observations or objects. Data often fall
naturally into groups, or clusters, of observations, where the
619
characteristics of objects in the same cluster are similar and K-means uses an iterative algorithm that minimizes the sum
the characteristics of objects in different clusters are of distances from each object to its cluster centroid, over all
dissimilar. Both the similarity and the dissimilarity should clusters. The algorithm moves objects between clusters until
be examinable in a clear and meaningful way. Measures of the sum cannot be decreased further. The result is a set of
similarity depend on the application. clusters that are as compact and well-separated as possible.
An example of clustered data points is shown in Figure-1
Clustering is widespread, and a wealth of clustering
algorithms has been developed to solve different problems
in specific fields. However, there is no clustering algorithm
7
that can be universally used to solve all problems [12].
Clustering has been applied in a wide range of areas, 6

ranging from engineering (machine learning, artificial
intelligence, pattern recognition, mechanical engineering, 5
electrical engineering), computer sciences (web mining,
spatial database analysis, textual document collection,
4
image segmentation), life and medical sciences (genetics,
biology, microbiology, paleontology, psychiatry, clinic,
pathology), to earth sciences (geography. geology, remote 3
sensing), social sciences (sociology, psychology,
archeology, education), and economics (marketing, 2
business) [13-14].
1
There are two common clustering techniques based on the 8 2
7 3
properties of clusters generated [13-15]; hierarchical 6 4
5
4 5
clustering and partitioned clustering. Hierarchical clustering
groups the data over a variety of scales by creating a cluster
tree. The tree is not a single set of cluster, but rather a Figure-1. An example of clustered data points (taken from a
multilevel hierarchy, where clusters at one level are joined Matlab example)
as clusters at the next level. This allows you to decide the
level or scale of clustering that is most appropriate for your 5. PROPOSED APPROACH
application.
In this study, we proposed a new metamodeling approach by
In partitioned clustering, the data objects are divided into using multiple regression integrated K-means clustering
some specified number of clusters. K-means clustering algorithm especially for simulation optimization. Our
algorithm is one of the well-known methods in this approach works in four phases; Experimentation,
category. K-means partitions data into k mutually exclusive Clustering, Metamodeling, and Optimization. We have ten
clusters, and returns the index of the cluster to which each steps in total as presented in Figure 2. We assume that we
observation has assigned. We used this technique in our have a simulation model that is built for the system that we
methodology to cluster the simulation inputs. desire to find optimum values of some decision variables.
Note that in this case, the decision variables are simulation
K-means clustering algorithm treats each observation in data model inputs. In the experimentation phase, the modeler
as an object having a location in space. It finds a partition in designs the experiments according to the search space size.
which objects within each cluster are as close to each other For example, if there are n input values and we decided to
as possible, and as far from objects in other clusters as run low and high values of each variable, we end up with 2n
possible. There are several different distance measures, factorial experiments. For some cases, that many
depending on the kind of data you are clustering. experiments may not be enough and more experimentation
might be required.
Each cluster in the partition is defined by its member objects
and by its centroid, or center point. The centroid for each
cluster is the point to which the sum of distances from all
objects in that cluster is minimized. K-means computes the
cluster centroids differently for each distance measure, to
minimize the sum with respect to the specified measure.
620
In the second phase, we cluster the simulation inputs. This is
an iterative process since we look for some performance
criterion in each iteration and if the criterion for clustering is
below the acceptable level, we increase the number of
clusters. For example, in K-Means clustering method, the
performance criterion is the silhouette value.
In the third phase, we develop metamodels for each cluster.

As in the previous phase, we look for some quality
measures of metamodels, for example by the R-square
values. The purpose of a metamodel is to estimate outputs
values without further simulation experimentation.
Therefore after this phase, we can estimate simulation
outputs without running the model. However since we have
multiple metamodels, we need to determine some rules for
using each metamodel. These rules might be based on the
limits of the simulation inputs.
The final phase is the optimization phase. Based on the

objective function, we seek the minima or maxima of each
metamodel. This requires differentiating the regression
model and setting it to zero to find the roots of the equation.
Then, we choose the minimum or maximum among the
clusters optimum values.
6. APPLICATION
6.1. Problem Definition

To test our approach, we used an example of a call center
simulation model created with ARENA program [16]. We
choose this model to benchmark our methodology.
Therefore the model structure and its parameter values are
taken from the original problem definition as is written in
[16]. The call centre provides technical support, sales
information, and order processing to a company. The calls
arrive to this call centre with interarrival times exponentially
distributed with a mean value of 0.857 minute. The call
center has 26 trunk lines, which means that there exist
concurrent 26 calls maximum. If all lines are busy, then the
next arriving call will be rejected. An incoming call can be
diverted to one of these options; transfer to technical
support, sales information or order status inquiry. Their
percentages are 76%, 16%, 8% respectively. The estimated
time for this activity is UNIF(0.1, 0.6); all times are in
minutes.
In case of technical support calls, first, a recorded welcome

message is presented which takes UNIF (0.1, 0.5) minutes.
In this message, the caller is expected to choose one of the
three product types. The percentage of the product types 1, 2
and 3 are 25%, 34% and 41% respectively. If a qualified
technical support person is available for the selected product
Figure-2. Flowchart of the proposed methodology type, the call is automatically routed to that person.
Otherwise, the customer is placed in an electronic queue
621
until a support person is available. All technical support call There are two constraints in the problem definition; first, the
durations are triangularly distributed with 3, 6, 18 minutes. number of trunk lines must be between 26 and 50. Second,
After a caller is being served, he exits the system. the call center can accommodate 15 operators at most.
The second type of calls is the sales. These calls are routed 6.2. Steps of the Methodology
to the sales staff. A sales staff call duration is triangularly Step-1 Specify the decision variables: We choose the six
distributed with the parameters 4, 15, 45 minutes. As in the decision variable as shown in Table-1 that affected our
technical support, the caller leaves the system after performance criteria (e.g. the total cost).
completion of the call. The third type of call, order status, is
handled by computers. However some customers may Table 1. Decision variables and their lower and upper
require talking to a real operator. This happens in 15% of bounds
this type of calls. Order status calls also distributed Decision Variables Lower Upper
triangularly with 2, 3, 4 minutes. Note that when these calls Bound Bound
are inserted to a queue for a real operator, they have lower New Sales (X1) 0 15
priority than sales calls. An operator can handle these calls New Tech 1 (X2) 0 15
with triangularly distributed times (3, 5, 10 minutes). These New Tech 2 (X3) 0 15
callers then exit the system. New Tech 3 (X4) 0 15
New Tech All (X5) 0 15
Trunk Line (X6) 26 50
In our base experimentation, there are 11 technical support
employees to answer the technical support calls. Two are
Step-2 Simulation Experimentation: For this stage, instead
only qualified to handle calls for product Type 1, three are
of designing our own experiments, we choose the
only qualified to handle calls for product Type 2, three are
experiments that are already specified by Arena’s OptQuest.
only qualified to handle calls for product Type 3, two are
To ease the process, we first run OptQuest for 500
only qualified to handle calls for product Types 1 and 3, and
experiments to find the optimum. As a result of this,
one is only qualified to handle calls for all three products
OptQuest found the values in Table 2 with the objective
types. There are four employees to answer the sales calls
function value of $21,017. The run length for the model is
and those order-status calls that want to speak to a real
1000 hours and we made 10 replications in each experiment.
person.
Table 2. Minimum total cost and values of decision
Our main output variable is the total cost which includes 3
variables via OptQuest
types of costs; (1) staffing and resource costs, (2) costs due
Obj.Func. X1 X2 X3 X4 X5 X6
to poor customer service and (3) costs of rejected calls. A
$21017 3 0 0 0 3 29
sales staff’s cost is $20/hour and a tech-support staff’s cost
is $18-$20/hour, depending on their level of training and
Step-3 Evaluate the Simulation Output: 16 experiments
flexibility. The second type of cost is the incurred cost
among 500 experimental results are removed since they
associated by making costumer wait on hold. When dealing
were in infeasible region.
with a call center, at some point, people will start getting
mad and the system will start incurring a cost. Although it is
Step-4 Determine the Number of Clusters: In this step, we
difficult to measure this cost, we assumed that for tech calls,
cluster the inputs of the simulation model by examining the
this point is 3 minutes; for sales calls, it’s 1 minute; and for
silhouette values. The silhouette plot displays a measure of
order status it’s 2 minutes. Beyond this tolerance point for
the closeness of each data point by comparing with the
each call type, the system will incur a cost of 36.8
neighboring clusters in the diagram. The measure for the
cents/minute for tech calls, 81.8 cents/minute for sales calls
silhouette value ranges from +1 to -1. “+1” indicates the
and 34.6 cents/minute for order status calls. For rejected
points that are very distant from the neighboring clusters.
calls it is assumed that no more than %5 of incoming calls
“0” indicates the points that are not distinctly in one cluster
get a busy signal; any model configuration not meeting this
or another. “-1” indicates the points that are assigned to the
requirement will be regarded as unacceptable. With related
wrong cluster. The value is defined as;
rejected calls changing the number of trunk line is incurred
$98/week for each trunk line.
S(i) = (min(b(i,k),2) - a(i)) / max(a(i),min(b(i,k)))
In the optimization part, we used this call center simulation
where a(i) is the average distance from the ith point to the
model to find the minimum total cost while holding percent
other points in its cluster, and b(i,k) is the average distance
of rejected calls to 5 and less. The decision variables and
from the ith point to points in another cluster k.
their lower/upper bound values are as shown in the Table 1.
622
Step-5 Cluster Simulation Inputs: We clustered the f 4  34439.5-1970.06 * X 1 - 450.6 * X 2 -1767.6 * X 3 -1197.2 * X 4
simulation inputs using the euclidean distance between the - 2515 * X 5 + 130.49 * X 12 + 81.6 * X 1 * X 2 + 156.9 * X 1 * X 5
inputs. Here, we clustered the inputs up to 8 to compare the (5)
Silhouette plots. + 23.7 * X 2 2 + 75.6 * X 2 * X 3 + 81.7 * X 2 * X 4 + 67.87 * X 2 * X 5
+ 171.8 * X 32 + 146.3 * X 3 * X 4 + 353.6 * X 3 * X 5 + 118 * X 4 2
Step-6 Cluster Validation: To validate the clusters, we + 190.4 * X 4 * X 5 + 211.9 * X 52
analyzed the Silhouette plots and means. Here, the best plot
belongs to the 5-clusters (mean 0.55), as shown in Figure-3.
f5  -4170.44 + 2002.79 * X 1 - 1395.23 * X 3 + 6725.7 * X 4 + 323.3 * X 5
Therefore we end up with 5 metamodels.
+ 1151.2 * X 6 + 567.3 * X 12 + 111.7 * X 1 * X 3 - 838.7 * X 1 * X 5
(6)
- 86.4 * X 1 * X 6 + 19.2 * X 2 2 -156.9 * X 2 * X 4 + 102.7 * X 2 * X 5
1 + 80.9 * X 32 + 54.8 * X 3 * X 4 + 315.4 * X 3 * X 5 - 144.5 * X 4 2
- 193.1* X 4 * X 6 + 269 * X 52 - 8.9 * X 6 2
2
Step-8 Evaluate the Results: In this step, we evaluate the

3
metamodels in Step-7 by conducting some statistical tests
Cluster
(ANOVA, R-square, Residuals Sum of Square). The

metamodels’ corresponding R-Square values are 79,83%,
63,82%, 65,27%, 67,70% and 96.65% respectively.
4
Additionally, square roots of mean square errors (MSE) are
given in Table-3. We compare the R-Square and MSE
values with the single metamodel, that is when we assume
5 to have a classic metamodel (no cluster), we see that the
0 0.2 0.4 0.6 0.8 1 single metamodel’s R-Square value is 81.51% and MSE is
Silhouette Value 1782.76.
Figure 3. Silhouette plot for the experiments.
Table 3. Statistical results of proposed approach and classic
Step-7 Create Metamodel of Every Cluster: We created 5 metamodel
metamodels by using Minitab [17] according to number of Method R MSE F& p Value
clusters in Step 6. The Equations 2 to 6 shows the square
metamodel of each cluster. Cluster-1 79,83 1703.29 25.7246
%, 0.0000
f1  36500  4657,06* X 1  382,69* X 2  774,67* X 3 Cluster-2 63,82 1764.56 10.2316
(2) % 0.0000
 779, 2* X 4  166,54 * X 5  618,5* X  24, 43* X 2
1
2 2
Proposed Cluster-3 76.75 1631.55 7.76564
 62,75* X 32  96,67* X 4 2  57,97* X 5 2 Approach % 0.0000
Cluster-4 81.63 793 47.89
% 0.0000
f 2  33072  4200,84* X 1  185* X 2  225,52* X 3
(3) Cluster-5 99.38 138.45 109.708
 57, 46* X 4  229,69* X 5  617,75* X 12  102, 25* X 2 2 % 0.0000
 77,57* X 32  14,71* X 4 2  16,81* X 52 Classic 81.51 1782.76 173.039
Metamodel % 0.0000
f3  47090.5 - 3869 * X 1 +15649 * X 2 - 3294.5 * X 3 - 3404.3 * X 4 Step-9 Find the Optimum of Each Metamodel: To optimize
- 2199.4 * X 5 +474.6 * X 12 + 90 * X 1 * X 4 -1367.5 * X 2 2 the objective functions of five metamodels, we used Matlab
- 1238.2 * X 2 * X 3 -821.7 * X 2 * X 4 -1084.9 * X 2 * X 5 +193.8 * X 32
(4) [19]’s Optimization Tool. Table-4 shows the minimum total
costs and values of decision variables.
+ 295.9 * X 3 * X 4 +250.5 * X 3 * X 5 +170.7 * X 4 2 +251* X 4 * X 5
+104.9 * X 52
623
Table 4. Objective functions and decision variables’ values [2] Law, M. and Kelton W. D., 2001. Simulation Modeling
Method Obj.Func Decision Variables Tested and Analysis, McGrawHill, Second Edition, United
Value [X1;X2;X3;X4;X5;X6] Obj. States.
Func. [3] Fu, M., 2002. “Optimization for Simulation: Theory vs.
OptQuest $21017 [3;0;0;0;3;29] - Practice”, INFORMS Journal on Computing
Cluster-1 $21394 [3.76;0;6.17;4;8;50] $28570 14(3):192-215.
Cluster-2 $24842 [3.4;0.9;1.4;1.95;6.83;5 $26343
[4] Waziruddin, S., Brogan,D. C., Reynolds, P.
0]
F.:“Coercion through Optimization: A Classification of
Cluster-3 $23994 [3.5;0;3.9;6;0;50] $26246
Cluster-4 $21888 [7.5;0;4;2.6;0;41] $25171 Optimization Techniques” Proceedings of the 2004 Fall
Cluster-5 $20345 [4;0;0;0;5;29] $21986 Simulation Interoperability Workshop, Orlando, FL,
September 2004.
Step-10 Test the Optimum by Using Simulation Model: We [5] Carson, Y. and A. Maria: “Simulation
tested the optimum of each cluster that obtained in Step-9 Optimization: Methods and Applications” Proceedings of
by using Arena simulation model. Note that the minimum the 1997 Winter Simulation Conference, 1997.
total cost belongs to the Cluster-5’s metamodel, as shown in [6] Rubinstein, R. Y. and A. Shapiro. 1993. Discrete
Table-3. After running those decision variable values in our Event Systems: Sensitivity Analysis and Stochastic
call center simulation model, the result is $21646 (“Tested Optimization by the Score Function Method. New York:
Objective Function” column) which is close to the minimum John Wiley & Sons.
total cost that OptQuest finds $21017. [7] Fu, M.: “Simulation Optimization” Proceedings of
the 2001 Winter Simulation Conference, 2001.
7. CONCLUSION [8] R. H. Myers and D. C. Montgomery: Response
Surface Methodology: Process and Product Optimization
Simulation optimization techniques have developed Using Designed Experiments, Wiley-Interscience, 2002.
significantly in the last two decades. In this study, we aim at [9] Montgomery, D.C. (1991) Design and Analysis of
contributing the literature by proposing a new approach in Experiments, John Wiley & Sons, New York, NY.
which K-Means clustering algorithm is integrated into [10] Fu, M.C. (1994) Optimization via simulation: A
metamodeling. We tested the proposed approach by using a review. Annals of Operations Research, 53, 199–247.
call center simulation model. In this example we used 500 [11] Sargent ,R.G.: “Reesearch Issues in
scenarios which are created by Arena OptQuest Metamodeling” Proceedings of the 1991 Winter Simulation
optimization tool, and then clustered the inputs into five Conference, 1997.
groups. The clusters helped to create plausible metamodels [12] Xu, R.:”Survey of Clustering Algorithms” IEEE
with satisfactory and near-optimal R-Square and MSE Transactıons on Neural Networks, Vol. 16, No. 3, pp. 645–
values. This gives us an indication of the advantage of the 678, May 2005.
proposed approach. [13] B. Everitt, S. Landau, and M. Leese, Cluster
Analysis. London:Arnold, 2001.Biography.
When the solution space is large and searching is costly, the [14] J. Hartigan, Clustering Algorithms. New York:
proposed approach can be used as an alternative to heuristic Wiley, 1975.
search algorithms. However to generalize the usefulness of [15] A. Jain, M. Murty, and P. Flynn, “Data clustering:
this approach, we aim at having more cases in the future. A review,” ACM Comput. Surv., vol. 31, no. 3, pp. 264–323,
1999.
8. ACKNOWLEDGMENTS [16] Kelton, W. D. Sadowski, R. P. and Sturrock, D. T.
2007. Simulation with Arena, McGrawHill, Fourth Edition,
The views and conclusions contained herein are those of the United States. pp 195-285.
authors and should not be interpreted as necessarily [17] Minitab, http://www.minitab.com, [accessed
representing the official policies or endorsements, either Jan.2013]
expressed or implied, of any affiliated organization or [18] Arena Simulation Software,
government. http://www.arenasimulation.com/, [accessed Jan.2013]
[19] Matlab,
9. REFERENCES http://www.mathworks.com/products/matlab/ , [accessed
Jan.2013]
[1] Tekin, E. and Sabuncuoglu, I., 2004.“Simulation
Optimization: A Comprehensive Review on Theory and
Applications”. IEEE Transactions, 36:1067-1081.
624
Biography
Emre İrfanoglu is pursuing his MSc in Naval Operations
Research in the Institute of Naval Science and Engineering.
He holds a BSc in Industrial Engineering degree where he
received in 2005 from the Turkish Naval Academy.
Ilker Akgun is an assistant professor in Turkish Naval

Academy. He completed his PhD in Istanbul Technical
University and MSc studies in Middle East Technical
University in 2012 and 2002 respectively.
Murat Gunal is an assistant professor in Turkish Naval

Academy. He completed his PhD and MSc studies in
Lancaster University, UK, in 2008 and 2000 respectively.
His PhD thesis’ title is “Simulation Modelling for
Performance Measurement in Hospitals”. He did research
and worked in simulation field many years.
625
View publication stats

Metamodeling by Using Multiple Regression Integrated K-Means Clustering Algorithm

Uploaded by

Copyright:

Available Formats

You might also like

Metamodeling by Using Multiple Regression Integrated K-Means Clustering Algorithm

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Metamodeling by Using Multiple Regression Integrated K-Means Clustering Algorithm

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Metamodeling by using Multiple Regression Integrated K-Means Clustering

Conference Paper · April 2013

Ilker Akgun Murat M. Gunal

SEE PROFILE SEE PROFILE

Hospital management View project

İstanbul Kalkınma Ajansı DFD 2017 View project

The user has requested enhancement of the downloaded file.

Emre Irfanoglu, Ilker Akgun, Murat M. Gunal

Clustering has been applied in a wide range of areas, 6

In the third phase, we develop metamodels for each cluster.

The final phase is the optimization phase. Based on the

6.1. Problem Definition

In case of technical support calls, first, a recorded welcome

Step-8 Evaluate the Results: In this step, we evaluate the

(ANOVA, R-square, Residuals Sum of Square). The

Ilker Akgun is an assistant professor in Turkish Naval

Murat Gunal is an assistant professor in Turkish Naval

View publication stats

You might also like