26 18MI31032 Shubham Shubhadarshi

SUMMER TRAINING REPORT
RESEARCH PROJECT | IIT KHARAGPUR
BY
SHUBHAM SHUBHADARSHI
(18MI31032)
DEPARTMENT OF MINING ENGINEERING

INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR
TABLE OF CONTENTS
Acknowledgment 3
Abstract 5
1. Introduction 6
2. Literature Review 8
3. Methods 9
3.1. Data Acquisition 9
3.2. Data Pre-processing 9
3.3. Data Processing 10
3.4. Association Rule Mining 12
4. Results and Discussion 13
5. Conclusion 16
6. References 17
Acknowledgment
I want to express our gratitude to Professor J. Maiti and Baneswar Sarker (Ph.D. Research
Scholar) from the Department of Industrial and Systems Engineering, IIT Kharagpur, for their
help with the assortment of information for finishing this project. I am highly thankful to the
Department of Industrial and Systems Engineering, IIT Kharagpur, for supporting this project to
successful completion.
A MACHINE LEARNING-BASED MODEL FOR DEVELOPMENT OF HAZARD
TRIANGLE USING ASSOCIATION RULE MINING
Shubham Shubhadarshia, Baneswar Sarkerb

a
Department of Mining Engineering, IIT Kharagpur
b
Department of Industrial and Systems Engineering, IIT Kharagpur
Abstract
Prior knowledge of the hazardous accidents that occurred at industrial facilities in the
past would be pretty helpful for management to make decisions to improve the safety of the
industrial workspace. In this project, an attempt is put forward to develop the related association
rules for understanding the relationship between components of the Hazard Triangle.
Furthermore, to find out the frequent patterns of the hazard triangle occurring in industrial areas.
A hazard is a combination of three components known as the hazard triangle; (i) Hazardous
Element (HE), (ii) Initiating Mechanisms (IM), and (iii) Target/Threat (T/T). More than ten
thousand accidents from industrial areas were analyzed, then the components of hazard triangles
were categorized into 21 groups using the unsupervised topic modeling algorithm Latent
Dirichlet Allocation (LDA). After that, three hundred ninety-seven unique hazard triangles are
developed; from that, 31 hazard triangles contribute to 50 percent of all accidents. Eighteen
significant association rules are extracted based on three criteria: support (S), confidence (C),
and lift (L). For example, the results show that accidents involving chemicals, metals, and
materials exposure lead to burns in the arm and elbow(S=2.1%, C=90.3%, L=2.32). Similarly,
accidents involving any fall, slip, or trip due to machinery or walkway mostly lead to fracture,
wound, and muscle injury (S=1.5%, C=87.4%, L=1.74). It is also found that accidents involving
workers exposed to equipment or machinery lead to thermal or electrical burns in the head, neck,
or trunk(S=2.3%, C=84.0%, L=2.16). The results of this project can be used for improvements in
safety protocol to minimize frequently occurring accidents.
1. Introduction
For any industry to thrive, its operations must be safe, reliable, and sustainable in the long
run. Hazards related to the industry need to be identified to assess associated risks to reduce the
risks to a tolerable level. According to the domino hypothesis (Heinrich, 1959), most industrial
accidents arise due to controllable hazardous conduct and conditions. According to Reason's
Swiss Cheese model, a hazard becomes an accident when a succession of events align, forming a
path from hazard to accident (Reason, 1990). The identification and assessment of such
pathways are critical for preventing accidents.
This hazards and risk analysis project aims to identify and assess hazards, event
sequences that lead to hazards, and the risk associated with hazardous events. Several ways are
available to identify and assess hazards, ranging from simple qualitative procedures to complex
quantitative ones.
According to (Clifton A. and Ericson II, 2005), a hazard is the combination of three
components known as the hazard triangle. The three components are as illustrated in Fig. 1: (i)
Hazardous Element (HE): The primary hazardous source that causes the hazard (e.g., a
hazardous energy source). (ii) Initiating Mechanisms (IM): The events that cause the hazard to
arise. (iii) Target/Threat (T/T): A person or object susceptible to harm, damage/mishap
outcomes, and the projected damage and loss. Hazard components, particularly IMs, may only
activate a hazard in a specified time sequence, resulting in accidents. A threat can be produced
by a human or an asset that can cause a hazardous scenario. A Hazard event is caused by the
interaction of several triggering events or persons, which might result in a potential accident. The
hazard and mishap relations are shown in Fig. 2.
Fig. 1. Hazard Triangle according to Hazard Theory
Fig. 2. Hazard components and Hazard actuation
With the availability of extensive data and high-speed computer facilities, the demand for
data mining techniques in information-related applications for rational management decision-
making has skyrocketed. Data mining technologies include machine learning, cluster analysis,
regression analysis, and neural networks. A machine learning algorithm creates a set of models,
often in decision rules, to emphasize the most important links between the input features and the
decision. In cluster analysis, similar items are classified into one cluster, and dissimilar ones are
separated into another depending on specific features.
The Machine Learning model, Association Rule Mining, uses uncomplicated 'If Then'
statements to examine frequently occurring patterns in a dataset or identify intrinsic links
between independent and dependent variables. These guidelines are applicable for non-numeric,
categorical data generated just by counting. An association rule consists of two parts: an
antecedent (if) and a consequent (then) (Kaur, 2014; Meenakshi, 2014). An antecedent is a data
item found in the dataset, whereas a consequent is a data item seen in conjunction with the
antecedent. Thus, the 'If-Then' phrase has the form 'If condition Then conclusion.' These rules
are created by examining the dataset for the presence of frequent 'If-Then' patterns, and the most
relevant associations are evaluated afterward using the support, confidence, and lift criteria
(Gupta and Chauhan, 2013).
In this project, a Machine Learning model has been developed to identify frequent
Hazard Triangles in different industrial areas by analyzing 10351 accidents that happened in the
US between 2020-2021. The topic modeling algorithms are used to cluster similar accidents into
different categories. Then Association rule mining was applied to find frequently occurring
hazard triangles using the criteria of support, confidence, and lift.
2. Literature Review
Several statistical models have been developed to investigate the factors that lead to
hazardous events. Khanzode et al. (2012) conducted a generational assessment of accident-cause
ideas. Among the pioneering efforts in accident data analysis are (Cooper, 2000; Maher and
Summersgill, 1996). In recent years, academics have become more interested in studying
accident data utilizing data mining techniques and algorithms. (Arunraj et al., 2013; Cheng et al.,
2013, Verma et al., 2014).
Association rule mining aims to discover frequent patterns, intriguing correlations,
associations, or causal networks among groups of objects in transaction databases, relational
databases, or other data warehouses (Jaiswal and Agarwal, 2012). They are widely employed in
various fields, including communications networks, risk and market management, and inventory
control. (Bala et al., 2010; Adewole et al., 2014; Agarwal and Mittal, 2019; Chakraborty et al.,
2022).
The apriori algorithm is the favored approach for association rule mining (Agrawal and
Ramakrishnan, 1994). Aside from the apriori algorithm, other algorithms, techniques, and
approaches for mining association rules include the aprioriTid and aprioriHybrid algorithms
(Agrawal and Ramakrishnan, 1994), the Eclat algorithm (Zaki, 2000), the FP-Growth algorithm
(Han et al., 2000), the continuous association rule mining algorithm (CARMA) (Hidber, 1999).
Although there has been various research on association rule mining, the literature on safety data
analysis is limited.
3. Methods
3.1. Data Acquisition
The dataset used in this project is an open-source dataset containing the details of
accidents that happened in the industries of the US between January 2015 and February 2021.
More than sixty thousand accidents were present in the dataset. For this project, only the recent
accidents between January 2020 and February 2021 were considered, accounting for 10351
accidents. The dataset contains 25 features, from which four (SourceTitle, EventTitle,
NatureTitle, Part of Body Title) features are used to develop the Hazard Triangles of the
accidents. These features directly correspond to the components of the hazard triangle, i.e.,
SourceTitle corresponds to Hazardous Element, EventTitle corresponds to Initiating Mechanism,
NatureTitle corresponds to Threat, and Part of Body Title corresponds to Target.
3.2. Data Pre-processing

The selected four features to have many unique values, i.e., Hazardous Elements has 676,
Initiating Mechanisms has 254, Threat has 118, and Target has 101 unique values. Considering
so many unique values as components of the hazard triangle, the number of hazard triangles is
too high. Hence similar values need to be categorized into groups to reduce the number of unique
elements.
In Natural Language Processing (NLP), text preprocessing is the practice of cleaning and
preparing text data using an open-source software library called spaCy to prepare the data for
analysis. The text is then broken into small tokens of lemmatized words. From the tokens, a
dictionary was built that gives each token a unique ID number, which can then be used to create
a corpus or Bag of Words representing the frequency of the tokens.
3.3. Data Processing

The unsupervised machine learning model is then trained on the data. The Latent
Dirichlet Allocation (LDA) technique from the Gensim package and Mallet's implementation
(through Gensim) were utilized. Mallet's LDA implementation is effective. It is known to run
faster and to provide greater topic separation. LDA's approach to topic modeling evaluates each
document as a collection of topics in a specific ratio and each topic as a collection of keywords
in a specific ratio. Once the algorithm is given the number of topics, all it does is rearrange the
topic allocation within the documents and the keyword allocation within the topics to get a
desirable topic-keyword allocation composition (Blei et al., 2003).
The selected LdaMulticore algorithm leverages all CPU cores to parallelize and
accelerate model training. The model was given the previously created corpus and dictionary as
input and assigned to iterate over the corpus 50 times to optimize the model parameters (this is
the default value). The number of topics is set to five, and the number of workers is four (the
number of cores in the processor). The pass is 10, indicating that the model will traverse the
corpus ten times during training.
After training the model, the logical next step is to assess it. A coherence score can be
computed once the topics have been formed. The score assesses the degree of semantic similarity
between words with high scores in each topic. A coherence score may be obtained for each
iteration by changing the number of topics. The coherence score is calculated using various
methods (C_v, C_p, C_uci, C_umass, C_npmi, C_a). It has been decided to use C_v as it is easy
to implement and evaluate. C_v's coherence score ranges from 0 (complete incoherence) to
1(complete coherence). According to John McLevey, values above 0.5 are considered quite
good. According to the coherence score values, 676 Hazardous Elements are clustered into seven
categories, 254 Initiating Mechanisms are clustered into six categories, 118 Threats are clustered
into five categories, and 101 Targets into three categories. Details of the categorization are
shown in Table 1.
Component Number Coherence Number of Name of the Categories

of Hazard of Unique Score Categories
Triangle Elements
Hazardous 676 0.65 7 Conveyor, Truck, and Other Machinery

Element
Fuel, Petroleum and Natural Gas
Powered Machinery and Wooden

objects
Chemical, Metals, and Materials
Injured or Ill Worker and Vehicle

Machinery
Non-Powered Hand Tool Machinery
Machinery, Appliances, and Equipment

for Heating, Logging Cleaning
Initiating 254 0.58 6 Explosion, Fire and Bodily Exertions

Mechanism
Any Type of Fall, Slip and Trip
Structure or Equipment Collapse
Collision in a Roadway by Anything
Struck by Object, Equipment or Vehicle
Caught or Exposed to Equipment,

Machinery
Threat 118 0.63 5 Allergenic, Toxic and Noxious

Poisoning
Fracture, Wound and Muscle Injuries
Thermal and Electrical Burns
Chemical Burns and Corrosions
Intracranial, Spinal Injuries and

Disorders
Target 101 0.75 3 Arm, Elbow and Wrist
Leg, Foot, and Lower Body
Head, Trunk, and Neck

Table 1. Classification of Components of Hazard Triangle
3.4. Association Rule Mining
The association rule is primarily used to discover 'interesting' hidden links among
characteristics in an extensive database. A standard association rule is often written in X → Y
form, where X is the antecedent and Y is the consequent, indicating that X will occur with Y in a
database with minimal significance. It should be noted that each rule might have many things as
antecedent and consequent, i.e., a collection of objects. The apriori algorithm finds frequent item
sets and generates association rules from them. Let I =i1 , i2 , i3 … . in defines the itemsets,
where N is the total number of itemsets. Let X ⊆ I and Y ⊆ I be two separate subsets of I and X
and Y not be void, i.e., X =⊘ and Y =⊘ .
The two main criteria of association rules are support and confidence. The amount of
support indicates how frequently the items appear in the dataset, whereas confidence is the
number of times the 'If-Then' statements are found to be true. They discover correlations and
rules by searching data for frequently used 'If-Then' patterns. At the same time, association rules
must satisfy a user-specified minimum level of support and confidence (Sadoyan et al., 2006).
Support (S) and Confidence (C) are expressed as follows:
N ( X ∪Y )
S( X → Y )=P( X ∪ Y )= (1)
N
P( X ∪ Y ) S ( X ∪ Y ; D)
C (X →Y )= = (2)
P(X ) X ( X ; D)
Zhang and Zhang (2002) developed a third assessment metric, 'lift' or 'interest,' to
improve the rule-generating technique. Lift (L) is the frequency of co-occurrence of the
antecedent and the consequent to the expected frequency of co-occurrence. Lift is a measurement
of the correlation between X and Y. It is represented as follows:
P( X ∪ Y ) X (X ∪ Y )
L(X → Y )= = (3)
P( X ) P(Y ) S (X )S (Y )
According to Lee et al. (2012), When,
● L = 1, there is no correlation between the antecedent and the consequent.
● L > 1, there is a positive correlation between the antecedent and the consequent.
● L < 1, there is a negative correlation between the antecedent and the consequent.
4. Results and Discussion
Accident data from January 2020 to February 2021 is used for this project. In total, 10351
accidents were analyzed. Table 2 shows the coding and frequency of information fields for
relevant components of the hazard triangle.
Component of Information Fields (Items) Code Frequency

Hazard Triangle
Hazardous Conveyor, Truck and Other Machinery HE1 2323

Elements
Fuel, Petroleum and Natural Gas HE2 1060
Powered Machinery and Wooden objects HE3 1965
Chemical, Metals and Materials HE4 1190
Injured or Ill Worker and Vehicle Machinery HE5 1010
Non Powered and Hand Tool Machinery HE6 1378
Machinery, Appliances, and Equipment for HE7 1425

Heating, Logging Cleaning
Initiating Explosion, Fire and Bodily Exertions IM1 350

Mechanisms
Any Type of Fall, Slip, and Trip IM2 3801
Structure or Equipment Collapse IM3 385
Collision in a Roadway by Anything IM4 637
Struck by Object, Equipment or Vehicle IM5 1668
Caught or Exposed to Equipment, Machinery IM6 3510
Threats Allergenic, Toxic and Noxious Poisoning TH1 174
Fracture, Wound, and Muscle Injuries TH2 5210
Thermal and Electrical Burns TH3 4026
Chemical Burns and Corrosions TH4 146
Intracranial, Spinal Injuries and Disorders TH5 795
Targets Arm, Elbow, and Wrist TA1 2469
Leg, Foot, and Lower Body TA2 4131

Head, Trunk, and Neck TA3 2751
Table 2. Codification and frequency distribution of each component of the hazard triangle.
After clustering the dataset's features, the values are substituted with the topic's title. Thus
obtained, a hazard triangle for each accident. In total, 10351 hazard triangles were formed. Every
hazard triangle is not unique. Three hundred ninety-seven unique hazard triangles are found. Of
these, 31 hazard triangles contribute to 50 percent of all accidents, and 105 hazard triangles
contribute to 80 percent of all accidents. A Pareto chart showing the cumulative percentage up to
80 percent of the hazard triangle is shown in Fig 3.
Fig 2. Pareto Chart, showing frequent One Hundred Five Hazard Triangles.
As it is showing in the Pareto chart, 31 hazard triangles account for 50 percent of

accidents. It means that by eliminating these 31 hazard triangles, accidents can be reduced by up
to 50 percent. In the 31 Hazardous Elements, the frequencies for HE1, HE2, HE2, HE4, HE5,
HE6, and HE7 are 7,4,6,2,4,4 and 4, respectively. The frequencies of Initiating Mechanisms,
IM1, IM2, IM3, IM4, IM5, and IM6 are 0,13,3,0,0 and 15, respectively. The frequencies of
Threats, TH1, TH2, TH3, TH4, and TH5 are 0,15,15,0 and 1, respectively. The frequencies of
Targets TA1, TA2, and TA3, are 9,11, and 11, respectively.
Hazardous elements distribution in the top 50 percent of accidents is uniform. All seven
types of Hazardous elements contribute to nearly equal numbers of accidents. Also, the
distribution of Targets is uniform, meaning every body part is equally likely to be a target during
an accident. However, in the Initiating Mechanisms, 50 percent of the accidents happened mainly
due to two initiating mechanisms, IM2 (Any Type of Fall, Slip, and Trip) and IM6 (Caught or
Exposed to Equipment, Machinery). These accidents mainly cause two Threats, TH2 (Fracture,
Wound, and Muscle Injuries) and TH3 (Thermal and Electrical Burns).
The hazard actuation can be prevented by interrupting the sequence of hazardous
elements, initiating mechanisms, and threat/target. Fifty percent of accidents occurred in
industrial areas because of two initiating mechanisms, Any type of Fall, Slip, or Trip and Caught
Exposed to Equipment or Machinery. So, the management had to decide how to eliminate these
initiating mechanisms to reduce the chances of accidents. Mainly, two Threats contribute to fifty
percent of all accidents, Fractures, Wounds, Muscle Injuries, and Thermal or Electrical burns.
Necessary safety arrangements should be there to prevent these threats and reduce accidents by
up to fifty percent.
In order to find the association rules, minimum levels of support and confidence must be
provided. There is no defined criterion for determining support and confidence threshold levels.
This study's threshold support and confidence levels were 1% and 70%, respectively. It signifies
that no association rules with support < minimum support and confidence < minimum
confidence, as well as a lift value greater than one, would be considered. The association rules
mining is generated with three antecedents, as shown in Table 4.
Sl. No. Antecedents Consequents Support Confidence Lift
1 HE4, IM6, TA1 TH3 0.021 0.903 2.32
2 HE3, IM2, TA1 TH2 0.015 0.874 1.74
3 HE5, IM6, TA2 TH5 0.018 0.863 11.24
4 HE7, IM6, TA1 TH3 0.017 0.840 2.16
5 HE4, IM6, TA3 TH3 0.023 0.840 2.16
6 HE1, IM2, TA1 TH2 0.013 0.825 1.64
7 HE3, IM6, TA3 TH3 0.011 0.824 2.12
8 HE6, IM6, TA1 TH3 0.019 0.807 2.08
9 HE2, IM2, TA3 TH2 0.012 0.795 1.58
10 HE7, IM6, TA3 TH3 0.019 0.788 2.03

11 HE6, IM6, TA3 TH3 0.021 0.779 2.00
12 HE1, IM6, TA1 TH3 0.019 0.756 1.94
13 HE7, IM2, TA2 TH2 0.018 0.755 1.50
14 HE3, TA2, IM2 TH2 0.039 0.745 1.48
15 HE3, TA3, IM2 TH2 0.023 0.738 1.47
16 HE2, IM2, TA2 TH2 0.019 0.735 1.46
17 HE1, IM2, TA2 TH2 0.042 0.729 1.45
18 HE7, IM2, TA3 TH2 0.011 0.711 1.41

Table 3. Association rule for components of hazard triangle with three antecedents.
Association rules with higher lift (greater than one) values are more powerful and
intriguing. Eighteen association rules have three antecedents(HE, IM, TA). The rules with only
three(HE, IM, TA) antecedents were considered antecedents, and the consequent(TH) will form
a hazard triangle. The first important rule has L=2.32, with antecedents HE4, IM6, TA1, and
consequent TH3 (S=2.1%, C=90.3%), which signifies that the accident occurred when a person
got exposed to chemical, metal, or material resulting in thermal or electrical burn in the arm or
elbow. The second rule has L=1.74, with antecedents HE3, IM2, TA1, and consequent TH2
(S=1.5%, C=87.4%), which signifies that the fracture or wound in the arm or elbow happened
due to any type of fall, slip or trip due to any powered machinery or wooden object. The fourth
rule has L=2.16, with antecedents HE7, IM6, TA1, and consequent TH3 (S=1.7%, C=84.0%),
which signifies an arm or hand caught in appliances used for heating or cleaning, it generally
leads to thermal or electrical burns. Similarly, other association rules can also be explained in
words.
5. Conclusion
This project aimed to develop hazard triangles from past accidents and identify the
patterns in the accidents to upgrade safety measures to prevent or minimize such incidents in the
future. A structured methodology was used to build hazard triangles. The results indicate that out
of 397 hazard triangles, only 31 triangles contribute to 50 percent of all accidents, and 105
hazard triangles contribute to 80 percent of all accidents. Only two initiating mechanisms are
more frequent Falls, slips, or Trips and Exposed to Equipment or Machinery. They result in
mainly two types of Threats Fractures, Wound or Muscle injury, and Thermal or Electrical
burns. Nineteen association rules were found using the apriori algorithm of association rule
mining. These rules will help the management of industrial facilities to understand different
accidents and make better decisions concerning safety. The limitations of the study can be the
data used in the project. For better clustering of text, the length of the text should be more, but
the texts used for the clustering in this project are concise; some are even one word. Thus, the
clustering accuracy is not the best, but it is well within the range of average to good. In the
future, different topic modeling algorithms can cluster the data with better accuracy.
Furthermore, different variants of the apriori algorithm or other algorithms outlined in the
introductory section can be used to establish association rules for safety-related occurrences.
Their results can be compared to the apriori method. Nonetheless, the current project allows
learning from past accident experiences.
6. References
Adewole, K. S., Akintola, A. G., Ajiboye, A. R., Abdulsalam, K. S., 2014. Frequent pattern and
association rule mining from inventory database using apriori algorithm. African Journal
of Computing & ICT, 7(3), 35-41.
Agarwal, R., Mittal, M., 2019. Inventory classification using multi-level association rule mining.
International Journal of Decision Support System Technology, 11(2), 1-9.
Agrawal, R., Ramakrishnan, S., 1994. Fast algorithms for mining association rules. International
Conference on Very Large Data Bases, pp. 1–32.
Arunraj, N.S., Mandal, S., Maiti, J., 2013. Modeling uncertainty in risk assessment: An
integrated approach with fuzzy set theory and Monte Carlo simulation. Accident
Analysis and Prevention, 55, 242–255.
Bala, P. K., Sural, S., Banerjee, R. N., 2010. Association rule for purchase dependence in multi-
item inventory. Production Planning and Control, 21(3), 274-285.
Blei, D. M., Ng, A. Y., Jordan, M. I., 2003. Latent Dirichlet Allocation. Journal of Machine
Learning Research, 3, 993-1022,
Cheng, C.-W., Yao, H.-Q., Wu, T.-C., 2013. Applying data mining techniques to analyze the
causes of major occupational accidents in the petrochemical industry. Journal of Loss
Prevention in the Process Industry.
Chakraborty, S., Mallick, B., Chakraborty, S., 2022. Mining of association rules for the
treatment of dental diseases. Journal of Decision Analytics and Intelligent Computing,
2(1), 1-11.
Clifton, A., Ericson, II., 2005. Hazard Analysis Techniques for System Safety. John Wiley &
Sons, Hoboken, New Jersey, USA.
Cooper, M.D., 2000. Towards a model of safety culture. Safety Science, 36, 111–136.
Gupta, D., Chauhan, A. S., 2013. Mining association rules from infrequent itemsets: A survey.
International Journal of Innovative Research in Science, Engineering and Technology,
2(10), 5801-5808.
Han, J., Pei, J., Yin, Y., 2000. Frequent pattern tree: design and construction. SIGMOD ’00
Proceedings of the 2000 ACM SIGMOD International Conference on Management of
Data, Dallas, pp. 1–12.
Heinrich, H., 1959. Industrial Accident Prevention. McGraw-Hill, New York.
Hidber, C., 1999. Online association rule mining. SIFMOD Association for Computing
Machinery, Philadelphia, PA, pp. 145–156.
Jaiswal, V., Agarwal, J., 2012. The evolution of the association rules. International Journal of
Modeling and Optimization, 2(6), 726-729.
Kaur, G., 2014. Association rule mining: A survey. International Journal of Computer Science
and Information Technologies, 5(2), 2320-2324.
Khanzode, V.V., Maiti, J., Ray, P.K., 2012. Occupational injury and accident research: a
comprehensive review. Safety Science, 50 (5), 1355–1367.
Lee, C., Song, B., Park, Y., 2012. Design convergent product concepts based on functionality:
an association rule mining and decision tree approach. Expert Systems with
Applications, 39 (10), 9534–9542.
Maher, M.J., Summersgill, I., 1996. A comprehensive methodology for the fitting of predictive
accident models. Accident Analysis and Prevention, 28 (3), 281–296.
Meenakshi, R., 2014. A review on association rule mining. International Journal of Advance
Research in Science and Engineering, 3(5), 299-303.
Reason. J., 1990. Human error. BMJ, New York: Cambridge University Press, vol. 320, pp.
768–770.
Sadoyan, H., Zakarian, A., Mohanty, P., 2006. Data mining algorithm for manufacturing
process control. International Journal of Advanced Manufacturing Technology, 28, 342-
350.
Verma, A., Khan, S. D., Maiti, J., Krishna, O. B., 2014. Identifying patterns of safety-related
incidents in a steel plant using association rule mining of incident investigation reports.
Safety Science, 70. 89–98.
Zaki, Mohammed J., Parthasarthy, S., Ogihara, M., 1997. Parallel algorithms for discovery of
association rules. Data Mining and Knowledge Discovery, 373, 343–373.
Zhang, C., Zhang, S., 2002. Association Rule Mining: Models and Algorithms. Springer, New
York.

26 18MI31032 Shubham Shubhadarshi

Uploaded by

Copyright:

Available Formats

You might also like

26 18MI31032 Shubham Shubhadarshi

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

26 18MI31032 Shubham Shubhadarshi

Uploaded by

Copyright:

Available Formats

SUMMER TRAINING REPORT

RESEARCH PROJECT | IIT KHARAGPUR

DEPARTMENT OF MINING ENGINEERING

4. Results and Discussion 13

Shubham Shubhadarshia, Baneswar Sarkerb

Fig. 2. Hazard components and Hazard actuation

3.2. Data Pre-processing

3.3. Data Processing

Component Number Coherence Number of Name of the Categories

Hazardous 676 0.65 7 Conveyor, Truck, and Other Machinery

Powered Machinery and Wooden

Chemical, Metals, and Materials

Injured or Ill Worker and Vehicle

Non-Powered Hand Tool Machinery

Machinery, Appliances, and Equipment

Initiating 254 0.58 6 Explosion, Fire and Bodily Exertions

Structure or Equipment Collapse

Collision in a Roadway by Anything

Struck by Object, Equipment or Vehicle

Caught or Exposed to Equipment,

Threat 118 0.63 5 Allergenic, Toxic and Noxious

Fracture, Wound and Muscle Injuries

Thermal and Electrical Burns

Chemical Burns and Corrosions

Intracranial, Spinal Injuries and

Target 101 0.75 3 Arm, Elbow and Wrist

Leg, Foot, and Lower Body

Head, Trunk, and Neck

Component of Information Fields (Items) Code Frequency

Hazardous Conveyor, Truck and Other Machinery HE1 2323

Powered Machinery and Wooden objects HE3 1965

Chemical, Metals and Materials HE4 1190

Injured or Ill Worker and Vehicle Machinery HE5 1010

Non Powered and Hand Tool Machinery HE6 1378

Machinery, Appliances, and Equipment for HE7 1425

Initiating Explosion, Fire and Bodily Exertions IM1 350

Structure or Equipment Collapse IM3 385

Collision in a Roadway by Anything IM4 637

Struck by Object, Equipment or Vehicle IM5 1668

Caught or Exposed to Equipment, Machinery IM6 3510

Threats Allergenic, Toxic and Noxious Poisoning TH1 174

Fracture, Wound, and Muscle Injuries TH2 5210

Thermal and Electrical Burns TH3 4026

Chemical Burns and Corrosions TH4 146

Intracranial, Spinal Injuries and Disorders TH5 795

Targets Arm, Elbow, and Wrist TA1 2469

Leg, Foot, and Lower Body TA2 4131

As it is showing in the Pareto chart, 31 hazard triangles account for 50 percent of

Sl. No. Antecedents Consequents Support Confidence Lift

1 HE4, IM6, TA1 TH3 0.021 0.903 2.32

2 HE3, IM2, TA1 TH2 0.015 0.874 1.74

3 HE5, IM6, TA2 TH5 0.018 0.863 11.24

4 HE7, IM6, TA1 TH3 0.017 0.840 2.16

5 HE4, IM6, TA3 TH3 0.023 0.840 2.16

6 HE1, IM2, TA1 TH2 0.013 0.825 1.64

7 HE3, IM6, TA3 TH3 0.011 0.824 2.12

8 HE6, IM6, TA1 TH3 0.019 0.807 2.08

9 HE2, IM2, TA3 TH2 0.012 0.795 1.58

10 HE7, IM6, TA3 TH3 0.019 0.788 2.03