Welcome to Scribd!

Titanic

Uploaded by

0% found this document useful (0 votes)

33 views4 pages

The document discusses the Apriori algorithm for association rule mining. It describes how Apriori uses a bottom-up approach to find frequent itemsets in transaction data by generating and testing candidate itemsets in multiple passes over the data. The document also notes some inefficiencies of Apriori and applies the algorithm to titanic passenger data to generate rules related to survival. Redundant rules are pruned before visualizing the results.

Original Description:

Titanic Program in rstudio

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as docx, pdf, or txt

0% found this document useful (0 votes)

33 views4 pages

Titanic

Uploaded by

Rohini Bhosale

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as docx, pdf, or txt

Jump to Page

You are on page 1of 4

Search inside document

#adjust this line to the folder where you will be saving this code and the data file

load("~/R_code_&_data/titanic.raw.rdata")

head(titanic.raw)

attach(titanic.raw)

install.packages("Matrix")

library(arules)

# find association rules with default settings

rules = apriori(titanic.raw)

inspect(rules)

#In computer science and data mining, Apriori is a classic algorithm for learning association rules. Apriori
is designed to operate on databases containing transactions. As is common in association rule mining,
given a set of itemsets, the algorithm attempts to find subsets which are common to at least a minimum
number C of the itemsets. Apriori uses a "bottom up" approach, where frequent subsets are extended
one item at a time (a step known as candidate generation), and groups of candidates are tested against
the data. The algorithm terminates when no further successful extensions are found.

#Apriori uses breadth-first search and a tree structure to count candidate item sets efficiently. It
generates candidate item sets of length k from item sets of length k-1. Then it prunes the candidates
which have an infrequent sub pattern. According to the downward closure lemma, the candidate set
contains all frequent k-length item sets. After that, it scans the transaction database to determine
frequent item sets among the candidates.

#Apriori, while historically significant, suffers from a number of inefficiencies or trade-offs, which have
spawned other algorithms. Candidate generation generates large numbers of subsets (the algorithm
attempts to load up the candidate set with as many as possible before each scan). Bottom-up subset
exploration (essentially a breadth-first traversal of the subset lattice) finds any maximal subset S only
after all 2^{|S|}-1 of its proper subsets.

#We then set rhs=c("Survived=No", "Survived=Yes") in appearance to make sure that only
"Survived=No" and "Survived=Yes" will appear in the rhs of rules.
# rules with rhs containing "Survived" only

rules <- apriori(titanic.raw, parameter = list(minlen=2, supp=0.005, conf=0.8), appearance =

list(rhs=c("Survived=No", "Survived=Yes"), default="lhs"), control = list(verbose=F))

rules.sorted <- sort(rules, by="lift")

inspect(rules.sorted)

#Pruning Redundant Rules

#In the above result, rule 2 provides no extra knowledge in addition to rule 1, since rules 1 tells us that
all 2nd-class children survived.

#Generally speaking, when a rule (such as rule 2) is a super rule of another rule (such as rule 1) and the
former has the same or a lower lift,

#the former rule (rule 2) is considered to be redundant. Below we prune redundant rules.

# find redundant rules

subset.matrix <- is.subset(rules.sorted, rules.sorted)

subset.matrix[lower.tri(subset.matrix, diag=T)] <- NA

redundant <- colSums(subset.matrix, na.rm=T) >= 15

which(redundant)

[1] 2 4 7 8

> # remove redundant rules

rules.pruned <- rules.sorted[!redundant]

inspect(rules.pruned)

#Visualizing Association Rules

#Package arulesViz supports visualization of association rules with scatter plot, balloon plot, graph,
parallel coordinates plot, etc.
install.packages( arules , scatterplot3d, vcd, seriation, igraph,"grid","cluster","TSP","gclus", "colorspace")

install.packages("arulesViz")

library(arulesViz)

plot(rules.pruned)

library(readxl)

Prof <- read_excel("Prof.xlsx", sheet = "Sheet1",

col_types = c("date", "numeric", "text",

"text", "text", "text", "text", "text",

"numeric", "numeric"))

aa <- read.transactions("Prof.xlsx",format = "basket",sep = ",")

inspect(aa)
"C:/Users/Rohan Bhosale/Documents/pro2/proo/Prof.xlsx"

Apriori Algorithm
Document9 pages
Apriori Algorithm
Alshabwani Saleh
No ratings yet
Orca Manual
Document574 pages
Orca Manual
Cran Osram
No ratings yet
How To Win at Daily Fantasy Sports (MIT) PDF
Document30 pages
How To Win at Daily Fantasy Sports (MIT) PDF
Tomás Tavares
No ratings yet
1.when A Sparse Matrix Is Represented With A 2-Dimensional Array, We
Document6 pages
1.when A Sparse Matrix Is Represented With A 2-Dimensional Array, We
khushinagar9009
No ratings yet
Apriori Algorithm: 1 Setting
Document3 pages
Apriori Algorithm: 1 Setting
Bobby Jasuja
No ratings yet
Opus Miner
Document6 pages
Opus Miner
Kannan Senthamarai
No ratings yet
Heaps: Analysis of Algorithms
Document27 pages
Heaps: Analysis of Algorithms
Asma Sajid
No ratings yet
Apriori
Document3 pages
Apriori
Sathish Kumar
No ratings yet
12 Algorithms
Document24 pages
12 Algorithms
Kc Mama
No ratings yet
Introduction To R
Document36 pages
Introduction To R
Refael Lav
No ratings yet
Datasets
Document40 pages
Datasets
Asmatullah Khan
No ratings yet
10 Sorting, Performance/Stress Tests
Document7 pages
10 Sorting, Performance/Stress Tests
Sara El Haimer
No ratings yet
R Mals Chains
Document9 pages
R Mals Chains
Nadim Aizarani
No ratings yet
Data Mining Lab Report
Document6 pages
Data Mining Lab Report
Redowan Mahmud Ratul
No ratings yet
Introduction To R: Nihan Acar-Denizli, Pau Fonseca
Document50 pages
Introduction To R: Nihan Acar-Denizli, Pau Fonseca
asaksjaks
No ratings yet
1) What Is Framework in Java?
Document20 pages
1) What Is Framework in Java?
anuja shinde
No ratings yet
Accenture Interview Qustion Ans Answer
Document73 pages
Accenture Interview Qustion Ans Answer
Radheshyam Nayak
No ratings yet
Dplyr Manual
Document71 pages
Dplyr Manual
pgnepal
No ratings yet
The C++ Standard Template Library (STL) : Algorithms
Document9 pages
The C++ Standard Template Library (STL) : Algorithms
Gauri Bansal
No ratings yet
CSE 3121 Information Visualization R Studio All Codes
Document9 pages
CSE 3121 Information Visualization R Studio All Codes
Dhaarani Pushpam
No ratings yet
ML Lab 10 - Ensemble Learning
Document7 pages
ML Lab 10 - Ensemble Learning
PRIYANSH AGGARWAL
No ratings yet
Comparison Based Sorting Algorithms, Such As Radixsort, Bucketsort Etc. The Focus
Document8 pages
Comparison Based Sorting Algorithms, Such As Radixsort, Bucketsort Etc. The Focus
amitmaheshpur
No ratings yet
Array Data Structure - GeeksforGeeks
Document5 pages
Array Data Structure - GeeksforGeeks
stephanus_ananda
No ratings yet
Java8features Streamapi
Document13 pages
Java8features Streamapi
mohammed.rana
No ratings yet
PHP Array Functions
Document7 pages
PHP Array Functions
kamaludeencrm
No ratings yet
RevoScale & Decision Trees
Document11 pages
RevoScale & Decision Trees
Manikantan Gopalakrishnan
No ratings yet
Package Fastica': R Topics Documented
Document8 pages
Package Fastica': R Topics Documented
Sally Sameh
No ratings yet
Veloso Sbac03
Document8 pages
Veloso Sbac03
Hieu Minh
No ratings yet
Package Fastica': R Topics Documented
Document8 pages
Package Fastica': R Topics Documented
CliqueLearn E-Learning
No ratings yet
Order Tasks and Milestones Assignment
Document6 pages
Order Tasks and Milestones Assignment
saqib khattak
No ratings yet
Affy Diffexp Clustering Exercise-1
Document16 pages
Affy Diffexp Clustering Exercise-1
emilio
No ratings yet
Working With Affymetrix Data: Estrogen, A 2x2 Factorial Design Example
Document15 pages
Working With Affymetrix Data: Estrogen, A 2x2 Factorial Design Example
Charles Wang
No ratings yet
Datamining 2
Document54 pages
Datamining 2
ananomous.email
No ratings yet
Dsa 2 PDF
Document12 pages
Dsa 2 PDF
Lol Telr
No ratings yet
Sorting - Algorithm - Python - 1653284600902
Document8 pages
Sorting - Algorithm - Python - 1653284600902
varunanand1508
No ratings yet
Strings and Pattern Searching
Document80 pages
Strings and Pattern Searching
AsafAhmad
100% (1)
Volume 2, No. 5, April 2011 Journal of Global Research in Computer Science Research Paper Available Online at WWW - Jgrcs.info
Document3 pages
Volume 2, No. 5, April 2011 Journal of Global Research in Computer Science Research Paper Available Online at WWW - Jgrcs.info
SadhuYadav
No ratings yet
Rekha Saripella - Radix and Bucket Sort
Document22 pages
Rekha Saripella - Radix and Bucket Sort
Abdullah Yousafzai
No ratings yet
19cs2205a Key
Document8 pages
19cs2205a Key
Suryateja Koka
No ratings yet
C-721 Inclass Assignment W-13 Working With R Report Progress Spatial Data Analysis: Introduction To Raster Processing
Document8 pages
C-721 Inclass Assignment W-13 Working With R Report Progress Spatial Data Analysis: Introduction To Raster Processing
ELI
No ratings yet
Assignment 2 With Program
Document8 pages
Assignment 2 With Program
Palash Saroware
No ratings yet
Unit-5 PHP
Document18 pages
Unit-5 PHP
himu11248
No ratings yet
Tunning Dss Queries
Document16 pages
Tunning Dss Queries
Airc Smtp
No ratings yet
UNIT 3 Array and Function
Document26 pages
UNIT 3 Array and Function
parth007.u
No ratings yet
Unit 3
Document39 pages
Unit 3
mimanshas28
No ratings yet
RP1
Document3 pages
RP1
Radhiyadevi Chinnasamy
No ratings yet
Arules Viz
Document12 pages
Arules Viz
Lakshmi Srividya
No ratings yet
CSE2103-Lec 01 (Data Type)
Document5 pages
CSE2103-Lec 01 (Data Type)
Monir Jihad
No ratings yet
Java - Streams
Document10 pages
Java - Streams
bikashghoshh41
No ratings yet
Decision Tree Coding Specifications
Document1 page
Decision Tree Coding Specifications
kunalbest
No ratings yet
Lab-2 Data Cleaning and Preprocessing
Document1 page
Lab-2 Data Cleaning and Preprocessing
moumitashopping0
No ratings yet
Association Rule: Association Rule Learning Is A Popular and Well Researched Method For Discovering
Document10 pages
Association Rule: Association Rule Learning Is A Popular and Well Researched Method For Discovering
sskalees
No ratings yet
Loops: Genome 559: Introduction To Statistical and Computational Genomics Prof. James H. Thomas
Document27 pages
Loops: Genome 559: Introduction To Statistical and Computational Genomics Prof. James H. Thomas
Rahul Jagdale
No ratings yet
Apriori Algorithm DWDM
Document5 pages
Apriori Algorithm DWDM
Ankit Popli
No ratings yet
Text Mining Package and Datacleaning: #Cleaning The Text or Text Transformation
Document6 pages
Text Mining Package and Datacleaning: #Cleaning The Text or Text Transformation
Arush sambyal
No ratings yet
Radix Sort: Problem Description
Document5 pages
Radix Sort: Problem Description
x_jain
No ratings yet
Utility Classes in Java
Document5 pages
Utility Classes in Java
dineshgomber
No ratings yet
Algo Questions
Document19 pages
Algo Questions
fashionpicks
No ratings yet
BafS - Java8-CheatSheet - A Java 8+ Cheat Sheet For Functional Programming
Document5 pages
BafS - Java8-CheatSheet - A Java 8+ Cheat Sheet For Functional Programming
Arun Kumar
No ratings yet
1
Document19 pages
1
HarsimranKaurBindra
No ratings yet
How To Code in R
Document6 pages
How To Code in R
layla z
No ratings yet
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet
Name: Rohini Premendra Bhosale Roll No: 07 Class: MCA Sem II Batch:A1 Date
Document4 pages
Name: Rohini Premendra Bhosale Roll No: 07 Class: MCA Sem II Batch:A1 Date
Rohini Bhosale
No ratings yet
Online Resume Builder
Document14 pages
Online Resume Builder
Rohini Bhosale
No ratings yet
Ch3 2
Document7 pages
Ch3 2
Rohini Bhosale
No ratings yet
Online Wedding Planning System by Nirav Darji PDF
Document55 pages
Online Wedding Planning System by Nirav Darji PDF
Rohini Bhosale
No ratings yet
Using Using Using Using Namespace Class Static Void String: Program
Document1 page
Using Using Using Using Namespace Class Static Void String: Program
Rohini Bhosale
No ratings yet
Icles' Motilal Jhunjhunwala College of Arts, Science & Commerce 2016-2017 GCC Compiler C
Document5 pages
Icles' Motilal Jhunjhunwala College of Arts, Science & Commerce 2016-2017 GCC Compiler C
Rohini Bhosale
No ratings yet
1506 02626 PDF
Document9 pages
1506 02626 PDF
daksh
No ratings yet
6c. Practice Examples - Construction Process Optimization
Document17 pages
6c. Practice Examples - Construction Process Optimization
Abdelhadi Sharawneh
No ratings yet
7 LP Simplex Maximization
Document18 pages
7 LP Simplex Maximization
Brian Igrobay
No ratings yet
Guarded Commands
Document9 pages
Guarded Commands
Lini Ickappan
No ratings yet
Adaptive Fuzzy Systems and Control
Document24 pages
Adaptive Fuzzy Systems and Control
Rahmawati Dinii
No ratings yet
Course Outline: SC Ence
Document1 page
Course Outline: SC Ence
gprasadatvu
No ratings yet
2.0-Fourier Theory and Communication Signals
Document5 pages
2.0-Fourier Theory and Communication Signals
WaelBazzi
No ratings yet
PDSP Labmanual2021-1
Document57 pages
PDSP Labmanual2021-1
Anuj Jain
No ratings yet
AI Glossary
Document5 pages
AI Glossary
Тетяна Коваль
No ratings yet
Quartiles in R PDF
Document4 pages
Quartiles in R PDF
wichasta
No ratings yet
Lec3 Gradient Based Method Part I
Document30 pages
Lec3 Gradient Based Method Part I
Abhay Jindal
No ratings yet
Effects of Aperture Time and Jitter in A Sampled Data System
Document5 pages
Effects of Aperture Time and Jitter in A Sampled Data System
ΠΑΝΑΓΙΩΤΗΣΠΑΝΑΓΟΣ
No ratings yet
Join Head Pose
Document6 pages
Join Head Pose
anittadevadas
No ratings yet
PMP Q and Answer 10
Document3 pages
PMP Q and Answer 10
hazemnh
No ratings yet
Spectral Mapping Theorem For Polynomials
Document28 pages
Spectral Mapping Theorem For Polynomials
ZRichard61
No ratings yet
Kotlin Cheat Sheet
Document3 pages
Kotlin Cheat Sheet
Star
No ratings yet
Python - Lab 5
Document12 pages
Python - Lab 5
salin chaudhary
No ratings yet
Learn Random Forest Using Excel
Document9 pages
Learn Random Forest Using Excel
kPrasad8
No ratings yet
Courses IITK PG Courses 2nd Sem CS
Document2 pages
Courses IITK PG Courses 2nd Sem CS
triplewalker
No ratings yet
Study Scheme & Syllabus Of: IK Gujral Punjab Technical University
Document28 pages
Study Scheme & Syllabus Of: IK Gujral Punjab Technical University
viji
No ratings yet
Lecture Notes (Week 1) : Error Analysis
Document4 pages
Lecture Notes (Week 1) : Error Analysis
Rana Bilal
No ratings yet
Handouts
Document221 pages
Handouts
khubaibahmed3141
No ratings yet
Automatic Speech Recognition (Attempt) : ECE 113DB Final Project, Winter 2019 Fong Chi Ho, Zijun Sun, Shao Xiong Lee
Document4 pages
Automatic Speech Recognition (Attempt) : ECE 113DB Final Project, Winter 2019 Fong Chi Ho, Zijun Sun, Shao Xiong Lee
Ken K
No ratings yet
CS1 - Chapter 0
Document19 pages
CS1 - Chapter 0
An
No ratings yet
Econometrics
Document40 pages
Econometrics
Lay Zhang
No ratings yet
Flowchart Ga
Document1 page
Flowchart Ga
Sidney Bruce Shiki
No ratings yet
Annu Maria-Introduction To Modelling and Simulation
Document7 pages
Annu Maria-Introduction To Modelling and Simulation
Miguel Dominguez de García
0% (1)