Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/348294788

Data mining

Presentation · January 2021


DOI: 10.13140/RG.2.2.20064.76803

CITATIONS READS

0 1,555

1 author:

Hind el Atmani
Shandong University of Finance and Economics
23 PUBLICATIONS   37 CITATIONS   

SEE PROFILE

All content following this page was uploaded by Hind el Atmani on 24 March 2022.

The user has requested enhancement of the downloaded file.


Data mining
HIND

OUTLINE

➤ introduction
➤ de nition
➤ data mining used to
➤ vector
➤ matrix
➤ cbind and rbind
➤ arrays
➤ lists
➤ data farme
➤ function
➤ visualisation
fi

Introduction

R is commonly used is saving online data, but i will show yet


another useful function of this generous application. in my
presentation, also i will show you how to use the R Language to
compute mathematical problems in statistics, using vector,
matrix, extrat element from list , data.frame, function and
graph.
Let’s begin

De nition

Data mining (knowledge discovery in databases):


◦ Extraction of interesting (non-trivial, implicit, previously unknown
and potentially useful)
information or patterns from data in large databases

fi
Data mining used

Data mining is used to


Detect patterns in events and behaviors
Optimize product performance and manufacturing processes
Data mining can be utilized in any organization that needs to find patterns or
relationships in their data, wherever the derived insights will deliver business
value.

vector

The most basic data object in R is a vector. Even when you assign a single number to an object (like in
x <- 45.3) you are creating a vector containing a single element. All data objects have a mode and a
length. The mode determines the kind of data stored in the object. It can take the values character,
logical, numeric or complex. Thus you may have vectors of characters, logical values (T or F or
FALSE or TRUE) numbers
Most of the times you will be using vectors with length larger than 1. You may create a vector in R ,
using the c() function,

! Suit of anathematic raison 1 or -1 with commands : c (a:b)



Matrix and arrays

Data elements can be stored in an object with more than one dimension. This may
be useful in several situations. Arrays arrange data elements in several dimensions.
Matrices are a special case of arrays with two single dimensions

. Let us see an example. Suppose you have the vector of numbers


c(45,23,66,77,33,44,56,12,78,23). The following would “ organize ” these 10
numbers as a matrix,
> m <- c(45,23,66,77,33,44,56,12,78,23) >m
[1] 45 23 66 77 33 44 56 12 78 23 > dim(m) <- c(2,5)
>m
[,1] [,2] [,3] [,4] [,5
[1,] 45 66 33 56 7
[2,] 23 77 44 12 2


.

C bind and R bind

Functions cbind() and rbind() may be used to join together two or more vectors or matrices, by
columns or by rows, respectively

The following examples should illustrate this,

Rbind: we consider it as row and used to join datasets , and to create separate matrix.(e.g.):adding rows
To join two data frames (datasets) vertically, use the rbind function.
The two data frames must have the same variables , but they do not have to be in the same order.
Cbind : we consider it as colum and used to cut the range sequence table
.

Arrays
Arrays are extensions of matrices to more than two dimensions.

The following is an example of its use,

> a <- array(1:50,dim=c(2,5,5)) >a


,,1
[,1] [,2] [,3] [,4] [,5
[1,] 1 3 5 7
[2,] 2 4 6 8 1
,,2
[,1] [,2] [,3] [,4] [,5
[1,] 11 13 15 17 1
[2,] 12 14 16 18 2
,,3
[,1] [,2] [,3] [,4] [,5
[1,] 21 23 25 27 2
[2,] 22 24 26 28 3
,,4
[,1] [,2] [,3] [,4] [,5
[1,] 31 33 35 37 3
[2,] 32 34 36 38 4
,,5
[,1] [,2] [,3] [,4] [,5
[1,] 41 43 45 47 4
[2,] 42 44 46 48 50

Lists
R lists consist of an ordered collection of other objects known as its components. These
components do not need to be of the same type, mode or length. The components of a list are
always numbered and may also have a name attached to them. Let us start by seeing a simple
example of creating a list,
> my.lst <- list(stud.id=34453
+ stud.name="Hind"
+ stud.marks=c(14.3,12,15,19)

To show the contents of a list you simply type is name as any other object,
> my.ls
$stud.i
[1] 3445
$stud.nam
[1] "Hind6
$stud.mark
[1] 14.3 12.0 15.0 19.0
d

"

Data farmes
! data.fram is just like an Excel spreadsheet in that it has columns
and rows.
! There are numerous ways to construct a data.fram, the simplest
being to use the data.fram functionA data frame is similar to a
matrix but with named columns. However, contrary to matrices data
frames may include data of different type on each column. In this
sense they are more similar to lists, and in effect, for R data
frames are a special class of lists.
Each row of a data frame is an observation (or case), being described by a set of variables (the named columns
of the data frame).

You can create a data frame like this,

As you see here It`s a matrix in all the columns is not necessarily
the same scale , Boolean, characteristics for this example

- for visually rst line (head)


! "For de ned the name of the lines row. names

! "For de nes the name of the columns names

! "Dimension of object given by dim








fi
fi
fi



Function
R allows the user to create new functions. This is a useful feature particularly when you want to automate certain
tasks that you have to repeat over and over. Instead of writing the instructions that perform this task every time you
want to execute it, it is better to create a new function containing these instructions, and then simply using it
whenever necessary.
> se <- function(x)
+  v <- var(x)

+   n <- length(x

+  return(sqrt(v/n)) +}
Thus, to create a function object you assign to its name something with the general form,
function(<list of parameters>) { <list of R instructions>

After creating this function you can use it as follows,
> se(c(45,2,3,5,76,2,4)
[1] 11.1031
The body of a function can be written either in diferent lines (like the
example above) or in the same line by separating each instruction by the ’;’
character. 

The value returned by any function can be “decided” using the function return() or
alternatively R returns the result of the last expression that was evaluated within
the function. The following function illustrates this and also the use of parameters
with default values 

.
0


{



}


some of function
Data Visualization and Summarization

comparison with independent quantities of 2 variable

We compare the number of calls a mont


of two stands

Like Here is the boxplot evenly distributed

Here is T- table evenly distributed

Here is plot table evenly dis


“ thank you

HIND

View publication stats

You might also like