Professional Documents
Culture Documents
Hind Data
Hind Data
net/publication/348294788
Data mining
CITATIONS READS
0 1,555
1 author:
Hind el Atmani
Shandong University of Finance and Economics
23 PUBLICATIONS 37 CITATIONS
SEE PROFILE
All content following this page was uploaded by Hind el Atmani on 24 March 2022.
OUTLINE
➤ introduction
➤ de nition
➤ data mining used to
➤ vector
➤ matrix
➤ cbind and rbind
➤ arrays
➤ lists
➤ data farme
➤ function
➤ visualisation
fi
Introduction
vector
The most basic data object in R is a vector. Even when you assign a single number to an object (like in
x <- 45.3) you are creating a vector containing a single element. All data objects have a mode and a
length. The mode determines the kind of data stored in the object. It can take the values character,
logical, numeric or complex. Thus you may have vectors of characters, logical values (T or F or
FALSE or TRUE) numbers
Most of the times you will be using vectors with length larger than 1. You may create a vector in R ,
using the c() function,
Data elements can be stored in an object with more than one dimension. This may
be useful in several situations. Arrays arrange data elements in several dimensions.
Matrices are a special case of arrays with two single dimensions
.
Functions cbind() and rbind() may be used to join together two or more vectors or matrices, by
columns or by rows, respectively
Rbind: we consider it as row and used to join datasets , and to create separate matrix.(e.g.):adding rows
To join two data frames (datasets) vertically, use the rbind function.
The two data frames must have the same variables , but they do not have to be in the same order.
Cbind : we consider it as colum and used to cut the range sequence table
.
Arrays
Arrays are extensions of matrices to more than two dimensions.
Lists
R lists consist of an ordered collection of other objects known as its components. These
components do not need to be of the same type, mode or length. The components of a list are
always numbered and may also have a name attached to them. Let us start by seeing a simple
example of creating a list,
> my.lst <- list(stud.id=34453
+ stud.name="Hind"
+ stud.marks=c(14.3,12,15,19)
To show the contents of a list you simply type is name as any other object,
> my.ls
$stud.i
[1] 3445
$stud.nam
[1] "Hind6
$stud.mark
[1] 14.3 12.0 15.0 19.0
d
"
Data farmes
! data.fram is just like an Excel spreadsheet in that it has columns
and rows.
! There are numerous ways to construct a data.fram, the simplest
being to use the data.fram functionA data frame is similar to a
matrix but with named columns. However, contrary to matrices data
frames may include data of different type on each column. In this
sense they are more similar to lists, and in effect, for R data
frames are a special class of lists.
Each row of a data frame is an observation (or case), being described by a set of variables (the named columns
of the data frame).
As you see here It`s a matrix in all the columns is not necessarily
the same scale , Boolean, characteristics for this example
Function
R allows the user to create new functions. This is a useful feature particularly when you want to automate certain
tasks that you have to repeat over and over. Instead of writing the instructions that perform this task every time you
want to execute it, it is better to create a new function containing these instructions, and then simply using it
whenever necessary.
> se <- function(x)
+ v <- var(x)
+ n <- length(x
+ return(sqrt(v/n)) +}
Thus, to create a function object you assign to its name something with the general form,
function(<list of parameters>) { <list of R instructions>
After creating this function you can use it as follows,
> se(c(45,2,3,5,76,2,4)
[1] 11.1031
The body of a function can be written either in diferent lines (like the
example above) or in the same line by separating each instruction by the ’;’
character.
The value returned by any function can be “decided” using the function return() or
alternatively R returns the result of the last expression that was evaluated within
the function. The following function illustrates this and also the use of parameters
with default values
.
0
{
}
some of function
Data Visualization and Summarization
HIND