Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

#a) 27(38-17)

27*(38-17)

#b) ln(147)
log(14^7)

#c) 436/12
sqrt(436/12)

[1] 567

[1] 18.4734

[1] 6.027714

#a = (5, 10, 15, 20, ..., 160)


a= seq(5,160, by= 5)

#b = (87, 86, 85, ..., 56)


b= seq(87,56, by= -1)

#Use vector arithmetic to multiply these vectors and call the result
d.
d= a * b

#What are the 19th, 20th, and 21st elements of d?


d[19:21]

#What are all of the elements of d which are less than 2000?
d[d < 2000]

#How many elements of d are greater than 6000?


sum(d > 6000)

[1] 6555 6800 7035

[1] 435 860 1275 1680

[1] 16

#sum of d
sum(d)

#median of d
median(d)

#standard deviation of d
sd(d)

[1] 175120

[1] 5897.5
[1] 2608.563

#1st matrix
a = c(7,9,12)
b= c(2,4,13)
X= rbind(a, b)

#2nd matrix
w= c(1,7,12,19)
z= c(2,8,13,20)
l= c(3,9,14,21)
Y= rbind(w, z, l)

#multiplication
X%*%Y

[,1] [,2] [,3] [,4]


a 61 229 369 565
b 49 163 258 391

as.data.frame(X)

V1 V2 V3
a 7 9 12
b 2 4 13

Tasla <- read.csv("TSLA.csv")


head(Tasla)

Date Open High Low Close Adj.Close Volume


1 2010-06-29 3.800 5.000 3.508 4.778 4.778 93831500
2 2010-06-30 5.158 6.084 4.660 4.766 4.766 85935500
3 2010-07-01 5.000 5.184 4.054 4.392 4.392 41094000
4 2010-07-02 4.600 4.620 3.742 3.840 3.840 25699000
5 2010-07-06 4.000 4.000 3.166 3.222 3.222 34334500
6 2010-07-07 3.280 3.326 2.996 3.160 3.160 34608500

Why Visualizing Your Data ?

1- Examining the distribution of a single variable

2- Analyzing the relationship between two variables

3- Data exploration

4- Analyzing a single variable over time

5- Know the problems in data

#Summary statistics give us some sense of the data


summary(Tasla)
Date Open High Low

2010-06-29: 1 Min. : 3.228 Min. : 3.326 Min. :


2.996
2010-06-30: 1 1st Qu.: 19.627 1st Qu.: 20.402 1st Qu.:
19.128
2010-07-01: 1 Median : 46.657 Median : 47.487 Median :
45.820
2010-07-02: 1 Mean : 138.691 Mean : 141.772 Mean :
135.426
2010-07-06: 1 3rd Qu.: 68.057 3rd Qu.: 69.358 3rd Qu.:
66.912
2010-07-07: 1 Max. :1234.410 Max. :1243.490
Max. :1217.000
(Other) :2950

Close Adj.Close Volume


Min. : 3.16 Min. : 3.16 Min. : 592500
1st Qu.: 19.61 1st Qu.: 19.61 1st Qu.: 13102875
Median : 46.55 Median : 46.55 Median : 24886800
Mean : 138.76 Mean : 138.76 Mean : 31314486
3rd Qu.: 68.10 3rd Qu.: 68.10 3rd Qu.: 39738750
Max. :1229.91 Max. :1229.91 Max. :304694000

#if i want to represent one variable can use plot() – hist()


#histogram --> visualize the frequency of occurrence of data points in
data
hist(Tasla[,4])
#continuous histogram --> vizualize interval between values
plot(density(Tasla[,4]))
Can conclude from this graph that data heve some problems as:

1- very wide, biased, imbalanced, or extremely skewed

2- Outliers, anomalies

3- Shape of the Distribution

4- Distribution of data

boxplot(Tasla$Open)
#plot(sort(.)) --> if i want to identify outerliers and distribution
of data.
plot(Open~Close, data= Tasla)
model <- lm(Open ~ Close, data = Tasla)

#Scatter plot used to compare the position of each data point to the
mean line
#Show how closely the data points cluster around the mean and identify
any outliers in the data.
plot(Open~Close, data= Tasla)

abline(model, col = "red")


negative relationship --> mean line were to take the inverse direction of the data points 📉

positive --> mean line were to take the same direction of the data points 📈

weak relationship --> points are mostly spread

hexbinplot --> isualize the distribution and denisty of data at specific value of data points

pairs() --> to examine many two-way relationships, generate a plot of each pairs of variables

You might also like