Data Munging - Ipynb - Colaboratory - Yodhi Adhi Sanjaya

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

11/18/2020 Data Munging.

ipynb - Colaboratory

Nama = 'Yodhi Adhi Sanjaya'


NPM = '2006608680'

print(Nama,NPM)

Yodhi Adhi Sanjaya 2006608680

from google.colab import drive


drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force

cd /content/drive/My Drive/Python-Data-Science-Essentials-Third-Edition-master/Chapter2

/content/drive/My Drive/Python-Data-Science-Essentials-Third-Edition-master/Chapter2

import pandas as pd
iris_filename = 'regression-datasets-housing.csv'
df = pd.read_csv(iris_filename, sep=',', decimal='.', header=None)
print(df)

0 1 2 3 4 5 ... 8 9 10 11 12 13
0 0.00632 18 2.31 0 0.538 6.575 ... 1 296 15 396.90 4.98 24.0
1 0.02731 0 7.07 0 0.469 6.421 ... 2 242 17 396.90 9.14 21.6
2 0.02729 0 7.07 0 0.469 7.185 ... 2 242 17 392.83 4.03 34.7
3 0.03237 0 2.18 0 0.458 6.998 ... 3 222 18 394.63 2.94 33.4
4 0.06905 0 2.18 0 0.458 7.147 ... 3 222 18 396.90 5.33 36.2
.. ... .. ... .. ... ... ... .. ... .. ... ... ...
501 0.06263 0 11.93 0 0.573 6.593 ... 1 273 21 391.99 9.67 22.4
502 0.04527 0 11.93 0 0.573 6.120 ... 1 273 21 396.90 9.08 20.6
503 0.06076 0 11.93 0 0.573 6.976 ... 1 273 21 396.90 5.64 23.9
504 0.10959 0 11.93 0 0.573 6.794 ... 1 273 21 393.45 6.48 22.0
505 0.04741 0 11.93 0 0.573 6.030 ... 1 273 21 396.90 7.88 11.9

[506 rows x 14 columns]

type(df)

pandas.core.frame.DataFrame

df.shape

(506, 14)

#Filter Data CSV dimana Kolom 1 lebih besar daripada 1.

filtered_column1 = df[df[0] > 1]


filtered_column1

https://colab.research.google.com/drive/1-lZ0Q5iCEUVJkdLT-E4xtjVO4p6Du-Ei?authuser=1#scrollTo=AAL3VUCBEAjS&printMode=true 1/4
11/18/2020 Data Munging.ipynb - Colaboratory

0 1 2 3 4 5 6 7 8 9 10 11 12 13

16 1.05393 0 8.14 0 0.538 5.935 29.3 4.4986 4 307 21 386.85 6.58 23.1

20 1.25179 0 8.14 0 0.538 5.570 98.1 3.7979 4 307 21 376.57 21.02 13.6

22 1.23247 0 8.14 0 0.538 6.142 91.7 3.9769 4 307 21 396.90 18.72 15.2

29 1.00245 0 8.14 0 0.538 6.674 87.3 4.2390 4 307 21 380.23 11.98 21.0

30 1.13081 0 8.14 0 0.538 5.713 94.1 4.2330 4 307 21 360.17 22.60 12.7

... ... ... ... ... ... ... ... ... ... ... ... ... ... ...

483 2.81838 0 18.10 0 0.532 5.762 40.3 4.0983 24 666 20 392.92 10.42 21.8

484 2.37857 0 18.10 0 0.583 5.871 41.9 3.7240 24 666 20 370.73 13.34 20.6
type(filtered_column1)
485 3.67367 0 18.10 0 0.583 6.312 51.9 3.9917 24 666 20 388.62 10.58 21.2
pandas.core.frame.DataFrame
486 5.69175 0 18.10 0 0.583 6.114 79.8 3.5459 24 666 20 392.68 14.98 19.1

487 4.83567 0 18.10


filtered_column1.shape 0 0.583 5.905 53.2 3.1523 24 666 20 388.22 11.45 20.6

174 rows
(174, × 14 columns
14)

#Statistic Data Kolom 1 lebih besar daripada 1.


mean_df = filtered_column1.mean()
max_df = filtered_column1.max()
min_df = filtered_column1.min()

stat = pd.DataFrame({'Mean': mean_df,'Max': max_df,'Min': min_df})


stat

Mean Max Min

0 10.138975 88.9762 1.00245

1 0.000000 0.0000 0.00000

2 17.836437 21.8900 8.14000

3 0.086207 1.0000 0.00000

4 0.677006 0.8710 0.53200

5 6.021885 8.7800 3.56100

6 90.339655 100.0000 29.30000

7 2.124741 4.4986 1.12960

8 19.344828 24.0000 4.00000

9 597.373563 666.0000 304.00000

10 19.017241 21.0000 14.00000

11 297.773793 396.9000 0.32000

12 17.815115 37.9700 1.73000

13 17.613793 50.0000 5.00000

#Filter Data CSV berdasarkan NPM: 2006608680


#Akhiran NPM 80

df1 = filtered_column1
filt d l 2 df [df [ 2] t (i t) 80]
https://colab.research.google.com/drive/1-lZ0Q5iCEUVJkdLT-E4xtjVO4p6Du-Ei?authuser=1#scrollTo=AAL3VUCBEAjS&printMode=true 2/4
11/18/2020 Data Munging.ipynb - Colaboratory

filtered_column2 = df1[df1[12].astype(int)==80]
filtered_column2

0 1 2 3 4 5 6 7 8 9 10 11 12 13

#Tidak Ditemukan Data CSV Kolom 13 dengan Nilai 80.


df2 = filtered_column2
df2.shape

(0, 14)

#Filter Data CSV berdasarkan NPM: 2006608680


#Akhiran NPM 8

df1 = filtered_column1
filtered_column3 = df1[df1[12].astype(int)==8]
filtered_column3

0 1 2 3 4 5 6 7 8 9 10 11 12 13

372 8.26725 0 18.1 1 0.668 5.875 89.6 1.1296 24 666 20 347.88 8.88 50.0

#Ditemukan Data CSV Kolom 13 dengan Nilai 8.


df3 = filtered_column3
df3.shape

(1, 14)

#Filter Data CSV berdasarkan NPM: 2006608680


#Akhiran NPM 6

df1 = filtered_column1
filtered_column4 = df1[df1[12].astype(int)==6]
filtered_column4

0 1 2 3 4 5 6 7 8 9 10 11 12 13

16 1.05393 0 8.14 0 0.538 5.935 29.3 4.4986 4 307 21 386.85 6.58 23.1

158 1.34284 0 19.58 0 0.605 6.066 100.0 1.7573 5 403 14 353.89 6.43 24.3

#Ditemukan Data CSV Kolom 13 dengan Nilai 6.


df4 = filtered_column4
df4.shape

(2, 14)

#Statistic Data Kolom 13 sama dengan nilai 6.


mean_df1 = filtered_column4.mean()
max_df1 = filtered_column4.max()
min_df1 = filtered_column4.min()

stat1 = pd.DataFrame({'Mean': mean_df1,'Max': max_df1,'Min': min_df1})


stat1

https://colab.research.google.com/drive/1-lZ0Q5iCEUVJkdLT-E4xtjVO4p6Du-Ei?authuser=1#scrollTo=AAL3VUCBEAjS&printMode=true 3/4
11/18/2020 Data Munging.ipynb - Colaboratory

Mean Max Min

0 1.198385 1.34284 1.05393

1 0.000000 0.00000 0.00000

2 13.860000 19.58000 8.14000

3 0.000000 0.00000 0.00000

4 0.571500 0.60500 0.53800

5 6.000500 6.06600 5.93500

6 64.650000 100.00000 29.30000

7 3.127950 4.49860 1.75730

8 4.500000 5.00000 4.00000

9 355.000000 403.00000 307.00000

10 17.500000 21.00000 14.00000

11 370.370000 386.85000 353.89000

12 6.505000 6.58000 6.43000

13 23.700000 24.30000 23.10000

https://colab.research.google.com/drive/1-lZ0Q5iCEUVJkdLT-E4xtjVO4p6Du-Ei?authuser=1#scrollTo=AAL3VUCBEAjS&printMode=true 4/4

You might also like