Professional Documents
Culture Documents
Pembersihan Data Script
Pembersihan Data Script
ipynb - Colaboratory
Membaca data
#membaca data
import pandas as pd
column type object object object int64 object float64 float64 object
null values (%) 0.0 0.0 0.268311 0.0 0.0 0.0 24.926694 0.0
menghapus record data yg ada null values pada kolom customerID dan Description
data.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 401604 entries, 0 to 541908
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 InvoiceNo 401604 non-null object
1 StockCode 401604 non-null object
2 Description 401604 non-null object
3 Quantity 401604 non-null int64
4 InvoiceDate 401604 non-null object
5 UnitPrice 401604 non-null float64
6 CustomerID 401604 non-null float64
7 Country 401604 non-null object
dtypes: float64(2), int64(1), object(5)
memory usage: 27.6+ MB
https://colab.research.google.com/drive/1RN8ynXJgi1K9A3NwpVHhfch6f8XN0BHK#printMode=true 1/2
5/30/23, 1:43 PM test1.ipynb - Colaboratory
Mengonversi kolom 'InvoiceDate' menjadi tipe data datetime, dan menampilkan rentang waktu data
<class 'pandas.core.frame.DataFrame'>
Int64Index: 401604 entries, 0 to 541908
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 InvoiceNo 401604 non-null object
1 StockCode 401604 non-null object
2 Description 401604 non-null object
3 Quantity 401604 non-null int64
4 InvoiceDate 401604 non-null datetime64[ns]
5 UnitPrice 401604 non-null float64
6 CustomerID 401604 non-null float64
7 Country 401604 non-null object
dtypes: datetime64[ns](1), float64(2), int64(1), object(4)
memory usage: 27.6+ MB
https://colab.research.google.com/drive/1RN8ynXJgi1K9A3NwpVHhfch6f8XN0BHK#printMode=true 2/2