Untitled

You might also like

Download as pdf
Download as pdf
You are on page 1of 18
414723, 018 PM ede-analysis, #Inport standard Libraries import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns # For plotting from itertools import cycle import warnings warnings. filterwarnings(' ignore’) pit.style.use( geplot’) color=plt.rcParans["axes..prop_: color_cycle=cycle(plt.rcParans[ "axes.prop_cycl Load the data store=pd.read_csv('/kaggle/input /customer-shopping-dataset/customer_shopping_data.csv’ store.head() invoice ne customer i © 1138884 241288 Female 28 Clothing 1317333111565 Male. 2° ‘Shoes. 1127801 266599 Male 20. Clothing 1173702 988172. Female 66 Shoes |33704€ 189076 Female 53 Books store. info() RangeIndex: 99457 entries, @ to 99456 Data columns (total 10 colunns) # Column Non-Null Count 2 invoice_no 99457 non-null object 1 customer_id 99457 non-null object 2 gender 99457 non-null object 3 age 99457 non-null int6a 4 category 99457 non-null object 5 quantity 99457 non-null int6d 6 price 99457 non-null floate4 7 payment_method 99457 non-null object 8 invoice date 99457 non-null object 9 shopping mall 99457 non-null object. dtypes: floaté4(1), int64(2), object(7) memory usage: 7.6+ ME Explore Data Analysis 1. Visualize the descriptive stastics localhost 8888inbconverthim/Downloadsieda-analysis.ipynb?download-flee gender age category quantity 5 3 cle" ].by_key()[‘color*] ].by_key()[‘color*]) price payment.method invoice date 1500.40 180051 30008 3000.85 60.60 Credit Card Debit Card Cash Credit Card Cash 5/8/2022 12/12/2021 9/11/2021 16/05/2021 24/10/202 ane 414723, 018 PM ede-analysis, 2. Visualize the Fairplot 3. Visualize the histplots,nuingue values n store.describe().T.style.background_gradient(axis=1,cmap="Blues') mean std 25% 50% 75% 43427089 14990054 18.000000 30000000 43000000 $6.000000 6901 3.003429 1.413025 1.000000 2.000000 3.000000 4.000000 50) 689256321 941.184567 5.230000 45.450000 203300000 1200320000 5250.0) 1 #Checking the null values in the datset store. isna().sum()/len(store) invoice_no o. customer_id @ gender @ age 8 category 8 quantity a. orice 8 payment_method 0. invoice date 0. shopping.mall ©. dtype: floates n sns.pairplot(store, hue=" gender" ) localhost 8888inbconverthim/Downloadsieda-analysis.ipyab?download-falee ane 414723, 018 PM ede-analysis, n- ; 4 a 7 7 Bao. 4 J x» | ! 20 7 > 7 a | Sema cee rae 2 : oe | ian sae | ae con aA “tT 000 etn ° 4 g 3000 SEGIMAAARSSESTSRS ~ —- a re 760 a) 2 2000 ave quantity rice In [8]: | #hecking the unique columns for i in store. column print(i,"-->', store[i].nunique()) invoice_no --> 99457 customer_id --> 99457 gender -=> 2 age --> 52 category --> & quantity --> 5 orice --> 48 payment_method --> 3 invoice date --> 797 shopping mall --> 10 print("The gender in the dataset", en print(store[ "gender’ ].unique()) print("The category in the dataset”,end= print(store[ "category" ].unique()) print("The payment_nethod in the dataset*,en print(store[ *payment_method" ].unique()) print("The shopping mall in the dataset", end: print(store[ "shopping _mall"].unique()) ) localhost 8888inbconverthim/Downloadsieda-analysis.ipyab?download-falee gender © Female © Male ‘4000 ane 414723, 018 PM ede-analysis, The gender in the dataset[‘Female’ 'Male"] The category in the dataset['Clothing’ ‘shoes’ 'Books' ‘Cosmetics’ "Food & Beverage "Toys' "Technology’ ‘Souvenir’ ] The payment_nethod in the dataset['Credit Card’ ‘Debit Card’ 'Cash"] The shopping mall in the dataset['Kanyon' ‘Forum Istanbul’ 'Metrocity’ ‘Metropol AVM Istinye Park ‘Nall of Istanbul’ ‘Emaar Square Mall’ ‘Cevahir AVM' 'Viaport Outlet’ “Zorlu Center"! In [10]: | sns-heatmap(store.corr(), annot=True, cnap="ocean_r*) jo), 10 0.00067 0.0017 08 0.6 2 - 0.00067 04 8 02 S- 00017 5 age quantity In [11]: store.hist(Figsize=(15,3)) array ([[, ], [, ]], dtype=object) age i uantity Box plot and violinplot localhost 8888inbconverthim/Downloadsieda-analysis.ipyab?download-falee ane 418729, 9.16 PM ede-analysis, * For Numerical columns to visualize the box plot and violinplot plot using for loop In [12]: #70 visualize the box plot for i in store. select_dtypes(include="int*) plt. figure(figsize=(16,3)) sns.boxplot (x=store[ i], data-store, color=color[4]) plt.xticks(rotation=98) pt. show() a a 3 a : : In [13]: #To visualize the box plot for i in store. select_dtypes(include="int"): pit. figure(figsize=(16,3)) sns.violinplot (x=store[i ], data=store, color=color[4], palett plt.xticks(rotation=90) plt.show() 8 a 3 8 3 8 Wus1") localhost 8888inbconverthim/Downloadsieda-analysis.ipyab?download-falee ene 414723, 018 PM ede-analyss, Count of the category and visualize with matplotlib to visualize the category in the dataset ax-store['category'].value_counts()\ -sort_index()\ «plot (kind="bar’ ,titl ax.set_xlabel ("Category") ax.set_ylabel("Count of the Value") ‘ategory visualization” ,figsize=(16,7) ,color=color[5], lw=5,edg Text(@, 0.5, ‘Count of the Value") category visualization #let’s visualize the gender in the dataset store[ ‘gender’ ].value_counts()\ localhost 8888inbconverthim/Downloadsieda-analysis.ipyab?download-falee pie" , labels=store[ "gender" ].value_counts(). index, autopct='%1.1f%%" , colors: ene 414723, 018 PM ede-analyss, Gender percentage in the data Female gender Male Question asked about data * Calculate the each age with gender wise and do some background style * Create distplot to visualize the price with hue values as gender # To calculate the age with gender store. groupby('age’)["gender'].value_counts()\, -sort_index()\ cunstack()\, style. background_gradient (axis=0, cmap="Y10rRd") localhost 8888inbconverthim/Downloadsieda-analysis.ipyab?download-falee me 414723, 018 PM ede-analysis, gender Female Male 32 localhost 8888inbconverthim/Downloadsieda-analysis.ipyab?download-falee ane 414723, 018 PM ede-analysis, gender Female Male In [17]; #let's distribution of the price along with gender pit. subplots (1,2, figsize=(25,6)) pit. subplot (121) sns.histplot (data=store,bins=30, x='price’ ,kde=True) pit. subplot (122) sns. histplot (data=store, x=" price’ ,kde=True, hue=" gender") plt.show() ' localhost 8888inbconverthim/Downloadsieda-analysis.ipyab?download-falee one 414723, 018 PM ede-analyss, Created a function © Create a function that's create a new columns which convert categorical with given conditior 1. if age>=65 return “old age” 2.then age >= return "Middle aged” 3. Finally return to adult age # Then we visualize the shopingmall in the date furite a function to to covert the ages numerical to adult ages def ages(a): ifa return “Senior aged person" elif a >=4! Middle aged person” return “Adult age” store[ 'ages_cate' ]=store| ‘age’ ] apply(ages) store. head() invoice ne customer id gender age category quantity price payment method invoice date 1138884241288 Female 28 Clothing 5 150040 Credit Card 5/8/2022 1317333 C111568 Male 2° Shoes 3. 180057 Debit Card 12/12/202 2 1127801266599 Male 20 Clothing 30008 Cash 9/11/202 3 1173702 988172. Female 66 Shoes 5 300085 Credit Card 16/05/2021 4 133704 189076 Female 53 Books 4 6060 Cash 24/10/2021 #Visualize the shopping_maLL shoping=store[ 'shopping_mall'].value_counts()\, -sort_index()\

You might also like