Q1 Video Games Sales: #Import The Libraries

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Q1 VIDEO GAMES SALES

Video game sales data (https://www.kaggle.com/gregorut/videogamesales), consisting of


rank, genre, publisher, and global sales amount (in millions) worldwide. Using Python
create a Pie chart and find out which genre accounts for a high portion of global sales in a
video game. This would help understand the potential needs of the video game that will be
published by the company.
#Import the libraries
import pandas as pd
import matplotlib.pyplot as plt

#Load the dataset


df = pd.read_csv('/content/sample_data/vgsales.csv')
df.groupby(['Genre']).sum().plot(kind='pie', y='Global_Sales',
autopct='%1.0f%%' , figsize=(20,15))
plt.show()
Action Genre accounts for a high portion of global sales in a video game i.e. 20%

Q2 BIKE SHARING DEMAND


Rental bikes have gained more popularity these days due to their flexibility, mobility, low
cost, and health benefits in urban cities. Understanding the demand for rental bikes is
essential to estimate rental bikes’ availability over various time periods and locations.
Demand information can also help to allocate the optimal amount of supply of rental bikes.
These factors could be critically related to customer satisfaction and traffic
Create a Visualization using python
The line charts of rented bike count over the months.
The line charts of temperature over the months.
The pie chart of rented bike count by seasons.
The stacked bar chart of rented bike count by the holiday.
The scatter plot of rented bike count and temperature.
The scatter plot of rented bike count and rainfall.
pd.set_option('mode.chained_assignment', None)
import seaborn as sns
df = pd.read_csv('/content/sample_data/SeoulBikeData -
SeoulBikeData.csv')
df['Month'] = 1
for i in range(len(df)):
df['Month'][i] = df['Date'][i].split('/')[1]
df2 = df.groupby('Month')['Rented Bike Count'].sum()
df2 = df2.reset_index()
sns.lineplot(data=df2, x="Month", y="Rented Bike Count")

<matplotlib.axes._subplots.AxesSubplot at 0x7f9d74445910>

sum = 0
RBC = df2['Rented Bike Count']
for i in range(len(RBC)):
for j in range(i +1, len(RBC)):
sum = sum + abs(RBC.iloc[i] - RBC.iloc[j])
sum / 66

291843.8484848485
df2 = df.groupby('Month')['Temperature(蚓)'].mean()
df2 = df2.reset_index()
sns.lineplot(data=df2, x="Month", y="Temperature(蚓)")

<matplotlib.axes._subplots.AxesSubplot at 0x7f9d73ea92e0>

/usr/local/lib/python3.8/dist-packages/matplotlib/backends/
backend_agg.py:214: RuntimeWarning: Glyph 34451 missing from current
font.
font.set_text(s, 0.0, flags=flags)
/usr/local/lib/python3.8/dist-packages/matplotlib/backends/backend_agg
.py:183: RuntimeWarning: Glyph 34451 missing from current font.
font.set_text(s, 0, flags=flags)

df.groupby(['Seasons']).sum().plot(kind='pie', y='Rented Bike Count',


autopct='%1.0f%%', figsize=(7,7))
plt.show()
df2 = df.groupby('Holiday').mean()
plt.bar(["Holiday Type"], df2['Rented Bike Count'][0], color='r')
plt.bar(["Holiday Type"], df2['Rented Bike Count'][1],
bottom=df2['Rented Bike Count'][0], color='b')
plt.ylabel("Rented Bike Count")
plt.legend(["Hoiday", "No Holiday"])
plt.title("Stacked bar chart of rented bike count by the holiday")
plt.show()
plt.scatter(df['Rented Bike Count'], df['Temperature(蚓)'], s= 5)
plt.show()

plt.scatter(df['Rented Bike Count'], df['Rainfall(mm)'], s = 5)


plt.show()
Answer following Questions:
1) What is the trend of rented bike count over the months?
The rented bike count gradually increases over the starting months and then decreases.
2) Which seasons show greater demand for the rented bike than other seasons?
Summer
3) Is the demand for rented bikes affected by the holiday season?
Yes, the demand for rented bikes is affected by the holiday season as in no holidays, the
demand increases.
4) Is there a relationship between the rented bike demand and temperature?
No, there is no relationship between the rented bike demand and temperature.
5) Is there a relationship between the rented bike demand and the amount of rainfall?
No, there is no relationship between the rented bike demand and Rainfall.
• Does the demand for bicycle rental vary by season?
Yes, summer has highest demand for bicycle rental
• What is the difference between the monthly demand for bike rental?
291843.8484848485
• Is the demand for bike rental affected by holidays?
Yes, the demand for rented bikes is affected by the holiday season as in no holidays, the
demand increases.

Q3 REAL TIME VOICE CALL QUALITY DATA FROM CUSTOMERS


Since 2000, mobile phones have spread rapidly, and since 2010, many people use data
communication through smartphones. Voice communication is the most basic service in the
mobile communication business and understanding the quality and the performance of
voice calls is critical to ensuring great customer experiences. Bad call experiences lead to
frustrated customers, lost customer relationships, and have a real financial impact on
businesses. However, measuring call quality was not an easy part for mobile carriers since
users’ subjective factors are reflected a lot. Thus, mobile communication companies have
used customer survey techniques to check call quality, continue to monitor, and trace call
performance to improve service quality based on the survey data. The data set in this case
captures the Customers Feedback using the MyCAll App developed by TRAI (Telecom
regulatory authority of India) which is a statutory body set up by the Government of India
The data is captured for various service providers in India, at multiple locations, network
types 2G, 3G, 4G, ratings, coordinates, etc. Customers rate their experience with voice call
quality in real-time and help TRAI gather customer experience data along with Network
data.
Create Data visualization using python for:
Vertical bar chart of average call quality rate per operator.
The vertical bar chart of the quality level per each state in India.
The vertical bar chart showing the relationship between the call quality and the network
type.
Horizontal bar chart of average call quality rate per Call Drop Category
Heat map between state , Network Type and rating.
Vertical bar chart of average call quality rate per Indoor_Outdoor_Travelling .
import pandas as pd
import matplotlib.pyplot as plt

df1=pd.read_csv('/content/sample_data/CallVoiceQualityExperience-2018-
April.csv')
df2=pd.read_csv('/content/sample_data/CallVoiceQuality_Data_2018_May.c
sv')

df=pd.concat([df1,df2])

df
Operator Indoor_Outdoor_Travelling Network Type Rating \
0 Airtel Indoor 3G 5
1 RJio Indoor 4G 4
2 Airtel Outdoor 3G 5
3 Airtel Travelling 3G 5
4 RJio Indoor 4G 5
... ... ... ... ...
31976 RJio NaN 4G 4
31977 Airtel NaN 4G 5
31978 RJio NaN 4G 5
31979 RJio NaN 4G 5
31980 BSNL NaN Unknown 5

Call Drop Category Latitude Longitude State Name In Out


Travelling
0 Satisfactory 28.422966 76.912324 Haryana
NaN
1 Satisfactory 11.158358 77.301897 Tamil Nadu
NaN
2 Satisfactory 28.422931 76.912253 Haryana
NaN
3 Satisfactory 28.422947 76.912260 Haryana
NaN
4 Satisfactory 25.625990 85.094294 Bihar
NaN
... ... ... ... ...
...
31976 Satisfactory 20.979739 75.580521 Maharashtra
Indoor
31977 Satisfactory 17.438340 78.382000 Telangana
Indoor
31978 Satisfactory -1.000000 -1.000000 NaN
Indoor
31979 Satisfactory 28.533182 77.216453 NCT
Indoor
31980 Satisfactory 17.406222 78.438804 Telangana
Indoor

[95317 rows x 9 columns]

df3=df.groupby('Operator')['Rating'].mean()
df3=df3.reset_index()
df3

df3.plot.bar(x="Operator", y="Rating", rot=70, title="Average call


quality rate per operator");

plt.show(block=True);
df4=df.groupby('State Name')['Rating'].mean()
df4=df4.reset_index()

f, ax = plt.subplots(figsize=(28,7))
plt.bar(df4['State Name'],df4['Rating'])
plt.xticks( df4['State Name'], df4['State Name'],rotation='vertical')
plt.xlabel("States",fontsize=20)
plt.ylabel("Rating",fontsize=20)
plt.show()
df5=df.groupby('Network Type')['Rating'].mean()
df5=df5.reset_index()

f, ax = plt.subplots(figsize=(7,7))
plt.bar(df5['Network Type'],df5['Rating'])
plt.xticks( df5['Network Type'], df5['Network
Type'],rotation='vertical')
plt.xlabel("Network Type",fontsize=20)
plt.ylabel("Rating",fontsize=20)
plt.show()

df6=df.groupby('Call Drop Category')['Rating'].mean()


df6=df6.reset_index()
f, ax = plt.subplots(figsize=(7,7))
plt.barh(df6['Call Drop Category'],df6['Rating'])
plt.yticks( df6['Call Drop Category'], df6['Call Drop
Category'],rotation='horizontal')
plt.ylabel("Call Drop Category",fontsize=15)
plt.xlabel("Rating",fontsize=15)
plt.show()

df8=df[['Rating','State Name','Network Type']]


df8=df8.groupby(['State Name','Network Type'])['Rating'].mean()
df8=df8.reset_index()
df8

State Name Network Type Rating


0 Adis Abeba 3G 4.000000
1 Adis Abeba 4G 3.166667
2 Adis Abeba Unknown 2.000000
3 Andaman and Nicobar Islands Unknown 1.000000
4 Andhra Pradesh 2G 2.506329
.. ... ... ...
127 Uttarakhand Unknown 3.651341
128 West Bengal 2G 3.542373
129 West Bengal 3G 3.687309
130 West Bengal 4G 3.613048
131 West Bengal Unknown 3.541707

[132 rows x 3 columns]

df9=df8.pivot(values='Rating',index='State Name',columns='Network
Type')
df9

Network Type 2G 3G 4G Unknown


State Name
Adis Abeba NaN 4.000000 3.166667 2.000000
Andaman and Nicobar Islands NaN NaN NaN 1.000000
Andhra Pradesh 2.506329 2.480427 3.627483 3.509749
Arunachal Pradesh NaN NaN 2.750000 1.000000
Assam 4.707627 4.527897 4.230769 4.495652
Bihar 3.957746 3.293706 2.818471 3.892351
Central Region NaN NaN NaN 2.875000
Chandigarh NaN 3.466667 4.849673 4.200000
Chhattisgarh 2.888889 4.005952 3.390374 4.119266
Chhukha NaN 2.666667 2.500000 3.000000
Dadra and Nagar Haveli 3.000000 3.200000 3.000000 4.142857
Eastern Region NaN 1.000000 2.636364 3.500000
Goa 3.375000 3.127660 2.947761 2.888889
Gujarat 3.425532 3.183147 3.421384 3.615234
Haryana 2.265306 3.293872 3.350866 3.475921
Himachal Pradesh 3.689655 3.621622 4.506912 4.307143
Jharkhand 3.040000 3.129252 3.887430 3.081481
Karnataka 3.012121 3.234538 3.724341 3.896007
Kashmir 4.600000 4.875000 4.277778 4.220779
Kerala 2.075534 3.136364 3.085791 3.153477
Madhya Pradesh 2.666667 2.870801 3.401747 3.839181
Maharashtra 2.131944 3.373690 3.492145 3.492904
Manipur NaN 1.000000 NaN NaN
Meghalaya NaN NaN 1.500000 1.000000
NCT 3.703704 3.164278 3.138142 2.991091
Nagaland NaN NaN 3.625000 3.500000
New York NaN NaN NaN 3.000000
Odisha 2.903846 3.658824 3.949609 3.390041
Pondicherry 1.600000 2.428571 3.750000 2.500000
Punjab 2.791667 3.634043 4.413462 4.338983
Rajasthan 2.273846 2.085809 3.019257 3.436702
Samchi NaN NaN 5.000000 NaN
Samdrup Jongkhar 5.000000 1.500000 NaN 2.750000
Sikkim 3.250000 1.000000 NaN 4.000000
Tamil Nadu 3.147826 2.994455 3.939166 4.013100
Telangana 3.128205 3.568106 3.426002 4.118526
Tripura 5.000000 5.000000 5.000000 NaN
Uttar Pradesh 3.097561 3.374741 3.832611 3.664495
Uttarakhand 2.900000 4.205882 3.603448 3.651341
West Bengal 3.542373 3.687309 3.613048 3.541707

import seaborn as sns


fig, ax = plt.subplots(figsize=(20,15))
sns.heatmap(df9,annot=True, linewidths=.7, ax=ax)

<matplotlib.axes._subplots.AxesSubplot at 0x7f9d7439b070>

df7=df.groupby('Indoor_Outdoor_Travelling')['Rating'].mean()
df7=df7.reset_index()

f, ax = plt.subplots(figsize=(7,7))
plt.bar(df7['Indoor_Outdoor_Travelling'],df7['Rating'])
plt.xticks( df7['Indoor_Outdoor_Travelling'],
df7['Indoor_Outdoor_Travelling'],rotation='vertical')
plt.ylabel("Call Quality",fontsize=15)
plt.show()
poor_calls=len(df[(df['Call Drop Category']=='Poor Voice Quality')])
total_calls=len(df)
freq_poor_quality_calls=poor_calls/total_calls
freq_poor_quality_calls

0.08036341890743519

Answer the following Questions

1. What is the level of overall call quality? (by the operator,by the
network.
2. Which operators provide low-value services?
3. What is the frequency of poor quality calls?
4. Based on the analysis results, what are the suggestions to improve
call quality?
Answers:
1. The average call quality rate is almost 3 for every operator except RComm. The
average call quality is above 2.5 for every network.

2. RComm and Telenor operators provide low-value services.

3. The frequency of poor quality calls is 0.08036341890743519

4. Travelling improves call quality, 4G network improves, Vodafone and Tata operator
improves call quality.

You might also like