Data Analytics: Submitted by

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 17




Submitted to DR. QUANG NGUYEN



Introduction …………………………………………………….2.

Descriptive analysis …………………………………………..3.

Regression analysis …………………………………………..9.

Managerial interpretation and implications…………………. 12.

Conclusion ……………………………………………………...13.


The report analysis shows the catalog of people of different states and cities in USA
in respect to there financial earnings salary of people. The data collection is of 1000
respondents is related to the Hytex company which manufactures in textile
organisation and the analysis is done by using the Microsoft excel tool. The study by
the researcher is done in the past and based on the owned home by the gender
diffrentiation and with there martial status including children and the amount spent.
This study will show the following study:
1. Who spends the most money by the gender ?
2. Is there any correlation between amount spent and owned home?
3. Is there any negative correlation between married people and owned
4. Is there any correlation between children and owned home?

1.Descriptive Analysis
It is used to understand the features of the given data of particular sample of the
population by giving its summarize form about the measures of the data. The most
highlighted types of descriptive analysis are the mean, median, and mode which are
used mostly by the statistics. It has the quantitative insights across the large set of
data. This study of descriptive analysis will show the mensurable intuitions of Age,
gender and marital status of the set of data with respect to the salary.

1.1 Age
Analysis of the age is executed from the given data consisting of 1000 people
respondents. The below graph is constructed by the count IF formula and the data is
converted into the percentage for the clear understandings by the excel division
function tool. The graph shows the percentage of young, middle- aged and elderly
50% of middle aged people live in USA. Whereas, elderly people have the least ratio
of 20%.

Respondent Age
600 60.0%

500 50.0%

400 40.0%

300 30.0%

200 20.0%

100 10.0%

0 0.0%
1= 30 or young 2= 31 to 55 3= 56 and older

count percentage

Age Assumption count

1 1= 30 or young =COUNTIF(B:B,1)

Gende count
Assumption count percentage
r formula
2 =COUNTIF(C:C,0 2= 31 to 55 =COUNTIF(B:B,2)
1 =T4/T6
0=female )  
3 3= 56 and older =COUNTIF(B:B,3)
Total   =T5/T6 =SUM(AC4,AC5,AC6)
1= male )  
Total   =SUM(T4,T5) 1  

1.2 Gender
The data of the gender is summarised between male and female of the given data
population of the 1000 people. The stated graph shows that female count population
is higher than the males. The graph is being illustrated by the count if formula of the
excel tool. 50.6% of the population is female and 49.4% of the population is male.
Although, there is no such big difference in the figures but it seems that the USA has
more female tendency.

Respondent Gender
508 50.8%
506 50.6%
504 50.4%
502 50.2%
500 50.0%
498 49.8%
496 49.6%
494 49.4%
492 49.2%
490 49.0%
488 48.8%
0=female 1= male

count percentage

The computation of the graph of the gender into the percentage.

1.3 catalog

Analysis using Excel COUNT IF function shows that the there are different types Catalog.
The catalog 12 has the 28.2% of the data and the lowest data has the catalog 18 and 24 with
same percentage 23.3% . however, catalog 6 has the 25.2% of the data.

Respondents catalog
300 30.0%
250 25.0%
200 20.0%
150 15.0%
100 10.0%
50 5.0%
0 0.0%
6 12 18 24

Count Percentage

Catalog no. of catalog given Count Percentage Count formula

1 6 252 25.2%  
2 12 282 28.2%  
3 18 233 23.3%  
4 24 233 23.3%  
total   1000 100%  


The salary analysis is done using the Microsoft excel pivot table. As shown below. It
indicates the average salaries on the basis of respondents with respect to there Age . The
results of the average salary on the basis of gender that is most paid. And at last on the behalf
of there catalog data availability. The graph shows that the middle aged people has the
highest pay scale with an average salary of $72036.41 in contrast with the elderly people of
average salary $56365.85. However, the young people are the least paid with an average
salary of $27715.67.

Fig 2.1 illustration of average salary by age

The fig 2.2 shows the illustration of the average salary on the basis of gender. The graph
depicts that in USA the male population has the high pay scale with an average salary of
$64202.42 on the other hand the females earn the average salary of $48197.43.

Fig 2.2 illustration of average salary by gender

Also, the fig 2.3 shows the average salary scale by the catalog data availability. It shows that
with catalog 18 and 24 the average salaries are $60408.15 and $62268.24 respectively. The
lowest average salary is with the catalog 6 of $46892.06.

Fig 2.3 illustration of average salary by catalog

Moreover, the fig 2.4 shows an histogram of the average salary of the respondents. It is being refined by the
histogram tool under the data analysis tool box of Microsoft excel. The figure shows that average salary of the
respondents has been slanted to the left and the majority of the respondents earns between $30000 to $50000.

Histogr am
300 249 120.00%
250 221 210 94.99% 99.20% 99.90% 100.00% 100.00% 100.00%
200 84.28% 80.00%

150 107 60.00%
100 47.05% 40.00%
24.92% 42
50 7 1 0 0 20.00%
0 0.00%
30000 70000 50000 90000 110000 130000 150000 170000 190000 More

Frequency Cumulative %
Fig 2.4 illustrates the respondents salary using histogram

3.Regression analytics

It is a form of predilected technique that finds the relationship between a dependent and
independent variables. This form is basically used for forecasting, time series or to monitor

the effect relationship between the variables. This is not similar to the descriptive statistics,
they are able to conclude beyond the data and used to predict the future.
The following regression analysis are carried out to find out the answers to the questions.

3.1 Correlation between gender and owned home.

The below graph fig 3.1 shows the relationship of the variables is negative correlation
Between gender and owned home in USA. The R squared shows the 0.0522 of variation of
owned home by the gender.

Gender Line Fit Plot

Linear (Y)
Predicted Y

f(x) = 0.08 x + 0.47
R² = 0.01
0.05 Linear (Predicted Y)
0 0.2 0.4 0.6 0.8 1 1.2

Fig 3.1

3.2 Correlation between amount spent and owned home.

The graph fig 3.2 shows the insignificant positive correlation between the amount spent and
owned home in the USA. The R square indicates the 0.9018 change in the owned home
opinion by the changes of amount spent.

X amount spent Line Fit Plot
f(x) = 0 x + 0.29 Y
1 R² = 0.12
0.9 Linear (Y)
Predicted Y

Linear (Predicted Y)
$0 0 0 00 00 00 00 00 00
,0 ,0 ,0 ,0 ,0 ,0 ,0
$1 $ 2 3
$ 4
$ SPENT $ 5 $ 6 $ 7

Fig 3.2

3.3 Correlation between marital status and owned home.

The significant graph fig 3.3 show the negative correlation of the variables between the
married and owned home opinion. The R squared shows the 0.0607 variation in owned home
by married couples.

Xmarital status Line Fit Plot

2 Y
1 Linear (Y)

Predicted Y
0 f(x) = 0.26 x + 0.38
Linear (Predicted Y)
0 R² =0.2
0.510.4 0.6 0.8
0.07 1 1.2
X marital status

Fig 3.3

3.4 Correlation between children and owned home.

The graph fig 3.4 shows the negative correlation of variables between the children and the
owned home. The R squared showed the 0.0716 variation in owned home opinion affected by
number of children.

X children Line Fit Plot

Linear (Y)

0.5 Predicted Y
f(x) = − 0.02 x + 0.53 Linear (Predicted Y)
0 R² = 0
0 0.5 1 1.5 2 2.5 3 3.5
X children

Fig 3.4


Regression Statistics
Multiple R 0.36942807
R Square 0.1364771
Adjusted R
Square 0.13300565
Error 0.46555706
Observations 1000

  df SS MS F eF
Regression 4 34.0843363 8.52108407 39.3141606 1.3629E-30
Residual 995 215.659664 0.21674338
Total 999 249.744      

Standard Lower Upper

  Coefficients Error t Stat P-value Lower 95% Upper 95% 95.0% 95.0%
Intercept 0.24465719 0.03225959 7.58401441 7.6708E-14 0.18135255 0.30796183 0.18135255 0.30796183
X Variable 1 0.01391569 0.03014079 0.46168972 0.64440476 -0.0452311 0.07306251 -0.0452311 0.07306251
X Variable 2 0.00015532 1.8231E-05 8.51940702 5.87E-17 0.00011954 0.0001911 0.00011954 0.0001911
X Variable 3 0.11992313 0.03380421 3.54758011 0.00040693 0.05358741 0.18625885 0.05358741 0.18625885
X Variable 4 0.01635942 0.01453879 1.12522599 0.26076463 -0.0121708 0.04488963 -0.0121708 0.04488963

Managerial interpretation and implications

The findings of this report has interpret that there is variation in the given
composition and the current situation in USA. There is a correlation negative
between gender and owned home. Whereas, the Yannis loannides and stuart (2020)
has said that the studies of ownership including gender through a variable that
denotes female are more headed household so, they argued that they have a
positive relation in between. On the other hand the Joseph Gyourko and peter
Linneman, 1996) (2020) has also stated about the own housing of the married
couples in USA by analyzying that the own housing policy is positive for the married.
The findings shows that correlation in between marital status and owned house is
negative. There is no concern of the children to the own housing in USA and the
result has interrelated the negative relation which is true with the current situation
The age for first buyer home in USA is a 32 years which is not a age of children

The only relation of amount spent with the own housing matches with the existing
situation that there is a positive relation between both of them.
However, it is important to note that there is no link between the catalog data with
the people and the average salaries, it showed that the high number of catalog 24
has the more average salary. The matter of age is related to the salary but the
findings say that the middle aged person has $72036.41 more salary than the
elderly person has $56365.85 but the real pay scale table of USA shows that the
more aged people will get more amount of salary (What’s The Average Salary By Age In The
US? |, 2020). This is an consistent data that there is a gender pay gap in
USA in some of the sectors in there is a salaried discrimination related to gender.
The above stated graph and given data shows that men population has high average
salary than female.


The report has analysed the catalog, gender, age and salary differences that affects in the
USA. The finding indicates that the amount spent does not have any effect on the owned
house. However, the gender has the negative effect on the owned home that concerned
majority females. A negative correlation is also found in respect to marital status and with
the number of children. It can conclude that USA can promote owned home encouraging
adults specially females and children also more about environment.


Investopedia. 2020. Descriptive Statistics. [online] Available at:

<> [Accessed 10 May 2020].

SurveyGizmo. 2020. What Is Regression Analysis And Why Should I Use It? | Surveygizmo Blog.
[online] Available at: <> [Accessed
10 May 2020].

2020. [online] Available at: <

Y7ptA1mjKMA> [Accessed 11 May 2020]. 2020. [online] Available at: <> [Accessed 11 May 2020].


You might also like