REPORT Data-Science

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4


Subject: MAS202
Group member:

Part I. Introduction & Methodology
a. What is the topic of your group project?
The topic is about 300 U.S data scientists working in the U.S in 2023
b. What are the main issues you plan to address? What questions do you have
about your project?
Our analysis will approach the salaries of a very hot career in 2023: data science.
The question we have is whether there is a difference between the mean salaries
of data analyst, data scientist, and data engineer. Furthermore, we want to
discover how experience level and company size affect one’s salary in this field.
c. What do experts think about your research issues? Provide background
information on the research topic with in-text reliable references.
Experts can be interested in our topic as salary is one of the important factors that
affect a student’s major and career choice.
d. Identify the continuous variables (independent and dependent) between which
you would like to find the relationship. Explain why you are choosing these
Salary is a continuous variables. The independent variable studied is salary. The
dependent variables are experience level and company size. We choose these two
variables because we assume they are related to each other.
e. Identify the population in your research about which you’ll be making inference.
The population is Data Science Salaries across the world in 2023
f. Identify the samples and the sampling method that you will use to collect the
Our data is a sample of 300 U.S data scientists working in the U.S in 2023. The
data is collected based on random sampling method.
g. Submit the designed questionnaire which you’ll be using to collect the data if
you use survey in your data collecting step.
Our data is secondary data from Kaggle.
h. Identify the survey errors that might have occurred in your research while
collecting data.
It is likely that the sampling error will occur as the sample size is small when
compared to the population size.

Part II. Descriptive Statistics Results
a. Demographics information
The topic is about 300 U.S data scientists working in the U.S in 2023
b. Descritive statistics
i. A table of the measures of Central Tendency &
ii. A table with the measures of Variation


Mean 164132.3133
Standard Error 3414.468727
Median 155000
Mode 145000
Standard Deviation 59140.33315
Sample Variance 3497579005
Kurtosis 0.0470
Skewness 0.4623
Range 317310
Minimum 25500
Maximum 342810
Sum 49239694
Count 300
Table 1: Descriptive statistics of Salary (Unit: $/year)
Key Findings
In the sample, the mean salary is $164,132.3133/ year with median is $155,000 and
mode is $145,000. Additionally, it ranges from $25,500 to $49,239,694 and has a
standard deviation of $59,140.33315

iii. The Box-and Whisker Plot /Histogram or other graphs if necessary

Table 2: Bar chart of
Count of job_title by experience_level
Experience level

250 241 Key Findings

In the sample, there
are 12 people are in
executive level, 241
100 people are seniors, 28
people are in mid level,
50 28
19 12 and 19 people are in
entry level.

Count of company_size
Table 3: Bar chart of
270 Company size

200 Key Findings

In the sample, there are 28
people are working in large
100 companies, 270 people are
working in medium
companies, and only 2
L M S people are working in small

You might also like