Professional Documents
Culture Documents
SFM A1
SFM A1
SFM A1
- Techniques for collecting data: There are a variety of techniques for collecting data,
such as observation, recording, and documentation, surveys, questionnaires, interviews,
and studies of certain populations. Data can be categorised into two categories after it has
been obtained: qualitative and quantitative (numbers and counts)
Qualitative Quantitative
- A conversational interview in which one - This type of data collection makes use of
party asks questions and the other party surveys and questionnaires, in which
responds in order to obtain information. respondents are asked a variety of questions.
- Observation: In order to get the necessary The researchers can then compile the
information, researchers must keep an eye information the individuals submitted into a
on the participants' behavior. database.
- Focused group study also includes - They will look for records and materials
observation, questionnaires, and interviews pertaining to the study participants to make
because the participants must be a group or sure they have all the data they need.
team with similar characteristics. To acquire
data, it is important to have a thorough grasp
of the group.
Part B:
Missing Value
Statistics
The type of Is the property In what floor the
property Furnished or not property is?
N Valid 501 501 501
Missing 0 0 0
There are no missing values in my data set of 501 observations on home price data
for the qualitative variable.
Statistics
The price of Number of Number of The Area of the
property bedrooms bathrooms property by m2
N Valid 501 501 501 500
Missing 0 0 0 1
My data appears to be lacking a value at the area for the quantitative variables like
price, number of bedrooms, number of math rooms, and area. There was just one missing
value among the variables, therefore I deemed it unnecessary to keep it. As a result, I only
have 500 observations instead of 501.
Outliers
The normal distribution plays a crucial role in statistical analysis, especially when
trying to find a missing variable. This method is employed to transform the information
into a uniform distribution with a zero-mean and a one-standard-deviation. Restoring data
to a normal distribution from an outlier distribution requires the creation of a standard
deviation distribution. Boxplot plots illustrate data distributions, allowing us to assess the
dispersion of the data points, symmetry, wide or narrow distribution, minimum,
maximum, and other exception points, all of which will be used to identify outliers.
Statistics
The price of Number of Number of The Area of the
property bedrooms bathrooms property by m2
N Valid 500 500 500 500
Missing 0 0 0 0
Mean 3318944,172000 2,62 2,14 142,73
00000
Median 2500000,000000 3,00 2,00 121,00
00000
Mode 3100000,000000 3 2 125
000
Std. Deviation 4271883,361822 1,028 1,023 91,608
007000
Variance 1824898745701 1,058 1,046 8391,939
1,690
Skewness 3,758 1,134 1,555 3,454
Std. Error of Skewness ,109 ,109 ,109 ,109
Range 34967000,00000 6 6 815
0000
Minimum 33000,00000000 1 1 35
0
Maximum 35000000,00000 7 7 850
0000
Percentiles 25 800000,0000000 2,00 2,00 95,00
0000
50 2500000,000000 3,00 2,00 121,00
00000
75 3840000,000000 3,00 2,00 155,00
00000
The outliers of the number of bedrooms
Based on the data sheet of the type of houses, I found Chalet house with 335
houses, accounting for nearly 91% of the total number of houses. Meanwhile other types
of houses account for a very low proportion. Ranked second, accounting for 11% is the
house studio with 11 units. Meanwhile, standalone villa has only 1 unit and only accounts
for 0.3%.
Statistics
The price of property
N Valid 369
Missing 0
Mean 1991043,027100
27100
Median 1950000,000000
00000
Mode 330000,0000000
00a
Std. Deviation 1465532,695392
450600
Variance 2147786081264,
261
Skewness ,727
Std. Error of Skewness ,127
Range 7867000,000000
000
Minimum 33000,00000000
0
Maximum 7900000,000000
000
Sum 734694877,0000
00000
Percentiles 25 600000,0000000
0000
50 1950000,000000
00000
75 3000000,000000
00000
a. Multiple modes exist. The smallest value is
shown
$ 1991043,027 is the average price of 369 houses in my observation. 50% of
protries have a value less than $1950000. Of all my observations, the most common
priced house is $330000. Looking at the data sheet, the house with the highest price is
$7900000 and the house with the lowest price is $33000. With a skewness coeffience of
0.727, the distribution is right-skewed.
Statistics
Number of bedrooms
N Valid 369
Missing 0
Mean 2,25
Median 2,00
Mode 2
Std. Deviation ,697
Variance ,485
Skewness -,183
Std. Error of Skewness ,127
Range 3
Minimum 1
Maximum 4
Sum 829
Percentiles 25 2,00
50 2,00
75 3,00
With bedrooms variable, the largest number of bedrooms is 4 rooms and the
number of bedrooms is 1 room. The number of mid-crotum rooms of 369 houses is 2.25.
The number of rooms is quite small, the deviation is 0.127 so the chart tends to be
symmetrical.
Statistics
Number of bathrooms
N Valid 369
Missing 0
Mean 1,68
Median 2,00
Mode 2
Std. Deviation ,468
Variance ,219
Skewness -,763
Std. Error of Skewness ,127
Range 1
Minimum 1
Maximum 2
Sum 619
Percentiles 25 1,00
50 2,00
75 2,00
With the bathrooms variable, the smallest number of rooms is 1. The number of
popular bathrooms in houses is 2. Each house has at least 1 bathroom. The average
number of bathrooms per house is 1.68. The distance range is quite small, the deviation is
0.127 so the expression is likely to deviate.
Statistics
The Area of the property by m2
N Valid 369
Missing 0
Mean 108,42
Median 109,00
Mode 125
Std. Deviation 32,144
Variance 1033,260
Skewness ,259
Std. Error of Skewness ,127
Range 189
Minimum 35
Maximum 224
Sum 40006
Percentiles 25 90,00
50 109,00
75 125,00
On the statistics table, the smallest house area is 35m and the largest room has an
area of 224m. The distance is 189m. The average area of 369 houses is 108.42m. In my
observation data, there are 50% of houses with an area greater than 109m and the most
common area of houses is 125m. The deviation is 0, 127 so the chart tends to be
symmetrical.
Based on the correlation data sheet of price variables and other quantitative variables in
observation, the correlation between price and bedroom is 0.136. Similar to the link between price
and mathrooms and the area, the correlation between price and the aftermentioned two variable is
0.145 and 0.166. I can see that this is a positie relationship. It means that as the number of bedrooms,
bathrooms or the area of the house increases, the price of the house will also increase.
The coefficient of variation (CV) is used to compare the price variability between
various property categories because the means of furnished and unfurnished properties
and the mean of difference floors of properties differ.
It is so that we get the CV table of the price variable and the two forbishop and
level-group variables.
Since the CV data of level group 1 is lower than that of level group 2, homes with
levels in group 1 (the ground, first, and second floors) have less price variation than
homes in group 2 (the other levels).
Since different property kinds have distinct means, it is necessary to compare the
price fluctuations between these different property types using the coefficient of variation
(CV). The CV of studio is higher than that of other sorts of properties, indicating that
town house prices fluctuate more than those of other forms of dwellings, according to a
comparison of CV data for other types of properties.
Data outliers can be identified using box plots. Summary statistics and histograms
are used to examine quantitative variables, whereas frequency tables and a pie/bar chart
are used to examine qualitative ones. The histogram will help those with numerical data
see the spread of the data and the skewness trend of the graph. Indicators such as
frequency tables, pie charts, and bar graphs allow readers to easily compare and contrast
the prevalence and distribution of data. The price-quantity link was examined using
correlation coefficients and scatter plots. Mean, standard deviation, and coefficient of
variation are determined, then compared to analyze the correlation between price and
qualitative features.
Continuous data and quantitative testing can be visualized using histograms, with
the proportion of observations represented by the length of the bars. Multiple data
categories can be represented with varying heights of bars in bar charts. To visualize how
several factors contribute to the whole, a pie chart is used. You may easily compare
qualitative values using either a pie chart or a bar chart.
Part C
H 1: σ²(furnished) ≠ σ²(unfurnished)
The test's P value is 0.322, which is higher than = 0,05, hence H1 is rejected and
H0 is true. As a result, the data with assumed equal variances will be used in the test for
mean equality.
H 0: μ(furnished) = μ(unfurnished)
H 1: μ(furnished) ≠ μ(unfurnished)
Do not reject H0, which revealed that there is no difference in the average prices
of furnished and unfurnished properties, because the P value of the test is0.446, which is
higher than = 0,05.
Level-group T-Test
Independent Samples Test
Levene's Test
for Equality of
Variances t-test for Equality of Means
95% Confidence Interval of
Sig. (2- Mean Std. Error the Difference
F Sig. t df tailed) Difference Difference Lower Upper
The Equal ,080 ,778 2,202 367 ,028 430963,098 195755,007 46020,8697 815905,32632
price of variances 055501470 516202600 90035180 0967800
property assumed
Equal 2,196 99,296 ,030 430963,098 196227,638 41619,2386 820306,95743
variances not 055501470 855224850 72550480 8452400
assumed
Test for equality variance
H 1: σ²(Level_group1) ≠ σ²(Level_group2)
Do not reject H0 since the P value of the test is larger than = 0,05 at 0.778. The
data with equal variances assumed in the preceding table will be used for the mean
equality test.
H 0: μ(Level_group1) = μ(Level_group2)
H 1: μ(Level_group1) ≠ μ(Level_group2)
The P value of the test is 0.028, which is smaller than α = 0,05, therefore we
might reject H0. Therefore, we can see that the average house price of houses on the
ground, first and second floors is different from other types of houses.
Model Summary
Adjusted R Std. Error of the
Model R R Square Square Estimate
1 ,220a ,048 ,035 1439537,359707
030700
a. Predictors: (Constant), Furnished_Dummy, Level_Group, Number of
bathrooms, The Area of the property by m2, Number of bedrooms
The variables furnished, bedrooms, Level group, space, and bathrooms, with a R
Square value of0.048, are responsible for 4,8% of the variance in the variable price.
ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 3815206287804 5 7630412575609, 3,682 ,003b
5,875 175
Residual 7522332150272 363 2072267809992,
01,000 289
Total 7903852779052 368
46,900
a. Dependent Variable: The price of property
b. Predictors: (Constant), Furnished_Dummy, Level_Group, Number of bathrooms, The Area of the
property by m2, Number of bedrooms
H 0 : R2= 0
H 1: R2> 0
The model is overall significant with 5 variances since the P value of the test
is0.003, which is less than =0,05, and H0 is rejected as accurate.
Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 1517634,695 395007,401 3,842 ,000
Number of bedrooms -74239,165 185530,937 -,035 -,400 ,689
Number of bathrooms 276822,145 198570,951 ,088 1,394 ,164
The Area of the property by 6958,531 3753,247 ,153 1,854 ,065
m2
Level_Group -452414,877 193518,848 -,120 -2,338 ,020
Furnished_Dummy -86331,475 150458,004 -,029 -,574 ,566
a. Dependent Variable: The price of property
^
Price=¿ ¿1517634,695+276822,145*Bathrooms-74239,165*Bedrooms+6958,531*Area-
452414,877*Level_Group- 86331,475*Furnished
The cost increases by $276822,145 on average when there are one more bathroom. When
the number of bedrooms is increased by one, the cost drops by an average of $74239,165. The
price will increase by $6958,531 for every 1 m2 that the property's size increases by. In
comparison to other properties (Level group 1), properties on the third level or higher (Level
group 2) cost about $452414,887 less. Properties that are furnished will cost about $86331,475
less than those that are unfurnished.
H0: βi = 0
H1: βi ≠ 0
Bathrooms
P value = 0,164 > α = 0,1 => Do not reject H0
Bedrooms
P value = 0,689 > α = 0,1 => Do not reject H0
Area
P value = 0,065 < α = 0,1 => Reject H0
Level_group
P value = 0,020 < α = 0,1 => Reject H0
Furnished
P value = 0,566 > α = 0,1 => Do not reject H0
The aforementioned figures demonstrate that H1 is true in terms of the level and
area variables, indicating that the level of homes and the size of the properties do have an
effect on their prices. There is insufficient data to reject H0 because the P values for the
number of bedrooms, bathrooms, and furnished variables are larger than = 0,1.