Professional Documents
Culture Documents
Oup 6
Oup 6
LAVANYA M , P221826
MAHESHPRIYA L, P221827
MALAVIKA K P221828
MD ALTAMASH AYUB , P221829
MD GUFRAN KHAN , P221830
August 2023
Submitted to
Dr.S.VISALAKSHMI
Assistant Professor
Department of Management
CENTRAL UNIVERSITY OF TAMIL NADU
Thiruvarur – 610 005
1
TABLE OF CONTENTS
1 FREQUENCIES 3-8
11 GROUP PHOTO 48
2
1.FREQUENCIES
3
STEP 3: ANALYZE FREQUENCY DATA
Comments
Filter <none>
Weight <none>
Missing Value Handling Definition of Missing User-defined missing values are treated as
missing.
Cases Used Statistics are based on all cases with valid data.
4
Syntax FREQUENCIES VARIABLES=ID Gender
Height Weight
/NTILES=4
/NTILES=10
/STATISTICS=STDDEV VARIANCE
RANGE MINIMUM MAXIMUM SEMEAN
MEAN MEDIAN MODE SUM SKEWNESS
SESKEW KURTOSIS SEKURT
/ORDER=ANALYSIS.
[DataSet1]
Statistics
N Valid 10 10 10 10
Missing 0 0 0 0
Range 9 8 37
Minimum 1 62 116
Maximum 10 70 153
5
Percentiles 10 1.10 62.20 116.20
Frequency Table
ID
6
Gender
Height
Weight
7
INTERPRETATION:
8
2. DESCRIPTIVES 1
9
STEP 3: SELECT THE VARIABLE
OUTPUT
Notes
Comments
Filter <none>
Weight <none>
Missing Value Handling Definition of Missing User defined missing values are treated as
missing.
10
Notes
Comments
Filter <none>
Weight <none>
Missing Value Handling Definition of Missing User defined missing values are treated as
missing.
DATE 169 42.0 1959.1 2001.1 1.980E3 .9409 12.2322 149.626 .001 .187 -1.200 .371
GDP 169 9747.5 496.1 10243.6 3.573E3 221.0122 2873.1581 8.255E6 .701 .187 -.751 .371
Valid N (listwise)
169
11
INTERPRETATION:
12
3 . DESCRIPTIVE 2
13
STEP 3: OUTPUT OF DESCRIPTIVE DATA
Descriptive Statistics
Std.
Ran Mini Maxi Devia Varian
N ge mum mum Sum Mean tion ce Skewness Kurtosis
Std.
Stati Stat Stati Stati Stati Stati Std. Statist Statisti Stat Erro Stat Std.
stic istic stic stic stic stic Error ic c istic r istic Error
Age 37.6 11.02 121.57 -
60 47 17 64 2261 1.423 .087 .309 .608
8 6 6 .441
Vali
dN
(list 60
wise
)
14
INTERPRETATION:
The provided descriptive statistics pertain to the variable "Age," derived
from a dataset of 60 individuals. The range of ages spans 47 years,
ranging from a minimum of 17 to a maximum of 64. The sum of ages is
2261, resulting in a mean age of approximately 37.68 years. The
standard deviation of 1.423 indicates the average dispersion of ages
from the mean. The variance, measuring the spread, is 11.026.
The distribution of ages exhibits a slight negative skewness of -0.441,
suggesting that the tail of the distribution extends towards younger ages.
The kurtosis value of 0.608 indicates that the distribution is relatively
less peaked and has fewer extreme values than a normal distribution.
The dataset comprises a valid sample of 60 cases. In summary, the
analysis provides insights into the distribution and characteristics of the
"Age" variable, revealing a relatively centered distribution with a
moderate range and standard deviation. The skewness and kurtosis
values suggest deviations from a perfectly normal distribution, but the
data appears generally symmetric and less extreme in terms of tail and
peak
15
4. SIMPLE CORRELATION
Click Analyze > Correlate > Bivariate... on the main menu, Click Analyze
> Correlate > Bivariate... on the main menu,
16
SIMPLE CORRELATION:
17
STEP 3 : VIEW THE DATA WHICH IS IMPORTED IN VARIABLE VIEW.
18
STEP 5: SELECT THE FLITER OF YOUR CHOICE IN FILTERS AND
SELECTS THE DIFFERENT TYPES OF METHODS AND DATA VIEWS.
19
Correlations:
Correlation is a statistical measure that expresses the extent to which two variables are
linearly related (meaning they change together at a constant rate). It’s a common tool for
describing simple relationships without making a statement about cause and effect.
Simple correlation:
Simple linear correlation is a measure of the degree to which two variables vary
together, or a measure of the intensity of the association between two variables. •
Correlation often is abused. You need to show that one variable actually is affecting
another variable.
[DataSet1]
Descriptive Statistics
Std.
Mean Deviation N
Income 56.00 11.121 7
Expendi
53.57 8.886 7
ture
Correlations
Expenditur
Income e
Income Pearson Correlation 1 .830*
Sig. (2-tailed) .021
Sum of Squares and
742.000 492.000
Cross-products
Covariance 123.667 82.000
N 7 7
Expenditur Pearson Correlation .830* 1
e
Sig. (2-tailed) .021
Sum of Squares and
492.000 473.714
Cross-products
20
Covariance 82.000 78.952
N 7 7
*. Correlation is significant at the 0.05 level (2-tailed).
INTERPRETATION:
The descriptive statistics and correlation coefficients:
Descriptive Statistics:
Income:
Correlations:
The Pearson correlation coefficient measures the strength and direction of a linear
relationship between two variables. In this case, it is used to assess the relationship
between "Income" and "Expenditure."
The Sum of Squares and Cross-products and Covariance are additional statistical values
21
related to the correlation calculation, but they don't directly impact the interpretation of
the correlation coefficient.
Nonparametric Correlations
Correlations
Inco Expendi
me ture
Kendall's Income Correlation
1.000 .619
tau_b Coefficient
Sig. (2-tailed) . .051
N 7 7
Expendi Correlation
.619 1.000
ture Coefficient
Sig. (2-tailed) .051 .
N 7 7
Spearman's Income Correlation
1.000 .750
rho Coefficient
Sig. (2-tailed) . .052
N 7 7
Expendi Correlation
.750 1.000
ture Coefficient
Sig. (2-tailed) .052 .
N 7 7
Interpret the nonparametric correlation coefficients (Kendall's tau_b and Spearman's rho)
that you've provided for the variables "Income" and "Expenditure."
22
Nonparametric correlations are used when the data does not meet the
assumptions required for parametric correlations (such as Pearson's
correlation) due to non-normality or when the relationship is not linear.
interpretations for various nonparametric correlation coefficients
(Kendall's tau_b and Spearman's rho) for the variables "Income" and
"Expenditure." Nonparametric correlations are used when the data
doesn't meet the assumptions required for parametric correlations, often
due to non-normality or non-linear relationships.
For Kendall's tau_b: The correlation coefficient between "Income" and
"Expenditure" is 0.619. This suggests a moderate positive correlation.
However, with a p-value of 0.051, the correlation isn't statistically
significant at the 0.05 level. This means that there isn't enough evidence
to conclude that the observed correlation is more than what could be
expected by random chance.
For Spearman's rho: The correlation coefficient between "Income" and
"Expenditure" is 0.750, indicating a relatively strong positive
monotonic relationship. This means that as one variable increases, the
other generally tends to increase, although the relationship might not be
strictly linear. Similar to Kendall's tau_b, the p-value of 0.052 is slightly
above the 0.05 level, indicating no statistically significant correlation.
Both nonparametric correlation coefficients suggest positive
associations between "Income" and "Expenditure," but these
associations are not statistically significant at the 0.05 level. This
implies that the observed correlations might be due to random
fluctuations rather than a solid underlying relationship. If you need
further analysis or more information, feel free to provide additional
details or ask specific questions.
23
5 .MULTIPLE CORRELATIONS
24
STEP 3: PLOT THE DATA
OUTPUT:
Output Created 15-Aug-2023 23:08:55
Comments
Filter <none>
Weight <none>
Missing Value Handling Definition of Missing User-defined missing values are treated as
missing.
Syntax CORRELATIONS
/VARIABLES=GDP GOVTEXP CONS INV
/PRINT=TWOTAIL NOSIG
/STATISTICS DESCRIPTIVES XPROD
/MISSING=PAIRWISE.
25
Resources Processor Time 00:00:00.047
[DataSet1]
Descriptive Statistics
Correlations
N 37 37 37 37
N 37 37 37 37
26
Covariance 3.072E10 3.969E9 1.856E10 1.152E10
N 37 37 37 37
N 37 37 37 37
INTERPRETATION:
The presented output is the result of a correlation analysis performed on four variables: "GDP,"
"GOVTEXP," "CONS," and "INV" using the dataset "DataSet1." Here's the interpretation of the key
statistics: Descriptive Statistics: The variables' means, standard deviations, and the number of cases are
as follows:
GDP: Mean = 1.14E6, Std. Deviation = 233846.634, N = 37
GOVTEXP: Mean = 1.26E5, Std. Deviation = 32913.541, N = 37
CONS: Mean = 6.75E5, Std. Deviation = 136232.151, N = 37
INV: Mean = 3.74E5, Std. Deviation = 88811.183, N = 37
Correlation Analysis: Pearson correlation coefficients were calculated between all pairs of variables.
The results are as follows:
GDP and GOVTEXP: Pearson correlation coefficient = 0.928, p < 0.001
GDP and CONS: Pearson correlation coefficient = 0.964, p < 0.001
GDP and INV: Pearson correlation coefficient = 0.977, p < 0.001
GOVTEXP and CONS: Pearson correlation coefficient = 0.885, p < 0.001
GOVTEXP and INV: Pearson correlation coefficient = 0.872, p < 0.001
CONS and INV: Pearson correlation coefficient = 0.952, p < 0.001
All correlation coefficients are highly significant (p < 0.001), indicating strong linear relationships
between the variables.
These correlations reveal that there are significant positive relationships among the variables. GDP has
strong positive correlations with GOVTEXP, CONS, and INV. Similarly, other variable pairs also show
strong positive correlations. The correlation analysis provides insights into how these variables move
together or against each other in the dataset
27
6. SIMPLE REGRESSION
28
STEP 2 : ANALYSE THE DATA
29
STEP 4 : THE OUTPUT WILL BE GENERATED
Variables Entered/Removeda
Mod Variables Variables
el Entered Removed Method
1 Maintenanc
. Enter
e costb
a. Dependent Variable: Age of a car
b. All requested variables entered.
Model Summaryb
Change Statistics
Std. R
Error Squar Durbi
R Adjuste of the e F Sig. F n-
Mod Squar d R Estimat Chang Chang df df Chang Watso
el R e Square e e e 1 2 e n
1 .991 212.76
a .982 .977 .284 .982 1 4 .000 1.669
1
a. Predictors: (Constant), Maintenance cost
b. Dependent Variable: Age of a car
ANOVAa
Sum of Mean
Model Squares df Square F Sig.
1 Regressio 212.76
17.177 1 17.177 .000b
n 1
Residual .323 4 .081
Total 17.500 5
a. Dependent Variable: Age of a car
b. Predictors: (Constant), Maintenance cost
30
Coefficientsa
95.0%
Unstandardized Standardized Confidence
Coefficients Coefficients Interval for B
Std. Lower Upper
Model B Error Beta t Sig. Bound Bound
1 (Constant) .528 .234 2.254 .087 -.122 1.179
Maintenance
.435 .030 .991 14.586 .000 .352 .518
cost
a. Dependent Variable: Age of a car
Residuals Statisticsa
Minimu Maximu Std.
m m Mean Deviation N
Predicted Value 1.40 6.18 3.50 1.853 6
Residual -.398 .297 .000 .254 6
Std. Predicted
-1.134 1.447 .000 1.000 6
Value
Std. Residual -1.401 1.046 .000 .894 6
31
INTERPRETATION:
Multiple regression is a statistical technique that can be used to analyze the relationship
between a single dependent variable and several independent variables.
THE PROBLEM:
To investigate if Government Expenditure, Consumption, Investment has a significant
impact on GDP
Using the SPSS SOFTWARE we will analyse the data given and then can conclude
33
STEP 2: ANALYSE THE DATA
ANALYSE → REGRESSION → LINEAR REGRESSION
34
STEP 4: THE OUTPUT WILL BE GENERATED.
OUTPUT
Regression
Descriptive Statistics
Mean Std. Deviation N
GDP 1.14E6 233846.634 37
GOVTEXP 1.26E5 32913.541 37
CONS 6.75E5 136232.151 37
INV 3.74E5 88811.183 37
Correlations
GDP GOVTEXP CONS INV
Pearson Correlation GDP 1.000 .928 .964 .977
GOVTEXP .928 1.000 .885 .872
CONS .964 .885 1.000 .952
INV .977 .872 .952 1.000
Sig. (1-tailed) GDP . .000 .000 .000
GOVTEXP .000 . .000 .000
CONS .000 .000 . .000
INV .000 .000 .000 .
N GDP 37 37 37 37
GOVTEXP 37 37 37 37
CONS 37 37 37 37
INV 37 37 37 37
Variables Entered/Removedb
Model Summaryb
ANOVAb
Coefficientsa
Unstandardized Standardized 95% Confidence Interval for
Coefficients Coefficients B
Model B Std. Error Beta t Sig. Lower Bound Upper Bound
1 (Constant) 123496.030 29386.413 4.202 .000 63708.923 183283.136
GOVTEXP 1.918 .365 .270 5.260 .000 1.176 2.660
CONS .360 .141 .210 2.561 .015 .074 .646
36
INV 1.427 .205 .542 6.963 .000 1.010 1.843
a. Dependent Variable: GDP
Residuals Statisticsa
Minimum Maximum Mean Std. Deviation N
Predicted Value 7.28E5 1.51E6 1.14E6 231730.967 37
Residual -5.809E4 7.023E4 .000 31384.824 37
Std. Predicted Value -1.789 1.602 .000 1.000 37
Std. Residual -1.772 2.142 .000 .957 37
INTERPRETATION:
The provided output represents the results of a regression analysis involving the
variables "GDP" (dependent variable), "GOVTEXP," "CONS," and "INV"
(independent variables). Here is the interpretation of the key statistics:
Descriptive Statistics: The mean and standard deviation of the variables are as
follows:
GDP: Mean = 1.14E6, Std. Deviation = 233846.634, N = 37
GOVTEXP: Mean = 1.26E5, Std. Deviation = 32913.541, N = 37
CONS: Mean = 6.75E5, Std. Deviation = 136232.151, N = 37
INV: Mean = 3.74E5, Std. Deviation = 88811.183, N = 37
Correlations: There are strong positive correlations among the variables:
GDP and GOVTEXP: 0.928
GDP and CONS: 0.964
GDP and INV: 0.977
GOVTEXP and CONS: 0.885
GOVTEXP and INV: 0.872
CONS and INV: 0.952 All correlations have p-values less than 0.001, indicating
statistical significance.
37
Model Summary: The regression model has a high R-squared value of 0.982,
suggesting that around 98.2% of the variation in GDP can be explained by the
independent variables (INV, GOVTEXP, CONS) included in the model.
ANOVA: The ANOVA table indicates that the regression model is statistically
significant (p < 0.001) in explaining the variance in GDP.
Coefficients: For the coefficients, all three independent variables (GOVTEXP,
CONS, INV) show significant positive relationships with GDP. Each unit increase
in GOVTEXP is associated with an increase of approximately 1.918 units in GDP,
while CONS and INV have respective coefficients of 0.360 and 1.427.
Residuals Statistics: The residuals (differences between actual and predicted
values) have a mean of nearly 0, indicating that the model fits well. The predicted
GDP values have a mean of 1.14E6 and a standard deviation of 231730.967.
In summary, the regression analysis suggests that the model with GOVTEXP,
CONS, and INV as independent variables is a good fit for explaining the variation
in GDP. The model's strong R-squared, significant ANOVA, and meaningful
coefficients indicate a substantial relationship between the variables.
38
8 CHI SQUARE TEST
39
STEP:2 ANALYSE THE DATA
40
STEP 3 : OUTPUT
Crosstabs
[DataSet1]
Cases
Valid Missing Total
N Percent N Percent N Percent
Smoke * Cancer 50 100.0% 0 0.0% 50 100.0%
Count
Cancer
1 2 Total
Smoke 1 14 12 26
2 13 11 24
Total 27 23 50
a
Pearson Chi-Square .001 1 .982
b
Continuity Correction .000 1 1.000
Likelihood Ratio .001 1 .982
Fisher's Exact Test 1.000 .603
Linear-by-Linear Association
.001 1 .982
N of Valid Cases 50
41
a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 11.04.
b. Computed only for a 2x2 table
INTERPRETATION:
42
9. ONE SAMPLE T-TEST
The one sample t-test is a statistical procedure used to determine whether a sample of
observations could have been generated by a process with a specific mean. Suppose you are
interested in determining whether an assembly line produces laptop computers that weigh
five pounds. To test this hypothesis, you could collect a sample of laptop computers from the
assembly line, measure their weights, and compare the sample with a value of five using a
one-sample t-test.
STEP -1
IMPORT DATA FROM EXCEL
One-Sample Statistics
43
One-Sample Test
Test Value = 0
INTERPRETATION:
The provided data presents the results of a one-sample statistical test conducted on a variable
called "Scores." The sample consists of 10 data points. The mean of the "Scores" is 31.30,
with a standard deviation of 1.767 and a standard error mean of 0.559.
The one-sample test was performed with a test value of 0. The t-statistic computed is 56.016,
with 9 degrees of freedom, and the two-tailed p-value is determined as 0.000, which is less
than the conventional significance level of 0.05. This indicates that there is a highly
significant difference between the sample mean and the test value of 0.
The mean difference is 31.300, and the 95% confidence interval for this difference ranges
from 30.04 to 32.56. Since the confidence interval does not include the test value of 0, this
further supports the conclusion that the sample mean is significantly different from 0.
In summary, the "Scores" variable shows a substantial positive difference from the test value
of 0, as evidenced by the high t-statistic and extremely low p-value. This suggests that the
data sample represents a population with a mean significantly different from 0
44
10.PAIRED SAMPLE T-TEST
T-Test
45
STEP3 : ANALYZE – COMPARE MEANS- PAIRED
SAMPLE T -TEST
N Correlation Sig.
Paired Differences
PAfter Sales
a- Before
iSales
3.000 3.432 1.085 .545 5.455 2.764 9 .022
r
46
INTERPRETATION:
The provided data presents the results of a paired samples analysis comparing
"After Sales" and "Before Sales" data points. For a sample size of 10 pairs, the
mean "After Sales" value is 17.50, with a standard deviation of 3.567 and a
standard error mean of 1.128. The mean "Before Sales" value is 14.50, with a
standard deviation of 5.836 and a standard error mean of 1.845.
The correlation between the "After Sales" and "Before Sales" values is strong
and positive (0.841), indicating a significant relationship between the two
variables (p = 0.002).
In the paired samples test, the mean difference between "After Sales" and
"Before Sales" is 3.000 units, with a standard deviation of 3.432 and a standard
error mean of 1.085. The 95% confidence interval for this difference is between
0.545 and 5.455 units. The t-statistic is 2.764 with 9 degrees of freedom, and the
two-tailed p-value is 0.022, which is less than the conventional significance
level of 0.05. This indicates that the difference between "After Sales" and
"Before Sales" is statistically significant.
In summary, there is a significant increase in values from "Before Sales" to
"After Sales," suggesting a positive effect of the change. The correlation further
supports this trend, and the statistical test confirms the significance of this
difference
47
48