Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 115

PROJECT REPORT

ON
“RESEARCH METHODOLOGY”
SUBMITTED IN PARTIAL FULFILLMENT FOR THE
AWARD OF THE DEGREE OF
BACHELOR OF COMMERCE (H)
2019-2022
UNDER THE GUIDANCE OF
MS. NUPUR ARORA
FACULTY, VIPS
SUBMITTED BY:
Name: HRISHABH SRIVASTAVA
Enrolment No. 12517788819
B.COM (H)

Vivekananda School of Business Studies


Vivekananda Institute of Professional studies
AU Block (Outer Ring Road), Pitampura
Delhi-110034

Research Methodology File 1 Hrishabh Srivastava


INDEX
TOPICS
FUNCTIONS
 Count
 Count A
 Count Blank
 Sum
 Average
 Count If
 Average If
 Concatenate
 VLOOKUP
 HLOOKUP
OTHER TOOLS
 Transpose table
 Conditional Formatting- Highlight Cell rules (greater than, less than,
 between, equal to, text that contains)
 Conditional Formatting - Duplicate values
 Conditional Formatting - Top/ Bottom rules
 Conditional Formatting - Data Bars
 Conditional Formatting - Color Scales
 Format Cells – Number, Alignment, Font, Border, Fill
 Data validation – settings (Any value, number, custom)
 Data validation – Input message
 Data validation – Error alert
 Customization- Quick access toolbar
 Save as adobe pdf
DATA VISUALIZATION AND ANALYSIS
 Pivot Table and its tools
 Pivot Chart and its tools
 Pivot Slicer
 Sparkling Tool
 Histogram using Graph tab
 Histogram frequency distribution
 Histogram – Chart output
 Histogram – Pareto (sorted diagram)
 Histogram – Cumulative percentage
 Descriptive statistics
 Correlation

HYPOTHESIS TESTING
 T-test one sample test using Dummy (One Tailed)
 t-Test Two-Sample Assuming Equal Variances
 t-Test Paired Two Sample for Means
 Two sample - Independent sample t test
 Two sample - Paired Sample t test

Research Methodology File 2 Hrishabh Srivastava


 Two sample z test
 ANOVA – Single Factor
 ANOVA – Two Factor without replication
 ANOVA – Two Factor with replication
 F test
 Chi square test
INTRODUCTION TO R
 How to install R Studio
 Four Panes in R
 Import of Data Sheet in Excel
 Correlation
 Hypothesis Testing: Two sample - Independent sample t test
 Hypothesis Testing: Two sample - Paired Sample t test
 Hypothesis Testing: One-way ANOVA
 Hypothesis Testing: F test
 Hypothesis Testing: Chi square test

Research Methodology File 3 Hrishabh Srivastava


RESEARCH METHODOLOGY
Meaning of research:
Research in simple terms refers to search for knowledge. It is a scientific and systematic
search for information on a particular topic or issue. It is also known as the art of scientific
investigation. Several social scientists have defined research in different ways. In the
Encyclopedia of Social Sciences, D. Slesinger and M. Stephension (1930) defined research as
“the manipulation of things, concepts or symbols for the purpose of generalizing to extend,
correct or verify knowledge, whether that knowledge aids in the construction of theory or in
the practice of an art”. According to Redman and Mory (1923), research is a “systematized
effort to gain new knowledge”. It is an academic activity and therefore the term should be
used in a technical sense. According to Clifford Woody (kothari, 1988), research comprises
“defining and redefining problems, formulating hypotheses or suggested solutions; collecting,
organizing 4 and evaluating data; making deductions and reaching conclusions; and finally,
carefully testing the conclusions to determine whether they fit the formulated hypotheses”.
Thus, research is an original addition to the available knowledge, which contributes to its
further advancement. It is an attempt to pursue truth through the methods of study,
observation, comparison and experiment. In sum, research is the search for knowledge, using
objective and systematic methods to find solution to a problem.

Objectives of Research:
The objective of research is to find answers to the questions by applying scientific
procedures. In other words, the main aim of research is to find out the truth which is hidden
and has not yet been discovered. Although every research study has its own specific
objectives, the research objectives may be broadly grouped as follows:
1. To gain familiarity with new insights into a phenomenon (i.e., formulative research
studies);
2. To accurately portray the characteristics of a particular individual, group, or a situation
(i.e., descriptive research studies);
3. To analyze the frequency with which something occurs (i.e., diagnostic research studies);
and
4. To examine the hypothesis of a causal relationship between two variables (i.e., hypothesis-
testing research studies).

Research Methodology File 4 Hrishabh Srivastava


FUNCTIONS
COUNT - The COUNT function counts the number of cells that contain numbers,
and counts numbers within the list of arguments.

COUNT A - The COUNTA function counts cells containing any type of information,
including error values and empty text ("")

Research Methodology File 5 Hrishabh Srivastava


COUNT BLANK - Use the COUNTBLANK function to count blank cells in a range,
where the word blank means empty.

SUM - The SUM function adds values. You can add individual values, cell references or
ranges or a mix of all three

Research Methodology File 6 Hrishabh Srivastava


AVERAGE - The Excel AVERAGE function calculates the average (arithmetic
mean) of supplied numbers. AVERAGE can handle up to 255 individual arguments, which
can include numbers, cell references, ranges, arrays, and constants.

COUNT IF - COUNTIF is an Excel function to count cells in a range that meet a single
condition. COUNTIF can be used to count cells that contain dates, numbers, and text.

Research Methodology File 7 Hrishabh Srivastava


AVERAGE IF - The Excel AVERAGEIF function calculates the average of numbers
in a range that meet supplied criteria. AVERAGEIF criteria can include logical operators (>,
<, <>, =) and wildcards (*,) for partial matching.

CONCATENATE - The CONCATENATE function allows you to combine text


from different cells into one cell. In our example, we can use it to combine the text in column

Research Methodology File 8 Hrishabh Srivastava


A and column B to create a combined name in a new column.

VLOOKUP - VLOOKUP stands for 'Vertical Lookup'. It is a function that


makes Excel search for a certain value in a column (the so called 'table array'), in order to
return a value from a different column in the same row.

Research Methodology File 9 Hrishabh Srivastava


HLOOKUP - HLOOKUP in Excel stands for 'Horizontal Lookup'. It is a function that
makes Excel search for a certain value in a row (the so called 'table array'), in order to return
a value from a different row in the same column.

OTHER TOOLS
TRANSPOSE TABLE –
The TRANSPOSE function returns a vertical range of cells as a horizontal range, or vice
versa. The TRANSPOSE function must be entered as an array formula in a range that has the
same number of rows and columns, respectively, as the source range has columns and rows.

Research Methodology File 10 Hrishabh Srivastava


CONDITIONAL FORMATTING –
Conditional formatting is a feature included in the popular spreadsheet creation
programs Excel and Google Sheets. This feature automatically applies formatting, such as
font colour or bolding, to a cell when the data in that cell meets specific criteria. For example,
in the image, the font colour is automatically changed to red in all cells with negative values.

Research Methodology File 11 Hrishabh Srivastava


1. Highlight Cell Rules:
 Text That Contains:

Research Methodology File 12 Hrishabh Srivastava


 Greater Than:

Research Methodology File 13 Hrishabh Srivastava


 Less Than:

Research Methodology File 14 Hrishabh Srivastava


 Between:

Research Methodology File 15 Hrishabh Srivastava


 Equal To:

Research Methodology File 16 Hrishabh Srivastava


 Duplicate Values:

Research Methodology File 17 Hrishabh Srivastava


 A Date Occurring:

Research Methodology File 18 Hrishabh Srivastava


2. Top Bottom Rules:
 Top 10 Items:

Research Methodology File 19 Hrishabh Srivastava


 Top 10%:

Research Methodology File 20 Hrishabh Srivastava


 Bottom 10 Items:

Research Methodology File 21 Hrishabh Srivastava


 Bottom 10%:

Research Methodology File 22 Hrishabh Srivastava


 Above Average:

Research Methodology File 23 Hrishabh Srivastava


 Below Average:

Research Methodology File 24 Hrishabh Srivastava


3. Data Bars:

Research Methodology File 25 Hrishabh Srivastava


4. Colour Scales:

Research Methodology File 26 Hrishabh Srivastava


Research Methodology File 27 Hrishabh Srivastava
FORMAT CELLS-
When we format cells in Excel, we change the appearance of a number without changing the
number itself. We can apply a number format (0.8, $0.80, 80%, etc.) or
other formatting (alignment, font, border, etc.).

1. Numbers:

Research Methodology File 28 Hrishabh Srivastava


2. Alignment:
 Left Alignment:

Research Methodology File 29 Hrishabh Srivastava


 Centre Alignment:

 Right Alignment:

Research Methodology File 30 Hrishabh Srivastava


3. Font:

Research Methodology File 31 Hrishabh Srivastava


4. Border:

Research Methodology File 32 Hrishabh Srivastava


Research Methodology File 33 Hrishabh Srivastava
5. Fill:

DATA VALIDATION:
The data validation feature helps you control what can be entered in your worksheet. For
example, you can: create a drop down list of items in a cell. Restrict entries, such as a date
range or whole numbers only.

Research Methodology File 34 Hrishabh Srivastava


1. Settings:
 Any Value:

 Number:

Research Methodology File 35 Hrishabh Srivastava


 Custom:

2. Input Message:

Research Methodology File 36 Hrishabh Srivastava


3. Error Message:

Research Methodology File 37 Hrishabh Srivastava


CUSTOMIZATION:
1. Quick access toolbar:

Research Methodology File 38 Hrishabh Srivastava


Research Methodology File 39 Hrishabh Srivastava
2. Save as PDF:

Research Methodology File 40 Hrishabh Srivastava


DATA VISUALISATION AND ANALYSIS
1. Pivot Table and its tools:
A pivot table is a statistics tool that summarizes and reorganizes selected columns and
rows of data in a spreadsheet or database table to obtain a desired report. The tool does
not actually change the spreadsheet or database itself, it simply “pivots” or turns the data
to view it from different perspectives.

Research Methodology File 41 Hrishabh Srivastava


Research Methodology File 42 Hrishabh Srivastava
2. Pivot Chart and Its Tools:
Pivot Chart in excel is an in-built Programme tool in excel which helps you out to
summarize selected rows and column of data in a spreadsheet. It's the visual representation
of a pivot table or any tabular data which helps to summarize & analyze the datasets,
patterns, and trends.

Research Methodology File 43 Hrishabh Srivastava


3. PIVOT SLICER–
Slicers provide buttons that you can click to filter tables, or PivotTables. In addition to quick
filtering, slicers also indicate the current filtering state, which makes it easy to understand
what exactly is currently displayed.

Research Methodology File 44 Hrishabh Srivastava


Research Methodology File 45 Hrishabh Srivastava
Research Methodology File 46 Hrishabh Srivastava
4. SPARKLINE TOOLS –
 A Sparkline is a tiny chart in a worksheet cell that provides a visual representation of data.
Use sparklines to show trends in a series of values, such as seasonal increases or decreases,
economic cycles, or to highlight maximum and minimum values.

Research Methodology File 47 Hrishabh Srivastava


5. Histogram using Graph tab:
A histogram is a common data analysis tool in the business world. It’s a column chart that
shows the frequency of the occurrence of a variable in the specified rank

Research Methodology File 48 Hrishabh Srivastava


6. Histogram frequency distribution

Research Methodology File 49 Hrishabh Srivastava


Research Methodology File 50 Hrishabh Srivastava
7. Histogram-Chart output:

8. Histogram-Pareto

Research Methodology File 51 Hrishabh Srivastava


9. Histogram- Cumulative Percentage

Research Methodology File 52 Hrishabh Srivastava


10. Descriptive Statistics:
Using the descriptive statistics feature in Excel means that you won't have to type in
individual functions like MEAN or MODE. One button click will return a dozen
different stats for your data set. If you want to calculate Excel descriptive statistics, you
must have the Data Analysis Toolpak loaded in Excel.

Research Methodology File 53 Hrishabh Srivastava


Research Methodology File 54 Hrishabh Srivastava
Research Methodology File 55 Hrishabh Srivastava
11. Correlation:

Research Methodology File 56 Hrishabh Srivastava


Hypothesis Testing
T-test one sample test using Dummy (One Tailed)
Research Problem: To determine that the population mean of age is greater than 40 at
α=0.05 assuming equal variances

Age DUMMY
42 0
76 0
56  
56  
67  
65  
65  
89  
76  
45  
45  
65  
78  
55  
52  

Research Methodology File 57 Hrishabh Srivastava


53  
44  
65  
76  
89  
44  
54  
45  
56  
56  
56  
76  
Hypothesis Testing:
Null Hypothesis: The population mean is less than 40.
H 0: µ≤40

Alternate Hypothesis: The population mean is greater than 40.


H 1: µ>40

Research Methodology File 58 Hrishabh Srivastava


Research Methodology File 59 Hrishabh Srivastava
Result:
DUMM
  Age Y
60.9629
Mean 6 0
189.190
Variance 9 0
Observations 27 2
182.183
Pooled Variance 8
Hypothesized Mean Difference 40
df 27
t Stat 2.11932
P(T<=t) one-tail 0.0217
t Critical one-tail 1.70329

Decision Rule:
 If t Stat is greater than t Critical, reject Null Hypothesis.
If t Stat is less than t Critical, accept Null Hypothesis.

 If sig level (p value, alpha) > 0.05, Accept Null Hypothesis.


If sig level (p value, alpha) < 0.05, reject Null Hypothesis.
Inference:
Since t Stat=1.95 is greater than t Critical=1.70, reject Null Hypothesis.
Since P=0.03 is less than α=0.05, reject Null hypothesis.

Research Methodology File 60 Hrishabh Srivastava


Conclusion:
Therefore, the mean age of the population is greater than 40 at alpha = 0.05 assuming equal
variances.

T-test Two-Sample Assuming Equal Variances


Research Problem – To find out the results of students for statistical software
in May is better than the result in January.
Data:

Jan May
45 56
54 57
44 45
56 67
34 44
45 34
34 34
67 76
45 56
54 45
67 76
56 87
56 66
56 65
76 45
76 76
Hypothesis Testing:
Null Hypothesis: Result in January is better than or equal to result in May
H 0: µJan ≥ µMay or µJan - µMay ≥ 0

Alternate Hypothesis: Result in January is less than the result in May


H 1: µJan < µMay or µJan - µMay < 0

Research Methodology File 61 Hrishabh Srivastava


Research Methodology File 62 Hrishabh Srivastava
Result:

  Jan May

Mean 54.0625 58.0625


164.329 258.062
Variance 2 5

Research Methodology File 63 Hrishabh Srivastava


Observations 16 16
211.195
Pooled Variance 8

Hypothesized Mean Difference 0

df 30

t Stat -0.77851
0.22118
P(T<=t) one-tail 4
1.69726
t Critical one-tail 1

Decision Rule:
 If t Stat is greater than t Critical, reject Null Hypothesis.
If t Stat is less than t Critical, accept Null Hypothesis.

 If sig level (p value, alpha) > 0.05, Accept Null Hypothesis.


If sig level (p value, alpha) < 0.05, reject Null Hypothesis.

Inference:
Since t Stat= -0.77851 is less than t Critical=1.697261, accept Null Hypothesis.
Since P=0.221184 is greater than α=0.05, accept Null hypothesis.
Conclusion:
Therefore, the results of students for statistical software in January is better than or equal to
result in May.

T-Test: Paired Two Sample for Means


Research Problem: To determine that the mean time to exhaustion is greater after chocolate
milk than after carbohydrate replacement drink.
Use a significance level of 0.05.

Data:

Cyclis Chocolate Carbohydrate


t Milk Replacement Drink

1 50.46 42.9
2 47.08 50.1

3 57.51 41.67

Research Methodology File 64 Hrishabh Srivastava


4 46.6 32.69
5 29.1 46.33
6 57.5 31.63
7 23.87 20.61
8 28.65 14.99
9 35.37 20.11

Hypothesis Testing:
Null Hypothesis: The mean time to exhaustion is less than or equal to chocolate milk.
H 0: µcm ≤ µcd OR µcm - µcd ≤ 0

Alternate Hypothesis: The mean time to exhaustion is greater after chocolate milk.
H 1: µcm > µcd OR µcm - µcd > 0

Research Methodology File 65 Hrishabh Srivastava


Research Methodology File 66 Hrishabh Srivastava
Result:

Chocolate Carbohydrate
  Milk Replacement Drink
Mean 41.79333333 33.44777778
Variance 164.53125 160.9338194
Observations 9 9
Pearson Correlation 0.508406248
Hypothesized Mean
Difference 0
df 8
t Stat 1.979280834
P(T<=t) one-tail 0.0415706
t Critical one-tail 1.859548038

Decision Rule:
 If t Stat is greater than t Critical, reject Null Hypothesis.
If t Stat is less than t Critical, accept Null Hypothesis.

Research Methodology File 67 Hrishabh Srivastava


 If sig level (p value, alpha) > 0.05, Accept Null Hypothesis.
If sig level (p value, alpha) < 0.05, reject Null Hypothesis.
Inference:
Since t Stat=1.97 is greater than t Critical=1.85, reject Null Hypothesis.
Since P=0.041 is less than α=0.05, reject Null hypothesis.
Conclusion:
Therefore, the mean time to exhaustion is greater after chocolate milk than after carbohydrate
replacement drink.

Two-Sample Independent sample test


Research Problem: To analyse that there is a significant difference between the marks
scored by class groups A & B in mathematics.

Group A Group B
76 95
87 97
98 87
45 89
66 87
78 45
76 76
88 56
78 76
87 87
54 45
65 76
76 45
89 88
65 76
78 66
54 78
87 56
45 77
Hypothesis Testing:
H 0: ua -ub=0

H 1: UA-UB ≠ 0

Research Methodology File 68 Hrishabh Srivastava


Research Methodology File 69 Hrishabh Srivastava
Result:
Variable 1 Variable 2
Mean 73.26315789 73.78947368
Variance 236.7602339 287.3976608
Observations 19 19
Hypothesized 0
Mean
Difference
df 36
t Stat -0.100205633
P(T<=t) one-tail 0.460368525
t Critical one- 1.688297714
tail
P(T<=t) two-tail 0.92073705
t Critical two- 2.028094001
tail

Decision Rule:
 If t Stat is greater than t Critical, reject Null Hypothesis.
If t Stat is less than t Critical, accept Null Hypothesis.

 If sig level (p value, alpha) > 0.05, Accept Null Hypothesis.


If sig level (p value, alpha) < 0.05, reject Null Hypothesis.
Inference:
Since t Stat=-0.100 is less than t Critical=2.02, accept Null Hypothesis.

Research Methodology File 70 Hrishabh Srivastava


Since P=0.92 is greater than α=0.05, accept Null hypothesis.
Conclusion:
Therefore, there is not a difference between the marks scored by class groups A & B in
mathematics.

Two Sample – Paired Sample t test


Research Problem: To determine that the mean weight after the diet is less than the mean
weight before the diet.

Data:

Before After

162 168
170 136
184 147
164 159
172 143
176 161
159 143
170 145
Hypothesis Testing:
Null Hypothesis: The mean weight after the diet is more than or equal to the mean weight
before the diet.
H 0: µa ≥ µb OR µa - µb ≥ 0

Alternate Hypothesis: The mean weight after the diet is less than the mean weight before the
diet.
H 1: µa < µb OR µa - µb < 0

Research Methodology File 71 Hrishabh Srivastava


Research Methodology File 72 Hrishabh Srivastava
Research Methodology File 73 Hrishabh Srivastava
Result:
t-Test: Paired Two Sample for
Means

Variable 1 Variable 2
Mean 169.625 150.25
121.928571
Variance 65.125 4
Observations 8 8
-
0.17674777
Pearson Correlation 2
Hypothesized Mean Difference 0
df 7
3.70687337
t Stat 3
0.00379299
P(T<=t) one-tail 4
1.89457860
t Critical one-tail 5

Decision Rule:
 If t Stat is greater than t Critical, reject Null Hypothesis.
If t Stat is less than t Critical, accept Null Hypothesis.

 If sig level (p value, alpha) > 0.05, Accept Null Hypothesis.


If sig level (p value, alpha) < 0.05, reject Null Hypothesis.
Inference:
Since t Stat= 3.706 is greater than t Critical=1.895, reject Null Hypothesis.
Since P = 0.003 is less than α=0.05, reject Null hypothesis.
Conclusion:
Therefore, the mean weight after the diet is less than the mean weight before the diet.

Two Sample Z Test


Research Problem: The net annual returns (the returns on investment after deducting all
relevant fees) in percentage are given. Can investors do better by buying mutual funds
directly from banks or other financial institutions than by purchasing mutual funds through
brokers? Can we conclude at the 5% significance level that directly-purchased mutual funds
outperform mutual funds bought through brokers?
Direct Broker
9.33 3.24
6.94 -6.76
16.17 12.8

Research Methodology File 74 Hrishabh Srivastava


16.97 11.1
5.94 2.73
12.61 -0.13
3.33 18.22
16.13 -0.8
11.2 -5.75
1.14 2.59
4.68 3.71
3.09 13.15
7.26 11.05
2.05 -3.12
13.07 8.94
0.59 2.74
13.57 4.07
0.35 5.6
2.69 -0.85
18.45 -0.28
4.23 16.4
10.28 6.39
7.1 -1.9
-3.09 9.49
5.6 6.7
5.27 0.19
8.09 12.39
15.05 6.54
13.21 10.92
1.72 -2.15
14.69 4.36
-2.97 -11.07
10.37 9.24
-0.63 -2.67
-0.15 8.97
0.27 1.87
4.59 -1.53
6.38 5.23
-0.24 6.87
10.32 -1.69
10.29 9.43
4.39 8.31
-2.06 -3.99
7.66 -4.44
10.83 8.63
14.48 7.06
4.8 1.57
13.12 -8.44
-6.54 -5.72
-1.06 6.95

Research Methodology File 75 Hrishabh Srivastava


Hypothesis Testing:
H 0: µd<=µb or µd-µb<=0

H 1: µd>µb or µd-µb>0

Research Methodology File 76 Hrishabh Srivastava


Research Methodology File 77 Hrishabh Srivastava
Result:

  Direct Broker
Mean 6.6312 3.7232
Known Variance 37.488 43.339
Observations 50 50
Hypothesized Mean Difference 0
z 2.28718437
P(Z<=z) one-tail 0.011092532
z Critical one-tail 1.644853627

Decision Rule:
 If z Stat is greater than z Critical, reject Null Hypothesis.
If z Stat is less than z Critical, accept Null Hypothesis.
 If sig level (p value, alpha) > 0.05, Accept Null Hypothesis.
If sig level (p value, alpha) < 0.05, reject Null Hypothesis.
Inference:
Since z stat = 2.28 is greater than z critical = 1.644, reject null hypothesis
Since p = 0.011 is less than a=0.05, reject null hypothesis
Conclusion:
Therefore the Net Annual returns on mutual funds is greater when investors purchase then directly
from bank rather than purchasing from brokers.

Research Methodology File 78 Hrishabh Srivastava


ANOVA – Single Factor
Research Problem: To test that the mean marks of the students in subjects- Economics,
Science and History are all equal or not

Marks ( one factor/variable)


economics science history
42 69 35
53 54 40
49 58 53
53 64 42
43 64 50
44 55 39
45 56 55
52   39
54   40

Hypothesis Testing:
H 0: Mean marks of all subjects are equal

μe =μs=μh
H 1: Mean marks of at least one group is different

Research Methodology File 79 Hrishabh Srivastava


Research Methodology File 80 Hrishabh Srivastava
Result:
SUMMARY
Groups Count Sum Average Variance
economics 9 435 48.33333333 23.5
32.3333
science 7 420 60 3
history 9 393 43.66666667 50.5

ANOVA
Source of
Variation SS df MS F P-value F crit
1085.8 15.1962 7.16E- 3.44335
Between Groups 4 2 542.92 3 05 7
Within Groups 786 22 35.72727273

1871.8
Total 4 24        

Decision Rule:

f stat > f critical, reject Ho


If p value < 0.05 , reject H0
Inference:
Since f Stat=15.196 is greater than f Critical=3.44, reject Null Hypothesis.
Since P=7.16 is greater than α=0.05, accept Null hypothesis.
Conclusion:
Therefore, the mean marks of the students in subjects- Economics, Science and History are
equal.

Research Methodology File 81 Hrishabh Srivastava


ANOVA – Two factor Without Replication
Research Problem: To test whether mean marks of at least one of the students is
different.

student economics science history


a 42 69 35
b 53 54 40
c 49 58 53
d 53 64 42
e 43 64 50

Hypothesis Testing:
Row wise
H0: no significant difference in mean marks of students
H1: Mean marks of at least one of the students is different
Column wise
H0: no significant difference in marks of subjects
H1: Mean marks of at least one of the subjects is different

Research Methodology File 82 Hrishabh Srivastava


Research Methodology File 83 Hrishabh Srivastava
Result:
Anova: Two-Factor Without
Replication

Varianc
SUMMARY Count Sum Average e
48.6666 322.333
a 3 146 7 3
b 3 147 49 61
53.3333 20.3333
c 3 160 3 3
d 3 159 53 121
52.3333 114.333
e 3 157 3 3

economics 5 240 48 28
science 5 309 61.8 34.2
history 5 220 44 54.5

ANOVA
Source of Variation SS df MS F P-value F crit
Rows 60.933 4 15.2333 0.30026 0.86988 3.83785

Research Methodology File 84 Hrishabh Srivastava


3 3 3 9 3
872.13 436.066 8.59526 0.01017
Columns 3 2 7 9 2 4.45897
405.86 50.7333
Error 7 8 3

1338.9
Total 3 14        

Decision Rule:
f stat > f critical, reject Ho
If p value < 0.05 , reject H0
Inference:
Row Wise
Since f Stat=0.300 is less than f Critical=3.83, accept Null Hypothesis.
Since P=0.86 is greater than α=0.05, accept Null hypothesis.
Column Wise
Since f Stat=8.59 is greater than f Critical=4.45, reject Null Hypothesis.
Since P= 0.01 is less than α=0.05, reject Null hypothesis.

Conclusion:
Therefore,
The mean marks of the students in subjects- Economics, Science and History are equal
column wise but different row wise.

ANOVA – Two factor With Replication


Research Problem: To test whether or not marks of students differ with respect with school,
subject wise and school wise in conjunction with the subjects.
economic
  s science history
SCHOO
LA 42 69 35
  53 54 40
  49 58 53
  53 64 42
  43 64 50
SCHOO
LB 44 55 39
  45 56 55

Research Methodology File 85 Hrishabh Srivastava


  52 0 39
  54 0 40
  0 0 0

Hypothesis Testing:

H0: No significant difference between the mean marks of school A and School B ( Row wise)
H0: no significant difference between the mean marks of economics, medicine and history
H0: no significant difference between school A and School B subject-wise (interactions)

Research Methodology File 86 Hrishabh Srivastava


Result:
Anova: Two-Factor With Replication
economi
SUMMARY cs science history Total
SCHOOL A        

Research Methodology File 87 Hrishabh Srivastava


Count 5 5 5 15
Sum 240 309 220 769
51.2666
Average 48 61.8 44 7
Variance 28 34.2 54.5 95.6381
SCHOOL B        
Count 5 5 5 15
Sum 195 111 173 479
31.9333
Average 39 22.2 34.6 3
579.495
Variance 494 924.2 420.3 2
Total      
Count 10 10 10
Sum 435 420 393
Average 43.5 42 39.3
861.555 235.566
Variance 254.5 6 7
ANOVA
Source of
Variation SS df MS F P-value F crit
2803.33 0.00727 4.25967
Sample 2803.333 1 3 8.6027 2 7
0.13901 0.87091 3.40282
Columns 90.6 2 45.3 4 2 6
770.233 2.36364 0.11561 3.40282
Interaction 1540.467 2 3 6 1 6
325.866
Within 7820.8 24 7

Total 12255.2 29        

Decision Rule:
f stat > f critical, reject H0
If p value < 0.05 , reject H0

Inference:
Since F = 8.6 is greater than F crit = 4.2, reject H0
Since p value = 0.007 is less than a=0.05, reject H0
Conclusion:
Therefore marks of students differ with respect with school, subject wise and school wise in conjunction
with the subjects

F-Test Two-Sample for Variances

Research Methodology File 88 Hrishabh Srivastava


Research Problem: Todetermine whether variance of Class 1 is greater than variance of
class 2 in mathematics.

Class1 Class2
65 76
76 54
65 67
76 65
56 76
45 66

Hypothesis Testing:
H 0: Variance group 1 = Variance group 2

H 1: Variance group 1> Variance group 2

Research Methodology File 89 Hrishabh Srivastava


Research Methodology File 90 Hrishabh Srivastava
Result:

  Class1 Class2

Mean 63.83333333 67.33333333

Variance 142.9666667 67.06666667

Observations 6 6

df 5 5

F 2.131709742

P(F<=f) one-tail 0.212888468


F Critical one-tail 5.050329058  

Decision Rule:
 If f is greater than f Critical, reject Null Hypothesis.
 If f is equal to f critical , accept null hypothesis
Inference:
Since F > F Critical one-tail, we reject the null hypothesis
Conclusion:
Therefore, variance of Class 1 is greater than variance of class 2 in mathematics

Research Methodology File 91 Hrishabh Srivastava


Chi Square test
Research Problem: To determine whether brand preference is independent of age group
Row
Age/Brand Brand1 Brand2 Brand3 total
15-25 65 76 72 213
26-35 60 40 64 164
36-45 45 52 50 147
46-55 55 65 60 180
Column
Total 225 233 246 704

Hypothesis Testing:
H 0: There is no association between brand preference and age group

H 1: There is association between brand preference and age group

Research Methodology File 92 Hrishabh Srivastava


Research Methodology File 93 Hrishabh Srivastava
Research Methodology File 94 Hrishabh Srivastava
Research Methodology File 95 Hrishabh Srivastava
Result:
Observed Expected O-E (O-E)2/E
65 68.07528409 -3.075284091 0.138925197
60 52.41477273 7.585227273 1.097699557
45 46.98153409 -1.981534091 0.083574907
55 57.52840909 -2.528409091 0.11112514
76 70.49573864 5.504261364 0.429769143
40 54.27840909 -14.27840909 3.756060091
52 48.65198864 3.348011364 0.230395106
65 59.57386364 5.426136364 0.494226059
72 74.42897727 -2.428977273 0.079269269
64 57.30681818 6.693181818 0.781733907
50 51.36647727 -1.366477273 0.036351727
60 62.89772727 -2.897727273 0.13349963
Chi Square
Value 7.373

Decision Rule:
If calculated value > Critical one-tail, we reject the null hypothesis
Inference:
Since calculated value is less than critical point, accept null hypothesis
Conclusion:

Research Methodology File 96 Hrishabh Srivastava


Therefore, brand preference is not independent of age group

INTRODUCTION TO R
1. How to Install R Studios:
To Install R

1. Open an internet browser and go to www.r-project.org.

2. Click the "download R" link in the middle of the page under "Getting Started."

3. Select a CRAN location (a mirror site) and click the corresponding link.

4. Click on the "Download R for (Mac) OS X" link at the top of the page.

5. Click on the file containing the latest version of R under "Files."

6. Save the .pkg file, double-click it to open, and follow the installation
instructions.

7. Now that R is installed, you need to download and install RStudio.

To Install RStudio

1. Go to www.rstudio.com and click on the "Download RStudio" button.

2. Click on "Download RStudio Desktop."

3. Click on the version recommended for your system, or the latest Mac version,
save the .dmg file on your computer, double-click it to open, and then drag and drop it to
your applications folder.

2. Four panels in R
RStudio has four main panes each in a quadrant of your screen: Source
Editor, Console, Workspace Browser (and History), and Plots (and Files, Packages, Help).
These can also be adjusted under the ‘Preferences’ menu. Note that there might be subtle
differences between RStudio installations on different operating systems

 TOP RIGHT:
Environment and history window. The environment window contains objects (data, values,
functions) R has currently stored in its memory. The history window shows all commands
that were executed in the Console.

Research Methodology File 97 Hrishabh Srivastava


 TOP LEFT:
Text editor or script window. This is where you can save and edit collections of commands.

 BOTTOM RIGHT:
Files, plots, packages, help, and viewer pane. Here you can open files, view plots, install and
load packages, read man pages, and view markdown and other documents in the viewer tab.

Research Methodology File 98 Hrishabh Srivastava


 BOTTOM LEFT:
Console or command window. Here you can type any valid R command after the prompt
followed by Enter and R will execute that command.

3. Import Data Sheet In Excel:


Importing data into R is a necessary step that, at times, can become time intensive. To
ease this task, RStudio includes new features to import data from: csv, xls, xlsx, sav, dta,
por, SAS, SPSS and Stata files.

Research Methodology File 99 Hrishabh Srivastava


Steps:
 Go to Files > Import Dataset > From Excel.

 Go to Browse select the file to be imported.;

Research Methodology File 100 Hrishabh Srivastava


 Select the sheet to be imported from Default.

 Click OK after selecting the sheet to be imported.

4. Correlation:
Correlation is a statistical technique that can show whether and how strongly pairs of
variables are related.

Research Methodology File 101 Hrishabh Srivastava


 Data Set
Engine capacity weight in kg
250 150
500 200
650 269
1550 3500
2000 4000
3000 5000
4000 6000

 Import the data sheet to R studios

 Attach the data sheet using attach() command.

Research Methodology File 102 Hrishabh Srivastava


 Run the correlation test by giving command cor.test()

5. Independent Samples:
 Data Set
Score shift
3.1 Part time
3.4 Part time
4.6 Part time
2.8 Part time

Research Methodology File 103 Hrishabh Srivastava


2.3 Part time
1.5 Part time
3.8 Part time
9.5 Part time
4.3 Part time
2.7 Part time
1.6 Part time
1.6 Part time
3.2 Part time
4.2 Part time
3.9 Part time
1.2 Part time
3.2 Full time
1.5 Full time
6.5 Full time
0.2 Full time
3.7 Full time
3.3 Full time
1.7 Full time
3.6 Full time
3.8 Full time
5.3 Full time
6.9 Full time
3.6 Full time
1.7 Full time
1.2 Full time
7.2 Full time
3.9 Full time
1.9 Full time
5.3 Full time

Research Methodology File 104 Hrishabh Srivastava


 Import file to R Studios.

 Attach the file and run the test by using t.test() command.

6.Paired Sample
 Data Set
Subject Before After
1 135 127
2 142 145
3 137 131

Research Methodology File 105 Hrishabh Srivastava


4 122 125
5 147 132
6 151 147
7 131 119
8 117 125
9 154 132
10 143 139
11 133 122
12 125 125
13 154 138
14 156 121
15 132 126
16 125 127
17 136 128
18 141 132
19 129 131
20 148 139
21 136 120
22 120 119
23 147 135
24 138 142
25 158 130
 Import the data sheet to R Studios

 Attach the file by using Attach() command.

Research Methodology File 106 Hrishabh Srivastava


 Run the test by using t.test() command.

7.One way anova


 Data Set
economi medicin
cs e history
42 69 35
53 54 40
49 58 53

Research Methodology File 107 Hrishabh Srivastava


53 64 42
43 64 50
44 55 39
45 56 55
52 39
54 40
 Import the data sheet to R tudios

 Attach the data sheet using Attach() command.

 Use the Following commands to run the test.

Research Methodology File 108 Hrishabh Srivastava


Research Methodology File 109 Hrishabh Srivastava
8.F test
 Data Set

Class1 Class2
65 76
76 54

Research Methodology File 110 Hrishabh Srivastava


65 67
76 65
56 76
45 66
 Import the data sheet to R Studios

 Attach the data sheet using Attach () command.

Research Methodology File 111 Hrishabh Srivastava


 Run the test using var.test() command.

9.Chi Square test


 Data Set
Hair eyes height sex
Brown Blue 63 M
Brown Brown 62 F
Brown Brown 60 F
Brown Brown 69 M
Brown Brown 71 F
blonde Blue 75 F
black green 65 M
black blue 55 F
black green 68 F
black green 69 M
Brown green 59 M
blonde green 72 M

Research Methodology File 112 Hrishabh Srivastava


 Import the data sheet to R Studios

 Attach the data sheet using Attach () command.

 Use the following commands to run chi square test.

Research Methodology File 113 Hrishabh Srivastava


Research Methodology File 114 Hrishabh Srivastava
Research Methodology File 115 Hrishabh Srivastava

You might also like