Hrishabh-Research Methodology File

PROJECT REPORT
ON
“RESEARCH METHODOLOGY”
SUBMITTED IN PARTIAL FULFILLMENT FOR THE
AWARD OF THE DEGREE OF
BACHELOR OF COMMERCE (H)
2019-2022
UNDER THE GUIDANCE OF
MS. NUPUR ARORA
FACULTY, VIPS
SUBMITTED BY:
Name: HRISHABH SRIVASTAVA
Enrolment No. 12517788819
B.COM (H)
Vivekananda School of Business Studies

Vivekananda Institute of Professional studies
AU Block (Outer Ring Road), Pitampura
Delhi-110034
Research Methodology File 1 Hrishabh Srivastava

INDEX
TOPICS
FUNCTIONS
 Count
 Count A
 Count Blank
 Sum
 Average
 Count If
 Average If
 Concatenate
 VLOOKUP
 HLOOKUP
OTHER TOOLS
 Transpose table
 Conditional Formatting- Highlight Cell rules (greater than, less than,
 between, equal to, text that contains)
 Conditional Formatting - Duplicate values
 Conditional Formatting - Top/ Bottom rules
 Conditional Formatting - Data Bars
 Conditional Formatting - Color Scales
 Format Cells – Number, Alignment, Font, Border, Fill
 Data validation – settings (Any value, number, custom)
 Data validation – Input message
 Data validation – Error alert
 Customization- Quick access toolbar
 Save as adobe pdf
DATA VISUALIZATION AND ANALYSIS
 Pivot Table and its tools
 Pivot Chart and its tools
 Pivot Slicer
 Sparkling Tool
 Histogram using Graph tab
 Histogram frequency distribution
 Histogram – Chart output
 Histogram – Pareto (sorted diagram)
 Histogram – Cumulative percentage
 Descriptive statistics
 Correlation
HYPOTHESIS TESTING
 T-test one sample test using Dummy (One Tailed)
 t-Test Two-Sample Assuming Equal Variances
 t-Test Paired Two Sample for Means
 Two sample - Independent sample t test
 Two sample - Paired Sample t test

 Two sample z test
 ANOVA – Single Factor
 ANOVA – Two Factor without replication
 ANOVA – Two Factor with replication
 F test
 Chi square test
INTRODUCTION TO R
 How to install R Studio
 Four Panes in R
 Import of Data Sheet in Excel
 Correlation
 Hypothesis Testing: Two sample - Independent sample t test
 Hypothesis Testing: Two sample - Paired Sample t test
 Hypothesis Testing: One-way ANOVA
 Hypothesis Testing: F test
 Hypothesis Testing: Chi square test

RESEARCH METHODOLOGY
Meaning of research:
Research in simple terms refers to search for knowledge. It is a scientific and systematic
search for information on a particular topic or issue. It is also known as the art of scientific
investigation. Several social scientists have defined research in different ways. In the
Encyclopedia of Social Sciences, D. Slesinger and M. Stephension (1930) defined research as
“the manipulation of things, concepts or symbols for the purpose of generalizing to extend,
correct or verify knowledge, whether that knowledge aids in the construction of theory or in
the practice of an art”. According to Redman and Mory (1923), research is a “systematized
effort to gain new knowledge”. It is an academic activity and therefore the term should be
used in a technical sense. According to Clifford Woody (kothari, 1988), research comprises
“defining and redefining problems, formulating hypotheses or suggested solutions; collecting,
organizing 4 and evaluating data; making deductions and reaching conclusions; and finally,
carefully testing the conclusions to determine whether they fit the formulated hypotheses”.
Thus, research is an original addition to the available knowledge, which contributes to its
further advancement. It is an attempt to pursue truth through the methods of study,
observation, comparison and experiment. In sum, research is the search for knowledge, using
objective and systematic methods to find solution to a problem.
Objectives of Research:
The objective of research is to find answers to the questions by applying scientific
procedures. In other words, the main aim of research is to find out the truth which is hidden
and has not yet been discovered. Although every research study has its own specific
objectives, the research objectives may be broadly grouped as follows:
1. To gain familiarity with new insights into a phenomenon (i.e., formulative research
studies);
2. To accurately portray the characteristics of a particular individual, group, or a situation
(i.e., descriptive research studies);
3. To analyze the frequency with which something occurs (i.e., diagnostic research studies);
and
4. To examine the hypothesis of a causal relationship between two variables (i.e., hypothesis-
testing research studies).

FUNCTIONS
COUNT - The COUNT function counts the number of cells that contain numbers,
and counts numbers within the list of arguments.
COUNT A - The COUNTA function counts cells containing any type of information,
including error values and empty text ("")

COUNT BLANK - Use the COUNTBLANK function to count blank cells in a range,
where the word blank means empty.
SUM - The SUM function adds values. You can add individual values, cell references or
ranges or a mix of all three

AVERAGE - The Excel AVERAGE function calculates the average (arithmetic
mean) of supplied numbers. AVERAGE can handle up to 255 individual arguments, which
can include numbers, cell references, ranges, arrays, and constants.
COUNT IF - COUNTIF is an Excel function to count cells in a range that meet a single
condition. COUNTIF can be used to count cells that contain dates, numbers, and text.

AVERAGE IF - The Excel AVERAGEIF function calculates the average of numbers
in a range that meet supplied criteria. AVERAGEIF criteria can include logical operators (>,
<, <>, =) and wildcards (*,) for partial matching.
CONCATENATE - The CONCATENATE function allows you to combine text

from different cells into one cell. In our example, we can use it to combine the text in column

A and column B to create a combined name in a new column.
VLOOKUP - VLOOKUP stands for 'Vertical Lookup'. It is a function that

makes Excel search for a certain value in a column (the so called 'table array'), in order to
return a value from a different column in the same row.

HLOOKUP - HLOOKUP in Excel stands for 'Horizontal Lookup'. It is a function that
makes Excel search for a certain value in a row (the so called 'table array'), in order to return
a value from a different row in the same column.
OTHER TOOLS
TRANSPOSE TABLE –
The TRANSPOSE function returns a vertical range of cells as a horizontal range, or vice
versa. The TRANSPOSE function must be entered as an array formula in a range that has the
same number of rows and columns, respectively, as the source range has columns and rows.

CONDITIONAL FORMATTING –
Conditional formatting is a feature included in the popular spreadsheet creation
programs Excel and Google Sheets. This feature automatically applies formatting, such as
font colour or bolding, to a cell when the data in that cell meets specific criteria. For example,
in the image, the font colour is automatically changed to red in all cells with negative values.

1. Highlight Cell Rules:
 Text That Contains:

 Greater Than:

 Less Than:

 Between:

 Equal To:

 Duplicate Values:

 A Date Occurring:

2. Top Bottom Rules:
 Top 10 Items:

 Top 10%:

 Bottom 10 Items:

 Bottom 10%:

 Above Average:

 Below Average:

3. Data Bars:

4. Colour Scales:

FORMAT CELLS-
When we format cells in Excel, we change the appearance of a number without changing the
number itself. We can apply a number format (0.8, $0.80, 80%, etc.) or
other formatting (alignment, font, border, etc.).
1. Numbers:

2. Alignment:
 Left Alignment:

 Centre Alignment:
 Right Alignment:

3. Font:

4. Border:

5. Fill:
DATA VALIDATION:
The data validation feature helps you control what can be entered in your worksheet. For
example, you can: create a drop down list of items in a cell. Restrict entries, such as a date
range or whole numbers only.

1. Settings:
 Any Value:
 Number:

 Custom:
2. Input Message:

3. Error Message:

CUSTOMIZATION:
1. Quick access toolbar:

2. Save as PDF:

DATA VISUALISATION AND ANALYSIS
1. Pivot Table and its tools:
A pivot table is a statistics tool that summarizes and reorganizes selected columns and
rows of data in a spreadsheet or database table to obtain a desired report. The tool does
not actually change the spreadsheet or database itself, it simply “pivots” or turns the data
to view it from different perspectives.

2. Pivot Chart and Its Tools:
Pivot Chart in excel is an in-built Programme tool in excel which helps you out to
summarize selected rows and column of data in a spreadsheet. It's the visual representation
of a pivot table or any tabular data which helps to summarize & analyze the datasets,
patterns, and trends.

3. PIVOT SLICER–
Slicers provide buttons that you can click to filter tables, or PivotTables. In addition to quick
filtering, slicers also indicate the current filtering state, which makes it easy to understand
what exactly is currently displayed.

4. SPARKLINE TOOLS –
A Sparkline is a tiny chart in a worksheet cell that provides a visual representation of data.
Use sparklines to show trends in a series of values, such as seasonal increases or decreases,
economic cycles, or to highlight maximum and minimum values.

5. Histogram using Graph tab:
A histogram is a common data analysis tool in the business world. It’s a column chart that
shows the frequency of the occurrence of a variable in the specified rank

6. Histogram frequency distribution

7. Histogram-Chart output:
8. Histogram-Pareto

9. Histogram- Cumulative Percentage

10. Descriptive Statistics:
Using the descriptive statistics feature in Excel means that you won't have to type in
individual functions like MEAN or MODE. One button click will return a dozen
different stats for your data set. If you want to calculate Excel descriptive statistics, you
must have the Data Analysis Toolpak loaded in Excel.

11. Correlation:

Hypothesis Testing
T-test one sample test using Dummy (One Tailed)
Research Problem: To determine that the population mean of age is greater than 40 at
α=0.05 assuming equal variances
Age DUMMY
42 0
76 0
56
56
67
65
65
89
76
45
45
65
78
55
52

53
44
65
76
89
44
54
45
56
56
56
76
Hypothesis Testing:
Null Hypothesis: The population mean is less than 40.
H 0: µ≤40
Alternate Hypothesis: The population mean is greater than 40.

H 1: µ>40

Result:
DUMM
Age Y
60.9629
Mean 6 0
189.190
Variance 9 0
Observations 27 2
182.183
Pooled Variance 8
Hypothesized Mean Difference 40
df 27
t Stat 2.11932
P(T<=t) one-tail 0.0217
t Critical one-tail 1.70329
Decision Rule:
 If t Stat is greater than t Critical, reject Null Hypothesis.
If t Stat is less than t Critical, accept Null Hypothesis.
 If sig level (p value, alpha) > 0.05, Accept Null Hypothesis.

If sig level (p value, alpha) < 0.05, reject Null Hypothesis.
Inference:
Since t Stat=1.95 is greater than t Critical=1.70, reject Null Hypothesis.
Since P=0.03 is less than α=0.05, reject Null hypothesis.

Conclusion:
Therefore, the mean age of the population is greater than 40 at alpha = 0.05 assuming equal
variances.
T-test Two-Sample Assuming Equal Variances

Research Problem – To find out the results of students for statistical software
in May is better than the result in January.
Data:
Jan May
45 56
54 57
44 45
56 67
34 44
45 34
34 34
67 76
45 56
54 45
67 76
56 87
56 66
56 65
76 45
76 76
Hypothesis Testing:
Null Hypothesis: Result in January is better than or equal to result in May
H 0: µJan ≥ µMay or µJan - µMay ≥ 0
Alternate Hypothesis: Result in January is less than the result in May

H 1: µJan < µMay or µJan - µMay < 0

Result:
Jan May
Mean 54.0625 58.0625

164.329 258.062
Variance 2 5

Observations 16 16
211.195
Pooled Variance 8
df 30
t Stat -0.77851
0.22118
P(T<=t) one-tail 4
1.69726
t Critical one-tail 1
Decision Rule:

Inference:
Since t Stat= -0.77851 is less than t Critical=1.697261, accept Null Hypothesis.
Since P=0.221184 is greater than α=0.05, accept Null hypothesis.
Conclusion:
Therefore, the results of students for statistical software in January is better than or equal to
result in May.
T-Test: Paired Two Sample for Means

Research Problem: To determine that the mean time to exhaustion is greater after chocolate
milk than after carbohydrate replacement drink.
Use a significance level of 0.05.
Data:
Cyclis Chocolate Carbohydrate

t Milk Replacement Drink
1 50.46 42.9
2 47.08 50.1
3 57.51 41.67

4 46.6 32.69
5 29.1 46.33
6 57.5 31.63
7 23.87 20.61
8 28.65 14.99
9 35.37 20.11
Hypothesis Testing:
Null Hypothesis: The mean time to exhaustion is less than or equal to chocolate milk.
H 0: µcm ≤ µcd OR µcm - µcd ≤ 0
Alternate Hypothesis: The mean time to exhaustion is greater after chocolate milk.
H 1: µcm > µcd OR µcm - µcd > 0

Result:
Chocolate Carbohydrate
Milk Replacement Drink
Mean 41.79333333 33.44777778
Variance 164.53125 160.9338194
Observations 9 9
Pearson Correlation 0.508406248
Hypothesized Mean
Difference 0
df 8
t Stat 1.979280834
P(T<=t) one-tail 0.0415706
t Critical one-tail 1.859548038
Decision Rule:

Inference:
Since t Stat=1.97 is greater than t Critical=1.85, reject Null Hypothesis.
Since P=0.041 is less than α=0.05, reject Null hypothesis.
Conclusion:
Therefore, the mean time to exhaustion is greater after chocolate milk than after carbohydrate
replacement drink.
Two-Sample Independent sample test

Research Problem: To analyse that there is a significant difference between the marks
scored by class groups A & B in mathematics.
Group A Group B
76 95
87 97
98 87
45 89
66 87
78 45
76 76
88 56
78 76
87 87
54 45
65 76
76 45
89 88
65 76
78 66
54 78
87 56
45 77
Hypothesis Testing:
H 0: ua -ub=0
H 1: UA-UB ≠ 0

Result:
Variable 1 Variable 2
Mean 73.26315789 73.78947368
Variance 236.7602339 287.3976608
Observations 19 19
Hypothesized 0
Mean
Difference
df 36
t Stat -0.100205633
P(T<=t) one-tail 0.460368525
t Critical one- 1.688297714
tail
P(T<=t) two-tail 0.92073705
t Critical two- 2.028094001
tail
Decision Rule:

Inference:
Since t Stat=-0.100 is less than t Critical=2.02, accept Null Hypothesis.

Conclusion:
Therefore, there is not a difference between the marks scored by class groups A & B in
mathematics.
Two Sample – Paired Sample t test

Research Problem: To determine that the mean weight after the diet is less than the mean
weight before the diet.
Data:
Before After
162 168
170 136
184 147
164 159
172 143
176 161
159 143
170 145
Hypothesis Testing:
Null Hypothesis: The mean weight after the diet is more than or equal to the mean weight
before the diet.
H 0: µa ≥ µb OR µa - µb ≥ 0
Alternate Hypothesis: The mean weight after the diet is less than the mean weight before the
diet.
H 1: µa < µb OR µa - µb < 0

Result:
t-Test: Paired Two Sample for
Means
Variable 1 Variable 2
Mean 169.625 150.25
121.928571
Variance 65.125 4
Observations 8 8
-
0.17674777
Pearson Correlation 2
df 7
3.70687337
t Stat 3
0.00379299
P(T<=t) one-tail 4
1.89457860
t Critical one-tail 5
Decision Rule:

Inference:
Since t Stat= 3.706 is greater than t Critical=1.895, reject Null Hypothesis.
Since P = 0.003 is less than α=0.05, reject Null hypothesis.
Conclusion:
Therefore, the mean weight after the diet is less than the mean weight before the diet.
Two Sample Z Test

Research Problem: The net annual returns (the returns on investment after deducting all
relevant fees) in percentage are given. Can investors do better by buying mutual funds
directly from banks or other financial institutions than by purchasing mutual funds through
brokers? Can we conclude at the 5% significance level that directly-purchased mutual funds
outperform mutual funds bought through brokers?
Direct Broker
9.33 3.24
6.94 -6.76
16.17 12.8

16.97 11.1
5.94 2.73
12.61 -0.13
3.33 18.22
16.13 -0.8
11.2 -5.75
1.14 2.59
4.68 3.71
3.09 13.15
7.26 11.05
2.05 -3.12
13.07 8.94
0.59 2.74
13.57 4.07
0.35 5.6
2.69 -0.85
18.45 -0.28
4.23 16.4
10.28 6.39
7.1 -1.9
-3.09 9.49
5.6 6.7
5.27 0.19
8.09 12.39
15.05 6.54
13.21 10.92
1.72 -2.15
14.69 4.36
-2.97 -11.07
10.37 9.24
-0.63 -2.67
-0.15 8.97
0.27 1.87
4.59 -1.53
6.38 5.23
-0.24 6.87
10.32 -1.69
10.29 9.43
4.39 8.31
-2.06 -3.99
7.66 -4.44
10.83 8.63
14.48 7.06
4.8 1.57
13.12 -8.44
-6.54 -5.72
-1.06 6.95

Hypothesis Testing:
H 0: µd<=µb or µd-µb<=0
H 1: µd>µb or µd-µb>0

Result:
Direct Broker
Mean 6.6312 3.7232
Known Variance 37.488 43.339
Observations 50 50
z 2.28718437
P(Z<=z) one-tail 0.011092532
z Critical one-tail 1.644853627
Decision Rule:
 If z Stat is greater than z Critical, reject Null Hypothesis.
If z Stat is less than z Critical, accept Null Hypothesis.
Inference:
Since z stat = 2.28 is greater than z critical = 1.644, reject null hypothesis
Since p = 0.011 is less than a=0.05, reject null hypothesis
Conclusion:
Therefore the Net Annual returns on mutual funds is greater when investors purchase then directly
from bank rather than purchasing from brokers.

ANOVA – Single Factor
Research Problem: To test that the mean marks of the students in subjects- Economics,
Science and History are all equal or not
Marks ( one factor/variable)

economics science history
42 69 35
53 54 40
49 58 53
53 64 42
43 64 50
44 55 39
45 56 55
52 39
54 40
Hypothesis Testing:
H 0: Mean marks of all subjects are equal
μe =μs=μh
H 1: Mean marks of at least one group is different

Result:
SUMMARY
Groups Count Sum Average Variance
economics 9 435 48.33333333 23.5
32.3333
science 7 420 60 3
history 9 393 43.66666667 50.5
ANOVA
Source of
Variation SS df MS F P-value F crit
1085.8 15.1962 7.16E- 3.44335
Between Groups 4 2 542.92 3 05 7
Within Groups 786 22 35.72727273
1871.8
Total 4 24
Decision Rule:
f stat > f critical, reject Ho

If p value < 0.05 , reject H0
Inference:
Since f Stat=15.196 is greater than f Critical=3.44, reject Null Hypothesis.
Conclusion:
Therefore, the mean marks of the students in subjects- Economics, Science and History are
equal.

ANOVA – Two factor Without Replication
Research Problem: To test whether mean marks of at least one of the students is
different.
student economics science history

a 42 69 35
b 53 54 40
c 49 58 53
d 53 64 42
e 43 64 50
Hypothesis Testing:
Row wise
H0: no significant difference in mean marks of students
H1: Mean marks of at least one of the students is different
Column wise
H0: no significant difference in marks of subjects
H1: Mean marks of at least one of the subjects is different

Result:
Anova: Two-Factor Without
Replication
Varianc
SUMMARY Count Sum Average e
48.6666 322.333
a 3 146 7 3
b 3 147 49 61
53.3333 20.3333
c 3 160 3 3
d 3 159 53 121
52.3333 114.333
e 3 157 3 3
economics 5 240 48 28
science 5 309 61.8 34.2
history 5 220 44 54.5
ANOVA
Source of Variation SS df MS F P-value F crit
Rows 60.933 4 15.2333 0.30026 0.86988 3.83785

3 3 3 9 3
872.13 436.066 8.59526 0.01017
Columns 3 2 7 9 2 4.45897
405.86 50.7333
Error 7 8 3
1338.9
Total 3 14
Decision Rule:
f stat > f critical, reject Ho
Inference:
Row Wise
Since f Stat=0.300 is less than f Critical=3.83, accept Null Hypothesis.
Column Wise
Since f Stat=8.59 is greater than f Critical=4.45, reject Null Hypothesis.
Since P= 0.01 is less than α=0.05, reject Null hypothesis.
Conclusion:
Therefore,
The mean marks of the students in subjects- Economics, Science and History are equal
column wise but different row wise.
ANOVA – Two factor With Replication

Research Problem: To test whether or not marks of students differ with respect with school,
subject wise and school wise in conjunction with the subjects.
economic
s science history
SCHOO
LA 42 69 35
53 54 40
49 58 53
53 64 42
43 64 50
SCHOO
LB 44 55 39
45 56 55

52 0 39
54 0 40
0 0 0
Hypothesis Testing:
H0: No significant difference between the mean marks of school A and School B ( Row wise)
H0: no significant difference between the mean marks of economics, medicine and history
H0: no significant difference between school A and School B subject-wise (interactions)

Result:
Anova: Two-Factor With Replication
economi
SUMMARY cs science history Total
SCHOOL A

Count 5 5 5 15
Sum 240 309 220 769
51.2666
Average 48 61.8 44 7
Variance 28 34.2 54.5 95.6381
SCHOOL B
Count 5 5 5 15
Sum 195 111 173 479
31.9333
Average 39 22.2 34.6 3
579.495
Variance 494 924.2 420.3 2
Total
Count 10 10 10
Sum 435 420 393
Average 43.5 42 39.3
861.555 235.566
Variance 254.5 6 7
ANOVA
Source of
Variation SS df MS F P-value F crit
2803.33 0.00727 4.25967
Sample 2803.333 1 3 8.6027 2 7
0.13901 0.87091 3.40282
Columns 90.6 2 45.3 4 2 6
770.233 2.36364 0.11561 3.40282
Interaction 1540.467 2 3 6 1 6
325.866
Within 7820.8 24 7
Total 12255.2 29
Decision Rule:
f stat > f critical, reject H0
Inference:
Since F = 8.6 is greater than F crit = 4.2, reject H0
Since p value = 0.007 is less than a=0.05, reject H0
Conclusion:
Therefore marks of students differ with respect with school, subject wise and school wise in conjunction
with the subjects
F-Test Two-Sample for Variances

Research Problem: Todetermine whether variance of Class 1 is greater than variance of
class 2 in mathematics.
Class1 Class2
65 76
76 54
65 67
76 65
56 76
45 66
Hypothesis Testing:
H 0: Variance group 1 = Variance group 2
H 1: Variance group 1> Variance group 2

Result:
Class1 Class2
Mean 63.83333333 67.33333333
Variance 142.9666667 67.06666667
Observations 6 6
df 5 5
F 2.131709742
P(F<=f) one-tail 0.212888468

F Critical one-tail 5.050329058
Decision Rule:
 If f is greater than f Critical, reject Null Hypothesis.
 If f is equal to f critical , accept null hypothesis
Inference:
Since F > F Critical one-tail, we reject the null hypothesis
Conclusion:
Therefore, variance of Class 1 is greater than variance of class 2 in mathematics

Chi Square test
Research Problem: To determine whether brand preference is independent of age group
Row
Age/Brand Brand1 Brand2 Brand3 total
15-25 65 76 72 213
26-35 60 40 64 164
36-45 45 52 50 147
46-55 55 65 60 180
Column
Total 225 233 246 704
Hypothesis Testing:
H 0: There is no association between brand preference and age group
H 1: There is association between brand preference and age group

Result:
Observed Expected O-E (O-E)2/E
65 68.07528409 -3.075284091 0.138925197
60 52.41477273 7.585227273 1.097699557
45 46.98153409 -1.981534091 0.083574907
55 57.52840909 -2.528409091 0.11112514
76 70.49573864 5.504261364 0.429769143
40 54.27840909 -14.27840909 3.756060091
52 48.65198864 3.348011364 0.230395106
65 59.57386364 5.426136364 0.494226059
72 74.42897727 -2.428977273 0.079269269
64 57.30681818 6.693181818 0.781733907
50 51.36647727 -1.366477273 0.036351727
60 62.89772727 -2.897727273 0.13349963
Chi Square
Value 7.373
Decision Rule:
If calculated value > Critical one-tail, we reject the null hypothesis
Inference:
Since calculated value is less than critical point, accept null hypothesis
Conclusion:

Therefore, brand preference is not independent of age group
INTRODUCTION TO R
1. How to Install R Studios:
To Install R
1. Open an internet browser and go to www.r-project.org.
2. Click the "download R" link in the middle of the page under "Getting Started."
3. Select a CRAN location (a mirror site) and click the corresponding link.
4. Click on the "Download R for (Mac) OS X" link at the top of the page.
5. Click on the file containing the latest version of R under "Files."
6. Save the .pkg file, double-click it to open, and follow the installation
instructions.
7. Now that R is installed, you need to download and install RStudio.
To Install RStudio
1. Go to www.rstudio.com and click on the "Download RStudio" button.
2. Click on "Download RStudio Desktop."
3. Click on the version recommended for your system, or the latest Mac version,
save the .dmg file on your computer, double-click it to open, and then drag and drop it to
your applications folder.
2. Four panels in R
RStudio has four main panes each in a quadrant of your screen: Source
Editor, Console, Workspace Browser (and History), and Plots (and Files, Packages, Help).
These can also be adjusted under the ‘Preferences’ menu. Note that there might be subtle
differences between RStudio installations on different operating systems
 TOP RIGHT:
Environment and history window. The environment window contains objects (data, values,
functions) R has currently stored in its memory. The history window shows all commands
that were executed in the Console.

 TOP LEFT:
Text editor or script window. This is where you can save and edit collections of commands.
 BOTTOM RIGHT:
Files, plots, packages, help, and viewer pane. Here you can open files, view plots, install and
load packages, read man pages, and view markdown and other documents in the viewer tab.

 BOTTOM LEFT:
Console or command window. Here you can type any valid R command after the prompt
followed by Enter and R will execute that command.
3. Import Data Sheet In Excel:

Importing data into R is a necessary step that, at times, can become time intensive. To
ease this task, RStudio includes new features to import data from: csv, xls, xlsx, sav, dta,
por, SAS, SPSS and Stata files.

Steps:
 Go to Files > Import Dataset > From Excel.
 Go to Browse select the file to be imported.;

 Select the sheet to be imported from Default.
 Click OK after selecting the sheet to be imported.
4. Correlation:
Correlation is a statistical technique that can show whether and how strongly pairs of
variables are related.

 Data Set
Engine capacity weight in kg
250 150
500 200
650 269
1550 3500
2000 4000
3000 5000
4000 6000
 Import the data sheet to R studios
 Attach the data sheet using attach() command.

 Run the correlation test by giving command cor.test()
5. Independent Samples:
 Data Set
Score shift
3.1 Part time
3.4 Part time
4.6 Part time
2.8 Part time

2.3 Part time
1.5 Part time
3.8 Part time
9.5 Part time
4.3 Part time
2.7 Part time
1.6 Part time
1.6 Part time
3.2 Part time
4.2 Part time
3.9 Part time
1.2 Part time
3.2 Full time
1.5 Full time
6.5 Full time
0.2 Full time
3.7 Full time
3.3 Full time
1.7 Full time
3.6 Full time
3.8 Full time
5.3 Full time
6.9 Full time
3.6 Full time
1.7 Full time
1.2 Full time
7.2 Full time
3.9 Full time
1.9 Full time
5.3 Full time

 Import file to R Studios.
 Attach the file and run the test by using t.test() command.
6.Paired Sample
 Data Set
Subject Before After
1 135 127
2 142 145
3 137 131

4 122 125
5 147 132
6 151 147
7 131 119
8 117 125
9 154 132
10 143 139
11 133 122
12 125 125
13 154 138
14 156 121
15 132 126
16 125 127
17 136 128
18 141 132
19 129 131
20 148 139
21 136 120
22 120 119
23 147 135
24 138 142
25 158 130
 Import the data sheet to R Studios
 Attach the file by using Attach() command.

 Run the test by using t.test() command.
7.One way anova

 Data Set
economi medicin
cs e history
42 69 35
53 54 40
49 58 53

53 64 42
43 64 50
44 55 39
45 56 55
52 39
54 40
 Import the data sheet to R tudios
 Attach the data sheet using Attach() command.
 Use the Following commands to run the test.

8.F test
 Data Set
Class1 Class2
65 76
76 54

65 67
76 65
56 76
45 66
 Attach the data sheet using Attach () command.

 Run the test using var.test() command.
9.Chi Square test

 Data Set
Hair eyes height sex
Brown Blue 63 M
Brown Brown 62 F
Brown Brown 60 F
Brown Brown 69 M
Brown Brown 71 F
blonde Blue 75 F
black green 65 M
black blue 55 F
black green 68 F
black green 69 M
Brown green 59 M
blonde green 72 M

 Attach the data sheet using Attach () command.
 Use the following commands to run chi square test.


Hrishabh-Research Methodology File

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hrishabh-Research Methodology File

Uploaded by

Copyright:

Available Formats

PROJECT REPORT

Vivekananda School of Business Studies

Research Methodology File 1 Hrishabh Srivastava

Research Methodology File 2 Hrishabh Srivastava

Research Methodology File 3 Hrishabh Srivastava

Research Methodology File 4 Hrishabh Srivastava

Research Methodology File 5 Hrishabh Srivastava

Research Methodology File 6 Hrishabh Srivastava

Research Methodology File 7 Hrishabh Srivastava

CONCATENATE - The CONCATENATE function allows you to combine text

Research Methodology File 8 Hrishabh Srivastava

VLOOKUP - VLOOKUP stands for 'Vertical Lookup'. It is a function that

Research Methodology File 9 Hrishabh Srivastava

Research Methodology File 10 Hrishabh Srivastava

Research Methodology File 11 Hrishabh Srivastava

Research Methodology File 12 Hrishabh Srivastava

Research Methodology File 13 Hrishabh Srivastava

Research Methodology File 14 Hrishabh Srivastava

Research Methodology File 15 Hrishabh Srivastava

Research Methodology File 16 Hrishabh Srivastava

Research Methodology File 17 Hrishabh Srivastava

Research Methodology File 18 Hrishabh Srivastava

Research Methodology File 19 Hrishabh Srivastava

Research Methodology File 20 Hrishabh Srivastava

Research Methodology File 21 Hrishabh Srivastava

Research Methodology File 22 Hrishabh Srivastava

Research Methodology File 23 Hrishabh Srivastava

Research Methodology File 24 Hrishabh Srivastava

Research Methodology File 25 Hrishabh Srivastava

Research Methodology File 26 Hrishabh Srivastava

Research Methodology File 28 Hrishabh Srivastava

Research Methodology File 29 Hrishabh Srivastava

Research Methodology File 30 Hrishabh Srivastava

Research Methodology File 31 Hrishabh Srivastava

Research Methodology File 32 Hrishabh Srivastava

Research Methodology File 34 Hrishabh Srivastava

Research Methodology File 35 Hrishabh Srivastava

Research Methodology File 36 Hrishabh Srivastava

Research Methodology File 37 Hrishabh Srivastava

Research Methodology File 38 Hrishabh Srivastava

Research Methodology File 40 Hrishabh Srivastava

Research Methodology File 41 Hrishabh Srivastava

Research Methodology File 43 Hrishabh Srivastava

Research Methodology File 44 Hrishabh Srivastava

Research Methodology File 47 Hrishabh Srivastava

Research Methodology File 48 Hrishabh Srivastava

Research Methodology File 49 Hrishabh Srivastava

Research Methodology File 51 Hrishabh Srivastava

Research Methodology File 52 Hrishabh Srivastava

Research Methodology File 53 Hrishabh Srivastava

Research Methodology File 56 Hrishabh Srivastava

Research Methodology File 57 Hrishabh Srivastava

Alternate Hypothesis: The population mean is greater than 40.

Research Methodology File 58 Hrishabh Srivastava

 If sig level (p value, alpha) > 0.05, Accept Null Hypothesis.

Research Methodology File 60 Hrishabh Srivastava

T-test Two-Sample Assuming Equal Variances

Alternate Hypothesis: Result in January is less than the result in May

Research Methodology File 61 Hrishabh Srivastava

Mean 54.0625 58.0625

Research Methodology File 63 Hrishabh Srivastava

Hypothesized Mean Difference 0