RM Notes

RESEARCH METHODOLOGY
Prepared by Sanjib Kumar Patra, PhD Assistant Professor, Division of Yoga and Life Sciences, SVYASA University, PSK, Bangalore
What is research?
Research is a course of critical study according to Oxford Dictionary According to Mosbys Medical and Nursing Dictionary, research is a diligent inquiry or examination of data, reports and observations in search of facts and principles Hence Research is all about making observations, proposing a hypothesis to explain them, testing the hypothesis by experiment and reaching a conclusion
How to become a good researcher?

To become a good researcher, one should have the following attributes Curiosity Competence Determination Dedication Honesty Integrity Common sense A sense of hum our
Research process
Asking a question [idea] literature review refined idea a guide plan or design for action what, how to measure actual measuring or data taking organizing data in the form tables & graphs data analysis writing up the results and construct underlying mechanisms for the results write thesis/submit to a journal
What are the attributes of a good research question?

Useful to the society Answerable/practically physible Acceptable (Suitable) by IEC as well as society Ethical to the subjects Exciting Unique Specific (a clear cut idea/specific question)
Hypothesis
An idea or suggestion put forward for reasoning or explanation is known as an hypothesis. Hypothesis is all about forming a logical sequencing or assumption making for a specific question. Hence, a research hypothesis is the statement created by researchers when they speculate upon the outcome of a research or experiment.
Null hypothesis
States that there is no association between the intervention and the outcome variable The null hypothesis is an essential part of any research design, and is always tested, even indirectly. Observations or assumption, based on underlying fact or previous experience are put in neutral (negative) or rational terms is called a null hypothesis.
Literature search
Manual searches Cumulated Index Medicus Key words: Year, Author, Name of the journal/Title of the research Computerized Search engines: Medline, psychlit, psychinfo, current contents, embase, excerpta medica etc. Key words: Year, Author, Name of the journal/Title of the research, intervention
Research is often divided into

Research questioning, setting hypothesis and checking them Audit Make three separate records to note BP (>100mm/hg, <100mm/hg and equal to 100mm/hg). how many can fit the set criteria (set criteria is >100mm/hg hypertension) Monitoring is a form of counting; heights of children of a specific income group
How to frame a good research question?

Follow the following steps while exploring and refining your question: This begins with asking is the answer known? Has someone else worked on it? Consult the experts Use text books to have an idea about the well-established facts Do literature review By reading the review articles on the same area of research Manual searches in the library By using Internet search engines
How would you design a study?

A study is always designed by taking the following things into consideration Review of literature - will help you to refine your idea Ethics of Research - is all about establishing certain norms before carrying out the research. Design of the study - explains, why and how the trial will be taken place.
Ethics of research
Mainly ethics of research is followed in three vital areas and they are Researcher with subjects Laboratory ethics Authorship ethics
Ethical committee
To review the research and ethics followed during research, there are two important bodies and their concerns before doing the research play an important role. Following are the two bodies Institutional Review Board (IRB) - to review the research carried out at the institution Institution ethical committee (IEC) to approve the research project taking the ethics into consideration in which there would be involvement subjects during research. Typically an ethical committee is comprised of the following experts Scientist Clinician Nurse
Lawyer Sociologist or any other person
Nuremberg code:
During the Second World War (1939-1945), many human experiments at the concentration camps were carried out by harming the live physical body of the common injured people which drew the attention of the World scientific community and a code of Ethics for Experimentation involving human subjects was set at Nuremberg, Germany. This set of norms was called Nuremberg code and they are the following: The voluntary consent of the human subject is absolutely essential The results of the experiment good for the society The experiment avoid unnecessary Physical or mental suffering No experiment to be done when reason to fear injury The subject should be at liberty to end the experiment depending on his/her choice There was slight variation in the ethics of research and the world medical association meet was held at Helsinki, Hong Kong I 1964, 1975 and 1989. In the Declaration of Helsinki: Recommendations guiding physicians in biomedical research involving human subjects The purpose of BIOMEDICAL research involving human subjects must be to improve Diagnostic tools Therapeutic interventions Prophylactic measures and the understanding about etiology and pathogenesis of the diseases Role of safety regulation board of IEC Mainly the role of Safety Regulation Committee (Researcher subjects) remains vital when there is a risk of the following factors Factors Electrical facilities should not be more than 120v Laboratory need to make free from Light, Noise and Vibration When there is a risk of Heavy apparatus for the subjects When there is a risk of using radioactive substances Laboratory Ethics: Data recorded during the research should be under the custody of the first author for about five years.
Data needs to be consolidated in both hard and soft copies to avoid the technical difficulties.
Publication Ethics
Authorship is given to a manuscript based on the contribution. Researcher who is involved in all sorts of work during research and gives maximum contribution becomes the first author and the person who gives maximum intellectual input to the manuscript becomes the corresponding author. Authorship signifies a definite contribution to: The conceptualization Design Execution & / or interpretation of the results Also, a willingness to take responsibility/defend the study, if needed Authorship is not for Advice Reagents / other materials Equipment / other support Patients / subjects (recruitment) Acknowledge the people who are involved in the above type of work during research.
Sampling
Sampling is the process of selecting a number of subjects from all the subjects in a particular group or "population" Types of Sampling Random Systematic Stratified Quota Random samplingA sampling method in which all members of a group (population or universe) have an equal and independent chance of being selected. Generally this kind of sampling is done either by using bingo method or computer generated random number table.
Advantages and Disadvantages of random sampling One of the best things about simple random sampling is the ease of assembling the sample. It is also considered as a fair way of selecting a sample from a given population since every member is given equal opportunities of being selected. One of the most obvious limitations of simple random sampling method is its need of a complete list of all the members of the population. Please keep in mind that the list of the population must be complete and up-to-date. This list is usually not available for large populations. In cases as such, it is wiser to use other sampling techniques. Systematic sampling: In systematic random sampling, the researcher first randomly picks the first item or subject from the population. Then, the researcher will select each n'th subject from the list. Advantages and Disadvantages of systematic sampling The main advantage of using systematic sampling over simple random sampling is its simplicity. It allows the researcher to add a degree of system or process into the random selection of subjects. Another advantage of systematic random sampling over simple random sampling is the assurance that the population will be evenly sampled. There exists a chance in simple random sampling that allows a clustered selection of subjects. This is systematically eliminated in systematic sampling. The process of selection can interact with a hidden periodic trait within the population. If the sampling technique coincides with the periodicity of the trait, the sampling technique will no longer be random and representativeness of the sample is compromised. Stratified sampling Stratified random sampling refers to a sampling method that has the following properties. The population consists of N elements. The population is divided into H groups, called strata. Each element of the population can be assigned to one, and only one, stratum. The number of observations within each stratum Nh is known, and N = N1 + N2 + N3 + ... + NH-1 + NH Advantages and Disadvantages of stratified sampling Stratified sampling offers several advantages over simple random sampling. A stratified sample can provide greater precision than a simple random sample of the same size. Because it provides greater precision, a stratified sample often requires a smaller sample, which saves money. A stratified sample can guard against an "unrepresentative" sample (e.g., an all-male sample from a mixed-gender population).
We can ensure that we obtain sufficient sample points to support a separate analysis of any subgroup. The main disadvantage of a stratified sample is that it may require more administrative effort than a simple random sample. Quota sampling Select individuals as they come to fill a quota by characteristics proportional to populations Advantages and Disadvantages of stratified sampling Ensures selection of adequate numbers of subjects with appropriate characteristics Not possible to prove that the sample is representative of designated population. Sampling techniques: Advantages and disadvantages
Technique Simple random Descriptions Random sample from whole population Advantages Highly representative if all subjects participate; the ideal Disadvantages Not possible without complete list of population members; potentially uneconomical to achieve; can be disruptive to isolate members from a group; time-scale may be too long, data/sample could change More complex, requires greater effort than simple random; strata must be carefully defined
Stratified random
Random sample from identifiable groups (strata), subgroups, etc.
Cluster
Random samples of successive clusters of subjects (e.g., by institution) until small groups are chosen as units Combination of cluster (randomly selecting clusters) and random or stratified random sampling of individuals Hand-pick subjects on the basis of specific characteristics Select individuals as they come to fill a quota by characteristics
Stage
Purposive
Can ensure that specific groups are represented, even proportionally, in the sample(s) (e.g., by gender), by selecting individuals from strata list Possible to select randomly when no single list of population members exists, but local lists do; data collected on groups may avoid introduction of confounding by isolating members Can make up probability sample by random at stages and within groups; possible to select random sample when population lists are very localized Ensures balance of group sizes when multiple groups are to be selected Ensures selection of adequate numbers of subjects with appropriate
Clusters in a level must be equivalent and some natural ones are not for essential characteristics (e.g., geographic: numbers equal, but unemployment rates differ) Complex, combines limitations of cluster and stratified random sampling
Quota
Samples are not easily defensible as being representative of populations due to potential subjectivity of researcher Not possible to prove that the sample is representative of designated population
Snowball
proportional to populations Subjects with desired traits or characteristics give names of further appropriate subjects Either asking for volunteers, or the consequence of not all those selected finally participating, or a set of subjects who just happen to be available
characteristics Possible to include members of groups where no lists or identifiable clusters even exist (e.g., drug abusers, criminals) Inexpensive way of ensuring sufficient numbers of a study No way of knowing whether the sample is representative of the population Can be highly unrepresentative
Volunteer, accidental, convenience
Variables and its types

In research, mainly two types of variables are used and they are 1. Independent variable 2. Dependent variable and 3. Confounding variable An independent variable, sometimes called an experimental or predictor variable, is a variable that is being manipulated in an experiment in order to observe the effect on a dependent variable, sometimes called an outcome variable. Confounding variables also known as a third variable or a mediator variable, can adversely affect the relation between the independent and dependent variables. Ex: A soccer coach wanted to improve the team's playing ability, so he had them run two miles a day. At the same time the players decided to take vitamins. In two weeks the team was playing noticeably better, but the coach and players did not know whether it was from the running or the vitamins
Research Designs
The design is the structure of any scientific work. It gives direction and systematizes the research. The method you choose will affect your results and how you conclude the findings. Most scientists are interested in getting reliable observations that can help the understanding of a phenomenon.
There are two main approaches to a research problem: Quantitative Research Qualitative Research
In a nutshell, quantitative research generates numerical data or information that can be converted into numbers. Only measurable data are being gathered and analyzed in this type of research. Qualitative research on the other hand generates non-numerical data. It focuses on gathering of mainly verbal data rather than measurements. Gathered information is then analyzed in an interpretative manner, subjective, impressionistic or even diagnostic Mainly there are five important designs that we follow in biomedical research and they are the following Controlled Studies Cross sectional studies Longitudinal studies Time series design Correlational studies
Controlled studies:
In the controlled studies, control intervention is mandatory. Randomized controlled trials (RCT) Randomized controlled trials are one of the most efficient ways of reducing the influence of external variables. In any research program, especially those using human subjects, these external factors can skew the results wildly and attempts by researchers to isolate and neutralize the influence of these variables can be counter-productive and magnify them. Any experiment that relies upon selecting subjects and placing them into groups is always at risk if the researcher is biased or simply incorrect. The researcher may fail to take into account all of the potential confounding variables, causing severe validity issues. Advantages of RCT RCT completely remove these extraneous variables without the researcher even having to isolate them or even be aware of them. Randomized experiment designs completely remove any
accusations of conscious or subconscious bias from the researcher and practically guarantee external validity. As an example, imagine that a school seeks to test whether introducing a healthy meal at lunchtime improves the overall fitness of the children. It decides to do this by giving half of the children healthy salads and wholesome meals, whilst the control group carries on as before. At regular intervals, the researchers note the cardiovascular fitness of the children, looking to see if it improves. Disadvantages of RCT Ideally, randomized controlled trials would be used for most experiments, but there are some disadvantages. Firstly, researchers often choose subjects because they do not have the resources, or time, to test larger groups, so they have to try to find a sample that is representative of the population as a whole. This select sampling means that it becomes very difficult to generalize the results to the population as a whole. Secondly, randomized experiment designs, especially when combined with crossover studies, are extremely powerful at understanding underlying trends and causalities. However, they are a poor choice for research where temporal factors are an issue, for which a repeated measures design is better. Whilst randomized controlled trials are regarded as the most accurate experimental design in the social sciences, education, medicine and psychology, they can be extremely resource heavy, requiring very large sample groups, so are rarely used. Instead, researchers sacrifice generalization for convenience, leaving large scale randomized controlled trials for researchers with bigger budgets and research departments. Matched/ paired control Subjects will be matched according to their age, gender, occupation and nature and severity of the ailment. Matched subjects designs are often used in education, giving researchers a useful way to compare treatments without having to use huge and randomized groups. In a matched subjects designs, researchers attempt to emulate some of the strengths of within subjects designs and between subjects designs. A matched subject design uses separate experimental groups for each particular treatment, but relies upon matching every subject in one group with an equivalent in another. The idea behind this is that it reduces the chances of an influential variable skewing the results by negating it.
Advantages of Matched/Paired control The overall goal of a matched subjects design is to emulate the conditions of a within subjects design, whilst avoiding the temporal effects that can influence results. A within subjects design tests the same people whereas a matched subjects design comes as close as possible to that and even uses the same statistical methods to analyze the results. This eliminates the possibility of differences between individuals affecting the results. The matched subjects design also utilizes the strength of the between subjects design, in that every subject is tested only once, eliminating the possibility of temporal factors, known as order effects, affecting the results. Disadvantages of Matched/Paired control Whilst the design is an excellent compromise between reducing order effects and smoothing out variation between individuals, it is certainly not perfect. Even with careful matching of the pairs, there will always be some variation. In the nursing home example, there are far too many factors influencing cardio-vascular fitness that the researchers can only hope to match the most influential variables, which is an approximation. In addition, the researcher might be incorrect in their assumptions about which variables are the most important and miss a major confounding variable. Even the single variable may have been measured incorrectly; in the educational example, one of the children may have had a really bad day, been ill or suffered from nerves, giving her a much lower score than her reading comprehension would indicate. Despite these disadvantages, matched subjects designs are useful, allowing researchers to perform streamlined and focused research programs whilst maintaining a good degree of validity. Subject (Self as control)/With in the subjects design In a within subject design, unlike a between subjects design, every single participant is subjected to every single treatment, including the control. This gives as many data sets as there are conditions for each participant; the fact that subjects act as their own control provides a way of reducing the amount of error arising from natural variance between individuals. These tests are common in many research disciplines. An education researcher might want to study the effect of a new program on children and test them before, and after, the new method has been applied.
Advantages of (Self as control)/Within the subjects design The main advantage that the within subject design has over the between subject design is that it requires fewer participants, making the process much more streamlined and less resource heavy. For example, if you want to test four conditions, using four groups of 30 participants is unwieldy and expensive. Using one group, which is tested for all four, is a much easier way. Ease is not the only advantage, because a well planned within subject design allows researchers to monitor the effect upon individuals much more easily and lower the possibility of individual differences skewing the results. Disadvantages of (Self as control)/Within the subjects design One disadvantage of this research design is the problem of carryover effects, where the first test adversely influences the other. Two examples of this, with opposite effects, are fatigue and practice. In a long experiment, with multiple conditions, the participants may be tired and thoroughly fed up of researchers prying and asking questions and pressuring them into taking tests. This could decrease their performance on the last study. Alternatively, the practice effect might mean that they are more confident and accomplished after the first condition, simply because the experience has made them more confident about taking tests. As a result, for many experiments, a counterbalance design, where the order of treatments is varied, is preferred, but this is not always possible.
Cross-over design
A research design where the subjects get both treatments in sequence. Contrast this with a parallel groups design where some subjects get the first treatment and different subjects get the second treatment. The crossover design represents a special situation where there is not a separate comparison group. In effect, each subject serves as his/her own control. Also, since the same subject receives both treatments, there is no possibility of covariate imbalance. Ideally in a crossover design, a subject is randomly assigned to a specific treatment order. Some subjects will receive the standard therapy first, followed by the new therapy (AB). Others will receive the new therapy first, followed by the standard therapy (BA). Here is an example of a crossover design. Advantages: A crossover study has two advantages over a non-crossover longitudinal study. First, the influence of confounding covariates is reduced because each crossover patient serves as his or her own control. In a non-crossover study, even randomized, it is often the case that different treatmentgroups are found to be unbalanced on some covariates. In a controlled, randomized
crossover designs, such imbalances are implausible (unless covariates were to change systematically during the study). Second, optimal crossover designs are statistically efficient and so require fewer subjects than do non-crossover designs (even other repeated measures designs). Optimal crossover designs are discussed in the graduate textbook by Jones and Kenward and in the review article by Stufken. Crossover designs are discussed along with more general repeated-measurements designs in the graduate textbook by Vonesh and Chinchilli. Disadvantages: These studies are often done to improve the symptoms of patients with chronic conditions; for curative treatments or rapidly changing conditions, cross-over trials may be infeasible or unethical. Crossover studies often have two problems: First is the issue of "order" effects, because it is possible that the order in which treatments are administered may affect the outcome. An example might be a drug with many adverse effects given first, making patients taking a second, less harmful medicine, more sensitive to any adverse effect. Second is the issue of "carry-over" between treatments, which confounds the estimates of the treatment effects. In practice, "carry-over" effects can be avoided with a sufficiently long "wash-out" period between treatments. However, the planning for sufficiently long wash-out periods does require expert knowledge of the dynamics of the treatment, which often is unknown, of course. Also, there might be a "learning" effect. This is important where you have controls who are naive to the intended therapy. In such a case e.g. you cannot make a group (typically the group which learned the skill first) unlearn a skill such as yoga and then act as a control in the second phase of the study.
How to improve a control group?

A control intervention is always improved by giving emphasis on the following things 1. Placebo is to please the subjects and to make the subjects feel that experimental and control interventions are identical. 2. Randomization by using either bingo method or a computer generated random number table.
3. Blinding: This refers to being unaware about the intervention 4. Masking: This refers to temporary covering. Generally done either during selection of subjects or (more common) during organization of acquired data
Cross-sectional study design

The study of groups of individuals differing on the basis of specified criteria (for example, age, gender, occupation, socioeconomic status and severity of the disease) at the same point in time. Cross-sectional studies are mainly used in survey type of research and are used to gather information on a population at a single point in time. An example of a cross sectional survey would be a questionnaire that collects data on how parents feel about Internet filtering, as of March of 1999. A different cross-sectional survey questionnaire might try to determine the relationship between two factors, like religiousness of parents and views on Internet filtering. Advantages and Disadvantages of Cross-sectional design If sample is large enough they may represent the population that they are drawn from. it is usually be levied that a cross sectional sample should be ten percent, minimal, of the target population. If too small they may not represent the population that they are drawn. The individuals in the sample need to be characteristically similar to the general population you are interested in. so if you were doing a short term memory study of eighteen year olds' memory capacity, your sample should contain eighteen year olds of average ability, to be generalizable to the general population, and not ones who have exceptional memory ability. If people have exceptional memory ability the results could be confounded, as they do not represent the general population. If the sample comprises of people from different areas, then they may have different characteristics that confound the results. or similar characteristic found in different areas of the country can show how widespread the phenomenon is.
Longitudinal studies
A basic type of research method in which subjects are tested one or more times after initial testing. Typically, subjects are assigned randomly to an experimental group (e.g. a group that performs a specific type of training) and a control group after the initial testing. Both the experimental and the control groups are tested again simultaneously one or more times during the period of the study. In this way, the effects of an experimental procedure can be measured over a period of time. Advantages are:
High in validity - people usually do not remember past events and if they were asked about their past, they would not remember Picking up long-term changes
Disadvantages are:
It takes a long period of time to gather results A need to have a large sample size and accurate sampling to reach representativeness Participant may drop out, this is called subject attrition. Time series/Cohort design
A cohort study is an analytical study in which individuals with differing exposures to a suspected factor are identified and then observed for the occurrence of certain health effects over some period, commonly years rather than weeks or months. The occurrence rates of the disease of interest are measured and related to estimated exposure levels. Cohort studies can either be performed prospectively or retrospectively from historical records. Advantages and disadvantages of Cohort and Case control study design Cohort studies No Yes since starting with exposure status Difficult but examples exists (Framingham study) Yes Sometimes difficult Yes since starting point of the study. Except for retrospective cohorts Yes Computation of risk ratio and rate ratio Case control studies Yes since starting with a set of cases No Yes
Suited for rare diseases Suited for rare exposures Allows for studying several exposures Allows for studying several outcomes Disease status easy to ascertain Exposure status easier to ascertain
No Easier since starting point of the study Sometimes difficult. Information biases.
Allows computation of risk and rates Allows computation of effect
No Estimation of risk ratio, rate ratio from odds ratio
Allows studying natural history of disease Based on existing data sources Easiness to find a reference group
Yes Easier to show that cause precedes effect. Difficult Usually not difficult to identify an unexposed population Large Elevated except if retrospective cohorts Long, sometimes very long except if retrospective cohorts Difficult, loss to follow up Heavy Many staff, large data sets Long duration Easy to understand
More difficult Temporality between cause and effect difficult to establish Yes but access to information sometimes difficult No Major potential biases when selecting a control group Small Smaller
Sample size Cost
Time required Follow up Logistics
Shorter No follow up Easier
Concept
Ethical issues
Major if studying risk factors. Interruption of study if exposure shown to be harmful. Need for intermediate analysis.
Difficult to understand particularly if case cohort or density case control study None since outcome already happened.
Correlational study design:

A correlational study is a quantitative method of research in which you have 2 or more quantitative variables from the same group of subjects, & you are trying to determine if there is a relationship (or co-variation) between the 2 variables (a similarity between them, not a difference between their means). Theoretically, any 2 quantitative variables can be correlated (for example, midterm scores & number of body piercings!) as long as you have scores on these variables from the same participants; however, it is probably a waste of time to collect & analyze data when there is little reason to think these two variables would be related to each other.
Advantages of correlational design Advantages and Uses of Correlational Research Useful for studying problems in education and other social sciences Enable researchers to analyze the relationship among a large # of variables Identify the cause and effect of educational phenomenon. They provide information concerning the degree of the relationship between the variables being studied. Disadvantages of Correlation studies A large number of variables are measured and subjected to correlation analysis even when the researcher has no theoretical basis or even commonsensical rational to justify their inclusion. The participants are inconvenienced Expensive when a large number of measures are used Measures are likely to be correlated by chance.
How to organize the data?

Data are always organized in the form of tables and graphs. Mean and Standard deviation are necessary to represent the data and find significant changes between and within the groups
Graphs
Data recorded in experiments or surveys is displayed by a statistical graph. We will discuss eleven types of statistical graphs. Choosing which graph is determined by the type and breadth of the data, the audience it is directed to, and the questions being asked. Each type of graph has its advantages and disadvantages. Consult the table below when choosing a graph. Each entry in the table has a link to an example of that graph. Just click on the underlined name of the graph. Pictograph A pictograph uses an icon to represent a quantity of data values in Advantages Easy to read Visually appealing Disadvantages Hard to quantify partial icons
order to decrease the size of the graph. A key must be used to explain the icon.
Handles large data sets easily using keyed icons
Icons must be of consistent size Best for only 2-6 categories Very simplistic
Line plot A line plot can be used as an initial record of discrete data values. The range determines a number line which is then plotted with X's for each data value.
Advantages Quick analysis of data Shows range, minimum & maximum, gaps & clusters, and outliers easily Exact values retained
Disadvantages Not as visually appealing Best for under 50 data values Needs small range of data
Pie chart A pie chart displays data as a percentage of the whole. Each pie section should have a label and percentage. A total data number should be included.
Advantages Visually appealing Shows percent of total for each category
Disadvantages No exact numerical data Hard to compare 2 data sets "Other" category can be a problem Total unknown unless specified Best for 3 to 7 categories Use only with discrete data
Map chart A map chart displays data by shading sections of a map, and must include a key. A total data number should be included.
Advantages Good visual appeal Overall trends show well
Disadvantages Needs limited categories No exact numerical values Color key can skew visual interpretation
Histogram A histogram displays continuous data in ordered columns. Categories are of continuous measure such as time, inches, temperature, etc. Histogram Explorer Bar graph A bar graph displays discrete data in separate columns. A double bar graph can be used to compare two data sets. Categories are considered unordered and can be rearranged alphabetically, by size, etc. Line graph A line graph plots continuous data as points and then joins them with a line. Multiple data sets can be graphed together, but a key must be used.
Advantages Visually strong Can compare to normal curve Usually vertical axis is a frequency count of items falling into each category
Disadvantages Cannot read exact values because data is grouped into categories More difficult to compare two data sets Use only with continuous data
Advantages Visually strong Can easily compare two or three data sets
Disadvantages Graph categories can be reordered to emphasize certain effects Use only with discrete data
Advantages Can compare multiple continuous data sets easily Interim data can be inferred from graph line
Disadvantages Use only with continuous data
Frequency Polygon A frequency polygon can be made from a line graph by shading in the area beneath the graph. It can be made from a histogram by joining midpoints of each column.
Advantages Visually appealing
Disadvantages Anchors at both ends may imply zero as data points Use only with continuous data
Scatterplot A scatterplot displays the relationship between two factors of the experiment. A trend line is used to determine positive, negative, or no correlation.
Advantages Shows a trend in the data relationship Retains exact data values and sample size Shows minimum/maximum and outliers
Disadvantages Hard to visualize results in large data sets Flat trend line gives inconclusive results Data on both axes should be continuous
Stem and Leaf Plot Stem and leaf plots record data values in rows, and can easily be made into a histogram. Large data sets can be accomodated by splitting stems.
Advantages Concise representation of data Shows range, minimum & maximum, gaps & clusters, and outliers easily Can handle extremely large data sets
Disadvantages Not visually appealing Does not easily indicate measures of centrality for large data sets
Box plot A boxplot is a concise graph showing the five point summary. Multiple boxplots can be drawn side by side to compare more than one data set..
Advantages Shows 5-point summary and outliers Easily compares two or more data sets Handles extremely large data sets easily
Disadvantages Not as visually appealing as other graphs Exact values not retained
Measuring tools:
Following are the measuring tools for qualitative data and analysis. 1. Questionnaire 2. Interview 3. Observations Questionnaire: Information elicited from a questionnaire gives/covers:
Attitudes Behavior Beliefs Attributes Types of Questionnaire: 1. OPEN ENDED 2. CLOSED: The nature of the expected answer is indicated Dichotomous Scaled Check-list Grid Ranking 3. ORGANIZATIONAL CLOSED ENDED QUESTIONS: Dichotomous questionnaire: Usually Dichotomous ends with two options YES/NO Ex: 1. Has your ailment been treated (/managed) with yoga therapy and Do you feel there is an improvement in your condition following yoga?
Scaled questionnaire: A scale is provided & the response is scaled. Ex: Do you take care to eat a healthful diet? Always Sometimes Never Checklist questionnaire: Respondents are asked to tick [ ] one out of many choices sometimes one or more Ex: Why did you choose yoga therapy? No other option nothing else worked? Your doctor suggested it? Read about it and felt curious You know the importance of mind-body medicine today
Ranking questionnaire: Subjects are asked to rank a list of things Ex: Arrange the following practices based on the amount of relaxation you feel, after them: Most/ Midway/ Least Cyclic Meditation Bhramari OM meditation Breathing exercises
OPEN ENDED QUESTIONS: Open questions are descriptive (a qualitative method of research) Ex: Describe how you feel .. ORGANIZATIONAL: Responses are built up based on the first response Funnel (Built on a response) Filter (Directs the response)
Ex: (Funnel) 1. Which department do you work in? 2. How many people work in that dept.? 3. How long have you been working in that dept? Ex: (Filter) 1. Do you practice yoga? , If Yes go to 2, If No go to 3. 2. How long have you been practicing? 3. Do you practice any technique, thought to increase relaxation? Advantages and disadvantages of Questionnaire: Some disadvantages of questionnaires: Questionnaires, like many evaluation methods occur after the event, so participants may forget important issues. Questionnaires are standardized so it is not possible to explain any points in the questions that participants might misinterpret. This could be partially solved by piloting the questions on a small group of students or at least friends and colleagues. It is advisable to do this anyway. Open-ended questions can generate large amounts of data that can take a long time to process and analyses. One way of limiting this would be to limit the space available to students so their responses are concise or to sample the students and survey only a portion of them. Respondents may answer superficially especially if the questionnaire takes a long time to complete. The common mistake of asking too many questions should be avoided. Students may not be willing to answer the questions. They might not wish to reveal the information or they might think that they will not benefit from responding perhaps even be penalized by giving their real opinion. Students should be told why the information is being collected and how the results will be beneficial. They should be asked to reply honestly and told that if their response is negative this is just as useful as a more positive opinion. If possible the questionnaire should be anonymous.
Some advantages of questionnaires: The responses are gathered in a standardized way, so questionnaires are more objective, certainly more so than interviews. Generally it is relatively quick to collect information using a questionnaire. However in some situations they can take a long time not only to design but also to apply and analyses
Potentially information can be collected from a large portion of a group. This potential is not often realized, as returns from questionnaires are usually low. However return rates can be dramatically improved if the questionnaire is delivered and responded to in class time.
Interview:
Structured Semi-structured Unstructured Focus groups Telephone Very useful and descriptive Test & retest reliability & Inter tester reliability is poor.
Structured:
Highly formalized Pre-coded responses, often Questions are read, always in the same order, same way, Same inflection specified) Types of questions- dichotomous, multiple choice, Scaled Advantages: Takes less time than others Quantitative statistics Like a questionnaire, but less bothering about layout, ambiguity Questions from the respondent can be cleared as they arise (unlike questionnaire)
(to be
Disadvantages: Sensitive questions less likely to evoke honest responses compared with questionnaires
Semi structured
Begin with close (ended) questions & move on to open (ended) questions
Ex: Role of a physiotherapist in a burns unit How long have you been working here? What do you do here during a typical day?
Unstructured
Interviewee takes control Interviewer promptly gives direction to the interview Ex: Physiotherapy burns unit What are your experiences of working in a burns unit? What is the impact of working in burns unit on yourself ?
Focus group
A group interview with 8 or 12 persons who all have something in common and a facilitator leads the group discussion Another researcher observes what is said (also any relevant non-verbal behavior) Advantages: Reduces time, money travel Prevents shying away from a meeting (& questioning) with a stranger, face to face Disadvantages Cost May avoid people Lines busy No phones Deaf people
Interview skills: Listen more than speaking! Questions should be: Clear non-threatening - straight forward Avoid cues:
You enjoy your work dont you is bad Or How do you feel about your work is good Enjoy the process of interviewing Avoid looking bored, scared! Establish a good rapport
Observation:
Structured Semi-structured Unstructured Structured observations: Effect of Yoga on Schizophrenics Advantages: Used for large scale studies Measurable, quantifiable data Can be used to test a hypothesis Easy to train observers Cost & time effective Strong reliability
Disadvantages: Pre-determined points may neglect important (new) observation. E.g., may not record awakened sudden interest The context in which the data are collected gets ignored. E.g., Did the subject attend the therapy session? Yes / No, (may not allow for: No, because he had fever)
Semi structured and unstructured observations: Advantages: Allows more creativity Less planning required Disadvantages: More labor during observation Takes time Difficult to quantify
Types of Data/Variables:
Nominal variables are variables that have two or more categories but which do not have an intrinsic order. For example, a real estate agent could classify their types of property into distinct categories such as houses, condos, co-ops or bungalows. So "type of property" is a nominal variable with 4 categories called houses, condos, co-ops and bungalows. Of note, the different categories of a nominal variable can also be referred to as groups or levels of the nominal variable. Another example of a nominal variable would be classifying where people live in the USA by state. In this case there will be many more levels of the nominal variable (50 in fact). Ordinal variables are variables that have two or more categories just like nominal variables only the categories can also be ordered or ranked. So if you asked someone if they liked the policies of the Democratic Party and they could answer either "Not very much", "They are OK" or "Yes, a lot" then you have an ordinal variable. Why? Because you have 3 categories, namely "Not very much", "They are OK" and "Yes, a lot" and you can rank them from the most positive (Yes, a lot), to the middle response (They are OK), to the least positive (Not very much). However, whilst we can rank the levels, we cannot place a "value" to them; we cannot say that "They are OK" is twice as positive as "Not very much" for example. Interval variables are variables for which their central characteristic is that they can be measured along a continuum and they have a numerical value (for example, temperature measured in degrees Celsius or Fahrenheit). So the difference between 20C and 30C is the same as 30C to 40C. However, temperature measured in degrees Celsius or Fahrenheit is NOT a ratio variable. Ratio variables are interval variables but with the added condition that 0 (zero) of the measurement indicates that there is none of that variable. So, temperature measured in degrees Celsius or Fahrenheit is not a ratio variable because 0C does not mean there is no temperature. However, temperature measured in Kelvin is a ratio variable as 0 Kelvin (often called absolute zero) indicates that there is no temperature whatsoever. Other examples of ratio variables include height, mass, distance and many more. The name "ratio" reflects the fact that you can use the ratio of measurements. So, for example, a distance of ten meters is twice the distance of 5 meters.
Analysis of the data:

Analysis of data is done in two statistical methods 1. Descriptive statistics 2. Inferential statistics
Descriptive statistics Descriptive statistics are used to describe the data in two ways and they are Measures of central tendencies and Measures dispersion Measures of central tendencies are 1. Mean 2. Median and 3. Mode Mean, median, and mode are three kinds of "averages". There are many "averages" in statistics, but these are, I think, the three most common, and are certainly the three you are most likely to encounter in your pre-statistics courses, if the topic comes up at all. The "mean" is the "average" you're used to, where you add up all the numbers and then divide by the number of numbers. The "median" is the "middle" value in the list of numbers. To find the median, your numbers have to be listed in numerical order, so you may have to rewrite your list first. The "mode" is the value that occurs most often. If no number is repeated, then there is no mode for the list. Find the mean, median and mode for the following list of values: 13, 18, 13, 14, 13, 16, 14, 21, 13 The mean is the usual average, so: (13 + 18 + 13 + 14 + 13 + 16 + 14 + 21 + 13) 9 = 15 Note that the mean isn't a value from the original list. This is a common result. You should not assume that your mean will be one of your original numbers. The median is the middle value, so I'll have to rewrite the list in order: 13, 13, 13, 13, 14, 14, 16, 18, 21 There are nine numbers in the list, so the middle one will be the (9 + 1) 2 = 10 2 = 5th number: 13, 13, 13, 13, 14, 14, 16, 18, 21 So the median is 14. The mode is the number that is repeated more often than any other, so 13 is the mode.
The largest value in the list is 21, and the smallest is 13, so the range is 21 13 = 8. Mean: 15 Median: 14 Mode: 13
Measures of dispersions are 1. Range 2. Standard deviation and 3. Variance
The "range" is just the difference between the largest and smallest values. In {4, 6, 9, 3, and 7} the lowest value is 3, and the highest is 9, so the range is 9-3 = 6. The variance of a data set is the arithmetic average of the squared differences between the values and the mean. Again, when we summarize a data set in a frequency distribution, we are approximating the data set by "rounding" each value in a given class to the class mark. Thus, the variance of a frequency distribution is given by
The standard deviation is the square root of the variance:
The variance and the standard deviation are both measures of the spread of the distribution about the mean. The variance is the nicer of the two measures of spread from a mathematical point of view, but as you can see from the algebraic formula, the physical unit of the variance is the square of the physical unit of the data. For example, if our variable represents the weight of a person in pounds, the variance measures spread about the mean in squared pounds. On the other hand, standard deviation measures spread in the same physical unit as the original data, but because of the square root, is not as nice mathematically. Both measures of spread are useful. Again we can think of the relative frequency distribution as the probability distribution of a random variable X that gives the mark of the class containing a randomly chosen value from the data set. With this interpretation, the variance and standard deviation of the frequency distribution are the same as the variance and standard deviation of X.
Inferential statistics Inferential statistics are used to infer the direction of the data by using various statistical tests. Test hypothesis & check whether differences/ effects are REAL or chance Most hypothesis involve an independent variable (e.g., Drug A / B; Yoga / non-yoga) the dependent variable (e.g., ability to walk)
How to choose Statistical tests?

Check the data 1. Homogenous/Heterogeneous 2. Normally distributed/Not normally distributed 3. Type of data
Parametric tests, if a. The data are normally distributed b. The data are interval and ratio type c. Data are homogenous
Non-parametric tests, if a. The data are normally distributed b. The data are interval and ratio type c. Data are heterogeneous
Comparison
Correlation
Comparison
Correlation
1. Within the group- Paired t test 2. Between the groupsIndependent t test 3. More than two groupsAnalysis of variance (ANOVA) 4. Many variables-Repeated measures ANOVA
Correlation between the variables- Pearson correlation coefficient test
1. Within the groupWilcoxon signed ranked test 2. Between the groups- Mann Whitney test 3. More than two groups- Kruskal Wallis test
Correlation between the variablesSpearman correlation coefficient test

RM Notes

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

RM Notes

Uploaded by

Copyright:

Available Formats

RESEARCH METHODOLOGY

How to become a good researcher?

What are the attributes of a good research question?

Research is often divided into

How to frame a good research question?

How would you design a study?

Lawyer Sociologist or any other person

Random sample from identifiable groups (strata), subgroups, etc.

Volunteer, accidental, convenience

Variables and its types

How to improve a control group?

Cross-sectional study design

Allows computation of risk and rates Allows computation of effect

No Estimation of risk ratio, rate ratio from odds ratio

Sample size Cost

Time required Follow up Logistics

Shorter No follow up Easier

Correlational study design:

How to organize the data?

Handles large data sets easily using keyed icons

Advantages Visually appealing Shows percent of total for each category

Advantages Good visual appeal Overall trends show well

Disadvantages Use only with continuous data

Advantages Visually appealing

Analysis of the data:

Measures of dispersions are 1. Range 2. Standard deviation and 3. Variance

The standard deviation is the square root of the variance:

How to choose Statistical tests?

Correlation between the variables- Pearson correlation coefficient test

Correlation between the variablesSpearman correlation coefficient test

You might also like