Motherhood University, (Roorkee) Faculty of Commerce and Business Studies Assignment

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Motherhood University,( Roorkee)

Faculty of Commerce and Business Studies


Assignment
Academic Session: 2020-21
Subject Name: Quantitative Analysis
Paper Code: - MUBBA-302
Name of student:-Sahil Saini
Roll no. 1905000012

___________________________________________________
UNIT-1
Q1. What are the uses of statistics and quantitative techniques?

Ans- Uses of Statistics:-

(1) Statistics helps in providing a better understanding and exact description of a phenomenon of
nature.

(2) Statistics helps in the proper and efficient planning of a statistical inquiry in any field of study.

(3) Statistics helps in collecting appropriate quantitative data.

(4) Statistics helps in presenting complex data in a suitable tabular, diagrammatic and graphic form for
easy and clear comprehension of the data.

(5) Statistics helps in understanding the nature and pattern of variability of a phenomenon through
quantitative obersevations.

(6) Statistics helps in drawing valid inferences, along with a measure of their reliability about the
population parameters from the sample data.
A small business owner is always making decisions under uncertainty. In the world of business, nothing
is ever done with total confidence that you have made the right decision. Fortunately, numerous
quantitative techniques are available to help organize and assess the risks of various issues.Quantitative
models give managers a better grasp of the problems so that they can make the best decisions based on
the information available. Quantitative techniques are used by managers in practically all aspects of
a business.

1.Project Management

Quantitative methods have found wide applications in project management. These techniques are used
for optimizing the allocation of manpower, machines, materials, money and time. Projects are
scheduled with quantitative methods and synchronized with delivery of material and workforce.

2.Production Planning and Scheduling

Determining the size and location of new production facilities is a complex issue. Quantitative
techniques aid in evaluating multiple proposals for costs, timing, location and availability of
transportation. Product mix and scheduling get analyzed to meet customer demands and maximize
profits.

3.Purchasing and Inventory

Predicting the amount of demand for a product is always dicey. Quantitative techniques offer guidance
on how much raw material to purchase, levels of inventory to keep and costs to ship and store finished
products.

4.Marketing

Marketing campaigns get evaluated with large amounts of data. Marketers apply quantitative methods
to set budgets, allocate media purchases, adjust product mix and adapt to customers'
preferences.Surveys produce data about viewers' responses to advertisements. How many people saw
the ads, and how many purchased the products. All of this information is evaluated to get the return on
investment of dollars in an advertising campaign.

5.Finance

Financial managers rely heavily on quantitative techniques. They evaluate investments with discounted
cash flow models and return on capital calculations. Products get analyzed for profit contribution and
cost of production. Workers are scrutinized for productivity standards and hiring or firing to meet
changing workloads.Predicting cash flow is always a critical concern for managers, and quantitative
measurements help them to predict cash surpluses and shortfalls. They use probabilities and statistics to
prepare annual profit plans.

6.Research and Development


Risking funds on research and development is always a best-guess scenario. The outcomes are never
certain. So, managers look to mathematical projections about the probability of success and eventual
profitability of products to make investment decisions.

7.Agriculture

Operations research techniques have long been employed by farmers. They utilize decision trees and
make assumptions about weather forecasts to decide which crops to plant. If forecasters predict cold
weather, is it more profitable to plant corn or wheat? What happens if the weather is warm? These are
all probabilities that farmers use to plan their crop rotations.A variety of quantitative methods of
analysis are finding more applications in business as managers learn how to use these techniques to
provide more insight into problems and aid in daily decision-making.

Q2. Differentiate between histogram and bar chart.

Ans- Have you ever noticed that the histogram and bar chart look quite similar, and wondered why we
need two different types of chart? Well, if you closely look at them, you can understand that there are
many differences between a histogram and a bar chart . In this blog, let’s look at the bar chart, the
histogram chart, and the differences between them.

Bar chart
A bar chart, which is also widely known as a column chart, is used to compare the frequency, count,
total, or average of data in different categories by using vertical or horizontal bars. Discrete categories
comparison is graphically visualized using a bar chart.

Bars require two values, x and y, to render. The x value might be string, numeric, date-time, log, etc. The
y value should always be numeric. The x-axis shows the categories being compared, whereas the y-axis
shows the measured value. Each bar represents a category and is rendered with equal width but with
varying length. The length of the bar is based on its y value.
Bar chart

In the above screenshot, the first bar represents that Tokyo’s land area is 6,993 sq. km.

Histogram chart
A histogram chart looks like bar chart. It is used to represent the distribution of numerical data by
rendering vertical bars. Non-discrete values comparison is graphically visualized using a histogram chart.
For example, the count of students who earned marks on an exam in various ranges can be visualized
easily with a histogram chart.

Only one set of numeric values is required to render the histogram chart. The x-axis shows the ranges of
value, whereas the y-axis shows the count of the occurrences. Each block represents the number of
occurrences within a specific range. The ranges are split based on the specified bin value.
Histogram chart

In the previous screenshot, bin value is set to 20, so the x-axis is split with 20 as the interval. The first bar
rep resents that there are 10 values lying between 0 and 20 in the provided data.

Comparison table

Comparison terms Bar chart Histogram chart

Usage To compare different categories of data. To display the frequency of occurrences.

Indicates Discrete values. Non-discrete values.

Data Categorical data. Quantitative data.

Each data point is rendered as a separate The data points are grouped and rendered
Rendering
bar. based on the bin value.

Space between bars Can have space. No space.

Reordering bars Can be reordered. Cannot reordered.


Axis labels can be placed on or between the
Axis label placement Axis labels are placed on the ticks.
ticks.

Required values x and y. Only y.

Bar chart versus histogram chart elements representation.

Comparison example
Here, you can see the output in a bar chart and a histogram chart for the same data set [0, 7, 9, 15, 19,
13, 8, 4, 24, 8].
Bar chart versus histogram chart comparison

From the previous screenshot, we can understand that the bar chart renders bars for all the data points.
The length of the bars is based on the y value. The width and spacing between the bars are maintained
equally. But, in the histogram chart, the bars are rendered for grouped values. Here, the bin value is
specified as 10 and six values lie between 0 and 10, three values lie between 10 and 20, and one value
lies between 20 and 30.

Q3. What is frequency distribution table?

Ans- A frequency distribution is a representation, either in a graphical or tabular format, that displays
the number of observations within a given interval. The interval size depends on the data being analyzed
and the goals of the analyst. The intervals must be mutually exclusive and exhaustive.
Frequency distributions are typically used within a statistical context. Generally, frequency distribution
can be associated with the charting of a normal distribution.

KEY TAKEAWAYS

 Frequency distribution in statistics is a representation that displays the number of observations


within a given interval.

 The representation of a frequency distribution can be graphical or tabular so that it is easier to


understand.

 Frequency distributions are particularly useful for normal distributions, which show the
observations of probabilities divided among standard deviations.
 In finance, traders use frequency distributions to take note of price action and identify trends.

Understanding Frequency Distribution

As a statistical tool, a frequency distribution provides a visual representation for the distribution of
observations within a particular test. Analysts often use frequency distribution to visualize or illustrate
the data collected in a sample. For example, the height of children can be split into several different
categories or ranges. In measuring the height of 50 children, some are tall and some are short, but there
is a high probability of a higher frequency or concentration in the middle range. The most important
factors for gathering data are that the intervals used must not overlap and must contain all of the
possible observations.

Visual Representation of Frequency Distribution

Both histograms and bar charts provide a visual display using columns, with the y-axis representing the
frequency count, and the x-axis representing the variable to be measured. In the height of children, for
example, the y-axis is the number of children, and the x-axis is the height. The columns represent the
number of children observed with heights measured in each interval.

In general, a histogram chart will typically show a normal distribution, which means that the majority of
occurrences will fall in the middle columns. Frequency distributions can be a key aspect of charting
normal distributions which show observation probabilities divided among standard deviations.
Frequency distributions can be presented as a frequency table, a histogram, or a bar chart. Below is an
example of a frequency distribution as a table.

Height of Children in a School

Interval (Height) 4' 4'5" 5' 5'2"

Frequency 25 63

Frequency Distribution in Trading

Frequency distributions are not commonly used in the world of investments. However, traders who
follow Richard D. Wyckoff, a pioneering early 20th-century trader, use an approach to trading that
involves frequency distribution.Investment houses still use the approach, which requires considerable
practice, to teach traders. The frequency chart is referred to as a point-and-figure chart and was created
out of a need for floor traders to take note of price action and to identify trends.

The y-axis is the variable measured, and the x-axis is the frequency count. Each change in price action is
denoted in Xs and Os. Traders interpret it as an uptrend when three X's emerge; in this case, demand
has overcome supply. In the reverse situation, when the chart shows three O's, it indicates that supply
has overcome demand.
Q4. Write down the different sources of data collection.

Ans- Sources of Data Collection

Normally we can gather data from two sources namely primary and secondary. Data gathered through
perception or questionnaire review in a characteristic setting are illustrations of data obtained in an
uncontrolled situation. Secondary data is the data acquired from optional sources like magazines, books,
documents, journals, reports, the web and more. The chart below describes the flow of the sources of
data collection.

→ Sources of Primary Data Collection

Primary data will be the data that you gather particularly with the end goal of your research venture.
Leverage of Primary data is that it is particularly customized to your analysis needs. A drawback is that it
is costly to get hold of. Primary data is otherwise called raw information; the information gathered from
the first source in a controlled or an uncontrolled situation. Cases of a controlled domain are
experimental studies where certain variables are being controlled by the analyst.The source of primary
data is the populace test from which you gather the information. The initial phase in the process is
deciding your target populace. For instance, if you are looking into the attractiveness of another washing
machine, your target populace may be newly-weds.

Clearly, it’s impracticable to gather information from everybody, so you will need to focus on the sample
size and kind of sample. The specimen ought to be arbitrary and a stratified random sample is frequently
sensible. In our washing machine illustration, sub populations may incorporate adolescent couples,
moderately aged couples, old couples, and previously wedded couples.

→ Sources of Secondary Data Collection

You can break the sources of secondary data into internal as well as external sources. Inner sources
incorporate data that exists and is stored in your organization. External data refers to the data that is
gathered by other individuals or associations from your association’s outer environment.

Examples of inner sources of data incorporate, but are not restricted only to, the following:

 Statement of the profit and loss

 Balance sheets

 Sales figures

 Inventory records

 Previous marketing studies

If the secondary data you have gathered from internal sources is not sufficient, you can turn to outside
sources of data collection, some outside sources of data collection include:
 Universities

 Government sources

 Foundations

 Media, including telecast, print and Internet

 Trade, business and expert affiliations

 Corporate filings

 Commercial information administrations, which are organizations that find the data for you

________________________________________________________________________________

UNIT-2
Q1. Find the mean of the following data.

(a) 9, 7, 11, 13, 2, 4, 5, 5.


(b) 16, 18, 19, 21, 23, 23, 27 29, 29, 35.

Ans- (a) Mean =Sum of total outcome ÷ No. of outcomes

So,

Mean= 9+7+11+13+2+4+5+5 ÷ 8

= 56÷8

=7

(b)Mean = Sum of tota outcomes ÷ No. of out comes

So,

Mean= 16+18+19+21+23+23+27+29+29+35 ÷ 10

= 240 ÷ 10

= 21.1

Q2. Find the mode of the following data.


(a) 12, 8, 4, 8, 1, 8, 9, 11, 9, 10, 12, 8.
(b) 15, 22, 17, 19, 22, 17, 29, 24, 17, 15.

Ans-(a) Mode= 8, is the mode of the following data because it repeats 4 times which is more than
others.

(b)Mode= 17, is the mode of the following data because it repeats 3 times which is more than others.

Q3. What is the standard deviation?Calculate it for the monthly salaries (in thousands)of a
sample of 55 sales representatives are 12, 9, 7, 6, 6.

Ans-The standard deviation is a statistic that measures the dispersion of a dataset relative to
its mean and is calculated as the square root of the variance. The standard deviation is calculated as the
square root of variance by determining each data point's deviation relative to the mean. If the data
points are further from the mean, there is a higher deviation within the data set; thus, the more spread
out the data, the higher the standard deviation.

KEY TAKEAWAYS:

 Standard deviation measures the dispersion of a dataset relative to its mean.


 A volatile stock has a high standard deviation, while the deviation of a stable blue-chip stock is
usually rather low.
 As a downside, the standard deviation calculates all uncertainty as risk, even when it’s in the
investor's favor—such as above average returns.

Q4. Find the variance for the following sample: 12, 13, 24, 24, 25, 26, 34, 35, 38, 45, 46, 46, 52,
53, 78, 78, and 89.

Ans- The variance for this particular data set is 540.667.

____________________________________________________________________________________

UNIT-3
Q1. What is the role of probability in decision making?

Ans- The Role Of Probability In Business Decision Making

Still, the real role of probability in business decision making doesn’t even deal with statistics, numbers or
math. It’s its story. It’s about how to look at events.

For example, most would look at an outcome made by three events and think, “Doing those three things
again would give the same outcome.” Probability’s story says, “Don’t count on it.”
Pilot programs are great business examples. When they go well, they usually launch a bigger program.
When they don’t, they don’t. Probability says, “Not so fast.”

Probability’s Story Has Two Key Themes

That is because probability’s story has two key themes that affect decision making:

1. The same events produce a range of outcomes over time, not the same one all the time.

2. While it’s hard to predict a single outcome for a specific time, it’s easier to predict a range of
outcomes for an extended period.

For example, a salesperson follows a sales process. It does not always produce a sale. Sales can be of
different sizes too. Yet, the process can come close to saying what the sales will be over time.

The importance of probability in business decision making processes shows up in four ways.

Four Ways Probability Impacts Business Decision Making Processes

These two themes impact business decision making processes in four ways:

1. Don’t overweight a single outcome.

2. Aim for a range of outcomes centered on the most likely one.

3. List all the unknowns to set a good range.

4. Look for patterns and trends in outcomes.

A single outcome might just be an outlier. Simply, “It’s neither as bad nor as good as it seems.” Thus, a
range of outcomes is better. Also, it’s easier to hit a range of outcomes than it is a single one.Listing
unknowns can help with the range. The more there are, the broader the range needs to be. The fewer,
the narrower. Beware though. People prefer to talk about what they know. Thus, people easily short the
unknowns.

Finally, patterns and trends help a lot. They often sub when one can’t see causes. For instance, science
does not know what causes people to like music. Yet, patterns and trends in behaviors show they do.

Taking Probability Beyond Numbers

In the end, the importance of probability in business decision making goes beyond statistics, numbers,
math and data. Its importance rests in its story. That story shows events in a context, a spectrum.
They’re no longer lonely points linked by a thread. Only better decisions can result.

Q2. What is exclusive event table? Draw it for rolling of two dice.

Ans- When rolling two dice, distinguish between them in some way: a first one and second one, a left
and a right, a red and a green, etc. Let (a,b) denote a possible outcome of rolling the two die, with a the
number on the top of the first die and b the number on the top of the second die. Note that each of a
and b can be any of the integers from 1 through 6. Here is a listing of all the joint possibilities for (a,b):

(1,1) (1,2) (1,3) (1,4) (1,5) (1,6)

(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)

(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)

(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)

(5,1) (5,2) (5,3) (5,4) (5,5) (5,6)

(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)

Note that there are 36 possibilities for (a,b). This total number of possibilities can be obtained from the
multiplication principle: there are 6 possibilities for a, and for each outcome for a, there are 6
possibilities for b. So, the total number of joint outcomes (a,b) is 6 times 6 which is 36. The set of all
possible outcomes for (a,b) is called the sample space of this probability experiment.

With the sample space now identified, formal probability theory requires that we identify the possible
events. These are always subsets of the sample space, and must form a sigma-algebra. In an example
such as this, where the sample space is finite because it has only 36 different outcomes, it is perhaps
easiest to simply declare ALL subsets of the sample space to be possible events. That will be a sigma-
algebra and avoids what might otherwise be an annoying technical difficulty. We make that declaration
with this example of two dice.With the above declaration, the outcomes where the sum of the two dice
is equal to 5 form an event. If we call this event E, we have

E={(1,4),(2,3),(3,2),(4,1)}.

Note that we have listed all the ways a first die and second die add up to 5 when we look at their top
faces.

Consider next the probability of E, P(E). Here we need more information. If the two dice are fair and
independent , each possibility (a,b) is equally likely. Because there are 36 possibilities in all, and the sum
of their probabilities must equal 1, each singleton event {(a,b)} is assigned probability equal to 1/36.
Because E is composed of 4 such distinct singleton events, P(E)=4/36= 1/9.

In general, when the two dice are fair and independent, the probability of any event is the number of
elements in the event divided by 36.What if the dice aren't fair, or aren't independent of each other?
Then each outcome {(a,b)} is assigned a probability (a number in [0,1]) whose sum over all 36 outcomes
is equal to 1. These probabilities aren't all equal, and must be estimated by experiment or inferred from
other hypotheses about how the dice are related and and how likely each number is on each of the dice.
Then the probability of an event such as E is the sum of the probabilities of the singleton events {(a,b)}
that make up E.
Q3. Find out the probability for the occurrence of 7 in the toss of a dice.

Ans- Probability for the occurrence of 7 in the toss of a dice is as follow:-

= 6(Total Number of combinations)

=16.67%( Probability)

Q4. What is probability distribution?

Ans- Probability Distribution

A probability distribution is a statistical function that describes all the possible values and likelihoods
that a random variable can take within a given range. This range will be bounded between the minimum
and maximum possible values, but precisely where the possible value is likely to be plotted on the
probability distribution depends on a number of factors. These factors include the distribution's mean
(average), standard deviation, skewness, and kurtosis.

How Probability Distributions Work

Perhaps the most common probability distribution is the normal distribution, or "bell curve," although
several distributions exist that are commonly used. Typically, the data generating process of some
phenomenon will dictate its probability distribution. This process is called the probability density
function.Probability distributions can also be used to create cumulative distribution functions (CDFs),
which adds up the probability of occurrences cumulatively and will always start at zero and end at 100%.

Academics, financial analysts and fund managers alike may determine a particular stock's probability
distribution to evaluate the possible expected returns that the stock may yield in the future. The stock's
history of returns, which can be measured from any time interval, will likely be composed of only a
fraction of the stock's returns, which will subject the analysis to sampling error. By increasing the sample
size, this error can be dramatically reduced.

KEY TAKEAWAYS

 A probability distribution depicts the expected outcomes of possible values for a given data
generating process.
 Probability distributions come in many shapes with different characteristics, as defined by the
mean, standard deviation, skewness, and kurtosis.
 Investors use probability distributions to anticipate returns on assets such as stocks over time
and to hedge their risk.

_____________________________________________________________________________________
UNIT-4
Q1. What is the importance of sampling?

Ans- Importance of sampling

1.Save Time

Contacting everyone in a population takes time. And, invariably, some people will not respond to the
first effort at contacting them, meaning researchers have to invest more time for follow-up. Random
sampling is much faster than surveying everyone in a population, and obtaining a non-random sample is
almost always faster than random sampling. Thus, sampling saves researchers lots of time.

2.Save Money

The number of people a researcher contacts is directly related to the cost of a study. Sampling saves
money by allowing researchers to gather the same answers from a sample that they would receive from
the population.Non-random sampling is significantly cheaper than random sampling, because it lowers
the cost associated with finding people and collecting data from them. Because all research is conducted
on a budget, saving money is important.

3.Collect Richer Data

Sometimes, the goal of research is to collect a little bit of data from a lot of people (e.g., an opinion poll).
At other times, the goal is to collect a lot of information from just a few people (e.g., a user study or
ethnographic interview). Either way, sampling allows researchers to ask participants more questions and
to gather richer data than does contacting everyone in a population.

The Importance of Knowing Where to Sample

Efficient sampling has a number of benefits for researchers. But just as important as knowing how to
sample is knowing where to sample. Some research participants are better suited for the purposes of a
project than others. Finding participants that are fit for the purpose of a project is crucial, because it
allows researchers to gather high-quality data.

For example, consider an online research project. A team of researchers who decides to conduct a study
online has several different sources of participants to choose from. Some sources provide a random
sample, and many more provide a non-random sample. When selecting a non-random sample,
researchers have several options to consider. Some studies are especially well-suited to an online panel
that offers access to millions of different participants worldwide. Other studies, meanwhile, are better
suited to a crowdsourced site that generally has fewer participants overall but more flexibility for
fostering participant engagement.

To make these options more tangible, let’s look at examples of when researchers might use different
kinds of online samples.
Q2. What are the possible errors in result interpretation?

Ans- The knowledge we have of the physical world is obtained by doing experiments and making
measurements. It is important to understand how to express such data and how to analyze and draw
meaningful conclusions from it.

In doing this it is crucial to understand that all measurements of physical quantities are subject to
uncertainties. It is never possible to measure anything exactly. It is good, of course, to make the error as
small as possible but it is always there. And in order to draw valid conclusions the error must be
indicated and dealt with properly.

Take the measurement of a person's height as an example. Assuming that her height has been
determined to be 5' 8", how accurate is our result?

Well, the height of a person depends on how straight she stands, whether she just got up (most people
are slightly taller when getting up from a long rest in horizontal position), whether she has her shoes on,
and how long her hair is and how it is made up. These inaccuracies could all be called errors of
definition. A quantity such as height is not exactly defined without specifying many other circumstances.

Even if you could precisely specify the "circumstances," your result would still have an error associated
with it. The scale you are using is of limited accuracy; when you read the scale, you may have to
estimate a fraction between the marks on the scale, etc

Q3. When we use snowball sampling in research?

Ans- Snowball Sampling: Definition

Snowball sampling or chain-referral sampling is defined as a non-probability sampling technique in


which the samples have traits that are rare to find. This is a sampling technique, in which existing
subjects provide referrals to recruit samples required for a research study.

For example, if you are studying the level of customer satisfaction among the members of an elite
country club, you will find it extremely difficult to collect primary data sources unless a member of the
club agrees to have a direct conversation with you and provides the contact details of the other
members of the club.

This sampling method involves a primary data source nominating other potential data sources that will
be able to participate in the research studies. Snowball sampling method is purely based on referrals
and that is how a researcher is able to generate a sample. Therefore this method is also called the chain-
referral sampling method.Snowball sampling is a popular business study method. The snowball sampling
method is extensively used where a population is unknown and rare and it is tough to choose subjects to
assemble them as samples for research.
This sampling technique can go on and on, just like a snowball increasing in size (in this case the sample
size) till the time a researcher has enough data to analyze, to draw conclusive results that can help an
organization make informed decisions.Types of Snowball Sampling

Linear Snowball Sampling: The formation of a sample group starts with one individual subject providing
information about just one other subject and then the chain continues with only one referral from one
subject. This pattern is continued until enough number of subjects are available for the sample.

Exponential Non-Discriminative Snowball Sampling: In this type, the first subject is recruited and then
he/she provides multiple referrals. Each new referral then provides with more data for referral and so
on, until there is enough number of subjects for the sample.

Exponential Discriminative Snowball Sampling: In this technique, each subject gives multiple referrals,
however, only one subject is recruited from each referral. The choice of a new subject depends on the
nature of the research study.Snowball Sampling Method

The nature of snowball sampling is such, that it cannot be considered for a representative sample or in
that case for statistical studies. However, this sampling technique can be extensively used for
conducting qualitative research, with a population that is hard to locate. Let us now explore how
snowball sampling can be carried out:Consider hypothetically, you as a researcher are studying the
homeless in Texas City. It is obviously difficult to find a list of all the details of the number of homeless
there. However, you are able to identify one or two homeless individuals who are willing to participate
in your research studies.

Now, these homeless individuals provide you with the details of other homeless individuals they know.
The first homeless individual that you found for your research is the primary data. You can collect the
information and tabulate data from the primary data source and move on to other individuals who the
primary data source has referred to. You as a researcher can continue to tap as many homeless you can
find through the reference provided till you know you have collected enough data for your research.The
same strategy can be followed to conduct research or study individuals belonging to certain
underground subculture, or individuals who have a hidden identity or are members of a cult etc. who
don’t want to be identified easily. Trust is an important part of any researcher.An individual, who is
ready to share information, needs to know that the information will be used discreetly and this kind of
trust is especially important in snowball sampling. For a participant to agree to identify themselves or
their group, researchers first need to develop that kind of rapport with the participants. Please know
that this sampling technique may consume more time than anticipated because of its nature.

Snowball sampling analysis is conducted once the respondents submit their feedback and opinions. The
data collected can be qualitative or quantitative in nature, and can be represented in graphs and charts
on the online survey software dashboard such as the one provided by QuestionPro.

Snowball Sampling Applications


Snowball sampling is usually used in cases where there is no precalculated list of target population
details (homeless people), there is immense pain involved in contacting members of the target
population (victims of rare diseases) , members of the target population are not inclined towards
contributing due to a social stigma attached to them (hate-crime, rape or sexual abuse victims, sexuality,
etc.) or the confidentiality of the organization respondents work for (CIA, FBI or terrorist organization).

Medical Practices: There are many less-researched diseases. There may be a restricted number of
individuals suffering from diseases such as progeria, porphyria, Alice in Wonderland syndrome etc. Using
snowball sampling, researchers can get in touch with these hard to contact sufferers and convince them
to participate in the survey research.

Social research: Social research is a field which requires as many participants as possible as it is a
process where scientists learn about their target sample. When social research is to be conducted in
domains where participants might not necessarily willing to contribute such as homeless or the less-
fortunate people.

Cases of discord: In case of disputes such as an act of terrorism, violation of civil rights and other similar
situations, the individuals involved may oppose giving their statements for evidential purposes. The
researchers or management can use snowball sampling, to filter out those people from a population
who are most likely to have caused the situation or are witness to the event to gather proof around the
event.

Snowball Sampling Examples

For some population, snowball sampling is the only way of collecting data and meaningful information.
Following are the instances, where snowball sampling can be used:

No official list of names of the members: This sampling technique can be used for a population, where
there is no easily available data like their demographic information. For example, homeless or list of
members of an elite club, whose personal details cannot be obtained easily.

Difficulty to locate people: People with rare diseases are quite difficult to locate. However, if a
researcher is carrying out a research study similar in nature, finding the primary data source can be a
challenge. Once he/she is identified, they usually have information about more such similar individuals.

People who are not willing to be identified: If a researcher is carrying out a study which involves
collecting information/data from sex workers or victims of sexual assault or individuals who don’t want
to disclose their sexual orientations, these individuals will fall under this category.

Secretiveness about their identity: People who belong to a cult or are religious extremists or hackers
usually fall under this category. A researcher will have to use snowball sampling to identify these
individuals and extract information from them.

Advantages of Snowball Sampling


It’s quicker to find samples: Referrals make it easy and quick to find subjects as they come from reliable
sources. An additional task is saved for a researcher, this time can be used in conducting the study.

Cost effective: This method is cost effective as the referrals are obtained from a primary data source. It’s
is convenient and not so expensive as compared to other methods.

Sample hesitant subjects: Some people do not want to come forward and participate in research
studies, because they don’t want their identity to be exposed. Snowball sampling helps for this situation
as they ask for a reference from people known to each other. There are some sections of the target
population which are hard to contact. For example, if a researcher intends to understand the difficulties
faced by HIV patients, other sampling methods will not be able to provide these sensitive samples. In
snowball sampling, researchers can closely examine and filter members of a population infected by HIV
and conduct a research by talking to them, making them understand the objective of research and
eventually, analyzing the received feedback.

Disadvantages of Snowball Sampling

Sampling bias and margin of error: Since people refer those whom they know and have similar traits
this sampling method can have a potential sampling bias and margin of error. This means a researcher
might only be able to reach out to a small group of people and may not be able to complete the study
with conclusive results.

Lack of cooperation: There are fair chances even after referrals, people might not be cooperative and
refuse to participate in the research studies.

Q4. Differentiate between qualitative and quantitative techniques of sampling.

Ans- Difference between qualitative and quantitative

Quantitative Research investigates a large number of people by submitting questionnaires based on


multiple, numeric answers (0 to 10) and open end (open answers, just a few in a quantitative
questionnaire). Qualitative Research investigates a small amount of people, by submitting them
physically the product itself, thus collecting a great number of behavioral details on a small sample of
users.I realize that most people get confused, even those who have been working in the sector for
thousands of years, about difference between qualitative and quantitative research. Much less the
brand, my client’s client, who is quite unfamiliar. The red line that divides these branches is very thin in
some ways but often like the Chinese wall for others. Qualitative research identifies abstract concepts
while quantitative research collects numerical data.

Qualitative and quantitative research

Qualitative research

Qualitative research is a type of empathic, empirical, exploratory, direct, physical research. It helps you
understand reasons, motivations, opinions, trends that hide behind the more quantitative data of
quantitative research. The most commonly used method for RQL is the F2F (Face to Face), the so-called
focus group where a small sample of respondents gets interviewed for a long time, even hours, in front
of a mirror, behind which the brand and research institute observe and listen. The F2F gets video-
recorded and then transcribed as a storytelling, for images and tales.

Quantitative research

As the word itself says, quantitative research helps you quantify, use numeric data or just data that can
then be easily transformed into statistics, and it measures behavior, opinions and attitudes of a large
sample of respondents. Let’s say that we must have interviewed at least 30 people to talk about
“quantitative”, but there are usually many more than that. Quantitative research can expand its scope if
the brand is a multinational, by implementing multi-county investigations. The more data you obtain
and the more statistics will be more accurate. Methods of collecting quantitative data are mainly Cati
(Computer Assisted Telephone Research), so telephone interviews, and CAWI (Computer Assisted Web
Interviewing), online questionnaires, both lasting approximately 7-10 minutes. Questions often require a
rating from 0 to 10. You can measure a level of satisfaction with a product, buying frequency, brand
awareness, market segments, and so on. Data will be then transcribed into numbers, graphs, and
statistics.

Usually the tandem metaphor is: qualitative research is on the top and it identifies the problem and
clarifies the objective that will be further investigated by quantitative research which is sitting behind.

Theory / research ratio

Qualitative research: Inductive setting that is articulated in the context of “discovery”, the researcher
rejects the formulation of theories. Theory and research work simultaneously.

Quanti: Sequential phases, based on a deductive approach that is articulated in the context of
“justification”. The theory precedes the research.

Concepts

Quali: They seek to find the character of uniqueness.

Quanti: Definitive and operative, they are the theory and are converted from the beginning into
variables.

Relationship with the studied environment

Quali: (active subject) Naturalistic approach: space and actions are analyzed in the present time during
the research.

Quanti: (passive subject) Experimental approach: the subject is not responsive but this is not a problem.

Interaction researcher/respondent
Quali: Essential, it is necessary that empathy arises between the two parts.

Quanti: Almost absent, the interviewer must be warm and human but must not interact outside the
questionnaire.

Search design

Quali: Without a structure, open, in search of unexpected options, it gets modified in progress.

Quanti: Closed structure, planned in advance.

Representativeness of the respondent

Quali: Inexistent. Different info are taken on different levels of depth.

Quanti: It is necessary to use representative samples.

Uniformity of the detection instrument

Quali: Absent. Not necessarily always the same.

Quanti: It is necessary to use a standard.

Nature of data

Quali: Soft: Data collected in their integrity, subjective.

Quanti: Hard: objective and standardized data.

Type of respondent

Quali: Unique individual.

Quanti: Variable individual.

Type of analysis

Quali: Case based, prospettiva olistica del comportamento umano.

Quanti: Variable based, mathematical and statistical techniques.

Presentation of data

Quali: Quotes, narrative-style extracts, to allow reality as it has been experienced during the study.

Quanti: Tables and graphs, statistics, analysis and comparison with data obtained and data from past
years and with estimates.
Generalization

Quali:Absent Identification of the Weberian ideal types, interpretation of reality.

Quanti: Necessary. Individual fragmentation, correlation between variables, conceptual unit in the
random model.

Scope of results

Quali: Limited number of cases.

Quanti: Significant number, representativity.

Methodology

Quali: Observation of the respondent in the focus room, interviews with privileged witnesses.

Quanti: Structured questionnaire for CATI, CAWI or PAP.

_____________________________________________________________________________________

UNIT-5
Q1. Write down the use place of correlation and regression.

Ans- Correlation Analysis

Correlation analysis is applied in quantifying the association between two continuous variables, for
example, an dependent and independent variable or among two independent variables.

Regression Analysis

Regression analysis refers to assessing the relationship between the outcome variable and one or more
variables. The outcome variable is known as the dependent or response variable and the risk elements,
and cofounders are known as predictors or independent variables. The dependent variable is shown by
“y” and independent variables are shown by “x” in regression analysis.

The sample of a correlation coefficient is estimated in the correlation analysis. It ranges between -1 and
+1, denoted by r and quantifies the strength and direction of the linear association among two variables.
The correlation among two variables can either be positive, i.e. a higher level of one variable is related
to a higher level of another or negative, i.e. a higher level of one variable is related to a lower level of
the other.

The sign of the coefficient of correlation shows the direction of the association. The magnitude of the
coefficient shows the strength of the association.
For example, a correlation of r = 0.8 indicates a positive and strong association among two variables,
while a correlation of r = -0.3 shows a negative and weak association. A correlation near to zero shows
the non-existence of linear association among two continuous variables.

Correlation and Regression Differences

There are some differences between Correlation and regression:

Correlation shows the quantity of the degree to which two variables are associated. It does not fix a line
through the data points. You compute a correlation that shows how much one variable changes when
the other remains constant. When r is 0.0, the relationship does not exist. When r is positive, one
variable goes high as the other goes up. When r is negative, one variable goes high as the other goes
down.

Linear regression finds the best line that predicts y from x, but Correlation does not fit a line.

Correlation is used when you measure both variables, while linear regression is mostly applied when x is
a variable that is manipulated.

Correlation and Regression Statistics

The degree of association is measured by “r” after its originator and a measure of linear association.
Other complicated measures are used if a curved line is needed to represent the relationship.

The coefficient of correlation is measured on a scale that varies from +1 to -1 through 0. The complete
correlation among two variables is represented by either +1 or -1. The correlation is positive when one
variable increases and so does the other; while it is negative when one decreases as the other increases.
The absence of correlation is described by 0.

Q2. What is the role of statistics in business forecasting?

Ans- Statistical Methods of BuinessForecasting

Various statistical forecasting methods exist designed for use with slow-moving products, new product
introductions, stable mature products and products with erratic demand. Determining which statistical
forecasting method works best for a product often boils down to trial and error. Because of the
confusion surrounding the method(s) to use, some companies bring in forecasting experts to help
analyze data and determine where to start the forecasting process.

Basics

When a company uses statistical sales forecasting techniques, it uses its historical sales or demand data
to try to predict future sales. Because of the complex mathematical formulas used to create the
forecast, most companies rely on advanced software to accomplish this task. Each type of demand
requires a different statistical method to best predict the future forecast.
Seasonal Models

A number of seasonal forecasting methods exist. Seasonal forecasting methods, such as Box Jenkins,
Census X-11, Decomposition and Holt Winters exponential smoothing models, all utilize the seasonal
component of a products demand profile as a major input to determine the future forecast. Seasonality
represents a trend that repeats during specific periods. For example, dining room tables exhibit high
seasonal demand in the months leading up to Thanksgiving and Christmas.

Simple models

Businesses that don’t have advanced forecasting software often rely on simple forecasting models
managed in a spreadsheet. Some of these methods include Holt’s double exponential smoothing;
adaptive exponential smoothing, weighted moving average and the very common moving average
method. Although an easy to use model, the moving average method fails to alert a business to future
trends in a product’s data. The moving average only shows trends already formed. Each time a new
period gets added to the moving average formula, the last period gets removed—thus the whole time
series “moves” forward one period.

New Product Models

Forecasting new products remains one of the toughest forecasting tasks available. New product
forecasting requires input from human and computer generated sources. New product forecasting
methods, such as Gompertz curve and Probit curve, seek to manage the high ramp up period associated
with a new product introduction. These methods also work for maturing products approaching the end
of their life cycle.

Slow-Moving Models

Products that exhibit slow-moving demand or have sporadic demand require a specific type of statistical
forecast model. Croston’s Intermittent model works for products with erratic demand. Products with
erratic demand do not exhibit a seasonal component; instead a graph drawn of the products demand
attributes shows peaks and flat periods at intermittent points along the time series. The goal of
Croston’s model is to provide a safety stock value instead of a forecast value. The safety stock value
allows for just enough inventories to cover needs.

Q3. Explain the main components of time series data.

Ans- The Components of Time Series

The factors that are responsible for bringing about changes in a time series, also called the components
of time series, are as follows:

1. Secular Trends (or General Trends)


2. Seasonal Movements
3. Cyclical Movements
4. Irregular Fluctuations

Secular Trends

The secular trend is the main component of a time series which results from long term effects of socio-
economic and political factors. This trend may show the growth or decline in a time series over a long
period. This is the type of tendency which continues to persist for a very long period. Prices and export
and import data, for example, reflect obviously increasing tendencies over time.

Seasonal Trends

These are short term movements occurring in data due to seasonal factors. The short term is generally
considered as a period in which changes occur in a time series with variations in weather or festivities.
For example, it is commonly observed that the consumption of ice-cream during summer is generally
high and hence an ice-cream dealer’s sales would be higher in some months of the year while relatively
lower during winter months. Employment, output, exports, etc., are subject to change due to variations
in weather. Similarly, the sale of garments, umbrellas, greeting cards and fire-works are subject to large
variations during festivals like Valentine’s Day, Eid, Christmas, New Year’s, etc. These types of variations
in a time series are isolated only when the series is provided biannually, quarterly or monthly.

Cyclic Movements

These are long term oscillations occurring in a time series. These oscillations are mostly observed in
economics data and the periods of such oscillations are generally extended from five to twelve years or
more. These oscillations are associated with the well known business cycles. These cyclic movements
can be studied provided a long series of measurements, free from irregular fluctuations, is available.

Irregular Fluctuations

These are sudden changes occurring in a time series which are unlikely to be repeated. They are
components of a time series which cannot be explained by trends, seasonal or cyclic movements. These
variations are sometimes called residual or random components. These variations, though accidental in
nature, can cause a continual change in the trends, seasonal and cyclical oscillations during the
forthcoming period. Floods, fires, earthquakes, revolutions, epidemics, strikes etc., are the root causes
of such irregularities.

Q4. Calculate the correlation co-efficent for the following heights(in inches)of fathers(X)and
their sons(Y).

X: 65 66 67 67 68 69 70 72

Y: 67 68 65 68 72 72 69 71

Ans-
Height(Father) Rank Height(Son) Rank ∣d∣ d2

65 8 67 7 1 1

66 7 68 5 2 4

67 5 65 8 3 9

67 5 68 5 0 0

68 4 72 1 3 9

69 3 72 1 2 4

70 2 69 4 2 4

72 1 71 3 2 4

n=08,∑d2=35

r=1−n(n2−1)6∑d2=1−8(82−1)6×35=1−504210=0.58

Since r(x, y) = 0.58, the variables X and Y are positively correlated. i.e. heights of fathers and their
respective sons are said to be positively correlated.

__________________________________________________________________________________

You might also like