Data Collection and Analysis: Interpretation and Providing Solution

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 39

Chapter 4

Data Collection and Analysis:


Interpretation and providing solution
Chapter Objectives
By the end of this chapter students will be able to:
 Describe the importance of carefully planning the data collection:
 how to develop data collection instruments;
 the scales of Measurement and their validity and reliability
 Describe the tools of research with respect to data collection and
analysis
 Describe analysis with respect to quantitative, qualitative and design
science researches.
 Describe what Interpretation and Discussion mean
1
Developing Data collection Instruments
• Two general approaches (in case of people respondent)
1. Adopting from similar previous researches
• Mention how and what part is adopted
• This has two advantages
• there may be evidence validating it, and
• you can compare your results with previous results.
2. Crafting yourself based / inline with specific objectives
• No orphan question or objectives should be observed
• And pilot study to validate
• Describe procedures of data sources accessing and
acquisition (in case of non-human data sources)

2
4.1. Data Collection– Quantitative & Qualitative
• After
• developing the instruments
• Pilot and Main (actual)
• determine the data sources
• In case of experimental, analytical or predictive
• Setting the objectives of the design artifacts and later after
the construction of the artifact.
Actual Data Collection
• One should carefully plan the data collection as this is the
departure for execution of the research
• Pre-data collection
• Training of Data Collectors might be crucial
• Supporting letters might be necessary
• Post-data collection like editing of returned questionnaires

3
Cont…
• The data you have collected may be presented using
• Tabular methods
• Graphical methods
• Followed by analysis
• In strict sense in case of quantitative but not in qualitative
• In qualitative you collect- analyze- collect…
• Until “saturation “
• Analysis could start
• by arranging; presentation
• by description (like in NLP and DM)
• By structuring requirements

4
Exploring and Organizing a Data Set
• Look closely at your data and explore various ways of
organising them – detect patterns
• Example : reading test scores of 11 children.
Ruth, 96; Robert, 60; Chuck, 68; Margaret, 88; Tom, 56; Mary, 92; Ralph,
64; Bill, 72; Alice, 80; Adam, 76; Kathy, 84
What do you see? Arrange the data and look for patterns
• Alphabetical arrangement – look for meaning or pattern,
• Observable fact –
• highest score was earned by a girl and that the lowest score was
earned by a boy – although silly and meaningless,
 Symmetrical pattern: same sex arrangement- girls vs. boys
• the girls’ scores increase as we proceed through the alphabet,
and the boys’ scores decrease
 Every researcher should be able to provide a clear, logical
rationale for the procedure used to arrange and organise
the data. - affect the meaning that those data reveal
Drawing Conclusions from the Data
• Questions from the example
• Why were all the scores of the girls higher than those of the
boys?
• Why is this algorithm working better?
• Why were the intervals between each of the scores
equidistant for both boys and girls?
• Knowledge springs from questions like these
• But must be careful not to make snap judgments
• Even the most thorough research can go astray(wrong) at
the point of drawing conclusions
• The example – might conclude that girls read better than
boys – not thinking carefully
• Reading is a complex and multifaceted skill
Tabular Methods of Data Presentation
• Tabulated data can be more easily understood than facts
• They help facilitate statistical treatment of data
• When data are tabulated, all unnecessary details and
repetitions are avoided.
• Type of tables
• Simple (one way) table: shows one characteristic
• Two-way table: shows two characteristics
• Higher order table: shows three or more characteristics
Tabular method- Frequency distributions
 Steps:
• Begin by arranging the data from smallest to largest
• Count values that repeat by making tallies
• Group observations with comparable magnitude
• Stop the classification when you’re sure that the 1st & the last
classes respectively consist the smallest and larges values.
Cont…
• Indicate how many values are included in a class
• Note: If the number of classes k has been fixed, then class width
may be fixed as w = range/k
Graphical Methods Of Data Presentation
• Data in a frequency distribution can be presented graphically
or diagrammatically
• Graphs are the natural choice to represent continuous data
• For discrete or qualitative data, we have
• Pie chart (multiply relative frequency by 3600), Pictogram (use
of pictures) or Bar graph (class limit and Abs. frequency)
• For continuous data, we have
• Histogram (class boundary and abs. freq.), Frequency polygon
(Class mark and abs freq.), Cumulative frequency graph (class
mark and cumulative frequency) – also known as Ogive
Summarizing Data Numerically
• Measures of Central Tendency
• describe the characteristics of a frequency distribution
• We can calculate the
• Mean, Median, and Mode for ungrouped data
• Mean, Median, and Mode for grouped data
• We need to determine how representative the average is as
a description of a given set of data.
• We need to calculate : Range (very simple), Quartile Deviations;
Mean Deviations, Standard Deviations
Estimation and Hypothesis Testing
• Two important problems of statistical inference are:
• Estimation of parameters (point estimation and interval
estimation) and
• Test of hypothesis
Cont…
• Estimation of parameters
• producing sound and reasonable substitute for unknown
parameters of a population (Mean, variance, correlation
coefficient, etc.)
 Test of Hypothesis
• The problem in hypothesis testing refers to speculation made
about the value of unknown parameter of a distribution.
• We can test hypothesis based on sample data.
 Procedures to follow in Tests of Hypothesis
• The assumption about the parameter is called the null
hypothesis, and is denoted by H o
• The counter hypothesis is known as alternative hypothesis
and is denoted by H1
• There should always be a level of significance α in testing a
hypothesis (the probability of rejecting a hypothesis when it
is actually true)
Test of Associations
• You can also test the relationship between two
variables. We have
• Test of independence
• Testing association between two variables
• Chi square test
• You have know how to find the calculated value

(o  e) 2

• You have to know 


e
how to refer to the chi square table
• You have to know how to make conclusions
Analysis
• Quantitative
• Statistical descriptive and inferential or experiments
• Understanding numbers is very much important
• Parameter /Variable settings (assumptions..)
• Qualitative
• Understanding meanings through pattern matching,
content analysis, time series analysis … using coding for
• Providing “Thick” description is important including rival
explanations.
• Design science
• Design and Construction of an artifact
• Following standards and principles is important
• May need input from qualitative type of analysis

12
Interpretation and Discussion
• Explain the results in light of previous literatures and theories.
• No clear distinction in case of qualitative
• Involves demonstration and evaluation in case of design
research
• May be required to collect data and analysis with the same
procedures as we have seen before
• Interpreting the data means:
1. Relating the findings to the original research problem and
to the specific research questions and hypotheses.
• Researchers must eventually come full circle to their
starting point – why they conducted a research study in the
first place and what they hoped to discover – and relate
their results to their initial concerns and questions.

13
Cont…
2. Relatingthe findings to preexisting literature, concepts,
theories, and research studies.
• To be useful, research findings must in some way be
connected to the larger picture – to what people already
know or believe about the topic in question.
• Perhaps the new findings confirm a current theoretical
perspective, perhaps they cast doubt on common
“knowledge”, or perhaps they simply raise new questions that
must be addressed before we can truly understand the
phenomenon in question
3. Determining whether the findings have practical significance
as well as statistical significance.
• Statistical significance is one thing; practical significance –
whether findings are actually useful – is something else
altogether.
Cont…
4. Identifying limitations of the study.
• Finally, interpreting the data involves outlining the
weaknesses of the study that yielded them.
• No research study can be perfect, and its imperfections
inevitably cast at least a hint of doubt on its findings. Good
researchers know and they also report the weaknesses along
with the strengths of their research. E.g In design science
Exercise
• Read the following problem descriptions and
1. Craft a candidate general objectives for both.
2. Identify possible Data source
3. Determine candidate Data collection techniques and
procedures
4. Propose Data analysis techniques and procedures
Example 1: problem identification
• Distance learning puts learners in isolation, lack of
observation by teachers and more freedom to learners
• Researchers in distance learning are interested to develop
collaborative tools that supports student interactions
• There is no sufficient, collaboration among tutors is also
necessary for effective distance learning
• No system so far that supports tutors.

16
Cont…
Example 2: Amharic
• Due to the advent of the internet, many Amharic documents
are now available online. Additionally, the popular search
engine, Google, has provided an Amharic interface. However,
to date, no tolerant-retrieval mechanism based on spelling
correction has been employed for Amharic; and even there is
no published prior work regarding spelling correction for the
Amharic language.
Objectives:
• Example 1- To develop a collaborative tool that supports
tutors in distance education.
• Example 2- To develop an Amharic spelling corrector to assist
in the development of tolerant-retrieval Amharic search
systems

17
The Tools Of Research- with respect to
data collection and analysis
• Tools are chosen to facilitate research tasks
• Be careful not to equate tools of research with the
methodology of research
• Research tool = specific mechanism or strategy researcher
uses to collect, manipulate, or interpret data.
• Research methodology = general approach the researcher
takes in carrying out the research project; to some extent,
this approach dictates the particular tools the researcher
selects.
• There are six general tools of research:
• The library and its resources, The computer and its software
• Techniques of measurement, Statistics, The human mind and
• Language
1. The Library and Its Resources
• After selecting research problem, the library is the FIRST
place to clarify the dimension of the problem
• Learn what others have done in the area or in corollary
investigation, receive ideas to sharpen the focus of research
• Catalogue is the heart of the library – books, films, filmstrip,
tapes, phonograph records, maps, pictures, slides, CDs, …
• E.g., ACM Digital Library, UPM Online Database

UPM Library Catalogue


2. The Computer and its Software – Tool of Research
a.Taking advantage of the Internet
• The World Wide Web (WWW) is the world of knowledge.
• Web browsers: Mozilla Firefox, Netscape, IE, etc.
• Website: journals, publishers, organisations, individuals,..
• Search engine: google, yahoo, Alta Vista, etc.
b.E-Mail
• Faster and to individual or a group of people.
• Asking questions to authors, experts, etc.
• Facilitate collaboration among people.
• Attached file (reports, etc.)
c.News
• List servers: E-discussion group.
• Many groups with particular interests.
3. Measurement – Tool of Research
• Researchers strive for objectivity: not influenced by own
perceptions, impressions, and biases.
• Therefore, must identify systematic way of measuring a
phenomenon
• Measurement is quantifying of any phenomenon,
substantial or insubstantial, concrete or abstract, and
involves the comparison of the data being measured to a
pre-established standard.
• Quantifying mean “how much”, how many”, “to what
degree”
• you think of the world and its manifestations through the
data observed in terms of magnitude and significance.
• Concepts Variables Working definition
Rich - Income - If>$100000
- Assets - If>$250000
Scales of Measurement
• Four types of measurements
• nominal, ordinal, interval, and ratio
 Nominal
• means “name” – can measure data by assigning name to
data
• Can measure a group of children by dividing into two groups
– boys and girls
 Ordinal
• Think of the quantity being measured in terms of the symbol
< and >, higher or lower, greater or lesser, younger or older
• Always an asymmetrical relationship.
• Example: Level of education grossly on ordinal:
• unschooled, primary, secondary, college, graduate
• Work force:
• unskilled, semi-skilled, skilled
Scales of Measurement
• Four types of measurements
• nominal, ordinal, interval, and ratio
 Nominal scale

• means “name” –measure data by assigning name to data


• enables the classification of individuals, objects or responses
into subgroups based on a common/shared property or
characteristic.
• A variable measured on a nominal scale may have one, two
or more subcategories depending upon the extent of
variation.
• For example, variables with
• only one subgroup: ’water’ or ‘tree’,
• two sub-categories: “gender”: male and female; children– boys & girls
• ≥ 2 sub-categories: ‘Hotels’ -
• The sequence in which subgroups are listed makes no
difference as there is no relationship among subgroups.
Cont…
 Ordinal / Ranking scale:
• categorizes variables into subgroups on the basis of common
characteristic, and ranks the subgroups in a certain order.
• They are arranged either in ascending or descending order
according to the extent a subcategory reflects the magnitude of
variation in the variable.
• Think of the quantity being measured in terms of the symbol <
and >, higher or lower, greater or lesser, younger or older
• Always an asymmetrical relationship.
• Examples:
• Level of education: unschooled, primary, secondary, college,
graduate; Work force: unskilled, semi-skilled, skilled
• income: ‘above average’, ‘average’ and ‘below average’.
• The ‘distance’ between these subcategories are not equal as
there is no quantitative unit of measurement.
• ‘Socioeconomic status’ and ‘attitude’ are other variables
that can be measured on ordinal scale.
Cont…
 Interval:
• has all the characteristics of an ordinal scale.
• In addition, it uses a unit of measurement with an arbitrary
starting and terminating points.
• equal units of measurement: equal distance between each
interval. E.g., 1,2,3. AND
• zero point established arbitrarily:
• Examples:
• Thermometer –
• What is the boiling point for Celsius and Fahrenheit and
the freezing point?? 32 (212) and 0 (100) -
• temperature does not cease to exist at 0 degrees.
• Common use is the rating scale - 5 point measurement of
academic teaching effectiveness
• Can also determine mode, mean, std deviation, t-test, F-
test, product moment correlation
Cont…
 Ratio:
• A ratio scale has all the properties of nominal, ordinal and
interval scales plus its own property: the zero point of a ratio
scale is fixed, which means it has a fixed starting point.
• a scale that measures in terms of equal intervals and absolute
zero point of origin.
• Since the difference between intervals is always measured from
a zero point, this scale can be used for mathematical
operations.
• The measurement of variables like income ($0=$0), age, height
and weight are examples of this scale.
• Can also determine the geometric mean, the harmonic mean,
the percent variation and all other inferential statistical analysis
• Difference between interval and ratio:
• Interval: we cannot say 80 oF is twice as warm as 40 oF,
Cont…
• because it does not originate from point of absolute zero. If it is,
then we cannot measure temperature below zero.
• Ratio:
• we can say a person who is 40 year old is twice as old as one who
is 20 year old.
• true zero - as “the total absence of the quantity being measured”,
cannot measure minus distance
 Measurement Summary: If you can say that:
• One object is different from another, you’ve a nominal scale
• One object is bigger or better or more of anything than
another, you have an ordinal scale
• One object is so many units (degrees, inches) more than
another, you have an interval scale
• One object is so many times as big or bright or tall or heavy
as another, you have a ratio scale
Measurements Considerations: Validity & Reliability
 Validity
• is concerned with the soundness, the effectiveness of the
measurement instrument
• Example, a standardize test, what does the test measure? Does
it, in fact, measure what it is supposed to measure? How well,
how comprehensively, how accurately does it measure?
• Example: Scale that measure professor’s availability,
• “always available” – what does “always” mean?
• Validity looks to the end results – are we really measuring what
we think we are measuring?
 Reliability:
• what accuracy does the measure (test, instrument, inventory,
questionnaire) what it is intended to measure? consistency
4. Statistics as a Tool of Research
 All tools are more suitable for some purposes than for others
 Statistics can be a powerful tool when used correctly (for specific
kind of data and research questions)
 However to insist the use of statistics will deny valid research in
non-quantitative investigation
 REMEMBER, the statistical values obtained are never the end of
research nor the final answer to research problem.
 The final question is “What do the data indicate” not what is their
numerical configuration.
 Statistics give information about the data
• only calculated numbers that help in interpretation –
• CANNOT capture the nuances of the data.
 A researcher must discover the meaning of the data and its
relevance to the research problem.
Primary Function of Statistics
 Descriptive Statistics:
• describe the relationship between variables;
• summarize the general nature of the data, average,
variability, closeness of two or more characteristics, etc.
• E.g. Frequencies, means, standard deviation
• E.g., How many men work at UoG?
 Inferential Statistics:
• make inferences about the population, based on a random
sample.
• help in making decision about the data: decide whether the
differences observed between two groups in an experiment
are large enough to be attributed to the experimental
intervention rather than to a once-in-a-blue-moon fluke.
Cont…
• E.g., What risk factors most predict heart disease?
 Both involve summarizing data in some ways and create
entities that have no counterpart in reality.
 Example: students work 24, 22, 12, and 16 hours per
week. The average is 18.5 but NO student work exactly
18.5 hours/week.
5. The Human Mind – Tool of Research
 Statistics can tell us the center, the spread, relationship of
data BUT cannot interpret and arrive at a logical
conclusion or meaning.
 Only mind can do. Mind is the most important tool.
 Strategies to make use of the human mind to better
understand include:
• Deductive logic
• Inductive reasoning
Cont…
• Scientific method
• Critical thinking
• Collaboration with others
Critical Thinking
 During LR don’t just accept research findings and theories
at face value.
 Good researchers engage in critical thinking.
• Scrutinize for faulty assumptions, questionable logic,
weaknesses in methodology, inappropriate statistical
analyses, and unwarranted conclusions.
• evaluating information or arguments in terms of their
accuracy and worth.
 Take a variety of forms, depending on the context.
• Verbal reasoning: Understanding and evaluating the
persuasive techniques found in oral and written language.
Cont….
• Argument analysis: Discriminating between reasons that do and do
not support a particular conclusion.
• Decision making: Identifying and judging several alternatives and
selecting the best alternative.
• Critical analysis of prior research.
Critical Analysis of Prior Research
 Evaluating the value of data and research results in terms of the
methods used to obtain them and their potential relevance to
particular conclusion.
 Consider these questions

• Was an appropriate method used to measure a particular outcome?


• Are the data and results derived from a relatively large number of
people, objects, or events?
• Have other possible explanations or conclusions been eliminated?
Cont…
• Can the results obtained in one situation be reasonably
generalized to other situations?
Collaboration with Others
 A researcher has certain perspectives, assumptions, and
theoretical biases – not to mention holes in knowledge about
subject matter – that limit research approaches of a project.
 Need to bring colleagues who have perspectives, backgrounds,
and areas of expertise somewhat different – more cognitive
resources to tackle research problem and how to find meaning.
 Can be equal partners or simply offer suggestions and advice.
 Typically they are assigned an advisor or advisory committee.
 Prudent (careful) student selects committee that will make
genuine contribution.
6. Language as a Tool of Research
 Human kind’s greatest achievements – facilitate
communication and think effectively.
 Can think more clearly and efficiently when can represent
thoughts with specific WORDS and PHRASES
 Words, even a simple one, can

• Reduce world’s complexity/ allow abstraction of the


environment
• Facilitate generalization and inference drawing in new
situation.
• Enhance the power of thought

 FINALY: Make sure that your data collection and


analysis is logical!!!
Developing Data collection Instruments
• Exercise
• A study is conducted to assess the effect of the provision of 1mg
folic acid per day to pregnant women on the birth weight of their
babies. In this study, is the researcher interested in quantitative or
qualitative results?
• If the above study was to observe the effect on the alleviation of
post-partum depression would your answer be different? Suggest
ways in which one can ‘measure’ post-partum depression.

36
Exploring and Organizing a Data Set
• Before employing any statistical procedure, develop habit of
looking closely at your data and exploring various ways of
organising them – detect patterns
• Example : reading test scores of 11 children.
Ruth, 96; Robert, 60; Chuck, 68; Margaret, 88; Tom, 56;
Mary, 92; Ralph, 64; Bill, 72; Alice, 80; Adam, 76; Kathy, 84
• What do you see? Arrange and look for patterns
Arranging The Data
• Alphabetical arrangement – look for meaning or pattern,
• Observable fact – highest score was earned by a girl and
that the lowest score was earned by a boy – although silly
and meaningless, it’s an observable fact, and it may come in
handy at a future time
• Careful researchers discover everything possible about their data,
whether the information is immediately useful or not
Arranging The Data….
 Symmetrical pattern: same sex arrangement- girls vs. boys
• The graph shows dramatic trends –
• the girls’ scores increase as we proceed through the
alphabet, and the boys’ scores decrease
• The researcher should be aware of the dynamics, the
phenomena, that are active within the data, whether those
phenomena are important to the purpose of the research or
not
• Fundamental Guideline for Looking at the Data
• Whatever the researcher does with the data to prepare it for
inspection or interpretation will affect the meaning that
those data reveal
• Therefore, every researcher should be able to provide a clear,
logical rationale for the procedure used to arrange and
organise the data
Drawing Conclusions from the Data
• Questions from the example
• Why were all the scores of the girls higher than those of the
boys?
• Why is this algorithm working better?
• Why were the intervals between each of the scores
equidistant for both boys and girls?
• Knowledge springs from questions like these
• But must be careful not to make snap judgments
• Even the most thorough research can go astray(wrong) at
the point of drawing conclusions
• The example – might conclude that girls read better than
boys – not thinking carefully
• Reading is a complex and multifaceted skill

You might also like