Definition of Assessing Terms

1.
Population Population in research is generally a large collection of individuals or objects that is the main focus of a scientific query. It is for the benefit of the population that researches are done. However, due to the large sizes of populations, researchers often cannot test every individual in the population because it is too expensive and time-consuming. This is the reason why researchers rely on techniques. It is also known as a well-defined collection of individuals or objects known to have similar characteristics. All individuals or objects within a certain population usually have a common, binding characteristic or trait.
2.
Sampling A sample is a subset of the population being studied. It represents the larger population and is used to draw inferences about that population. It is a research technique widely used in the social sciences as a way to gather information about a population without having to measure the entire population. There are several different types and ways of choosing a sample from a population, from simple to complex.
2.1
Non-probability Sampling Techniques Non-probability sampling is a sampling technique where the samples are gathered in a process that does not give all the individuals in the population equal chances of being selected.
a) Reliance On Available Subjects. Relying on available subjects, such as stopping people on a street corner as they pass by, is one method of sampling, although it is extremely risky and comes with many cautions. This method, sometimes referred to as a convenience sample, does not allow the researcher to have any control over the representativeness of the sample. It is only justified if the researcher wants to study the characteristics of people passing by the street corner at a certain point in time or if other sampling methods are not possible. The researcher
must also take caution to not use results from a convenience sample to generalize to a wider population. b) Purposive or Judgmental Sample. A purposive, or judgmental, sample is one that is selected based on the knowledge of a population and the purpose of the study. For example, if a researcher is studying the nature of school spirit as exhibited at a school pep rally, he or she might interview people who did not appear to be caught up in the emotions of the crowd or students who did not attend the rally at all. In this case, the researcher is using a purposive sample because those being interviewed fit a specific purpose or description.
c) Snowball Sample. A snowball sample is appropriate to use in research when the members of a population are difficult to locate, such as homeless individuals, migrant workers, or undocumented immigrants. A snowball sample is one in which the researcher collects data on the few members of the target population he or she can locate, then asks those individuals to provide information needed to locate other members of that population whom they know. For example, if a researcher wishes to interview undocumented immigrants from Mexico, he or she might interview a few undocumented individuals that he or she knows or can locate and would then rely on those subjects to help locate more undocumented individuals. This process continues until the researcher has all the interviews he or she needs or until all contacts have been exhausted.
d) Quota Sample. A quota sample is one in which units are selected into a sample on the basis of pre-specified characteristics so that the total sample has the same distribution of characteristics assumed to exist in the population being studied. For example, if you a researcher conducting a national quota sample, you might need to know what proportion of the population is male and what proportion is female as well as what proportions of each gender fall into different age
categories, race or ethnic categories, educational categories, etc. The researcher would then collect a sample with the same proportions as the national population.
2.2
Probability Sampling Techniques Probability sampling is a sampling technique where the samples are gathered in a process that gives all the individuals in the population equal chances of being selected. a) Simple Random Sample. The simple random sample is the basic sampling method assumed in statistical methods and computations. To collect a simple random sample, each unit of the target population is assigned a number. A set of random numbers is then generated and the units having those numbers are included in the sample. For example, lets say you have a population of 1,000 people and you wish to choose a simple random sample of 50 people. First, each person is numbered 1 through 1,000. Then, you generate a list of 50 random numbers (typically with a computer program) and those individuals assigned those numbers are the ones you include in the sample. b) Systematic Sample. In a systematic sample, the elements of the population are put into a list and then every kth element in the list is chosen (systematically) for inclusion in the sample. For example, if the population of study contained 2,000 students at a high school and the researcher wanted a sample of 100 students, the students would be put into list form and then every 20th student would be selected for inclusion in the sample. To ensure against any possible human bias in this method, the researcher should select the first individual at random. This is technically called a systematic sample with a random start.
c) Stratified Sample. A stratified sample is a sampling technique in which the researcher divided the entire target population into different subgroups, or strata, and then randomly selects the final subjects proportionally from the different strata. This type of sampling is used when the researcher wants to highlight specific subgroups within the population. For example, to obtain a stratified sample of university students, the researcher would first organize the population by college class and then select appropriate numbers of freshmen, sophomores, juniors, and seniors. This ensures that the researcher has adequate amounts of subjects from each class in the final sample.
d) Cluster Sample. Cluster sampling may be used when it is either impossible or impractical to compile an exhaustive list of the elements that make up the target population. Usually, however, the population elements are already grouped into subpopulations and lists of those subpopulations already exist or can be created. For example, lets say the target population in a study was church members in the United States. There is no list of all church members in the country. The researcher could, however, create a list of churches in the United States, choose a sample of churches, and then obtain lists of members from those churches. 3. Instrument Instrument is the generic term that researchers use for a measurement device (survey, test, questionnaire, etc.). To help distinguish between instrument and instrumentation, consider that the instrument is the device and instrumentation is the course of action (the process of developing, testing, and using the device). Instruments fall into two broad categories, researcher-completed and subject-completed, distinguished by those instruments that researchers administer versus those that are completed by participants. Researchers chose which type of instrument, or instruments, to use based on the research question.
3.1
Validity Validity is described as the degree to which a research study measures what it intends to measure. There are two main types of validity, internal and external. Internal validity refers to the validity of the measurement and test itself, whereas external validity refers to the ability to generalize the findings to the target population. Both are very important in analysing the appropriateness, meaningfulness and usefulness of a research study. However, here I will focus on the validity of the measurement technique (i.e. internal validity).There are 4 main types of validity used when assessing internal validity. Each type views validity from a different perspective and evaluates different relationships between measurements. a) Face validity This refers to whether a technique looks as if it should measure the variable it intends to measure. For example, a method where a participant is required to click a button as soon as a stimulus appears and this time is measured appears to have face validity for measuring reaction time. An example of analysing research for face validity by Hardesty and Bearden (2004) can be found here.
b) Concurrent validity This compares the results from a new measurement technique to those of a more established technique that claims to measure the same variable to see if they are related. Often two measurements will behave in the same way, but are not necessarily measuring the same variable, therefore this kind of validity must be examined thoroughly. An example and some weakness associated with this type of validity can be found here (Shuttleworth, 2009).
c) Predictive validity This is when the results obtained from measuring a construct can be accurately used to predict behaviour. There are obvious limitations to this as
behaviour cannot be fully predicted to great depths, but this validity helps predict basic trends to a certain degree. A meta-analysis by van IJzendoorn (1995) examines the predictive validity of the Adult Attachment Interview.
d) Construct validity This is whether the measurements of a variable in a study behave in exactly the same way as the variable itself. This involves examining past research regarding different aspects of the same variable. The use of construct validity in psychology is examined by Cronbach and Meehl (1955).
3.2
Reliability Reliability refers to the consistency of scores or answers provided by an instrument. Errors of measurement refer to variations in scores obtained by the same individuals on the same instrument. a) Test-retest reliability Test-retest reliability is a measure of reliability obtained by administering the same test twice over a period of time to a group of individuals. The scores from Time 1 and Time 2 can then be correlated in order to evaluate the test for stability over time. Example: A test designed to assess student learning in psychology could be given to a group of students twice, with the second administration perhaps coming a week after the first. The obtained correlation coefficient would indicate the stability of the scores.
b) Parallel forms reliability Parallel forms reliability is a measure of reliability obtained by administering different versions of an assessment tool (both versions must contain items that probe the same construct, skill, knowledge base, etc.) to the same group of individuals. The scores from the two versions can then be correlated in order to evaluate the consistency of results across alternate versions.
Example: If you wanted to evaluate the reliability of a critical thinking assessment, you might create a large set of items that all pertain to critical thinking and then randomly split the questions up into two sets, which would represent the parallel forms.
c) Inter-rater reliability Inter-rater reliability is a measure of reliability used to assess the degree to which different judges or raters agree in their assessment decisions. Interrater reliability is useful because human observers will not necessarily interpret answers the same way; raters may disagree as to how well certain responses or material demonstrate knowledge of the construct or skill being assessed. Example: Inter-rater reliability might be employed when different judges are evaluating the degree to which art portfolios meet certain standards. Interrater reliability is especially useful when judgments can be considered relatively subjective. Thus, the use of this type of reliability would probably be more likely when evaluating artwork as opposed to math problems.
d) Internal consistency reliability Internal consistency reliability is a measure of reliability used to evaluate the degree to which different test items that probe the same construct produce similar results. Average inter-item correlation is a subtype of internal consistency reliability. It is obtained by taking all of the items on a test that probe the same construct (e.g., reading comprehension), determining the correlation coefficient for each pair of items, and finally taking the average of all of these correlation coefficients. This final step yields the average interitem correlation. Split-half reliability is another subtype of internal
consistency reliability. The process of obtaining split-half reliability is begun by splitting in half all items of a test that are intended to probe the same area of knowledge (e.g., World War II) in order to form two sets of items. The entire test is administered to a group of individuals, the total score
for each set is computed, and finally the split-half reliability is obtained by determining the correlation between the two total set scores.
4.
Measuring Scale Statistical information, including numbers and sets of numbers, has specific qualities that are of interest to researchers. These qualities, including magnitude, equal intervals, and absolute zero, determine what scale of measurement is being used and therefore what statistical procedures are best. Magnitude refers to the ability to know if one score is greater than, equal to, or less than another score. Equal intervals means that the possible scores are each an equal distance from each other. And finally, absolute zero refers to a point where none of the scale exists or where a score of zero can be assigned. When we combine these three scale qualities, we can determine that there are four scales of measurement. The lowest level is the nominal scale, which represents only names and therefore has none of the three qualities. A list of students in alphabetical order, a list of favorite cartoon characters, or the names on an organizational chart would all be classified as nominal data. The second level, called ordinal data, has magnitude only, and can be looked at as any set of data that can be placed in order from greatest to lowest but where there is no absolute zero and no equal intervals. Examples of this type of scale would include Likert Scales and the Thurstone Technique. The third type of scale is called an interval scale, and possesses both magnitude and equal intervals, but no absolute zero. Temperature is a classic example of an interval scale because we know that each degree is the same distance apart and we can easily tell if one temperature is greater than, equal to, or less than another. Temperature, however, has no absolute zero because there is (theoretically) no point where temperature does not exist. Finally, the fourth and highest scale of measurement is called a ratio scale. A ratio scale contains all three qualities and is often the scale that statisticians prefer because the data can be more easily analyzed. Age, height, weight, and scores on a 100-point test would all be examples of ratio scales. If you are 20 years old, you not only know that you are
older than someone who is 15 years old (magnitude) but you also know that you are five years older (equal intervals). With a ratio scale, we also have a point where none of the scale exists; when a person is born his or her age is zero.
Scales of Measurement
Scale Level
Scale of Measurement
Scale Qualities Magnitude Equal Intervals Absolute Zero Magnitude
Example(s)
Ratio
Age, Height, Weight, Percentage
Interval
Equal Intervals
Temperature
2 1
Ordinal Nominal
Magnitude None
Likert Scale, Anything rank ordered Names, Lists of words
5.
Parametric and Non Parametric Data Several fundamental statistical concepts are helpful prerequisite knowledge for fully understanding the terms parametric and nonparametric. These statistical
fundamentals include random variables, probability distributions, parameters, population, sample, sampling distributions and the Central Limit Theorem. I cannot explain these topics in a few paragraphs, as they would usually comprise two or three chapters in a statistics textbook. Thus, I will limit my explanation to a few helpful (I hope) links among terms.The field of statistics exists because it is usually impossible to collect data
from all individuals of interest (population). Our only solution is to collect data from a subset(sample) of the individuals of interest, but our real desire is to know the truth about the population. Quantities such as means, standard deviations and proportions are all important values and are called parameters when we are talking about a population. Since we usually cannot get data from the whole population, we cannot know the values of the parameters for that population. We can, however, calculate estimates of these quantities for our sample. When they are calculated from sample data, these quantities are called statistics. A statistic estimates a parameter. Parametric statistical procedures rely on assumptions about the shape of the distribution (i.e., assume a normal distribution) in the underlying population and about the form or parameters (i.e., means and standard deviations) of the assumed distribution. Nonparametric statistical procedures rely on no or few assumptions about the shape or parameters of the population distribution from which the sample was drawn.
Parametric Assumed distribution Assumed variance Typical data Data set relationships Usual central measure Benefits Normal Homogeneous Ratio or Interval Independent Mean Can draw more conclusions Choosing parametric test
Non-parametric Any Any Ordinal or Nominal Any Median Simplicity; Less affected by outliers Choosing a nonparametric test Spearman
Choosing test
Correlation test
Pearson
6.

Definition of Assessing Terms

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Definition of Assessing Terms

Uploaded by

Copyright:

Available Formats

1.

Scale Qualities Magnitude Equal Intervals Absolute Zero Magnitude

Age, Height, Weight, Percentage

Likert Scale, Anything rank ordered Names, Lists of words

You might also like