Professional Documents
Culture Documents
Biostatistics BY SALAMA
Biostatistics BY SALAMA
A census is a comprehensive process involving the collection, compilation, evaluation, analysis, and
publication of demographic, economic, and social data for all individuals in a country or a specific area at a given time.
1. De Facto Method:
• In this method, a specific date is chosen for the census, and the enumeration is carried out on that day
throughout the entire country.
• For example, if the census is scheduled for April 1st, everyone present in the country on that day is
included in the count.
2. De Jure Method:
• Only individuals who are permanent residents of a place are counted, excluding those who are there
temporarily.
• This method is also known as the real and direct enumeration method.
A population pyramid is a graphical representation illustrating the distribution of different age groups and genders
within a population. It typically resembles a pyramid, with the younger age groups at the bottom and the older age
groups at the top. This visual tool helps demographers analyze population structures and predict demographic trends.
1. Mean:
• Formula for the sample mean: sum of all scores divided by the number of scores.
2. Mode:
Examples:
MOHAMED SALAMA 1
• Unimodal (one peak): 3, 4, 6, 5, 4, 1, 3, 3, 2, 7, 5 (Mode = 3)
• No Mode: 2, 3, 5, 4, 6, 4, 3, 2, 6, 5
Problem:
• The mode may give limited information and can be misleading. For example, in the sequence 7, 7, 7, 20, 20, 21,
22, 22, 23, 24, the mode might be misleading.
3. Median:
• The median is the score at the 50th percentile, or the middle score.
MOHAMED SALAMA 2
Determining the Median:
Examples:
In Summary:
• Median is the middle score, a robust measure, especially with skewed data.
Measures of Dispersion:
1. Range:
• Definition: The range is the difference between the highest and lowest values in a dataset.
• Example: In a group of individuals arrested for DUI, ages 18, 18, 19, 21, 26, and 30, the range is 30 - 18 = 12
years.
2. Variance:
• Definition: It is the sum of squared deviations from the mean divided by the number of values minus 1.
• Calculation:
• Example: For the ages 18, 18, 19, 21, 26, and 30, the variance is 24.4 years².
3. Standard Deviation:
• Definition: The positive square root of the variance. It is a crucial measure of dispersion.
• Calculation:
• Example: If the variance is 24.4 years², the standard deviation is 24.4≈4.9424.4≈4.94 years.
• Definition: The ratio of the standard deviation to the mean, expressed as a percentage.
• Calculation:
MOHAMED SALAMA 3
• Example: In a class of medical students with a mean weight of 60 kg and a standard deviation of 12 kg, the CV is
20%.
Deviation:
• Deviations indicate the error in prediction when using the mean to predict scores.
Example 2:
• Females: 20 to 26
• Males: 17 to 26
• Calculate mean, mode, median for both males and females separately.
• Characteristics:
MOHAMED SALAMA 4
• Total area under the curve above the x-axis = 1 or 100%.
MOHAMED SALAMA 5
Empirical Rule:
• Describes the percentage of data within specific standard deviations from the mean in a normal distribution.
(68%, 95%, 99.7%)
2. Mean=Median=Mode.
Morbidity:
• Morbidity measures characterize the number of individuals in a population who become ill (incidence) or are ill
at a given time (prevalence).
Mortality:
• Mortality rates measure the frequency of death occurrence in a defined population during a specified time
interval.
MOHAMED SALAMA 6
Incidence:
• Incidence is a measure of disease representing a person's probability of being diagnosed during a given period.
• Incidence rate = (Number of new cases) / (Number of persons at risk) during a specified time.
Example:
• In a gastroenteritis outbreak, 30 of 99 persons who ate potato salad developed the illness.
Prevalence:
• Prevalence rate = (Total cases of disease) / (Total population) during a specified time.
• The key difference: Incidence considers new cases, while prevalence includes both new and preexisting cases.
• Deaths assigned to a specific cause / Total deaths from all causes during the same time per 1,000.
• Deaths among children < 1 year / Number of live births during the same time per 1,000.
• Deaths assigned to pregnancy-related causes / Number of live births during the same time per 100,000.
Natality Rates:
• Number of live births in a geographical area during a year / Mid-year total population of the area per
1,000.
• Ratio of live births to women in childbearing years during a time period, divided by the length of the
period.
MOHAMED SALAMA 7
3. Fecundity Rate:
• Number of live births among married women in childbearing period in a geographical area during a year
per 1,000.
Sampling Techniques:
• Probability Sampling: Every eligible individual has a chance of being chosen, allowing for more generalizable
results.
• Non-Probability Sampling: Some individuals have no chance of being selected, leading to a risk of non-
representative samples. However, these methods are often more cost-effective and convenient.
• A random sample can be obtained by assigning numbers to individuals and using a table of random
numbers.
2. Systematic Sampling:
• Systematic sampling is a method of selecting individuals from a sampling frame at regular intervals to
achieve a representative sample size. The process involves determining the appropriate interval, which is
calculated by dividing the population size (x) by the desired sample size (n). For instance, if a researcher
aims to obtain a sample size of 100 from a population of 1000, the selection would involve choosing
every 1000/100 = 10th member from the sampling frame.
• In practical terms, this means selecting every nth individual in a systematic manner, ensuring that the
intervals are determined systematically and consistently. This method provides a structured approach to
sampling, making it easier to manage and analyze, especially when dealing with large populations.
Systematic sampling helps strike a balance between the need for a representative sample and the
practical considerations of sampling logistics.
• The target population is divided into strata based on attributes like age, sex, occupation, etc.
• Random samples are drawn from each stratum, and the final sample is the sum of samples from all
strata.
• Advantage: Provides accurate data when the distribution of the studied variable is not uniform among
strata.
4. Multistage Sampling:
MOHAMED SALAMA 8
• It involves dividing the population into clusters and taking samples from these clusters.
5. Clustered Sampling:
• Subgroups or clusters of the population are used as the sampling unit instead of individual members.
1. Convenience Sampling:
2. Quota Sampling:
• Researchers establish quotas for different categories and then sample individuals who fit those
categories until the quota is met.
• Researchers use their judgment to select individuals who meet specific criteria for the study.
4. Snowball Sampling:
• Definition: Data that have meaning as a measurement, such as a person's height, weight, IQ, or blood pressure.
• Discrete Data:
• Continuous Data:
• Definition: Represents measurements where possible values cannot be counted and are described using
intervals on the real number line.
Qualitative Data:
• Categorical Data:
• Definition: Represents characteristics like a person's gender, marital status, hometown, or movie
preferences.
MOHAMED SALAMA 9
• Other Names: Qualitative data, non-numeric data.
In summary:
• Numerical Data is about measurements, and it can be either discrete (countable items) or continuous
(measurable intervals).
• Qualitative Data is about characteristics, and it falls into the category of categorical data, which includes non-
numeric attributes like gender or hometown.
MCQS
MCQ 1:
a) Economic structures
b) Human populations
c) Political systems
d) Environmental sustainability
MOHAMED SALAMA 10
MCQ 2:
Question: In the De Facto Method of conducting a census, when are individuals included in the count?
MCQ 3:
a) Economic analysis
b) Political forecasting
c) Analyzing population structures and trends
d) Environmental conservation
MCQ 4:
MCQ 5:
MCQ 6:
MOHAMED SALAMA 11
Question: When is the mode most useful?
MCQ 7:
MCQ 8:
MCQ 9:
MCQ 10:
MOHAMED SALAMA 12
a) Calculating the mean
b) Measuring the spread of data relative to the mean
c) Determining the mode
d) Analyzing population structures
MCQ 11:
Answer: b) The percentage of data within specific standard deviations from the mean
MCQ 12:
Question: In a normal distribution, what is the total area under the curve above the x-axis?
a) 50%
b) 75%
c) 100%
d) 25%
Answer: c) 100%
MCQ 13:
a) Triangular
b) Asymptotic
c) Square
d) Exponential
Answer: b) Asymptotic
MCQ 14:
Question: Which measure of central tendency is most suitable for skewed data?
a) Mean
b) Mode
MOHAMED SALAMA 13
c) Median
d) Range
Answer: c) Median
MCQ 15:
Question: What does the deviation of a score from the mean indicate?
MCQ 16:
Question: In which method of conducting a census are only permanent residents counted?
a) De Facto Method
b) De Jure Method
c) Both methods
d) Neither method
MCQ 17:
a) In symmetrical data
b) In skewed data
c) In sequences with distinct categories
d) In sequences with similar values
MCQ 18:
Question: What is the primary purpose of the De Facto Method in census enumeration?
MOHAMED SALAMA 14
Answer: c) To count everyone present on a specific date
MCQ 19:
Question: In the context of a population pyramid, where are younger age groups typically represented?
a) At the top
b) In the middle
c) At the bottom
d) In the center
MCQ 20:
Question: Which method of conducting a census is also known as the real and direct enumeration method?
a) De Facto Method
b) De Jure Method
c) Both methods
d) Neither method
MCQ 21:
MCQ 22:
MOHAMED SALAMA 15
MCQ 23:
a) In symmetrical data
b) In skewed data
c) In sequences with similar values
d) In sequences with distinct categories
MCQ 24:
a) Triangular shape
b) Asymptotic to the x-axis
c) Square shape
d) Exponential shape
MCQ 25:
a) Mean
b) Mode
c) Median
d) Range
Answer: b) Mode
MCQ 26:
Answer: b) To describe the percentage of data within specific standard deviations from the mean
MCQ 27:
MOHAMED SALAMA 16
a) The median
b) The mode
c) The range
d) The standard deviation
MCQ 28:
MCQ 29:
a) Economic forecasting
b) Analyzing population structures and trends
c) Political analysis
d) Environmental conservation
MCQ 30:
Question: What does the sum of deviations around the mean always equal?
a) The median
b) The mode
c) The range
d) Zero
Answer: d) Zero
MCQ 31:
MOHAMED SALAMA 17
c) Number of individuals who become ill
d) Number of individuals who migrate
MCQ 32:
MCQ 33:
MCQ 34:
a) Incidence includes both new and preexisting cases, while prevalence considers only new cases.
b) Incidence and prevalence measure the same aspect of disease occurrence.
c) Incidence considers only new cases, while prevalence includes both new and preexisting cases.
d) Incidence and prevalence are interchangeable terms.
Answer: c) Incidence considers only new cases, while prevalence includes both new and preexisting cases.
MCQ 35:
MOHAMED SALAMA 18
Answer: c) The total number of cases existing in a population
MCQ 36:
Answer: b) Total deaths during a time interval / Mid-interval population per 1,000
MCQ 37:
Answer: b) Deaths assigned to a specific cause / Total number of cases per 1,000
MCQ 38:
MCQ 39:
Answer: b) Number of live births in a geographical area during a year / Mid-year total population of the area per 1,000
MOHAMED SALAMA 19
MCQ 40:
Answer: a) Ratio of live births to women in childbearing years during a time period
MCQ 41:
Answer: c) Number of live births among married women in childbearing period in a geographical area during a year per
1,000
MCQ 42:
Answer: c) Ensure that every eligible individual has an equal chance of being chosen
MCQ 43:
MCQ 44:
MOHAMED SALAMA 20
Question: What is the primary advantage of Stratified Random Sampling?
Answer: c) It provides accurate data when the distribution of the studied variable is not uniform among strata
MCQ 45:
MCQ 46:
MCQ 47:
Answer: c) Establish quotas for different categories and sample individuals who fit those categories until the quota is met
MCQ 48:
Question: In which sampling method do existing participants recruit new participants, creating a chain effect?
MOHAMED SALAMA 21
a) Simple Random Sampling
b) Systematic Sampling
c) Snowball Sampling
d) Stratified Random Sampling
MCQ 49:
MCQ 50:
TESTS BY SALAAALALALALALALALALAMAAAAAAAAAAAAAAAAAAAAA
HAHAHAHAHHAAHAHHAAH
TEST 1
ection 1: Multiple Choice Questions (MCQs)
1. What is demography? a. The study of rocks and minerals b. The study of human populations and their changes c.
The study of weather patterns d. The study of ancient civilizations
2. Which method of conducting a census involves an enumeration period of two or three weeks? a. De Facto
Method b. De Jure Method c. Both methods involve the same duration d. None of the above
3. What is the mean in statistics? a. The most frequently occurring score b. The score at the 50th percentile c. A
score that represents the center of a distribution d. The difference between the highest and lowest values
MOHAMED SALAMA 22
4. What does the mode measure? a. The middle score b. The most frequently occurring score c. The average of the
two middle scores d. The distance from the mean
5. What is the primary characteristic of a bell-shaped curve? a. Asymptotic to the y-axis b. Mean ≠ Median ≠ Mode
c. Bell-shaped and symmetrical d. Total area under the curve below the x-axis = 1
Section 2: Matching
A. Crude Death Rate B. Incidence C. Clustered Sampling D. Continuous Data E. Stratified Random Sample
1. Measure of disease representing a person's probability of being diagnosed during a given period.
2. Subgroups or clusters of the population are used as the sampling unit instead of individual members.
3. Represents measurements where possible values cannot be counted and are described using intervals on the
real number line.
4. Total deaths during a time interval divided by the mid-interval population per 1,000.
5. The target population is divided into strata based on attributes, and random samples are drawn from each
stratum.
Explain the significance of the "Empirical Rule" in the context of a normal distribution. How does it help researchers in
understanding and interpreting data? Provide examples to illustrate its application.
Answers:
1. b
2. b
3. c
4. b
5. c
Matching:
1. B - Incidence
2. C - Clustered Sampling
3. D - Continuous Data
Written Response:
The Empirical Rule is crucial in understanding and interpreting data in a normal distribution. It states that in a normal
distribution, approximately 68% of the data falls within one standard deviation of the mean, 95% within two standard
MOHAMED SALAMA 23
deviations, and 99.7% within three standard deviations. This rule helps researchers assess the spread of data and identify
the probability of data falling within specific ranges. For example, if a variable follows a normal distribution, researchers
can use the Empirical Rule to predict that about 95% of the values will fall within two standard deviations from the
mean. This aids in making statistical inferences and understanding the distribution's characteristics, facilitating more
informed decision-making in various fields.
TEST 2
Section 1: Multiple Choice Questions (MCQs)
1. What is the primary purpose of a census? a. To study weather patterns b. To collect and analyze data about a
population c. To analyze economic trends d. To examine geological formations
2. Which sampling method involves selecting individuals based on ease of access or convenience? a. Simple
Random Sampling b. Clustered Sampling c. Convenience Sampling d. Stratified Random Sample
3. What is the mode in statistics? a. The average of the two middle scores b. The score at the 50th percentile c. The
most frequently occurring score d. The difference between the highest and lowest values
4. What does the coefficient of variation (CV) measure? a. The difference between the highest and lowest values
b. The ratio of the standard deviation to the mean c. The middle score d. The average of the two middle scores
5. In a population pyramid, where are the older age groups typically represented? a. At the bottom b. In the
middle c. At the top d. Throughout the pyramid
Section 2: Matching
2. The sum of squared deviations from the mean divided by the number of values minus 1.
3. An enumeration method where a specific date is chosen for the census, and everyone present on that day is
included.
Explain the differences between the De Facto and De Jure methods of conducting a census. Provide examples to illustrate
when each method might be more appropriate.
Answers:
1. b
2. c
MOHAMED SALAMA 24
3. c
4. b
5. c
Matching:
1. C - Quota Sampling
2. B - Variance
3. A - De Facto Method
4. D - Prevalence
5. E - Multistage Sampling
Written Response:
The De Facto method involves choosing a specific date for the census, and the enumeration is carried out on that day
throughout the entire country. Everyone present in the country on that day is included in the count. On the other hand,
the De Jure method sets an enumeration period of two or three weeks, and only individuals who are permanent
residents of a place are counted, excluding those who are there temporarily.
The choice between De Facto and De Jure methods depends on the study's objectives and the population's
characteristics. For example, the De Facto method might be more suitable for capturing the actual population present in
a country at a specific moment, such as for assessing real-time demographic trends. In contrast, the De Jure method
might be preferred in situations where temporary residents significantly affect the population but are not the focus of
the study, ensuring a more accurate representation of the permanent population.
TEST 3
Section 1: Multiple Choice Questions (MCQs)
1. What is the primary purpose of a population pyramid? a. To study economic trends b. To represent the
distribution of different age groups and genders within a population c. To analyze weather patterns d. To
measure mortality rates
2. Which measure of central tendency is most reliable with skewed data? a. Mean b. Mode c. Median d. Range
3. What does the Empirical Rule describe in a normal distribution? a. The difference between the highest and
lowest values b. The percentage of data within specific standard deviations from the mean c. The most
frequently occurring score d. The total area under the curve below the x-axis
4. Which vital rate measures the frequency of death occurrence in a defined population during a specified time
interval? a. Crude Birth Rate b. Case Fatality Rate c. Crude Death Rate d. Maternal Mortality Rate
5. What is the primary characteristic of a bell-shaped curve? a. Mean ≠ Median ≠ Mode b. Asymptotic to the y-axis
c. Total area under the curve above the x-axis = 1 d. Bell-shaped and symmetrical
Section 2: Matching
MOHAMED SALAMA 25
Match the following:
A. Judgement Sampling B. Standard Deviation C. General Fertility Rate D. Mode E. Infant Mortality Rate
3. Deaths among children less than 1 year old divided by the number of live births during the same time.
4. A sampling method where researchers use their judgment to select individuals who meet specific criteria for the
study.
Explain the significance of the "Clustered Sampling" method in research. Provide examples of situations where clustered
sampling might be more practical or advantageous compared to other sampling methods.
Answers:
1. b
2. c
3. b
4. c
5. d
Matching:
1. B - Standard Deviation
2. D - Mode
4. A - Judgement Sampling
Written Response:
Clustered Sampling involves using subgroups or clusters of the population as the sampling unit. This method is
advantageous in situations where the population is naturally grouped or when it is more practical to sample clusters
rather than individual members. For example, in a study involving schools in a city, clusters of schools could be randomly
selected, and then all students within those schools would be included in the sample. This approach is more efficient and
cost-effective than attempting to sample individual students from across the entire city. Clustered Sampling is particularly
useful when there is heterogeneity within clusters but homogeneity between clusters.
متنسونيش من دعائكم
MOHAMED SALAMA 26