Gea Tutorial 2

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

L

et us designate Covid-19 patients aged 70 years and above as “Old”, and all other Covid-19
patients as “Young”. Let “D” represent death from Covid-19.

a. What proportion of patients in Italy are old? How about China? Which country, Italy or
China, is positively associated with old patients?

41%.
12%.
Italy is positively associated with old patients since rate(Old| Italy) > rate(Old |China).

b. In Italy, what is the death rate amongst the old patients, rate(D|Old)? Give your answers
to 2 decimal places.

6.2+ 13
=0.47
41
In Italy, rate(D|Old)=[13%(0.19) + 6.2%(0.22)]/[0.19+0.22] = 9.35%
Observe that we need to take the weighted average of the death rate in each age group.

From Fig 1, the overall death rate in Italy is 4.2%. Using the basic rule of rates, what must be
the possible range of the death rate amongst the young patients in Italy? Is there an association
between age and death among Covid-19 patients in Italy?

Rate (D) = 4.2%


Rate (D|Old) = 0.47%
Rate(D|NOld)
Since rate(D|Old) = 9.35%
Rate(D) = 4.2%
0<rate(D|Young)<4.2%

From figure 1, we know rate(D) = 4.2%


From (b), rate(D|Old), since 4.2% < 9.35%
By the basic rule of rates: rate(D|Young) < rate(D)<rate(D|Old)
Thus the possible range is: 0 < rate(D|Old) < 4.2%
Since rate(D|Young) < rate(D|Old), this means that being old is positively associated to death in
Italy.

d.
Repeat parts (b) and (c) for China.
In China, Rate(D|Old) = [15%(0.03) + 8%(0.09)]/[0.03+0.09] = 9.75%
From Fig 1 rate (D) = 2.2% in China
Hence 0 < rate(D|Young)< 2.2% and rate(D|Young)< rate(D|Old)
Being old is also positively associated to death in China.
e.
Let's assume the following rough estimates from Fig 1:
In Italy, rate(D|Young) = 0.621%. In China, rate(D|Young) = 1.17%.

Using the information from Q1(a) to (d), explain how it is possible for the overall death rate in
Italy to be higher than that in China, despite Italy having a lower death rate in China for every
age group, as shown in Fig 1.

Hint: You may use the following table to help you:

Italy (%) China

rate(D|Old) 0.47 2.1


(b) = 9.35% (d) = 9.75%

rate(D|Young) 0.621 1.17

rate(D) 4.2 2.2

rate(old) 41 12

Italy has more old people. Despite Italy having lower rates than China when comparing death
rate across individual age groups, when combining the groups, Italy will have a higher overall
death rate. overall trend opposes majority in each subgroup.
Car
Radient: Brand, "'B.M.W.' = 'Luxury';'MERCEDES BENZ'= 'Luxury'; else = 'Normal'
Normalise by total
Normalise by row: each row sum up to 1
The one you vary is the one you notmalis ebay

Let the car brands MERCEDES BENZ and B.M.W be regarded as Luxury car types. The other
brands are regarded as Normal car types.

a. Calculate the overall proportion of Luxury cars in the dataset.


163
=¿ 32.21%
506
b. Determine if there is any association between the variable “Car Type” which indicates if
the car brand was luxury/normal, and the variable “Year”.
46
rate(Luxury | 2008) = = 21.10%
218
117
rate(Luxury | 2021) = = 40.63%
288

Negative association between luxury car type and 2008.

c. Determine whether COE Category is a confounder when examining the association


between “Car Type” and “Year”.
22
rate(Luxury | Cat A) = = 16.79%
131
95
rate(Luxury | Cat B) = = 60.50%
157
Therefore there is association between COE Category and Car Type.
177
rate(Cat A | 2008) = =81.19%
218
131
rate(Cat A | 2021) = =45.49%
288
Therefore there is association between COE Category and Car Type, hence,COE Category is a
confounder.

d. In relation to (c), do we observe Simpson’s Paradox when investigating the association


between the variables “Car Type” and “Year”?
Recall part (b), there was a negative association between Luxury cars and 2008.
After slicing, we check the association amongst Cat A only, and amongst Cat B only.
Amongst Cat A, rate(Luxury|2008) = 14.12% < 16.79% = rate(Luxury|2021). There is still a
negative association between Luxury cars and 2008 when looking at Cat A only.
Amongst Cat B, rate(Luxury|2008)= 51.22% <60.51% = rate(Luxury|2021). There is still a
negative association between Luxury cars and 2008 when looking at Cat B only.
Thus there is no Simpson’s paradox as even after slicing the data according to COE categories,
the Luxury cars are still negatively associated to each year.

e. A student claims: “Since a financial crisis occurred in 2008, this would cause
Singaporeans to be less able to afford Luxury cars as compared to 2021, thus resulting
in a lower proportion of Luxury cars sold.”
Discuss how you might refute or support the student's claim based on the given dataset.
Association does not imply causation. Cannot definitely prove causation using any single data
set. This study is observational in nature, and thus the causal effect of the crisis cannot be
established. No relevant data of financial crisis. There could be many other hidden reasons
(such as confounders not shown in this data set) for the lower proportion of luxury cars sold in
2008.
Furthermore, the data set simply compares the year 2008 to 2021, and it is difficult to say that it
is specifically the financial crisis that occurred in 2008 that led to lower Luxury car sales, as
there could be many other events that happened in 2008 besides the crisis that impacted the
car sales. In short, our analysis deals with association, which tells us that certain variable
outcomes tend to occur together, but we don’t know which variable is causing the outcomes to
occur together.
3ai) Calculate the conditional rate of getting polio given that they are vaccinated for each study.
P( A ∩ B)
P( polio∨vac)=
P( B)
● Nfip = 0.0249%
● Exclusion =0.028%

(ii) Calculate the conditional rate of getting polio given that they are in control group for each
study.
P( polio∨control)
● Nfip = 0.0539%
● Exclusion =0.071%

(iii) Comment on the appropriateness of using rates rather than absolute number of polio cases
to compare the effectiveness of the vaccine in the NFIP study.
Rates better as they account for the difference in group sizes.

3b) Evaluate the two study designs by answering the questions below.

(i) To what extent were the study participants randomly assigned into treatment and control
groups?

Nfip study Exclusion

Non random Random Assignment

- All children without parental consent - Children with parental consent were
assigned to control group ie not randomly assigned
random assign
- Children without parental consent
Control larger than treatment were excluded

(ii) Discuss how this assignment might affect the way we interpret results from both studies.

Nfip study Exclusion

- As both groups are big, random


- Children without parental consent are assignment makes the features of
less susceptible to polio, assigned to both treatment and control group
control group: lower income group, similar
features similar - The only difference is thus due to the
presence/absence of the vaccine
- difference in rates may not be due to - This, together with double blinding
the vaccine done, allows us to conclude that the
difference in rates is due to the
vaccine

(iii) According to the table of results, by how much does the vaccine reduce the polio rate? Do
you think the vaccine is actually more or less effective than what you calculated?

Nfip study Exclusion


- The vaccine reduces the polio rate by - The vaccine reduces the polio rate by
0.0539% - 0.0249% = 0.0290% 0.0710% - 0.0280% = 0.0430%
- The actual effect is likely lower than
0.043%, - The actual effect is likely to be similar
- The actual effect of the vaccine is to 0.043%, since the study was
likely to be larger than 0.0290% conducted with randomised
- Vaccine more effective than what assignment, double blinding and a
above calculation tell us large sample size.
- Understated

You might also like