Homework Problem Set #3
P1. An epidemiologist conducted two cohort studies investigating the association between high cholesterol and development of coronary heart disease (CHD) in two different populations (A & B) where 10,000 persons were sampled (5,000 from each population). Using data in the table below, answer the following questions. Calculate the relative risk (RR) and odds ratio (OR) for the association between high cholesterol and CHD in Population A & Population B.
Population
|
Cholesterol
|
CHD
|
No CHD
|
A
|
High
|
80
|
920
|
A
|
Low
|
160
|
3840
|
|
|
|
|
B
|
High
|
280
|
1120
|
B
|
Low
|
360
|
3240
|
Q1. Which measure of association is identical in the two populations and which is not? What can account for the latter and what is the degree to which the measure overestimates the magnitude of association between high cholesterol and CHD in the two populations? (refer to equation in class)
Q2. Given the study design, which of the two measures of association would you report? Also, provide an interpretation of the measure you would report.
Q3. Which of the following scenarios would yield an odds ratio and relative risk that are identical?
A. Common disease
B. Rare disease
C. Null association between exposure and outcome
P2. How does the investigation of a rare versus common disease affect accuracy of the estimate of association reported in a case-control study?
P3. When testing an association from a case-control study, epidemiologists often make the ‘rare disease assumption’ in interpreting the estimate. Why doesn’t the epidemiologist estimate disease prevalence from the case-control study rather than assume a rare disease?
P4. Explain how we can interpret an odds ratio (OR) from a case-control study in terms of the ‘odds of developing disease’ when cases are recruited after developing disease?
P5. A case-control study examining the association between hormone replacement therapy (HRT) and uterine cancer revealed that 15% of the 300 cases and 10% of the 600 controls had used HRT. Calculate the odds ratio and its 95% confidence interval, interpret the estimate and indicate whether it is statistically significant.
HRT Use
|
Cases
|
Controls
|
Yes
|
|
|
No
|
|
|
P6. A cohort study of 10,000 participants was conducted to estimate the risk of lung cancer and coronary heart disease (CHD) attributable to cigarette smoking. Use the information below to answer the following questions.
Smoking status
|
Developed Lung cancer
|
Developed CHD
|
|
Yes
|
No
|
Yes
|
No
|
Smoker
|
65
|
1935
|
175
|
1825
|
Non-Smoker
|
17
|
7983
|
389
|
7611
|
Q1. Calculate the attributable risk (AR), the attributable fraction (AF), and relative risk (RR) for the two outcomes. Explain why AR does not show the same pattern across the two disease outcomes as AF and RR.
Q2. Suppose the cohort study was conducted in California which has a smoking prevalence of 15%. Calculate the population attributable risk (PAR) and population attributable fraction (PAF) for lung cancer. Why is the PAF considerably lower than the AF?
Q3. Under what conditions would the PAF be equal to AF?
A. 0% of population smoked cigarettes
B. 50% of population smoked cigarettes
C. 100% of population smoked cigarettes
D. smoking prevalence in the sample = smoking prevalence in the population
P7. The following table provides descriptions of four different studies and how they were sampled to estimate the association between Exposure X and Disease Y. Answer the following questions based on the descriptions.
Study
|
Description of Sample
|
A
|
Selected a random sample of residents from the Orange County population
(n=5000 participants)
|
B
|
Selected all participants with Disease Y and a subset of participants without Disease Y from Study A (n=500 participants)
|
C
|
Selected all participants without Disease Y from Study A to be followed for five years (n= 4750 participants)
|
D
|
Selected all participants who developed Disease Y and a subset of participants who did not develop Disease Y from Study C (n= 400 participants)
|
Q1. Identify which study (A, B, C, D) corresponds to which study design in the two-dimensional classification system discussed in class
Q2. What measure of morbidity can only be estimated from Study A, and why?
Q3. What is the primary advantage of over-sampling participants on the disease outcome in study B versus study A, for example?
P8. Explain why cross-sectional (prevalence) studies that utilize probability sampling tend to have great external validity but poor internal validity.
P9. An epidemiologist tested the association between use of NSAIDS (anti-inflammatory medications) and colorectal cancer based on a hospital case-control study in which cases were admitted for colorectal cancer and controls were admitted for arthritis.
Q1. Does selecting cases and controls from the same hospital ensure that study participants came from the same source population? Why/why not?
Q2. What fundamental principle was violated in selecting controls with arthritis, and how did this violation affect the measure of association between use of NSAIDs and colorectal cancer?