Mantel–Haenszel method explain in detail and its need in community medicne

Reading File
Finding Sources
Finding Sources
Searching the Web

Mantel-Haenszel method community medicine epidemiology confounding stratification

Reading File
Reading File
Reading Web Page

Mantel-Haenszel method formula confounding stratification community medicine

I now have comprehensive material. Here is a detailed explanation:

Mantel-Haenszel Method: A Detailed Explanation


1. Background and Why It Was Developed

In community medicine and epidemiology, a core challenge is confounding - a situation where the observed association between an exposure and an outcome is distorted by a third variable (the confounder), which is independently associated with both the exposure and the outcome.
For example, if you study the association between obesity and cardiovascular disease (CVD), age can be a confounder - older people are both more likely to be obese and more likely to have CVD. If you ignore age, your crude odds ratio will be inflated beyond the true association.
The Mantel-Haenszel (MH) method, published by Nathan Mantel and William Haenszel in 1959, provides a statistically rigorous way to produce an adjusted (pooled) summary measure of association - odds ratio (OR), relative risk (RR), or rate ratio - after controlling for one or more confounders through stratification.

2. What Is Stratification?

Stratification means dividing the study population into homogeneous subgroups (strata) based on the levels of the confounding variable. Within each stratum, the confounder either does not vary or varies very little, so the association measured within each stratum is unconfounded.
For example, to control for age:
  • Stratum 1: Age < 40
  • Stratum 2: Age >= 40
You compute a separate 2x2 table and OR/RR for each stratum, then combine them using the MH formula.

3. The Cochran-Mantel-Haenszel Framework: The 2 x 2 x K Table

The MH method works with a 2 x 2 x K table structure:
  • 2 rows: Exposure (Yes/No)
  • 2 columns: Outcome (Yes/No)
  • K layers: The K strata of the confounding variable
Notation for each stratum i:
Outcome +Outcome -Total
Exposeda_ib_im1_i
Unexposedc_id_im0_i
Totaln1_in0_in_i

4. The MH Formula

For Odds Ratio (Case-Control Studies):

$$OR_{MH} = \frac{\sum_i \frac{a_i d_i}{n_i}}{\sum_i \frac{b_i c_i}{n_i}}$$
Each stratum contributes a weighted pair of products. The weight for each stratum is 1/n_i (the inverse of total subjects in that stratum), so larger strata contribute more to the final pooled estimate.

For Relative Risk / Risk Ratio (Cohort Studies):

$$RR_{MH} = \frac{\sum_i \frac{a_i \cdot m0_i}{n_i}}{\sum_i \frac{c_i \cdot m1_i}{n_i}}$$

For Rate Ratio (incidence rate data):

$$IRR_{MH} = \frac{\sum_i \frac{a_i \cdot T_{0i}}{T_i}}{\sum_i \frac{c_i \cdot T_{1i}}{T_i}}$$
where T represents person-time in each stratum.

5. Step-by-Step Application

Step 1: Calculate the Crude (Unadjusted) OR/RR

From the overall 2x2 table (ignoring the confounder):
OR = (a × d) / (b × c)

Step 2: Stratify by the suspected confounder

Create a separate 2x2 table for each stratum of the confounder.

Step 3: Calculate Stratum-Specific OR/RR

Compute OR or RR within each stratum. If the stratum-specific values are similar to each other but differ markedly from the crude estimate, confounding is present.

Step 4: Check for Homogeneity of Effect Across Strata

Before pooling, assess whether the stratum-specific measures are similar (homogeneous). This is the homogeneity assumption - you can use the Breslow-Day test to formally test this.
  • If stratum-specific estimates are similar → pool them (MH is appropriate, confounding is present)
  • If stratum-specific estimates are very different → do NOT pool; this is effect modification (interaction), and you should report stratum-specific results separately

Step 5: Calculate the MH Pooled OR/RR

Apply the MH formula to get the adjusted, unconfounded estimate.

Step 6: Assess Magnitude of Confounding

$$% \text{ Confounding} = \frac{Crude OR - Adjusted OR}{Crude OR} \times 100$$
A difference of >10-15% is generally considered meaningful confounding.

Step 7: MH Chi-Square Test

The MH chi-square statistic tests the null hypothesis that there is no association between exposure and outcome after adjusting for the confounder:
$$\chi^2_{MH} = \frac{\left[\sum_i \left(a_i - E(a_i)\right)\right]^2}{\sum_i Var(a_i)}$$
This is the Cochran-Mantel-Haenszel test of conditional independence.

6. Worked Example

Study: Association between obesity and CVD, with age as a potential confounder.
Crude data:
  • OR (unadjusted) = (46 × 640) / (254 × 60) = 1.93
After stratifying by age (young vs. old):
StratumStratum OR
Age < 40~1.50
Age ≥ 40~1.52
Both stratum-specific ORs are similar to each other (~1.5) but lower than the crude OR (1.93). This pattern indicates positive confounding by age (age inflated the crude estimate).
After applying MH formula:
  • OR (MH adjusted) = ~1.52
Magnitude of confounding = (1.93 - 1.52) / 1.93 × 100 = ~21%

7. Key Assumptions of the MH Method

  1. Homogeneity of effect: The association measure (OR or RR) must be approximately the same across all strata. If it is not, MH pooling is inappropriate - you are dealing with effect modification, not confounding.
  2. No residual confounding within strata: Within each stratum, the confounder is assumed to be sufficiently controlled (subjects are homogeneous for the confounder).
  3. Adequate cell counts: Each cell in each 2x2 stratum table should have sufficient numbers (sparse data reduces reliability; the Breslow-Day test may be underpowered with small samples).

8. Confounding vs. Effect Modification: A Critical Distinction

FeatureConfoundingEffect Modification
Stratum-specific estimatesSimilar to each otherDifferent from each other
ActionPool using MHReport separately
Crude vs. adjustedCrude differs from adjustedCrude is misleading
Biological meaningBias to removeReal biological phenomenon

9. Uses in Community Medicine

The MH method has wide applications across community medicine and public health:

a) Controlling Confounding in Observational Studies

  • Case-control studies: Adjusting OR for age, sex, socioeconomic status
  • Cohort studies: Adjusting RR or rate ratios for known confounders
  • Example: Studying smoking and lung cancer while controlling for age and occupation

b) Multi-centre / Multi-site Studies

  • When a study is conducted across multiple clinics or hospitals, "site" can confound results (sicker patients may be concentrated at one site)
  • MH pools results across sites to give a single unconfounded summary estimate
  • This is directly analogous to Mantel and Haenszel's original 1959 example

c) Meta-Analysis

  • MH is the foundation for pooling 2x2 data across multiple independent studies
  • In systematic reviews, MH gives a weighted pooled OR/RR across studies (inverse-variance or MH weighting)
  • Used in Cochrane reviews and other evidence syntheses

d) Survival Analysis - Mantel-Haenszel Log-Rank Test

  • The MH principle extends to survival data via the log-rank test (also called Mantel-Haenszel life table test)
  • Compares survival curves between groups at each event time point
  • Widely used to compare treatment arms in clinical trials and cohort studies

e) Programme Evaluation

  • In health programme evaluation, MH helps assess whether an intervention effect holds across subgroups (rural vs. urban, different age groups)
  • Adjusts for baseline differences between intervention and control populations

f) Disease Surveillance and Outbreak Investigation

  • In field epidemiology, MH is used to adjust for age/sex when comparing attack rates across groups
  • Helps identify the true exposure-disease association when demographic confounders exist

g) Vaccine Effectiveness Studies

  • Adjusting vaccine effectiveness estimates for age, comorbidities, healthcare-seeking behaviour

h) Environmental Epidemiology

  • Studying occupational exposures while controlling for smoking status, age, duration of employment

10. Advantages of the MH Method

AdvantageDetail
Simple and intuitiveNo complex software needed for basic calculations
Non-parametricDoes not assume a distribution for the data
TransparentEach stratum's contribution is explicit and can be examined
Handles sparse data betterMore robust than logistic regression with small samples
Widely acceptedGold standard for stratified analysis in epidemiology
Works for multiple study designsCase-control, cohort, cross-sectional, survival

11. Limitations

  • Limited to one or two confounders at a time (stratification becomes impractical with many confounders - you run into sparse cell counts)
  • Cannot handle continuous confounders directly (must be categorized, which introduces residual confounding within categories)
  • Homogeneity assumption may not hold - requires checking before pooling
  • Sparse data in strata reduces precision
  • For multiple confounders simultaneously, multivariable logistic regression or Poisson regression is preferred

12. MH vs. Multivariable Regression

FeatureMantel-HaenszelMultivariable Regression
Number of confounders1-2Many
Model assumptionsFewerDistributional assumptions
InterpretabilityHigh (transparent)Moderate
Sparse dataBetterMay be unstable
Standard useTeaching, field epiComplex analyses
In practice, both methods are complementary. MH stratified analysis is often used first to check for confounding and effect modification, and then multivariable regression is used for the final adjusted analysis.

Summary

The Mantel-Haenszel method is one of the cornerstone techniques in community medicine and epidemiology. It provides a statistically valid way to calculate a weighted, pooled summary measure of association (OR or RR) that is adjusted for confounding, by combining stratum-specific 2x2 tables. It distinguishes confounding from effect modification, underpins meta-analytic methods, and is applicable across case-control studies, cohort studies, multi-site trials, and outbreak investigations. Its conceptual clarity makes it an essential teaching tool and practical analytical method in public health practice.
References:
  • Mantel N, Haenszel W. Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst. 1959.
  • Creasy & Resnik's Maternal-Fetal Medicine, Chapter 15 (Stratified Analysis)
  • Kaplan & Sadock's Comprehensive Textbook of Psychiatry (Cochran-Mantel-Haenszel statistic)
  • Stratification for Confounding - The MH Formula - Nephron Clinical Practice

how to calculate manully Mantel N, Haensze ?

Here is a complete manual calculation guide with worked examples from scratch.

Manual Calculation of the Mantel-Haenszel Method


The Core Idea

You have data from a study with a confounder. You split the data into strata (subgroups) based on that confounder, compute a weighted contribution from each stratum, then pool them.

PART 1: MH Odds Ratio (for Case-Control Studies)

The Formula

$$OR_{MH} = \frac{\sum_i \frac{a_i \cdot d_i}{n_i}}{\sum_i \frac{b_i \cdot c_i}{n_i}}$$

Standard 2x2 Table Notation (for each stratum i)

Disease +Disease -Total
Exposedaba+b
Unexposedcdc+d
Totala+cb+dn

Worked Example: Smoking and Lung Cancer (Controlled for Age)

Scenario: You want to know if smoking causes lung cancer, but age is a confounder. You split the data into two age strata.

Step 1: Write Out the Crude (Unstratified) Table

Cancer +Cancer -Total
Smoker8040120
Non-smoker206080
Total100100200
Crude OR = (a × d) / (b × c) $$OR_{crude} = \frac{80 \times 60}{40 \times 20} = \frac{4800}{800} = \mathbf{6.0}$$

Step 2: Stratify by Age

Stratum 1: Age < 50

Cancer +Cancer -Total
Smokera₁ = 10b₁ = 2030
Non-smokerc₁ = 5d₁ = 3540
Total1555n₁ = 70
OR₁ = (10 × 35) / (20 × 5) = 350 / 100 = 3.5

Stratum 2: Age ≥ 50

Cancer +Cancer -Total
Smokera₂ = 70b₂ = 2090
Non-smokerc₂ = 15d₂ = 2540
Total8545n₂ = 130
OR₂ = (70 × 25) / (20 × 15) = 1750 / 300 = 5.83

Step 3: Check Homogeneity

OR₁ = 3.5 and OR₂ = 5.83 are in the same direction and roughly similar (not wildly different). Crude OR = 6.0 is higher than both stratum-specific ORs. This suggests positive confounding by age - proceed with MH pooling.
Rule of thumb: If stratum-specific ORs are very different (e.g., one is 0.5 and another is 4.0), you have effect modification, and should NOT pool - report them separately.

Step 4: Calculate the MH Numerator

For each stratum, calculate: a_i × d_i / n_i
Stratumadn(a × d) / n
1 (Age < 50)103570(10 × 35) / 70 = 350/70 = 5.00
2 (Age ≥ 50)7025130(70 × 25) / 130 = 1750/130 = 13.46
Sum of Numerator = 5.00 + 13.46 = 18.46

Step 5: Calculate the MH Denominator

For each stratum, calculate: b_i × c_i / n_i
Stratumbcn(b × c) / n
1 (Age < 50)20570(20 × 5) / 70 = 100/70 = 1.43
2 (Age ≥ 50)2015130(20 × 15) / 130 = 300/130 = 2.31
Sum of Denominator = 1.43 + 2.31 = 3.74

Step 6: Calculate OR (MH)

$$OR_{MH} = \frac{18.46}{3.74} = \mathbf{4.93}$$

Step 7: Assess Confounding

$$% \text{ Confounding} = \frac{Crude OR - Adjusted OR}{Crude OR} \times 100 = \frac{6.0 - 4.93}{6.0} \times 100 = \mathbf{17.8%}$$
Age caused an 18% inflation of the crude OR. The true (adjusted) association between smoking and lung cancer is OR = 4.93, not 6.0.

PART 2: MH Relative Risk (for Cohort Studies)

The Formula

$$RR_{MH} = \frac{\sum_i \frac{a_i \cdot (c_i + d_i)}{n_i}}{\sum_i \frac{c_i \cdot (a_i + b_i)}{n_i}}$$
Here:
  • (a_i + b_i) = total exposed in stratum i = m1_i
  • (c_i + d_i) = total unexposed in stratum i = m0_i

Worked Example: Obesity and Diabetes (Controlled for Sex)

Stratum 1: Males

Diabetes +Diabetes -Total
Obesea₁ = 30b₁ = 70m1₁ = 100
Non-obesec₁ = 10d₁ = 90m0₁ = 100
Total40160n₁ = 200
RR₁ = (30/100) / (10/100) = 0.30 / 0.10 = 3.0

Stratum 2: Females

Diabetes +Diabetes -Total
Obesea₂ = 20b₂ = 80m1₂ = 100
Non-obesec₂ = 5d₂ = 95m0₂ = 100
Total25175n₂ = 200
RR₂ = (20/100) / (5/100) = 0.20 / 0.05 = 4.0

MH Numerator: Σ [a_i × m0_i / n_i]

Stratumam0n(a × m0) / n
Males301002003000/200 = 15.0
Females201002002000/200 = 10.0
Sum = 25.0

MH Denominator: Σ [c_i × m1_i / n_i]

Stratumcm1n(c × m1) / n
Males101002001000/200 = 5.0
Females5100200500/200 = 2.5
Sum = 7.5

RR (MH)

$$RR_{MH} = \frac{25.0}{7.5} = \mathbf{3.33}$$
This is a weighted average of RR₁ (3.0) and RR₂ (4.0), adjusted for sex.

PART 3: MH Chi-Square Test (Testing Significance After Adjustment)

This tests: Is there still a statistically significant association after controlling for the confounder?

Formula

$$\chi^2_{MH} = \frac{\left[\left|\sum_i (a_i - E_i)\right| - 0.5\right]^2}{\sum_i V_i}$$
Where:
  • E_i (expected value of a_i) = (row₁ total × col₁ total) / n_i = (a_i + b_i)(a_i + c_i) / n_i
  • V_i (variance) = (a_i+b_i)(c_i+d_i)(a_i+c_i)(b_i+d_i) / [n_i² × (n_i - 1)]
  • The 0.5 is a Yates' continuity correction (optional, conservative)

Using the Smoking/Cancer Example:

Stratum 1 (n₁ = 70):

E₁ = (30 × 15) / 70 = 450/70 = 6.43
V₁ = (30 × 40 × 15 × 55) / (70² × 69) = 990,000 / 338,100 = 2.928

Stratum 2 (n₂ = 130):

E₂ = (90 × 85) / 130 = 7650/130 = 58.85
V₂ = (90 × 40 × 85 × 45) / (130² × 129) = 13,770,000 / 2,184,900 = 6.302

Chi-Square:

Stratum 1Stratum 2
Observed a_i1070
Expected E_i6.4358.85
(a - E)+3.57+11.15
$$\sum(a_i - E_i) = 3.57 + 11.15 = 14.72$$
$$\sum V_i = 2.928 + 6.302 = 9.23$$
$$\chi^2_{MH} = \frac{(14.72 - 0.5)^2}{9.23} = \frac{(14.22)^2}{9.23} = \frac{202.21}{9.23} = \mathbf{21.9}$$
At 1 degree of freedom, chi-square = 21.9 >> 3.84 (p < 0.05 cutoff), so p < 0.001 - the adjusted association is highly statistically significant.

PART 4: Summary Table of All Calculations

StepWhat You CalculateFormula
1Crude OR or RRad/bc or (a/m1)/(c/m0)
2Stratum-specific OR or RRSame formula within each stratum
3MH NumeratorΣ (a_i × d_i / n_i) for OR; Σ (a_i × m0_i / n_i) for RR
4MH DenominatorΣ (b_i × c_i / n_i) for OR; Σ (c_i × m1_i / n_i) for RR
5Pooled MH estimateNumerator / Denominator
6% Confounding(Crude - Adjusted) / Crude × 100
7MH Chi-Square[Σ(a-E) - 0.5]² / Σ V

PART 5: Decision Rule After Getting Results

Crude OR/RR vs. Stratum-Specific OR/RR
        |
        |-- Stratum-specific are SIMILAR to each other?
              |
              YES → Check if crude ≠ adjusted by >10-15%?
                        |
                        YES → CONFOUNDING present → Report MH adjusted value
                        NO  → No meaningful confounding → Report crude
              |
              NO  → EFFECT MODIFICATION present → Report strata separately, do NOT pool

Quick Cheat Sheet for the Exam

ParameterFormula
OR_MHΣ(a_i d_i / n_i) ÷ Σ(b_i c_i / n_i)
RR_MHΣ(a_i m0_i / n_i) ÷ Σ(c_i m1_i / n_i)
E(a_i)m1_i × n1_i / n_i
V(a_i)m1_i × m0_i × n1_i × n0_i / (n_i² × (n_i-1))
χ²_MH[
% Confounding(Crude - Adjusted) / Crude × 100
The key insight: the MH method is just a weighted average of stratum-specific estimates, where larger strata get more weight (through the n_i in each denominator). It is nothing more than that - elegant simplicity applied to a real-world problem of confounding.
This is a shared conversation. Sign in to Orris to start your own chat.