Here is a thorough explanation of cumulative incidence:
Cumulative Incidence
Definition
Cumulative incidence (also called incidence proportion or attack rate in outbreak settings) is the proportion of a disease-free population that develops a new outcome over a defined time period.
"The numerator would be the number of subjects who developed the disease during follow-up and the denominator would be the total number of subjects at the beginning of the study."
- Scott-Brown's Otorhinolaryngology Head & Neck Surgery
Formula
$$\text{Cumulative Incidence} = \frac{\text{Number of new cases during time period}}{\text{Population at risk at the start of the period}}$$
- Result: A dimensionless proportion (0 to 1, or expressed as %)
- Always tied to a specific time window (e.g., "1-year cumulative incidence of 5%")
Example: 1,000 smokers followed for 5 years; 15 develop head and neck cancer.
- Cumulative incidence = 15/1,000 = 1.5% over 5 years
Key Characteristics
| Feature | Cumulative Incidence |
|---|
| Type | Proportion (dimensionless) |
| Time | Fixed observation period specified |
| Denominator | Persons at risk at start |
| Assumption | All subjects followed for full period |
| Range | 0 to 1 (or 0% to 100%) |
Relationship to Incidence Rate
The incidence rate (incidence density) uses person-time in the denominator instead of persons, accounting for variable follow-up durations (dropouts, late entries).
- Incidence rate = new cases / total person-time at risk (units: cases per person-year)
- Same example: 15 cases / (5 years × 1,000 people) ≈ 3 cases per 1,000 person-years
When incidence is low, a useful approximation holds:
Cumulative incidence ≈ Incidence rate × Average follow-up time
This approximation breaks down when cumulative incidence is high (>10%) or follow-up is long.
Cumulative Incidence vs. Prevalence
| Cumulative Incidence | Prevalence |
|---|
| Measures | New cases only | All existing cases |
| Denominator | At-risk persons at start | Total population at a point in time |
| Time | Period (longitudinal) | Point or period |
| Use | Measuring disease risk | Measuring disease burden |
Competing Risks and the Cumulative Incidence Function (CIF)
In survival analysis, when subjects can experience multiple types of events (e.g., death from cancer vs. death from other causes), standard Kaplan-Meier estimation overestimates cumulative incidence because it treats competing events as censored observations - violating the independent censoring assumption.
The solution is the Cumulative Incidence Function (CIF), which accounts for competing risks:
$$\text{CIF}(t) = \int_0^t S(u^-) \cdot h_k(u) , du$$
Where:
- S(u⁻) = overall survival probability just before time u
- h_k(u) = cause-specific hazard for the event of interest k
Key points about competing risks:
- The sum of all cause-specific CIFs cannot exceed 1
- Fine-Gray subdistribution hazard model is commonly used for regression with competing risks
- CIF should always be used instead of Kaplan-Meier when competing events are present
Clinical Applications
- Cohort studies: Primary measure of disease risk over follow-up
- Outbreak investigations: Called the "attack rate" (e.g., 40% of attendees developed gastroenteritis)
- Clinical trials: Reporting event rates over trial duration
- Cancer epidemiology: Lifetime cumulative incidence (e.g., 1-in-8 lifetime risk of breast cancer)
- Survival analysis: Plotted as 1 - Kaplan-Meier curve when no competing risks exist
Common Pitfalls
- Omitting the time period - cumulative incidence is meaningless without specifying the time window
- Using K-M when competing risks exist - leads to overestimation; use CIF instead
- Confusing with incidence rate - incidence rate has units (per person-time); cumulative incidence is unitless
- Ignoring losses to follow-up - if substantial dropout occurs, person-time methods (incidence rate) are more appropriate