Here is a critical appraisal of the THRIVE-AA2 trial — what it did well and where genuine gaps remain.
Critical Appraisal of THRIVE-AA2
What the Trial Did Well
- Robust design: Randomized, double-blind, placebo-controlled with two active doses — this allows dose-response characterization, not just a binary yes/no answer
- Clinically meaningful primary endpoint: SALT ≤ 20 reflects ≥ 80% scalp coverage, which patients and clinicians both recognize as meaningful recovery
- Breadth of secondary endpoints: Including eyebrow/eyelash regrowth and PROs (DLQI, AAPPO, HAD) demonstrates awareness that AA is not purely a scalp disease and that psychological burden matters
- Companion trial design: Running THRIVE-AA2 in parallel with THRIVE-AA1 rather than as a sequential replication strengthens reproducibility claims considerably
Key Limitations and What Could Be Improved
1. Trial Duration is Too Short
24 weeks is sufficient to demonstrate acute efficacy but AA is a lifelong, relapsing-remitting disease. Most responders are expected to relapse within weeks to months of stopping treatment — and the trial does not adequately characterize:
- Time to relapse after discontinuation
- Whether continuous therapy is required indefinitely
- Whether a maintenance strategy (dose reduction, intermittent dosing) is feasible
What would be better: A pre-specified withdrawal phase with structured follow-up to quantify relapse rates and time-to-relapse, similar to what baricitinib's BRAVE-AA extension provided.
2. Responder Rates Are Modest in Absolute Terms
Even at the higher dose (12 mg BID), roughly 38–41% reached SALT ≤ 20 at 24 weeks. That means the majority of patients — including those with the most severe disease — did not achieve the primary endpoint.
What would be better:
- Subgroup analyses powered to identify predictors of response (disease duration, baseline SALT, atopy comorbidity, prior treatment history, genetic markers like HLA subtypes) — these are often mentioned post-hoc but rarely prospectively powered
- Head-to-head data against baricitinib would help clinicians choose between the two approved oral options, but no such comparative trial exists
3. The SALT Score Has Well-Known Limitations
SALT measures scalp surface area affected but does not capture:
- Hair density or quality (thin, vellus regrowth is scored the same as terminal hair)
- Cosmetic acceptability to the patient — a SALT of 25 may look very different depending on distribution
- The psychological threshold for what constitutes meaningful regrowth varies widely between individuals
What would be better: Incorporating standardized photography scoring, trichoscopy metrics, or hair pull test outcomes alongside SALT to validate that measured regrowth translates to cosmetically relevant outcomes.
4. Pediatric and Adolescent Population Excluded
AA disproportionately affects children and adolescents, and severe early-onset AA carries significant psychosocial consequences during critical developmental years. THRIVE-AA2 enrolled adults only (≥ 18 years).
What would be better: A dedicated pediatric cohort or age-stratified analysis down to at least age 12, particularly given that the FDA has required pediatric study commitments as part of JAK inhibitor approvals.
5. Long-Term Safety Evidence is Limited
The 24-week controlled period is too short to detect:
- Malignancy risk (requires years of follow-up)
- Cardiovascular events in higher-risk subgroups
- Cumulative infection burden, including herpes zoster reactivation
- Lipid trajectory over time
The open-label extension provides some data, but uncontrolled OLE data has inherent survivor bias and dropout confounding.
What would be better: A mandated long-term safety registry, similar to what has been implemented for biologics in psoriasis, with pre-specified annual safety reporting.
6. Narrow Eligibility Criteria Limit Generalizability
The inclusion window of current episode ≥ 6 months and ≤ 10 years and SALT ≥ 50 excludes:
- Patients with very chronic, treatment-refractory disease (> 10 years)
- Patients with moderate AA (SALT 25–49) who still have significant burden
- Those with major comorbidities commonly seen in real-world AA populations (e.g., thyroid disease, atopy)
What would be better: Broader eligibility with pre-specified subgroup analyses for chronic/refractory disease, and a separate arm or trial for moderate AA (SALT 25–49) where treatment decisions are currently entirely off-label.
7. Placebo Response Rate Was Very Low — Appropriately So, But Worth Noting
The ~2–4% placebo response reflects the natural history of severe AA, where spontaneous remission is uncommon. This makes the treatment effect look large, but it also means the trial was optimally designed to detect a signal rather than to reflect real-world mixed-severity populations.
8. Biomarker and Mechanistic Data Are Absent from the Main Report
The paper is a clinical outcomes report. It does not address:
- Serum IFN-γ, CXCL9/10, or IL-15 levels as pharmacodynamic markers
- Whether JAK inhibition is associated with changes in the peribulbar immune infiltrate
- Predictive biomarkers for non-response
What would be better: Translational sub-studies linking clinical response to mechanistic endpoints would help understand why ~60% of patients do not achieve SALT ≤ 20, and whether those patients have a fundamentally different disease biology.
Summary Table
| Domain | Current Limitation | Suggested Improvement |
|---|
| Follow-up duration | 24 weeks only | Structured withdrawal + relapse data |
| Responder rates | ~40% at best | Biomarker-driven patient selection |
| Primary endpoint | SALT score alone | Add trichoscopy + cosmetic rating |
| Age range | Adults ≥ 18 only | Pediatric cohort |
| Long-term safety | OLE only | Mandatory registry |
| Eligibility breadth | Narrow | Include moderate AA, chronic disease |
| Mechanistic insight | None in main paper | Translational sub-studies |
| Comparator | Placebo only | Active comparator arm vs. baricitinib |
The trial is methodologically sound and met its regulatory purpose. The gaps above are not unique to THRIVE-AA2 — they reflect broader challenges in AA trial design — but they represent real unanswered questions that clinicians face when translating these results to individual patients.