Randomisation, parametric and non parametric, test for statistical significance, p- value, null hypothesis, hypothesis, sampling- make a short note for each topic

Here are concise notes on each topic:

1. Randomisation

Randomisation is the process of assigning subjects to groups (e.g., treatment vs. control) by chance, so each participant has an equal probability of being placed in any group. It eliminates selection bias and ensures that confounding variables are evenly distributed across groups. Types include:

Simple randomisation - coin flip or random number table
Block randomisation - ensures balanced group sizes at regular intervals
Stratified randomisation - randomises within subgroups (strata) to balance key variables like age or sex
Cluster randomisation - entire groups (e.g., villages, clinics) are randomised rather than individuals

Randomisation is the hallmark of a well-designed RCT and strengthens internal validity.

2. Parametric Tests

Parametric tests assume the data follow a specific distribution (usually normal/Gaussian) and deal with population parameters (mean, variance). They require:

Data measured on interval or ratio scale
Approximate normal distribution
Homogeneity of variance (for group comparisons)

Common examples:

Test	Use
t-test (independent)	Compare means of two independent groups
t-test (paired)	Compare means before/after in same group
One-way ANOVA	Compare means of 3+ groups
Pearson's r	Correlation between two continuous variables

Parametric tests are generally more statistically powerful when assumptions are met.

3. Non-Parametric Tests

Non-parametric tests make no assumptions about the underlying population distribution. They are used when:

Data are ordinal, ranked, or not normally distributed
Sample size is small
Outliers are present

Common examples:

Test	Parametric Equivalent
Mann-Whitney U	Independent t-test
Wilcoxon signed-rank	Paired t-test
Kruskal-Wallis	One-way ANOVA
Spearman's rho	Pearson's r
Chi-square test	- (for categorical data)

They are less powerful than parametric tests but more broadly applicable.

4. Tests for Statistical Significance

A test of statistical significance determines whether an observed result (difference, association) is likely to be real or due to chance. The process:

State the null and alternative hypotheses
Choose an appropriate test (t-test, ANOVA, chi-square, etc.)
Calculate the test statistic
Compare to a critical value or compute the p-value
Reject or fail to reject the null hypothesis

The choice of test depends on the type of data, number of groups, and whether the data are paired or independent.

5. P-Value

The p-value is the probability of obtaining a result at least as extreme as the observed result, assuming the null hypothesis is true.

p < 0.05 - conventionally considered statistically significant (less than 5% chance the result is due to chance alone)
p < 0.01 - highly significant
p > 0.05 - result is not statistically significant; insufficient evidence to reject H₀

Important caveats:

A small p-value does NOT mean the effect is large or clinically important
p-value does not measure the probability that H₀ is true
Significance threshold (alpha, α) must be set before the study, not after

6. Null Hypothesis (H₀)

The null hypothesis is the default assumption that there is no difference, no effect, or no association between variables in the population. It is what researchers attempt to disprove.

Example: "There is no difference in blood pressure between drug A and placebo groups."
It is never "proved" - only rejected or failed to be rejected
Rejecting H₀ when it is actually true = Type I error (false positive), controlled by alpha (α)
Failing to reject H₀ when it is false = Type II error (false negative), controlled by beta (β)

The alternative hypothesis (H₁) is what you accept if H₀ is rejected - it states there IS an effect or difference.

7. Hypothesis

A statistical hypothesis is a formal statement about a population parameter that can be tested using sample data.

Directional (one-tailed): Specifies the direction of the effect - "Drug A lowers BP more than placebo"
Non-directional (two-tailed): States only that a difference exists - "There is a difference in BP between groups"

A good hypothesis is:

Testable and falsifiable
Based on prior knowledge or theory
Stated before data collection (a priori)

8. Sampling

Sampling is the process of selecting a subset of individuals from a population to estimate characteristics of the whole.

Types of sampling:

Type	Description
Simple random	Every individual has equal chance of selection
Systematic	Every nth individual from a list
Stratified	Population divided into strata, random sample from each
Cluster	Population divided into clusters; entire clusters selected
Convenience	Readily available individuals (prone to bias)
Purposive	Deliberately selected based on specific criteria

Key concepts:

Sampling error - difference between sample statistic and true population parameter
Sample size - larger samples reduce sampling error and increase statistical power
Representativeness - the sample should reflect the population to allow generalisation

These topics form the foundation of research methodology and biostatistics in medical and scientific research.

Make it elaborate

Reading File

Finding Sources

Reading File

Excellent - I now have rich textbook content from multiple authoritative sources. Let me compile the elaborate notes.

Elaborate Notes on Biostatistics and Research Methodology

1. RANDOMISATION

Definition and Purpose

Randomisation is the process by which each participant in a study is assigned to one of the study groups (e.g., treatment or control) purely by chance - not by any decision made by the investigator or participant. As Scott-Brown's Otorhinolaryngology states:

"With randomization, you can expect that the prognostic characteristics of the randomized groups or 'arms' of the trial are similar except for differences due to chance variation. Randomization eliminates the influence of both known and unknown confounders that are present at the time of randomization. Without randomized treatment allocation... imbalances in prognostic factors between the groups occur that are the result of selection bias - a type of bias often called 'confounding by clinical indication'."

Scott-Brown's Otorhinolaryngology, p. 526

Randomisation is the cornerstone of a Randomised Controlled Trial (RCT) and is what distinguishes it from all other study designs. It is not merely a statistical technicality - it is the single most important mechanism for ensuring internal validity.

Why Randomisation Works

Randomisation simultaneously controls for:

Known confounders - variables the researcher is aware of (e.g., age, sex, disease severity)
Unknown confounders - variables the researcher has not even thought of measuring

No other method - not matching, not stratification, not statistical adjustment - can control for unknown confounders. This is the unique power of randomisation.

Allocation Concealment

A critical but often misunderstood aspect of randomisation is allocation concealment - keeping the future group assignment hidden from investigators until a participant is definitively enrolled.

"Proper allocation concealment requires that the investigators do not know the arm to which a participant will be allocated until the participant has definitively been recruited and included in the study. Concealment of the randomization is the only way to prevent the investigators influencing the balance of the prognostic characteristics between the groups."

Scott-Brown's Otorhinolaryngology, p. 526

Without allocation concealment, investigators may (consciously or not) delay enrolling a participant until they know the next assignment will be to the preferred group - completely defeating the purpose of randomisation.

Types of Randomisation

Type	Description	When to Use
Simple randomisation	Coin flip, random number table or computer generator. Every allocation is independent.	Adequate for large trials (n > 200)
Block (restricted) randomisation	Participants randomised in fixed-size blocks (e.g., blocks of 4 or 6). Ensures balance at regular intervals throughout enrolment.	Whenever balanced group sizes matter at any point in the trial
Stratified randomisation	Randomise separately within subgroups (strata) defined by important prognostic factors (e.g., age, sex, disease stage). Combines stratification with block randomisation within each stratum.	When a key variable is strongly associated with outcome
Minimisation	Adaptive algorithm that dynamically assigns the next participant to the group that minimises overall imbalance on multiple variables simultaneously.	Large multi-centre trials with many stratification variables
Cluster randomisation	Entire groups (villages, clinics, schools, hospital wards) are randomised rather than individuals.	Community-level interventions where individual randomisation is not feasible

Ethical Basis

The ethical justification for randomisation rests on the concept of clinical equipoise - a genuine state of uncertainty about which treatment is superior. If a physician genuinely does not know which treatment is better, it is ethically justifiable to allow chance to decide rather than personal preference.

Blinding (Related Concept)

Randomisation assigns participants to groups. Blinding keeps those groups concealed throughout the trial:

Single-blind - only the participant is unaware of their assignment
Double-blind - both participant and investigator are unaware
Triple-blind - participants, investigators, and outcome assessors are all unaware

Blinding prevents performance bias (differential care based on group knowledge) and detection bias (differential outcome assessment based on group knowledge).

2. PARAMETRIC TESTS

Definition

Parametric tests are statistical tests that make specific assumptions about the distribution and parameters of the population from which the data are drawn. The term "parametric" refers to the fact that these tests involve estimating population parameters (mean, variance, standard deviation).

As Goldman-Cecil Medicine explains:

"The distribution of values within a population (e.g., blood pressures) is often categorized as normal (i.e., Gaussian). A normal distribution is often characterized using both measures of central tendency (i.e., mean, median, and mode) and measures of dispersion around the center of the distribution (e.g., standard deviation)."

Goldman-Cecil Medicine, p. 2689

Core Assumptions

Parametric tests generally require ALL of the following:

Normality - the data (or the sampling distribution of the mean) should follow a normal (Gaussian) distribution
Interval or ratio scale - data must be measured on a continuous scale with meaningful numeric values
Homogeneity of variance (for group comparisons) - the variances of the groups being compared should be approximately equal (homoscedasticity)
Independence - observations must be independent of one another (except in paired tests)

Commonly Used Parametric Tests

Test	What it Compares	Data Requirements
Independent samples t-test	Means of 2 unrelated groups	Continuous, normal, equal variance
Paired t-test	Mean difference within the same group (before/after)	Continuous, differences normally distributed
One-way ANOVA	Means of 3 or more independent groups	Continuous, normal, equal variance
Repeated measures ANOVA	Means of the same group measured at 3+ time points	Continuous, normal, sphericity assumed
Two-way ANOVA	Effect of 2 independent variables + their interaction	Continuous, normal
Pearson's r	Linear correlation between 2 continuous variables	Both variables continuous, bivariate normal
Linear regression	Relationship between predictor(s) and outcome	Continuous outcome, residuals normal
ANCOVA	Means of groups while controlling for a covariate	Continuous outcome, normal residuals

Why Use Parametric Tests?

Greater statistical power (ability to detect a true effect) compared to non-parametric equivalents when assumptions are met
Produce effect size estimates (e.g., mean difference) that are clinically interpretable
Allow for more complex modelling (e.g., regression, ANOVA with interactions)
Violations of normality are less critical with large samples (Central Limit Theorem - the sampling distribution of the mean tends toward normality as n increases, typically n > 30)

Key Concepts

Mean - the arithmetic average; the parameter estimated by most parametric tests
Standard deviation (SD) - measures spread of data around the mean
Standard error of the mean (SEM) - measures precision of the sample mean as an estimate of the population mean; SEM = SD / √n

3. NON-PARAMETRIC TESTS

Definition

Non-parametric tests (also called distribution-free tests) make no assumptions about the shape of the underlying population distribution. They typically work by ranking the raw data and performing calculations on the ranks rather than the actual values.

The Harriet Lane Handbook notes:

"Nonparametric tests are used when a particular distribution cannot be assumed. They rank data rather than taking absolute differences into account."

The Harriet Lane Handbook, p. 957

When to Use Non-Parametric Tests

Data are ordinal (ranked, scored on a scale like Likert scales, pain scores)
Data are not normally distributed (confirmed by visual inspection or formal tests like Shapiro-Wilk)
Small sample sizes (where normality cannot be verified)
Significant outliers that would distort the mean
Outcome is a median rather than a mean
Data are in the form of categorical frequencies (e.g., Chi-square)

Parametric vs Non-Parametric Equivalents

Parametric Test	Non-Parametric Equivalent	When to Use Non-Parametric
Independent t-test	Mann-Whitney U test (Wilcoxon rank-sum)	Non-normal continuous or ordinal data, 2 independent groups
Paired t-test	Wilcoxon signed-rank test	Non-normal paired data
One-way ANOVA	Kruskal-Wallis test	Non-normal data, 3+ independent groups
Repeated measures ANOVA	Friedman test	Non-normal repeated measurements
Pearson's r	Spearman's rank correlation (ρ)	Ordinal or non-normal data
N/A	Chi-square test (χ²)	Categorical data - compare observed vs expected frequencies
N/A	Fisher's exact test	Categorical data with small expected cell counts (< 5)
N/A	McNemar's test	Paired categorical data

How Non-Parametric Tests Work (Ranking)

Consider comparing pain scores between two groups using Mann-Whitney U:

Pool all observations from both groups
Rank all values from smallest to largest (tied values get the average rank)
Sum the ranks for each group separately
If the groups truly have the same distribution, the rank sums should be approximately equal

By operating on ranks, these tests are robust against outliers and skewed distributions.

Trade-offs

Less statistical power than parametric equivalents when normality assumptions actually hold
Do not directly estimate clinically meaningful parameters (e.g., mean difference)
Some are less suitable for complex multi-variable analyses
However, with modern computers, the power loss is often small

4. TESTS FOR STATISTICAL SIGNIFICANCE

What is Statistical Significance?

A test of statistical significance answers one question: "Could this result have occurred by chance alone?" It evaluates how compatible the observed data are with the null hypothesis.

As Schwartz's Principles of Surgery explains:

"Many statistical tests can be used to calculate P values and confidence intervals. The appropriate statistical test must be selected according to several factors. This includes (1) determining the number of observations in the comparison groups, (2) the number of groups being compared, (3) whether two or more groups are being compared with each other or one group..."

Schwartz's Principles of Surgery, p. 115

The Process

Formulate hypotheses - null (H₀) and alternative (H₁)
Set significance level (α) - conventionally 0.05 before data collection
Choose the appropriate test - based on data type, number of groups, distribution
Calculate the test statistic - e.g., t, F, χ², z, U
Determine the p-value - probability of the observed test statistic (or more extreme) under H₀
Compare p to α - if p < α, reject H₀; if p ≥ α, fail to reject H₀

Selecting the Correct Test

Situation	Recommended Test
2 groups, continuous outcome, normal data, unpaired	Independent t-test
2 groups, continuous outcome, normal data, paired	Paired t-test
3+ groups, continuous outcome, normal data	One-way ANOVA (+ post-hoc test: Tukey, Bonferroni)
2 groups, ordinal/non-normal data, unpaired	Mann-Whitney U
2 groups, ordinal/non-normal data, paired	Wilcoxon signed-rank
3+ groups, ordinal/non-normal data	Kruskal-Wallis
2 categorical variables	Chi-square (if expected counts ≥ 5)
2 categorical variables, small samples	Fisher's exact test
2 continuous variables, assess correlation (normal)	Pearson's r
2 continuous variables, assess correlation (non-normal/ordinal)	Spearman's ρ

The Confidence Interval - an Alternative Expression

Sabiston's Textbook of Surgery notes:

"A confidence interval is a range of values that one can be certain contains the true mean of the population... a 95% confidence interval would include the observed difference 95% of the times that the study was repeated. Factors affecting the width of the confidence interval include the size of the sample, the confidence level, and the variability in the sample."

Sabiston Textbook of Surgery, p. 115

A 95% CI that does NOT include the null value (0 for differences, 1 for ratios) is equivalent to p < 0.05. Confidence intervals are often preferred over p-values alone because they convey both statistical significance AND the magnitude and precision of the effect.

Multiple Testing Problem

Sabiston also highlights:

"Type I errors can occur when the research question and analysis have not been specified a priori or when multiple statistical tests are performed in a study with several subgroups. For example, with a P value set at 0.05, 1 out of every 20 comparisons will be expected by chance to be deemed statistically significant and be a false-positive finding."

Sabiston Textbook of Surgery, p. 115

Corrections such as the Bonferroni correction (divide α by the number of tests) or the Hochberg sequential procedure are applied when multiple comparisons are made.

5. P-VALUE

Formal Definition

The p-value is defined as the probability of obtaining a result at least as extreme as the one observed, given that the null hypothesis is true.

Schwartz's Principles of Surgery states:

"The definition of a P value is the probability of an observed result given the assumption that the null hypothesis is true. The arbitrary value established for a result having statistical significance rather than 'pure chance' is less than 1 in 20, defined as a P value less than 0.05."

Schwartz's Principles of Surgery, p. 1718

The p-value was formalised by Sir Ronald Fisher, one of the founders of modern statistics.

What the P-value IS and IS NOT

The p-value IS...	The p-value is NOT...
The probability of the data (or more extreme) given H₀ is true	The probability that H₀ is true
A measure of evidence against the null hypothesis	A measure of the size or importance of an effect
A basis for a binary decision (reject / don't reject H₀)	A proof that an effect exists or does not exist
Specific to the study's patient sample	Necessarily generalisable to the whole population

As Kaplan & Sadock's Comprehensive Textbook of Psychiatry cautions:

"In the frequentist tradition of statistical inference, the P value cannot be interpreted as the probability that the null hypothesis is true. The hypothesis is not a random event, so it is either true or not true."

Kaplan & Sadock's Comprehensive Textbook of Psychiatry

Interpreting the P-value

p < 0.05 - statistically significant at the conventional threshold; less than 5% probability the result is due to chance (assuming H₀ true)
p < 0.01 - highly significant
p < 0.001 - very highly significant
p ≥ 0.05 - not statistically significant; insufficient evidence to reject H₀
p = 0.05 exactly - borderline; requires careful judgement

Alpha (α) Level

The α level (significance level) is the threshold for p that is set before data collection. It represents the acceptable risk of a Type I error. The Harriet Lane Handbook explains:

"α: Probability of making a type I error; the probability of rejecting the null hypothesis when the null hypothesis is true. α, the preset level of significance, is typically set at less than 0.05 in medical research, which allows interpretation with 95% certainty that a detected association is true."

The Harriet Lane Handbook, p. 957

Critical Limitations of the P-value

Statistical significance ≠ clinical significance - a trial of a new antiviral that shortens viral URI symptoms by 1 hour may produce p < 0.0001 in a large enough trial, but is clinically meaningless
P-value depends on sample size - with large enough n, trivially small differences become "significant"
P > 0.05 does NOT mean no effect - it means insufficient evidence was found; absence of evidence is not evidence of absence
P-value is not reproducible - even if an effect is real, repeated studies will produce varying p-values due to sampling variation

6. NULL HYPOTHESIS

Definition

The null hypothesis (H₀) is the default assumption being tested - typically stating that there is no difference, no effect, or no association between variables. It is the hypothesis that statistical tests attempt to reject.

Goldman-Cecil Medicine states:

"This comparison begins with a hypothesis that is stated formally as the null hypothesis and is phrased in relation to an alternative hypothesis. The two hypotheses are mutually exclusive and exhaustive."

Goldman-Cecil Medicine, p. 2694

Structure

H₀ (Null hypothesis): "There is no difference in mean systolic blood pressure between patients treated with Drug A and those given placebo."
H₁ (Alternative hypothesis): "There IS a difference in mean systolic blood pressure between patients treated with Drug A and those given placebo."

H₀ and H₁ must be mutually exclusive (cannot both be true) and exhaustive (together they cover all possibilities).

The Logic of Hypothesis Testing

Statistical testing works by assuming H₀ is true, calculating the probability of observing the data (or more extreme data) under that assumption, and deciding whether this probability is too low to be plausible.

"A statistical test helps to estimate the probability that an association observed in a study is due to chance (the 'p-value'). The 'alternative hypothesis' states that there is such an association."

Scott-Brown's Otorhinolaryngology, p. 2802

Importantly, H₀ is never "proved" - it can only be:

Rejected (when p < α) - evidence suggests the null is implausible
Failed to be rejected (when p ≥ α) - insufficient evidence to reject it

One-Sided vs Two-Sided Hypotheses

Two-sided (two-tailed) H₀: States only that a difference exists (does not specify direction). Example: "Drug A ≠ placebo." This is the default in most medical research.
One-sided (one-tailed) H₀: Specifies the direction. Example: "Drug A is better than placebo (not just different)."

Scott-Brown's notes:

"It is convention to use two-sided hypotheses when planning the size of a study as well as two-sided p-values when analyzing the results, unless there are well-argued reasons for the contrary."

Scott-Brown's Otorhinolaryngology, p. 2804

Type I and Type II Errors

These are the two fundamental errors in hypothesis testing:

Decision	H₀ Actually True	H₀ Actually False
Reject H₀	Type I Error (α) - False positive	Correct (True positive)
Fail to reject H₀	Correct (True negative)	Type II Error (β) - False negative

Type I Error (α error):

"A type I error occurs when the null hypothesis is rejected but is actually true in the population. This may also be referred to as a false positive. The type I error rate, denoted by α, is the probability that the null hypothesis is rejected given that it is true."

Schwartz's Principles of Surgery

Type II Error (β error):

"A type II error is the failure to reject the null hypothesis when the null hypothesis is false. This error may also be referred to as a false negative... Power = 1 - β: Probability of correctly rejecting the null hypothesis."

Schwartz's Principles of Surgery

Statistical Power (1 - β) is the ability of a study to detect a true effect when it exists. It is influenced by:

Sample size (larger n = more power)
Effect size (larger effect = easier to detect = more power)
Significance level (α) - higher α = more power but more Type I errors
Variability in the data (less variance = more power)

A power of 0.80 (80%) is conventionally accepted as the minimum for a well-designed study.

7. HYPOTHESIS (Research Hypothesis)

Definition

A research hypothesis is a testable, specific statement that predicts the relationship between variables in a study. It is the formal, a priori statement of what a study is designed to test.

Scott-Brown's describes it as:

"A research hypothesis should be formulated that further refines the study question... it should be simple (addressing one determinant or comparison and the occurrence of one outcome) and specific (defining unambiguously the target population, the control and comparison group, and the outcome of interest)."

Scott-Brown's Otorhinolaryngology, p. 2801

Types of Hypotheses

1. Research (Scientific) Hypothesis The general statement of the expected relationship, usually based on prior evidence or theory. Example: "Statin therapy reduces the incidence of myocardial infarction in patients with hypercholesterolaemia."

2. Null Hypothesis (H₀) The statistical version - states no effect or no difference (see Section 6 above).

3. Alternative Hypothesis (H₁ or Hₐ) Negates the null hypothesis - states that an effect or difference does exist.

4. Directional (One-Tailed) Hypothesis Specifies the direction of the expected effect. Example: "Drug A reduces BP by MORE than placebo."

5. Non-Directional (Two-Tailed) Hypothesis States that a difference exists but does not specify direction. Example: "Drug A produces a DIFFERENT BP response compared to placebo."

Characteristics of a Good Hypothesis

A well-constructed hypothesis should be:

Property	Description
Testable	Can be confirmed or refuted with available methods
Falsifiable	Must be possible to prove it wrong
Specific	Clearly defines the population, exposure, comparator, outcome, and timeframe (PICO format)
A priori	Stated BEFORE data are collected (post-hoc hypotheses inflate Type I error)
Grounded	Based on prior biological plausibility or existing evidence
Simple	Addresses one primary question (multiple outcomes inflate Type I error)

PICO Framework for Hypothesis Formulation

The standard structure for clinical research hypotheses:

P - Patient/Population
I - Intervention/Exposure
C - Comparison/Control
O - Outcome

Example: "In adults with type 2 diabetes (P), does metformin (I) compared to placebo (C) reduce HbA1c at 12 months (O)?"

Hypothesis and Sample Size

Once the hypothesis is stated, it directly drives sample size calculation. Key inputs needed are:

Effect size - the minimum clinically important difference
α level - conventional 0.05
Power (1 - β) - conventional 0.80 or 0.90
Variance of the outcome variable

The smaller the expected effect size, the larger the sample needed. The Harriet Lane Handbook notes:

"Sample size: The number of subjects required in a study to detect an effect with a predetermined power and α."

The Harriet Lane Handbook, p. 957

8. SAMPLING

Definition

Sampling is the process of selecting a subset (sample) of individuals from a larger target population to study, with the intention of drawing inferences about the whole population.

Kaplan & Sadock's Comprehensive Textbook of Psychiatry defines it as:

"Sampling refers to the process of selecting a subset (i.e., sample) of the population of interest for a research study. The goal is to select a sample that is representative of the population of interest... In order to select a representative sample of the population of interest, one needs to have an exhaustive list of the members in the population, called the sampling frame, from which the sample will be drawn."

Kaplan & Sadock's Comprehensive Textbook of Psychiatry, p. 2654

Key Concepts

Term	Definition
Target population	The total group the researcher wants to generalise to (e.g., all adults with hypertension in India)
Accessible population	The subset of the target population that is practically reachable
Sampling frame	The complete list or register of all members of the accessible population from which the sample is drawn
Sample	The subset actually selected and studied
Sampling error	The difference between a sample statistic and the true population parameter; inherent in any sample
Sampling bias	Systematic distortion in sample selection that makes the sample unrepresentative

Two Major Categories of Sampling

A. Probability Sampling

In probability sampling, every member of the population has a known, non-zero probability of being selected. This allows:

Calculation and control of sampling error
Generalisation (external validity) of results to the population

"Probability sampling is the gold standard for ensuring that the study sample is representative of the target population, except for the effect of chance variation."

Scott-Brown's Otorhinolaryngology, p. 2603

Types of Probability Sampling:

1. Simple Random Sampling Every individual has an equal and independent chance of selection (like drawing names from a hat, or using a random number generator).

Advantage: Simple, unbiased
Disadvantage: May not represent small subgroups; requires a complete sampling frame

2. Systematic Sampling Select every k-th individual from a list (where k = population size / desired sample size). The first individual is selected randomly.

Example: Selecting every 10th patient from a clinic register
Advantage: Easy to implement
Risk: If the list has a periodic pattern, systematic bias can occur ("periodicity problem")

3. Stratified Random Sampling Divide the population into non-overlapping subgroups (strata) based on a key variable (e.g., age, sex, disease stage), then randomly sample from each stratum.

As Kaplan & Sadock's explains:

"In stratified random sampling, the sampling frame is divided into a number of nonoverlapping strata based on a factor that may affect the variable of interest, and individuals are randomly selected from within each stratum. Stratified random sampling can ensure that representative samples of all relevant subsamples of the population are selected... Stratified random sampling provides greater statistical precision because there is less variability within a stratum."

Kaplan & Sadock's Comprehensive Textbook of Psychiatry, p. 2654
Proportionate stratified sampling - sample size within each stratum is proportional to its size in the population
Disproportionate stratified sampling - oversample smaller strata to ensure adequate representation (common for rare subgroups)

4. Cluster Sampling The population is divided into clusters (e.g., hospitals, villages, schools). Clusters are randomly selected, then ALL individuals within selected clusters are studied (single-stage cluster sampling) or a random sample from each selected cluster is taken (two-stage cluster sampling).

Advantage: Practical and economical when population is geographically dispersed; no complete sampling frame needed
Disadvantage: Lower statistical precision (intra-cluster correlation); needs specialised analysis (cluster-adjusted statistics)

5. Multi-stage Sampling A combination of sampling methods applied at successive stages. Example: First randomly select districts (clusters), then randomly select villages within those districts, then randomly select households within villages.

B. Non-Probability Sampling

In non-probability sampling, the probability of selection is unknown. Results cannot be formally generalised to the population.

"Nonprobability sampling is used when the sampling frame is not available. With nonprobability sampling, information on entire sections of the population may be missing, which affects the ability to estimate the size and effect of the sampling error."

Kaplan & Sadock's Comprehensive Textbook of Psychiatry, p. 2654

Type	Description	Risk
Convenience sampling	Select whoever is most accessible (e.g., volunteers, outpatient attendees)	High selection bias
Consecutive sampling	Recruit all eligible individuals who present within a set time period	Less biased than convenience; common in clinical studies
Quota sampling	Pre-set quotas for subgroups, but selection within each quota is non-random	Quota filled by convenience
Purposive (judgmental) sampling	Researcher deliberately selects cases that best represent the phenomenon (common in qualitative research)	Researcher bias
Snowball sampling	Existing participants recruit further participants (useful for hard-to-reach populations)	Referral bias; non-representative

Sampling Error vs Sampling Bias

These are frequently confused:

	Sampling Error	Sampling Bias
Nature	Random, due to chance	Systematic, directional
Direction	Unpredictable	Consistently over- or under-estimates
Control	Reduced by increasing sample size	Only corrected by better study design
Calculable?	Yes (with probability sampling)	Not reliably

Sample Size and Its Importance

A larger sample:

Reduces sampling error
Increases statistical power (ability to detect true effects)
Produces narrower confidence intervals (more precise estimates)
Makes results more generalisable

However, larger samples also cost more, take longer, and (in trials) expose more patients to an experimental treatment. Sample size must be calculated a priori based on expected effect size, α, and desired power.

Summary Comparison Table

Concept	Core Idea	Key Term
Randomisation	Allocate by chance to eliminate confounding	Allocation concealment
Parametric tests	Assume normal distribution; use actual values	t-test, ANOVA, Pearson's r
Non-parametric tests	No distribution assumption; use ranks	Mann-Whitney, Wilcoxon, Chi-square
Significance testing	Is the result due to chance?	Test statistic, p-value
P-value	Probability of data given H₀ true	p < 0.05 = significant
Null hypothesis	Assumes no effect/difference	Rejected or not rejected
Hypothesis	Testable a priori prediction	H₀, H₁, one/two-tailed
Sampling	Select a representative subset	Probability vs non-probability

Sources: Scott-Brown's Otorhinolaryngology Head & Neck Surgery Vol 1; Schwartz's Principles of Surgery, 11th ed.; Sabiston Textbook of Surgery; Goldman-Cecil Medicine; Kaplan & Sadock's Comprehensive Textbook of Psychiatry; The Harriet Lane Handbook, 23rd ed.

This is a shared conversation. Sign in to Orris to start your own chat.