Lead
"Eating this during pregnancy doubles the baby's risk." "This additive raises the odds of developmental problems by 1.5 times." Parenting information contains a relentless stream of articles that use numbers to amplify parental anxiety.
Those numbers may not be wrong. But numbers without context mislead. If the underlying risk was 0.1%, a "doubling" still leaves it at 0.2%. The story of one in a thousand lands in a parent's mind as "my child is in danger."
Statistical literacy is not the ability to distrust numbers. It is the ability to read them correctly. A handful of concepts is enough to change how you receive parenting information.
The Starting Point: Why Numbers That Are Accurate Can Still Mislead
There is a distortion in translation between what researchers write in a paper and what media write in a headline. "Statistically significant" does not mean "has a meaningful effect." "Risk elevated" does not mean "dangerous."
Two mechanisms produce most of this distortion: the habit of expressing findings in relative risk, and the habit of omitting study design. Both can be corrected by the reader with a modest amount of knowledge [1].
Relative Risk and Absolute Risk — Knowing That Two Numbers Exist
Medical research primarily uses two ways to express the effect of an exposure or intervention.
Relative Risk (RR) is a ratio using the unexposed group as a reference: the incidence in the exposed group divided by the incidence in the unexposed group. If the baseline risk is 1% and the exposed-group risk is 2%, the relative risk is 2.0 — that is, "twice the risk."
Absolute Risk Difference (ARD) is the actual difference in incidence rates. In the same example: 2% − 1% = 1%. One thousand people: 10 in the unexposed group develop the outcome; 20 in the exposed group do.
The problem is that relative risk makes headlines and absolute risk is unglamorous. "Twice the risk" and "a 1-percentage-point difference" are both accurate descriptions of the same result, but they produce entirely different impressions.
When you read a news article, developing the habit of asking "what was the baseline risk?" and "what is the absolute risk difference?" gives you the tools to weigh a finding accurately [2].
NNT — How Many People Must Be Treated for One to Benefit?
NNT (Number Needed to Treat) is the number of people who must receive an intervention for one person to benefit. It is the reciprocal of the absolute risk difference (1 ÷ ARD).
If the ARD is 1%, the NNT is 100: treat 100 people for one to gain from it. If the ARD is 0.1%, the NNT is 1,000 [2].
The mirror concept is NNH (Number Needed to Harm): the number of people who need to be exposed before one is harmed. Comparing NNT and NNH gives you the raw material for judging whether an intervention's benefits outweigh its harms.
Applied to parenting information: when a food or habit is reported to carry "twice the risk of X," if the baseline risk is 0.05%, the NNH is 2,000. Does that translate into a personally relevant concern?
p-Values — "Statistically Significant" Is Not "Has an Effect"
A p-value is "the probability of obtaining results as extreme as the observed ones, or more extreme, assuming the null hypothesis (no effect) is true." p < 0.05 means that probability is under 5%.
Here is where many misreadings originate. p < 0.05 does not prove "there is an effect" — it means only that "the difference is hard to explain by chance alone" [3]. Furthermore, when sample sizes are large, even trivially small differences — ones with no clinical relevance — can reach statistical significance.
What matters more than a p-value is effect size (Cohen's d, odds ratio) and the width of the confidence interval. A small effect size with a wide confidence interval, even if statistically significant, may be practically negligible [4].
"Statistically significant" ≠ "clinically meaningful" is the principle Altman and Bland (1995) articulated alongside their well-known formulation: "absence of evidence is not evidence of absence" [3].
The Evidence Hierarchy — Study Type Determines Reliability
Not all research produces equivalent evidence. The evidence hierarchy: a ranking of study designs by their resistance to bias, from expert opinion at the bottom to systematic reviews at the top systematized by Sackett and colleagues ranks studies from highest to lowest reliability as follows [5]:
- Systematic reviews and meta-analyses
- Randomized controlled trials (RCTs)
- Cohort studies and case-control studies
- Case reports and expert opinion
Most research cited in parenting media is observational (cohort or case-control). Observational studies can establish that "X is associated with Y" — but they cannot directly establish that "X causes Y." The possibility of confounding factors: variables that are associated with both the exposure and the outcome, potentially producing a spurious apparent link — a third variable explaining the observed association — is always present.
"Children who attend day care show more of X" may reflect not an effect of day care itself but characteristics of the families who use day care. Developing the habit of checking what kind of study is being cited prevents this kind of conflation.
Publication Bias and the Limits of Meta-Analysis
Publication bias refers to the tendency for studies showing a positive effect to be published, while "no-effect" studies remain in file drawers — a phenomenon Rosenthal termed the "file drawer problem" [6].
The visualization tool for this bias is the funnel plot: plotting effect size against sample size (precision) in a scatter plot, asymmetry (small studies clustered on one side) suggests the presence of publication bias [7].
Meta-analyses are particularly susceptible to this distortion. Furthermore, when heterogeneity (I²) is high — meaning the studies being combined show substantial variation in results — the reliability of the pooled estimate falls [8].
Ioannidis (2005) estimated that the majority of published research findings are not subsequently replicated, and argued that small studies, studies testing many hypotheses simultaneously, and research produced in contexts of conflict of interest warrant particular caution [9]. "A meta-analysis pooling multiple high-quality RCTs" and "a meta-analysis of a handful of small observational studies" carry very different evidentiary weight.
Translating This into Action
Three steps toward statistical literacy in practice:
1. When you read a news article, look for "what was the baseline risk?" and "what is the absolute risk difference?" If the article doesn't give these numbers, the abstract of the original paper is often enough to find them.
2. Ask once whether this result comes from an RCT or an observational study. "Associated with" and "causes" are not the same thing. Don't read an observational study's conclusion as establishing causation.
3. Try asking your pediatrician "what is the NNT for this?" When discussing the effects of a vaccine or medication, sharing numerical framing with your doctor opens a more concrete dialogue.
Keeping a log of your child's symptoms, visits, and treatments means that when you consult your doctor, you bring specific data about your own child — rather than being buffeted by claims about population averages.
Closing
A headline reading "X times the risk" may be numerically accurate, but without context it is meaningless. Relative or absolute risk? RCT or observational study? What does the p-value mean? — holding those questions is enough to raise your ability to assess the quality of information.
You don't need to become an expert. The habit of asking three questions — "what percentage of people, compared to what, in what kind of study?" — is sufficient to navigate parenting information without being driven by it. Numbers have meaning only when they come with context.
References
- Greenhalgh T. How to read a paper: statistics for the non-statistician. I: different types of data need different statistical tests. BMJ. 1997;315(7104):364–366. doi:10.1136/bmj.315.7104.364. PMID: 9270463.
- Cook RJ, Sackett DL. The number needed to treat: a clinically useful measure of treatment effect. BMJ. 1995;310(6977):452–454. doi:10.1136/bmj.310.6977.452. PMID: 7873954.
- Altman DG, Bland JM. Statistics notes: absence of evidence is not evidence of absence. BMJ. 1995;311(7003):485. doi:10.1136/bmj.311.7003.485. PMID: 7647644.
- Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. Hillsdale: Lawrence Erlbaum Associates; 1988.
- Sackett DL, Rosenberg WMC, Gray JAM, Haynes RB, Richardson WS. Evidence based medicine: what it is and what it isn't. BMJ. 1996;312(7023):71–72. doi:10.1136/bmj.312.7023.71. PMID: 8555924.
- Rosenthal R. The file drawer problem and tolerance for null results. Psychol Bull. 1979;86(3):638–641. doi:10.1037/0033-2909.86.3.638.
- Egger M, Davey Smith G, Schneider M, Minder C. Bias in meta-analysis detected by a simple, graphical test. BMJ. 1997;315(7109):629–634. doi:10.1136/bmj.315.7109.629. PMID: 9310563.
- Higgins JPT, Thomas J, Chandler J, et al., eds. Cochrane Handbook for Systematic Reviews of Interventions. Version 6.4. Cochrane; 2023. Available from: https://training.cochrane.org/handbook.
- Ioannidis JPA. Why most published research findings are false. PLoS Med. 2005;2(8):e124. doi:10.1371/journal.pmed.0020124. PMID: 16060722.
- Schulz KF, Altman DG, Moher D; CONSORT Group. CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. BMJ. 2010;340:c332. doi:10.1136/bmj.c332. PMID: 20332509.
- Moher D, Liberati A, Tetzlaff J, Altman DG; PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. 2009;6(7):e1000097. doi:10.1371/journal.pmed.1000097. PMID: 19621072.