Lead
"Supports brain development." "Builds vocabulary." "Raises IQ." The claims on toy packaging and retail websites are written with uniform confidence regardless of the quality of evidence behind them.
The educational toy market is growing rapidly worldwide, and a significant portion of that market now uses "evidence-based" as a selling point. When researchers trace the actual evidence, they find a mixed picture: some studies show effects, others do not, and what outcomes were measured, under what conditions, frequently determines which conclusion you get.
This article covers what peer-reviewed research has and has not established about educational toys — not to argue that all of them are useless or that any of them work, but to hand over the analytical frame that makes individual choices easier to evaluate.
Montessori materials: what the evidence shows
Montessori education has over a century of history, and its materials — tactile, sensory, and math manipulatives designed for independent child use — have an unusually well-developed research base compared with most "educational" approaches.
Marshall (2017), writing in npj Science of Learning (a Nature-family journal), conducted a systematic review of evaluations of Montessori education [1]. The main limitation he identified: when you restrict the sample to randomized controlled trials, the number of studies drops sharply. Most available evidence comes from observational studies and quasi-experimental designs, which are vulnerable to selection bias: distortion in research results caused when the people studied differ systematically from the general population in ways that affect the outcome. Families who choose Montessori schools have particular characteristics to begin with, and disentangling the educational method from those family characteristics is methodologically difficult [1].
A larger and more recent systematic review by Randolph and colleagues (2023), published in Campbell Systematic Reviews, restricted its analysis to studies with verified baseline equivalence [2]. Even within that constrained sample, the review found positive effect sizes: academic outcomes overall g = 0.24, language g = 0.17, mathematics g = 0.22 [2]. These are small-to-moderate effect sizes: a statistical measure of the magnitude of an experimental effect, independent of sample size — values around 0.2 are considered small — not "dramatically smarter" in magnitude, but statistically meaningful.
Marshall's important caveat is that these effects cannot currently be attributed to the materials themselves [1]. The Montessori effect — whatever its size — might derive from the extended child-directed exploration time, the trained teacher relationship, or the integrated environment, rather than the physical objects. Buying a set of Montessori materials and placing them in a room does not replicate the conditions under which those effects were observed.
Electronic toys and language development: the experimental evidence
Electronic toys that respond to touch, produce music, or speak words are a large category in the 0–2 market and are routinely marketed as language-development tools. The research points in a different direction.
Zimmerman and colleagues (2007), publishing in the Journal of Pediatrics, surveyed over 1,000 parents of children aged two to 24 months and reported an inverse association between exposure to baby-directed video and DVD content and vocabulary acquisition [3]. The finding — that passive electronic media exposure was not positively associated with vocabulary growth in under-twos — became widely cited as evidence against the "your baby will learn from this screen" premise [3].
More direct experimental evidence comes from Sosa (2016) in JAMA Pediatrics [4]. Twenty-six parent-infant dyads (infants aged 10 to 16 months) each participated in three 15-minute in-home play sessions: one with electronic toys, one with traditional toys (blocks, shape sorters), and one with books. The electronic toy condition was associated with significantly fewer parent words, fewer parental responses, and fewer conversational turns compared with both other conditions [4].
The mechanism Sosa's data support is not "the toy damages the child's language directly" — it is that an electronic toy producing continuous sounds reduces the space for parent-child verbal exchange [4]. Language development research has long established that the volume and quality of child-directed speech — words and sentences addressed to the child by a caregiver — is the strongest environmental predictor of vocabulary growth. A toy that fills the room with its own audio competes with that signal.
The "playful learning" framework
Hirsh-Pasek and colleagues (2015), writing in Psychological Science in the Public Interest, proposed a four-element framework for evaluating whether any learning tool — app, toy, or otherwise — is likely to support development [5]. The four conditions they identified as characteristic of effective learning environments are:
- Active engagement (the child acts on the material rather than observing it)
- Engaged (the level of involvement is sustained, not passive)
- Meaningful (connected to the child's existing knowledge and interests)
- Socially interactive (involves exchange with another person)
Against these four criteria, open-ended physical materials — blocks, sand, clay, construction sets — tend to score well because they invite physical manipulation, can be used multiple ways, can connect to any child's current interest, and leave room for adult commentary. An electronic toy that produces a single predetermined response to each input can score poorly on active engagement and social interaction simultaneously if it effectively replaces rather than supplements adult-child exchange [5].
Hirsh-Pasek and colleagues connect this framework to "guided play" — structured exploration in which an adult participates lightly, supporting and extending the child's inquiry without taking it over [5]. The toy is the container; the adult's presence and language determine what develops inside it.
Reading "IQ-raising" claims critically
"Raises IQ" and "strengthens neural circuits" are common marketing claims for toys marketed at the 0–6 age range. Both claims deserve scrutiny.
There is no consistent body of RCT evidence showing that specific toy use in the 0–6 period produces statistically significant IQ gains [1,2]. IQ is substantially heritable and is influenced by the broad richness of the child's environment over years — not by short-term exposure to a specific object.
"Brain development" language in toy marketing typically references early neural plasticity: the brain's ability to reorganize and form new connections in response to experience, especially in early childhood without specifying what changes are being claimed. Early childhood is a period of high plasticity, but that is not the same as saying specific stimuli produce predictable specific circuits. The largest variables in early neural development are the safety of the child's attachment relationships, the richness of language directed at the child, and the opportunity for self-directed exploration — none of which are captured in any toy.
Three practical criteria
For parents evaluating educational toy choices, three questions tend to produce better answers than relying on packaging claims.
First: how many ways can the child use this? A toy with a single predetermined output constrains exploration. A toy that can be combined, reconfigured, and reinterpreted extends child-directed play time and aligns with the "active" criterion in Hirsh-Pasek's framework [5].
Second: does this toy leave room for an adult alongside? Since child-directed speech is the most potent language environment available, a toy that invites a caregiver to comment, narrate, or problem-solve together does more for language than one that fills the acoustic space by itself [4].
Third: does it match what this child is curious about right now? "Developmental age" labels on packaging are safety ratings (choking risk, etc.) — they say little about whether a toy will capture the particular curiosity of a particular child at a particular moment. Following the child's demonstrated interests is a better signal than following the age bracket, and it is also one that any parent can observe without specialized knowledge.
Summary
Montessori education shows modest positive effects in systematic reviews, but the effects cannot yet be attributed specifically to the materials as opposed to the overall method and environment [1,2]. Electronic toys with continuous sound output have been experimentally linked to reductions in the quantity and quality of parent-infant language exchange — not because the toy damages language directly, but because it competes with it [4]. Claims that specific toys raise IQ have no consistent RCT support.
The most effective "educational environment" in the literature is one in which a child can explore actively, an adult speaks to and with the child, and failure carries no penalty [5]. Toys are one element of that environment — not the determinative one.
References
- Marshall C. Montessori education: a review of the evidence base. npj Sci Learn. 2017;2:11. doi:10.1038/s41539-017-0012-7.
- Randolph JK, Bryson A, Menon L, et al. Montessori education's impact on academic and nonacademic outcomes: a systematic review. Campbell Syst Rev. 2023;19(3):e1330. doi:10.1002/cl2.1330.
- Zimmerman FJ, Christakis DA, Meltzoff AN. Associations between media viewing and language development in children under age 2 years. J Pediatr. 2007;151(4):364–368. doi:10.1016/j.jpeds.2007.04.071. PMID: 17889070.
- Sosa AV. Association of the type of toy used during play with the quantity and quality of parent-infant communication. JAMA Pediatr. 2016;170(2):132–137. doi:10.1001/jamapediatrics.2015.3753. PMID: 26720437.
- Hirsh-Pasek K, Zosh JM, Golinkoff RM, et al. Putting education in "educational" apps: lessons from the science of learning. Psychol Sci Public Interest. 2015;16(1):3–34. doi:10.1177/1529100615569721. PMID: 25985468.
- Robb MB, Richert RA, Wartella EA. Just a talking book? Word learning from watching baby videos. Br J Dev Psychol. 2009;27(Pt 1):27–45. doi:10.1348/026151008X320156. PMID: 19972661.