What is semantic diversity, and how might it support children’s reading?

We’ve been thinking a lot about semantic diversity lately and how it might relate to children’s reading development. We invited our latest recruit to the research team, Nicky Dawson, to summarise our recent paper and think about the implications of its findings. Over to Nicky…

Experience with words matters for reading. We know this because children who read more are better word readers than children who read less. Similarly, words that occur more frequently in the language are read more easily by both children and adults compared to words that occur less frequently. What is it about exposure to words in texts that is so important for reading? Yaling Hsiao and Kate Nation set out to address this question in a recent paper (and see here for Open Access version).

Reading experience provides opportunities to build word knowledge. According to the Lexical Quality Hypothesis, knowledge about the meaning, spelling and pronunciation of a word affects how efficiently that word is processed during reading. When these three sources of knowledge are well-specified and closely interlinked in memory, word reading is more fluent and accurate. This means that more attention can be devoted to the primary aim of reading: understanding the text. Importantly, the quality of knowledge about words will vary across both individual people and individual words. More experienced readers will have, on average, more detailed and integrated knowledge about words than less experienced readers. Equally, at any point in time, a given individual will have better knowledge of some words compared to others.

How, then, does reading experience contribute to the quality of word knowledge? One view is that repeated encounters with the same word help to consolidate that word in memory, meaning that it can be processed efficiently during reading. In other words, repetition is important. However, this might not be the only factor at play. In general, words are not read in isolation, but form part of a larger body of text. This text can provide a rich source of information about the individual words it contains. Over time, a reader will encounter words across multiple, diverse contexts. Each encounter with a word in a different context provides an opportunity for the reader to accumulate and refine knowledge about the meaning of that word and its relationship to other words. On this perspective, it is not just the number of times an individual encounters a word that is important, but also the quality and diversity of those encounters. Individuals who read more widely will experience more of these opportunities, which may help to explain why they are better readers.

To test this idea, Yaling and Kate investigated how contextual experience influences children’s word reading. Specifically, they wanted to measure the amount of variation in meaning across the different contexts in which a given word occurs (known as ‘semantic diversity’), and the association between this measure and children’s reading performance on those words. For example, words such as perjury and predicament are each associated with a specific meaning, yet perjury tends to occur across similar contexts (i.e. relating to law), while predicament is used in more semantically varied contexts (Hoffman, Lambon Ralph & Rogers, 2013). This relates to children’s reading experience because the more widely children read, the more opportunity they have to encounter a given word in different meaningful contexts. In turn, this is likely to lead to differences in the quality of knowledge children have about that word, which, as outlined above, is thought to influence how efficiently they process the word during reading.

Their first aim was to establish a measure of semantic diversity. A previous study with children used ‘document count’ as an index of diversity, which simply counts the number of different documents that a word appears in. However, this measure is limited in at least two ways. First, the number of unique documents containing a given word is closely associated with the overall number of occurrences of that word, making it difficult to tease apart the effects of frequency and the effects of diversity. Second, it does not take the content of the documents into account, so it is unclear which aspects of diversity are important for word reading. As an alternative method, Hsiao and Nation used an approach based on ‘Latent Semantic Analysis’. This provides a graded measure of the semantic similarity across all the different contexts (documents) a word appears in. This is less likely to be related to overall frequency of occurrence, and it quantifies differences in semantic content across contexts.

The semantic diversity measure was calculated using the Oxford Children’s Corpus, a huge collection of around 35 million words in 12,000 texts written for children aged 5-16 years, developed by Oxford University Press (read more about it here). These texts include stories, non-fiction books, educational texts, as well as websites and magazines, providing a broad range of diverse contexts. Using the Latent Semantic Analysis technique described above, words within the corpus were assigned a value corresponding to the semantic diversity of the contexts in which they appeared across the corpus. These values were then used to test whether the semantic diversity value of a word influences how well children can read it.

In the first experiment, 60 words taken from the corpus were split into two sets: high semantic diversity and low semantic diversity. Thirty-five children aged 8-11 years completed two tasks using these words. In one task, the lexical decision task, children saw half of the high and low semantic diversity words and some nonsense words on a computer screen. Each word or nonsense word was presented individually, out of context and in a random order, and the children had to indicate whether the item they saw was a word by pushing a button as quickly as possible. In the other task (word naming), children were presented with the other half of the high and low semantic diversity words, each printed on its own card, and they were required to read each one aloud.

The main finding from this first experiment was that children were quicker to respond in the lexical decision task to words in the high semantic diversity condition compared to the low semantic diversity condition. More detailed analysis revealed that in addition to the effects of semantic diversity, responses were also faster to words that occurred more frequently in the corpus. Not surprisingly, children who were more proficient readers (as identified by scores on a standardised test of reading ability) were quicker in their responses than children who were less proficient readers. These findings suggest that high semantic diversity supports efficiency of word reading, and that semantic diversity is separable from measures of overall frequency of occurrence (and document count).

One limitation of this experiment is that effects of age of acquisition were not considered. Age of acquisition simply refers to the age at which children typically acquire a given word. It is well established that words learned earlier in development are read more easily by children and adults, and there is some evidence that age of acquisition is linked with the diversity of semantic connections a given word has to other words. To examine whether semantic diversity was important for reading over and above age of acquisition, frequency, and document count, a second experiment was run using 300 words from the Oxford Children’s Corpus, covering a wider range of semantic diversity values. The words were split into 10 lists of 30 words. One hundred and fourteen Year 3 to Year 6 children each responded to two of these lists, one in the decision task and one in the naming task.

Results from Experiment 2 revealed that children were faster and more accurate in the lexical decision task, and more accurate at naming, when words were higher in semantic diversity, occurred more frequently in texts, occurred in a larger number of different texts, and were acquired earlier in development. These findings provide further support for the idea that semantic diversity influences word reading, even when overall frequency, document count and age of acquisition are taken into account.

In a final experiment, Yaling and Kate tested whether the findings would replicate in an existing word reading database. This came from 350 children aged 6-13 years, gathered during the standardisation of the Diagnostic Test of Word Reading Processes, an assessment of single word reading. Again, the findings clearly showed an effect of semantic diversity: children were more accurate at reading words that are higher in semantic diversity. Age of acquisition was also important: words that are typically acquired earlier in development were read more accurately than later-acquired words. Interestingly, overall frequency of occurrence and document count did not matter for word reading accuracy in this experiment.

Together, these experiments demonstrate that children read individual words more efficiently when those words appear in semantically diverse contexts across reading experience. This effect could not be explained by word frequency, or age of acquisition, although these variables also influenced ease of reading, as did the children’s own level of reading ability.

So how might semantic diversity contribute to word reading? One suggestion is that if a word is encountered in semantically diverse contexts it becomes less dependent on context, and each new encounter with the word provides an opportunity to update and strengthen its representation in memory. If, however, it is repeatedly encountered in a similar semantic context, this predictability means there is less need for its representation to be strengthened in memory. Over time, this leads to differences in reading behaviour, even when words are read out of context.

It is important to stress that this is novel work that is at an early stage – many questions remain. Nevertheless, the findings highlight the importance of reading experience for children’s reading development. Children who read more often, and more widely, will benefit from exposure to words across a range of diverse and semantically rich contexts. It is also important to note that variation in reading experience is itself a consequence of reading ability: children who are better at reading read more. Once children are equipped with the skills they need to decode words in texts, exposing them to a rich and varied range of reading material will help to support them in the transition from novice to skilled reader.

About the Author

Nicky Dawson is in the final stages of completing her doctorate at Royal Holloway, University of London. Working with supervisors Jessie Ricketts and Kathy Rastle, her work explores the role of morphological knowledge in the development of lexical processing. Together, they recently published a paper in the Journal of Experimental Psychology: Learning, Memory and Cognition showing that processing of morphologically complex items becomes more efficient over the course of adolescence. Nicky recently started as a post-doctoral researcher at the University of Oxford and will be working with Kate Nation and Yaling Hsiao on a new project examining the role of book language on language and literacy development, funded by the Nuffield Foundation.

Word cloud picture credit: This was created by Yaling Hsiao using data from the 134,780 stories written by children and submitted to this year’s BBC Radio 2 500 Words writing competition. It shows the words that co-occurred most frequently with the word ‘read’ in the children’s stories. This is not the same as semantic diversity as discussed above, but it nevertheless demonstrates the rich and varying contexts words appear in. Yaling created the collocation network using visualisation software described in this paper: Brezina, V., McEnery, T., & Wattam, S. (2015). Collocations in context: A new perspective on collocation networks. International Journal of Corpus Linguistics, 20(2), 139-173.

Share this post: