The vocabulary demands of English and French L2 textbooks: A cross-lingual corpus study

Studies have determined the vocabulary demands of a variety of input types (e.g., novels, audio-visual media) by calculating lexical profiles (e.g., Nation, 2006), i.e., categorisations of the vocabulary into word frequency levels. This allows researchers to determine the required vocabulary knowledge to reach 95% and 98% understanding (or ‘coverage’) of the words in the input, which are believed to be necessary to reach minimal and optimal comprehension of a text’s contents, respectively (Laufer & Ravenhorst-Kalovski, 2010). Though textbooks are a vital source of input in the L2 classroom, lexical profiling research into L2 textbooks is limited. Moreover, existing studies tend to focus exclusively on English. Therefore, a corpus consisting of English and French L2 textbook reading materials was compiled (ca. 300,000 words per L2) in order to investigate (RQ1) what the vocabulary demands are, (RQ2) how these demands evolve across all six years of secondary education, and (RQ3) how target language influences the (evolution of) demands.

Typically, lexical profiling research relies on word families as lexical unit, but recent research has shown that these may overestimate the vocabulary knowledge of learners who struggle with morphology (e.g., Brown et al., 2022). In light of this criticism and bearing in mind the higher morphological demands of French, we opted for flemmas and individual word types instead and created a custom Python script to ensure cross-language comparability. The lexical profiles were supplemented with indices of lexical diversity and density.

Results showed that knowledge of the 15,000 most frequent flemmas was required to reach 98% coverage of the first grade English reading materials, as opposed to 11,000 for French, despite French instruction already starting in primary school and English instruction not until secondary school. From there, a gradual and moderately systematic increase in demands across grade levels was observed in both the English and French segments of the corpus. However, some grade levels did not differ in terms of lexical profiles. In addition, English consistently required more vocabulary knowledge than French (ca. 5,000 flemmas more in each grade level). Overall, these findings suggest that publishers take into account the considerably higher English vocabulary knowledge that has been observed in adolescents as a consequence of out-of-school exposure (Peters et al., 2019).