Skip to main navigation Skip to search Skip to main content

Estimating child linguistic experience from historical corpora

Research output: Contribution to journalArticlepeer-review

13 Scopus citations

Abstract

Child language acquisition is often identified as one of the primary drivers of language change, but the lack of historical child data presents a challenge for empirically investigating its effect. In this work, I observe the relationship between lexicons extracted from modern child-directed speech and those drawn from modern and historical literary corpora in order to better understand when language acquisition can be modeled over historical and non-child corpora as it is over child corpora. The type frequencies of morphophonological and syntactic-semantic patterns occur at similar type frequencies in these corpora among high token frequency items, and furthermore, when a learning algorithm is applied to lexicons sampled from these sources, it consistently achieves the same learning outcomes in each. With appropriate care and pre-processing, modern and historical text corpora are effectively interchangeable with child-directed speech corpora for the purpose of estimating child lexical experience, opening a path for modeling language acquisition where child-directed corpora are not available.

Original languageEnglish
Article number122
JournalGlossa
Volume4
Issue number1
DOIs
StatePublished - 2019

Keywords

  • Child language acquisition
  • Corpus linguistics
  • English
  • Historical linguistics
  • Latin
  • Paradigm saturation
  • Proto-Germanic
  • Spanish

Fingerprint

Dive into the research topics of 'Estimating child linguistic experience from historical corpora'. Together they form a unique fingerprint.

Cite this