Abstract
Child language acquisition is often identified as one of the primary drivers of language change, but the lack of historical child data presents a challenge for empirically investigating its effect. In this work, I observe the relationship between lexicons extracted from modern child-directed speech and those drawn from modern and historical literary corpora in order to better understand when language acquisition can be modeled over historical and non-child corpora as it is over child corpora. The type frequencies of morphophonological and syntactic-semantic patterns occur at similar type frequencies in these corpora among high token frequency items, and furthermore, when a learning algorithm is applied to lexicons sampled from these sources, it consistently achieves the same learning outcomes in each. With appropriate care and pre-processing, modern and historical text corpora are effectively interchangeable with child-directed speech corpora for the purpose of estimating child lexical experience, opening a path for modeling language acquisition where child-directed corpora are not available.
| Original language | English |
|---|---|
| Article number | 122 |
| Journal | Glossa |
| Volume | 4 |
| Issue number | 1 |
| DOIs | |
| State | Published - 2019 |
Keywords
- Child language acquisition
- Corpus linguistics
- English
- Historical linguistics
- Latin
- Paradigm saturation
- Proto-Germanic
- Spanish
Fingerprint
Dive into the research topics of 'Estimating child linguistic experience from historical corpora'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver