This article has two primary aims. The first is to introduce a new Vietnamese text-based corpus. The Corpora of Vietnamese Texts (CVT; Tang, 2006a) consists of approximately 1 million words drawn from newspapers and children's literature, and is available online at www.vnspeechtherapy.com/vi/CVT. The second aim is to investigate potential differences in lexical frequency and distributional characteristics in the CVT on the basis of place of publication (Vietnam or Western countries) and intended audience: adult-directed texts (newspapers) or child-directed texts (children's literature). We found clear differences between adult- and child-directed texts, particularly in the distributional frequencies of pronouns or kinship terms, which were more frequent in children's literature. Within child- and adult-directed texts, lexical characteristics did not differ on the basis of place of publication. Implications of these findings for future research are discussed.
Bibliographical noteFunding Information:
Funding for this project was provided by the Graduate Research Partnership Program at the University of Minnesota and was awarded to the first author under the faculty mentorship of the second author.