Title: Word Frequency Extraction and Summarization
Version: 0.1.0
Description: Provides tools to extract word frequencies from the CHILDES (Child Language Data Exchange System) corpus. The main function allows users to input a list of words and receive speaker-role-specific frequency counts and a summary of the dataset. The output includes Excel-formatted tables of word counts and metadata summaries such as number of speakers, transcripts, children, and token counts. Useful for researchers studying early language acquisition, corpus linguistics, and speaker role variation. The CHILDES database is maintained at https://childes.talkbank.org/.
License: MIT + file LICENSE
Encoding: UTF-8
RoxygenNote: 7.3.2
Imports: childesr, readr, dplyr, tidyr, writexl, rlang, magrittr, stats
URL: https://github.com/n-albudoor/childeswordfreq
BugReports: https://github.com/n-albudoor/childeswordfreq/issues
Suggests: knitr, rmarkdown, testthat (≥ 3.0.0)
VignetteBuilder: knitr
Config/testthat/edition: 3
NeedsCompilation: no
Packaged: 2025-04-01 15:53:07 UTC; albudoor.1
Author: Nahar Albudoor [aut, cre]
Maintainer: Nahar Albudoor <n.albudoor@gmail.com>
Repository: CRAN
Date/Publication: 2025-04-08 15:20:02 UTC

Get Word Counts by Speaker Role

Description

Reads a word list CSV and outputs a word frequency Excel file with summary.

Usage

word_counts(
  word_list_file,
  output_file,
  collection = NULL,
  language = NULL,
  corpus = NULL,
  age = NULL,
  sex = NULL
)

Arguments

word_list_file

Path to CSV file with a "word" column.

output_file

Path to output Excel (.xlsx) file.

collection

Language collection (default = NULL).

language

Vector of languages.

corpus

Vector of corpora.

age

Numeric vector: single value or min/max.

sex

"male" and/or "female".

Value

Writes an Excel file with 2 sheets: word frequencies and summary.

Examples


word_file <- system.file("extdata", "word_list.csv", package = "childeswordfreq")
output_file <- tempfile(fileext = ".xlsx")
word_counts(
  word_list_file = word_file,
  output_file = output_file,
  collection = NULL,
  language = "eng",
  age = c(18, 36),
  sex = NULL
)