Title: | Access and Harmonize Childfree Demographic Data |
Version: | 0.0.4 |
Description: | Reads demographic data from a variety of public data sources, extracting and harmonizing variables useful for the study of childfree individuals. The identification of childfree individuals and those with other family statuses uses Neal & Neal's (2024) "A Framework for Studying Adults who Neither have Nor Want Children" <doi:10.1177/10664807231198869>; A pre-print is available at <doi:10.31234/osf.io/fa89m>. |
License: | GPL-3 |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.1 |
Depends: | R (≥ 2.10) |
Imports: | rio, utils, RCurl |
Suggests: | knitr, |
VignetteBuilder: | knitr |
URL: | https://www.zacharyneal.com/childfree-home, https://github.com/zpneal/childfree |
BugReports: | https://github.com/zpneal/childfree/issues |
NeedsCompilation: | no |
Packaged: | 2025-03-20 12:56:41 UTC; zacharyneal |
Author: | Zachary Neal |
Maintainer: | Zachary Neal <zpneal@msu.edu> |
Repository: | CRAN |
Date/Publication: | 2025-03-20 13:10:06 UTC |
childfree: Access and harmonize childfree demographic data
Description
Reads demographic data from a variety of public data sources, extracting and harmonizing variables useful for the study of childfree individuals. The identification of childfree individuals and those with other family statuses uses the framework described by Neal & Neal (2024).
Data can be generated from:
Demographic and Health Surveys data using
dhs()
Michigan State University State of the State data using
soss()
US CDC National Survey of Family Growth data using
nsfg()
An introduction to the package is available using vignette("childfree")
, and the detailed
codebooks generated by these functions are available using vignette("codebooks")
.
Author(s)
Maintainer: Zachary Neal zpneal@msu.edu (ORCID)
Authors:
Jennifer Watling Neal jneal@msu.edu (ORCID)
References
Neal, Z. P. and Neal, J. W. (2024). A framework for studying adults who neither have nor want children. The Family Journal, 32, 121-130. Version of record: doi:10.1177/10664807231198869 Preprint: doi:10.31234/osf.io/fa89m
See Also
Useful links:
Report bugs at https://github.com/zpneal/childfree/issues
Read and recode Demographic and Health Surveys (DHS) individual data
Description
Read and recode Demographic and Health Surveys (DHS) individual data
Usage
dhs(files, extra.vars = NULL, progress = TRUE)
Arguments
files |
vector: a character vector containing the paths for one or more Individual Recode DHS data files (see details) |
extra.vars |
vector: a character vector containing the names of variables to be retained from the raw data |
progress |
boolean: display a progress bar |
Details
The Demographic and Health Surveys (DHS) program regularly collects
health data from population-representative samples in many countries using standardized surveys since 1984. The
"individual recode" data files contain women's responses, while the "men recode" files contain men's responses. These
files are available in SPSS, SAS, and Stata formats from https://www.dhsprogram.com/,
however access requires a free application. The dhs()
function
reads one or more of these files, extracts and recodes selected variables useful for studying childfree adults and other
family statuses, then returns an unweighted data frame.
Although access to DHS data requires an application, the DHS program provides a model dataset for practice. The example provided below uses the model data file "ZZIR62FL.SAV", which contains fictitious women's data, but has the same structure as a real DHS data file. The example can be run without prior application for data access.
Sampling weights
The DHS is collected using a complex survey design. The survey
package can be used to perform analyses that take these
design features into account, and make it possible to obtain population-representative estimates. In most cases, a svydesign
object for a single country and wave can be created using survey::svydesign(data = data, ids = ~cluster, strata = ~strata, weights = ~weight, nest = TRUE)
.
Additional information about analyzing DHS data using weights is available here
and in the documentation provided with the downloaded data files.
Non-biological children
Information about non-biological children (e.g., adopted children, foster children, etc.) is not available in the DHS, which means that a respondent with only non-biological children would be classified as a non-parent. This is not exactly match the approach described by the ABC Framework (Neal & Neal, 2024), and may lead to discrepancies when comparing DHS estimates to estimates derived from other data where information about non-biological children is available.
Additional notes
The SPSS-formatted files containing data from Gabon Recode 4 (GAIR41FL.SAV, GAMR41FL.SAV) and Turkey Recode 4 (TRIR41FL.SAV, TRMR41FL.SAV) contain encoding errors. Use the SAS-formatted files (GAIR41FL.SAS7BDAT, GAMR41FL.SAS7BDAT, TRIR41FL.SAS7BDAT, TRMR41FL.SAS7BDAT) instead.
In some cases, DHS makes available individual recode data files for specific regions. For example, women's data from individual states in India from 1999 are contained in files named XXIR42FL.SAV, where the "XX" is a two-letter state code. The
dhs()
function has only been tested using whole-country files, and may not perform as expected for regional files.Variables containing women's responses in the individual recode files begin with
v
, while variables containing men's responses in the men recode files begin withmv
. When applyingdhs()
to both female and male data, these are automatically harmonized. However, if extra variables are requested using theextra.vars
option, be sure to specify both names (e.g.extra.vars = c("v201", "mv201")
).
Value
A data frame containing variables described in the codebook available using vignette("codebooks")
If you are offline, or if the requested data are otherwise unavailable, NULL is returned.
References
ABC Framework: Neal, Z. P. and Neal, J. W. (2024). A framework for studying adults who neither have nor want children. The Family Journal, 32, 121-130. doi:10.1177/10664807231198869
Examples
dat <- dhs(files = c("ZZIR62FL.SAV"), extra.vars = c("v201")) #Request data for fictitous country
if (!is.null(dat)) { #If data was available...
table(dat$famstat)/nrow(dat) #Fraction of respondents with each family status
}
Read and recode National Survey of Family Growth (NSFG) data
Description
Read and recode National Survey of Family Growth (NSFG) data
Usage
nsfg(years, nonbio = TRUE, keep_source = FALSE, progress = TRUE)
Arguments
years |
vector: a numeric vector containing the starting year of NSFG waves to include (2002, 2006, 2011, 2013, 2015, 2017, 2022) |
nonbio |
boolean: should non-biological children be included |
keep_source |
boolean: keep the raw variables used to construct |
progress |
boolean: display a progress bar |
Details
The U.S. Centers for Disease Control National Survey of Family Growth (NSFG)
regularly collects fertility and other health information from a population-representative sample of adults in the
United States. Between 1973 and 2002, the NSFG was conducted periodically. Starting in 2006, the NSFG transitioned to
continuous data collection, releasing data in multi-year waves (e.g., 2006-2010, 2011-2013). The nsfg()
function reads
the raw data from CDC's website, extracts and recodes selected variables useful for studying childfree adults and other family
statuses, then returns an unweighted data frame.
Sampling weights
The NSFG is collected using a complex survey design. The survey
package can be used to perform analyses that take these
design features into account, and make it possible to obtain population-representative estimates. In most cases, a svydesign
object for a single wave can be created using survey::svydesign(data = data, ids = ~cluster, strata = ~strata, weights = ~weight, nest = TRUE)
.
Additional information about analyzing DHS data using weights is available here.
Non-biological children
When nonbio == TRUE
(default), non-biological children (e.g., adopted children, foster children, etc.) are treated the same as
biological children when determining a respondent's family status. This matches the approach described by the ABC Framework
(Neal & Neal, 2024), and should generally be used.However, non-biological children can be ignored by setting nonbio = FALSE
,
which may be useful when comparing NSFG estimates to estimates derived from other data where information about non-biological children
is not available.
Additional notes
Starting in 2006, "hispanic" was a response option for race, however "hispanic" is not a racial category, but an ethnicity. When a respondent chose this option, their actual race is unknown.
Partnership status only describes a respondent's status with respect to an opposite-sex partner. Information about current or former same-sex partnerships is not available.
The NSFG manual explains that "sample sizes for a single year are too small to provide estimates with adequate levels of precision," and therefore recommends avoiding analysis of data from single years. Instead, these data are designed to be analyzed by wave.
Value
A data frame containing variables described in the codebook available using vignette("codebooks")
References
NSFG Classification: Neal, J. W. and Neal, Z. P. (2025). Tracking types of non-parents in the United States. Journal of Marriage and Family. doi:10.1111/jomf.13097
ABC Framework: Neal, Z. P. and Neal, J. W. (2024). A framework for studying adults who neither have nor want children. The Family Journal, 32, 121-130. doi:10.1177/10664807231198869
Examples
unweighted <- nsfg(years = 2017) #Unweighted data
table(unweighted$famstat) / nrow(unweighted) #Fraction of respondents with each family status
Read and recode Michigan State of the State (SOSS) data
Description
Read and recode Michigan State of the State (SOSS) data
Usage
soss(waves, extra.vars = NULL, progress = TRUE)
Arguments
waves |
vector: a numeric vector containing the SOSS waves to include (currently available: 79, 82, 84, 85, 86) |
extra.vars |
vector: a character vector containing the names of variables to be retained from the raw data |
progress |
boolean: display a progress bar |
Details
The State of the State Survey (SOSS) is
regularly collected by the Institute for Public Policy and Social Research (IPPSR) at Michigan State
University (MSU). Each wave is collected from a sample of 1000 adults in the US state of Michigan, and
includes sampling weights to obtain a sample that is representative of the state's population with respect
to age, gender, race, and education. The soss()
function reads the raw data from IPPSR's website, extracts
and recodes selected variables useful for studying childfree adults and other family statuses, then returns
an unweighted data frame. Questions necessary for identifying childfree adults have been asked in five waves,
which each include unique questions that may be of interest:
-
Wave 79 (May 2020) - Neighborhoods, Health care, COVID, Personality
-
Wave 82 (September 2021) - Trust in government, Critical Race Theory
-
Wave 84 (April 2022) - Trust in scientists, Autonomous vehicles, Morality
-
Wave 85 (September 2022) - Reproductive rights, Race equity
-
Wave 86 (December 2022) - Education, Infrastructure
Sampling weights
The SOSS includes sampling weights that can be incorporated into analyses using the survey
package to obtain
population-representative estimates. A svydesign object for a single wave can be created
using survey::svydesign(data = data, ids = ~1, weights = ~weight)
.
Non-biological children
Non-biological children (e.g., adopted children, foster children, etc.) are treated the same as biological children when determining a respondent's family status. This matches the approach described by the ABC Framework (Neal & Neal, 2024). However, it can lead to discrepancies when comparing SOSS estimates to estimates derived from other data where information about non-biological children is not available.
Additional notes
Wave 79 did not include a "do not know" option for selected questions. Therefore, it is not possible to identify "undecided" or "ambivalent non-parent" respondents. This may lead other family status categories to be inflated.
Wave 82 originally included a 500 person oversample of parents, but they are excluded from
nsfg(wave==82)
.The provided sampling weights are designed to be used in the analyses of individual waves. Combining data from multiple waves may require using adjusted weights.
Value
A data frame containing variables described in the codebook available using vignette("codebooks")
.
If you are offline, or if the requested data are otherwise unavailable, NULL is returned.
References
ABC Framework: Neal, Z. P. and Neal, J. W. (2024). A framework for studying adults who neither have nor want children. The Family Journal, 32, 121-130. doi:10.1177/10664807231198869
Examples
dat <- soss(waves = 86) #Request data for December 2022
if (!is.null(dat)) { #If data was available...
table(dat$famstat) / nrow(dat) #Fraction of respondents with each family status
}