Type: Package
Title: Modified Generalized Estimating Equations for Binary Outcome
Version: 1.0.0
Maintainer: Ryota Ishii <r.ishii0808@gmail.com>
Description: Analyze small-sample clustered or longitudinal data with binary outcome using modified generalized estimating equations (GEE) with bias-adjusted covariance estimator. The package provides any combination of three GEE methods and 12 covariance estimators.
Depends: R (≥ 3.5.0)
Imports: MASS (≥ 7.3-45)
License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
Encoding: UTF-8
LazyData: true
URL: https://github.com/rtishii/geessbin
RoxygenNote: 7.3.2
Suggests: testthat (≥ 3.0.0)
Config/testthat/edition: 3
NeedsCompilation: no
Packaged: 2024-09-06 05:43:11 UTC; rishii
Author: Ryota Ishii [aut, cre], Tomohiro Ohigashi [ctb], Kazushi Maruo [ctb], Masahiko Gosho [ctb]
Repository: CRAN
Date/Publication: 2024-09-08 23:10:02 UTC

Modified Generalized Estimating Equations for Binary Outcome

Description

geessbin analyzes small-sample clustered or longitudinal data using modified generalized estimating equations (GEE) with bias-adjusted covariance estimator. This function assumes binary outcome and uses the logit link function. This function provides any combination of three GEE methods (conventional and two modified GEE methods) and 12 covariance estimators (unadjusted and 11 bias-adjusted estimators).

Usage

geessbin(
  formula,
  data = parent.frame(),
  id = NULL,
  corstr = "independence",
  repeated = NULL,
  beta.method = "PGEE",
  SE.method = "MB",
  b = NULL,
  maxitr = 50,
  tol = 1e-05,
  scale.fix = FALSE,
  conf.level = 0.95
)

Arguments

formula

Object of class formula: symbolic description of model to be fitted (see documentation of lm and formula for details).

data

Data frame.

id

Vector that identifies the subjects or clusters (NULL by default).

corstr

Working correlation structure. The following are permitted: "independence", "exchangeable", "ar1", and "unstructured" ("independence" by default).

repeated

Vector that identifies repeatedly measured variable within each subject or cluster. If repeated = NULL, as is the case in function gee, data are assumed to be sorted so that observations on a cluster are contiguous rows for all entities in the formula.

beta.method

Method for estimating regression parameters (see Details section). The following are permitted: "GEE", "PGEE", and "BCGEE" ("PGEE" by default).

SE.method

Method for estimating standard errors (see Details section). The following are permitted: "SA", "MK", "KC", "MD", "FG", "PA", "GS", "MB", "WL", "WB", "FW", and "FZ" ("MB" by default).

b

Numeric vector specifying initial values of regression coefficients. If b = NULL (default value), the initial values are calculated using the ordinary or Firth logistic regression assuming that all the observations are independent.

maxitr

Maximum number of iterations (50 by default).

tol

Tolerance used in fitting algorithm (1e-5 by default).

scale.fix

Logical variable; if TRUE, the scale parameter is fixed at 1 (FALSE by default).

conf.level

Numeric value of confidence level for confidence intervals (0.95 by default).

Details

Details of beta.method are as follows:

Details of SE.method are as follows:

Descriptions and performances of some of the above methods can be found in Gosho et al. (2023).

Value

The object of class "geessbin" representing the results of modified generalized estimating equations with bias-adjusted covariance estimators. Generic function summary provides details of the results.

References

Examples

data(wheeze)

# analysis of PGEE method with Morel et al. covariance estimator
res <- geessbin(formula = Wheeze ~ City + factor(Age), data = wheeze, id = ID,
                corstr = "ar1", repeated = Age, beta.method = "PGEE",
                SE.method = "MB")

# hypothesis tests for regression coefficients
summary(res)


Function for analysis using all combinations of GEE methods and covariance estimators

Description

geessbin_all provides analysis results using all combinations of three GEE methods and 12 covariance estimators.

Usage

geessbin_all(
  formula,
  data = parent.frame(),
  id = NULL,
  corstr = "independence",
  repeated = NULL,
  b = NULL,
  maxitr = 50,
  tol = 1e-05,
  scale.fix = FALSE,
  conf.level = 0.95
)

Arguments

formula

Object of class formula: symbolic description of model to be fitted (see documentation of lm and formula for details).

data

Data frame.

id

Vector that identifies the subjects or clusters (NULL by default).

corstr

Working correlation structure. The following are permitted: "independence", "exchangeable", "ar1", and "unstructured" ("independence" by default).

repeated

Vector that identifies repeatedly measured variable within each subject or cluster. If repeated = NULL, as is the case in function gee, data are assumed to be sorted so that observations on a cluster are contiguous rows for all entities in the formula.

b

Numeric vector specifying initial values of regression coefficients. If b = NULL (default value), the initial values are calculated using the ordinary or Firth logistic regression assuming that all the observations are independent.

maxitr

Maximum number of iterations (50 by default).

tol

Tolerance used in fitting algorithm (1e-5 by default).

scale.fix

Logical variable; if TRUE, the scale parameter is fixed at 1 (FALSE by default).

conf.level

Numeric value of confidence level for confidence intervals (0.95 by default).

Value

The list containing two data frames. The first is a table of estimates of regression coefficients, standard errors, z-values, and p-values. The second is a table of odds ratios and confidence intervals.


Square root of nonsymmetric matrix

Description

sqrtmat is used to calculate the square root of E_i - H_{ii}, which is an adjustment factor in Kauermann and Carroll-type method.

Usage

sqrtmat(M)

Arguments

M

Square matrix whose square root is to be computed.

Value

The square root of M

References

Kauermann, G. and Carroll, R. J. (2001). A note on the efficiency of sandwich covariance matrix estimation. Journal of the American Statistical Association, 96, 1387–1396, doi:10.1198/016214501753382309.


Wheeze dataset

Description

The data studied the effect of air pollution on the health of 16 children. The outcome variable was the wheezing status measured consistently four times yearly at ages of 9, 10, 11, and 12 years.

Format

A data frame with 64 observations on the following 6 variables:

ID

child identifier.

Wheeze

binary indicator of wheezing presence.

City

binary indicator of whether the child lives in Kingston (0 = Portage; 1 = Kingston).

Age

age of child in years ranging from 9 to 12.

Smoke

measure of smoking habits (cigarettes per day) of child's mother.

References

Examples

data(wheeze)