Help for package fastdid

Type:

Package

Title:

Fast Staggered Difference-in-Difference Estimators

Version:

1.0.5

Date:

2025-06-13

Maintainer:

Lin-Tung Tsai <tsaidondon@gmail.com>

Description:

A fast and flexible implementation of Callaway and Sant'Anna's (2021)<doi:10.1016/j.jeconom.2020.12.001> staggered Difference-in-Differences (DiD) estimators, 'fastdid' reduces the computation time from hours to seconds, and incorporates extensions such as time-varying covariates and multiple events.

License:

MIT + file LICENSE

Depends:

R (≥ 4.1.0)

Imports:

data.table (≥ 1.15.0), stringr, BMisc, collapse, dreamerr (≥ 1.4.0), parglm, ggplot2

Suggests:

did, knitr, parallel, rmarkdown, tinytest

Encoding:

UTF-8

RoxygenNote:

7.3.2

URL:

https://github.com/TsaiLintung/fastdid, https://tsailintung.github.io/fastdid/

BugReports:

https://github.com/TsaiLintung/fastdid/issues

VignetteBuilder:

knitr

NeedsCompilation:

Packaged:

2025-06-23 17:26:07 UTC; lttsai

Author:

Lin-Tung Tsai [aut, cre, cph], Maxwell Kellogg [ctb], Kuan-Ju Tseng [ctb]

Repository:

CRAN

Date/Publication:

2025-06-23 17:40:02 UTC

Fast Staggered DID Estimation

Description

Performs Difference-in-Differences (DID) estimation.

Usage

fastdid(
  data,
  timevar,
  cohortvar,
  unitvar,
  outcomevar,
  control_option = "both",
  result_type = "group_time",
  balanced_event_time = NA,
  control_type = "ipw",
  allow_unbalance_panel = FALSE,
  boot = FALSE,
  biters = 1000,
  cband = FALSE,
  alpha = 0.05,
  weightvar = NA,
  clustervar = NA,
  covariatesvar = NA,
  varycovariatesvar = NA,
  copy = TRUE,
  validate = TRUE,
  anticipation = 0,
  anticipation2 = 0,
  base_period = "universal",
  exper = NULL,
  full = FALSE,
  parallel = FALSE,
  cohortvar2 = NA,
  event_specific = TRUE,
  double_control_option = "both"
)

Arguments

data

data.table, the dataset.

timevar

character, name of the time variable.

cohortvar

character, name of the cohort (group) variable.

unitvar

character, name of the unit (id) variable.

outcomevar

character vector, name(s) of the outcome variable(s).

control_option

character, control units used for the DiD estimates, options are "both", "never", or "notyet".

result_type

character, type of result to return, options are "group_time", "time", "group", "simple", "dynamic" (time since event), "group_group_time", or "dynamic_stagger".

balanced_event_time

number, max event time to balance the cohort composition.

control_type

character, estimator for controlling for covariates, options are "ipw" (inverse probability weighting), "reg" (outcome regression), or "dr" (doubly-robust).

allow_unbalance_panel

logical, allow unbalance panel as input or coerce dataset into one.

boot

logical, whether to use bootstrap standard error.

biters

number, bootstrap iterations. Default is 1000.

cband

logical, whether to use uniform confidence band or point-wise.

alpha

number, the significance level. Default is 0.05.

weightvar

character, name of the weight variable.

clustervar

character, name of the cluster variable.

covariatesvar

character vector, names of time-invariant covariate variables.

varycovariatesvar

character vector, names of time-varying covariate variables.

copy

logical, whether to copy the dataset.

validate

logical, whether to validate the dataset.

anticipation

number, periods with anticipation.

anticipation2

number, periods with anticipation for the second event.

base_period

character, type of base period in pre-preiods, options are "universal", or "varying".

exper

list, arguments for experimental features.

full

logical, whether to return the full result (influence function, call, weighting scheme, etc,.).

parallel

logical, whether to use parallization on unix system.

cohortvar2

character, name of the second cohort (group) variable.

event_specific

logical, whether to recover target treatment effect or use combined effect.

double_control_option

character, control units used for the double DiD, options are "both", "never", or "notyet".

Details

'balanced_event_time' is only meaningful when 'result_type == "dynamic'.

'result_type' as 'group-group-time' and 'dynamic staggered' is only meaningful when using double did.

'biter' and 'clustervar' is only used when 'boot == TRUE'.

Value

A data.table containing the estimated treatment effects and standard errors or a list of all results when 'full == TRUE'.

Examples

# simulated data
simdt <- sim_did(1e+02, 10, cov = "cont", second_cov = TRUE, second_outcome = TRUE, seed = 1)
dt <- simdt$dt

# basic call
result <- fastdid(
  data = dt, timevar = "time", cohortvar = "G",
  unitvar = "unit", outcomevar = "y",
  result_type = "group_time"
)

Plot event study

Description

Plot event study results.

Usage

plot_did_dynamics(x, margin = "event_time")

Arguments

x

A data table generated with [fastdid] with one-dimensional index.

margin

character, the x-axis of the plot

Value

A ggplot2 object

Examples


# simulated data
simdt <- sim_did(1e+02, 10, seed = 1)
dt <- simdt$dt

# estimation
result <- fastdid(
  data = dt, timevar = "time", cohortvar = "G",
  unitvar = "unit", outcomevar = "y",
  result_type = "dynamic"
)

# plot
plot_did_dynamics(result)

Simulate a Difference-in-Differences (DiD) dataset

Description

Simulates a dataset for a Difference-in-Differences analysis with various customizable options.

Usage

sim_did(
  sample_size,
  time_period,
  untreated_prop = 0.3,
  epsilon_size = 0.001,
  cov = "no",
  hetero = "all",
  second_outcome = FALSE,
  second_cov = FALSE,
  vary_cov = FALSE,
  na = "none",
  balanced = TRUE,
  seed = NA,
  stratify = FALSE,
  treatment_assign = "latent",
  second_cohort = FALSE,
  confound_ratio = 1,
  second_het = "all"
)

Arguments

sample_size

The number of units in the dataset.

time_period

The number of time periods in the dataset.

untreated_prop

The proportion of untreated units.

epsilon_size

The standard deviation for the error term in potential outcomes.

cov

The type of covariate to include ("no", "int", or "cont").

hetero

The type of heterogeneity in treatment effects ("all" or "dynamic").

second_outcome

Whether to include a second outcome variable.

second_cov

Whether to include a second covariate.

vary_cov

include time-varying covariates

na

Whether to generate missing data ("none", "y", "x", or "both").

balanced

Whether to balance the dataset by random sampling.

seed

Seed for random number generation.

stratify

Whether to stratify the dataset based on a binary covariate.

treatment_assign

The method for treatment assignment ("latent" or "uniform").

second_cohort

include confounding events

confound_ratio

extent of event confoundedness

second_het

heterogeneity of the second event

Value

A list containing the simulated dataset (dt) and the treatment effect values (att).

Examples

# Simulate a DiD dataset with default settings
data <- sim_did(sample_size = 100, time_period = 5)