Type: | Package |
Title: | Tool for Diagnosis of Tables Joins and Complementary Join Features |
Version: | 0.2.4 |
Description: | Tool for diagnosing table joins. It combines the speed of 'collapse' and 'data.table', the flexibility of 'dplyr', and the diagnosis and features of the 'merge' command in 'Stata'. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
URL: | https://github.com/randrescastaneda/joyn, https://randrescastaneda.github.io/joyn/ |
BugReports: | https://github.com/randrescastaneda/joyn/issues |
Suggests: | badger, covr, knitr, rmarkdown, testthat (≥ 3.0.0), withr, dplyr, tibble |
Config/testthat/edition: | 3 |
Imports: | rlang, data.table, cli, utils, collapse (≥ 2.0.15), lifecycle |
Depends: | R (≥ 2.10) |
RoxygenNote: | 7.3.2 |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2024-12-13 23:00:39 UTC; wb384996 |
Author: | R.Andres Castaneda [aut, cre], Zander Prinsloo [aut], Rossana Tatulli [aut] |
Maintainer: | R.Andres Castaneda <acastanedaa@worldbank.org> |
Repository: | CRAN |
Date/Publication: | 2024-12-13 23:20:02 UTC |
joyn: Tool for Diagnosis of Tables Joins and Complementary Join Features
Description
Tool for diagnosing table joins. It combines the speed of 'collapse' and 'data.table', the flexibility of 'dplyr', and the diagnosis and features of the 'merge' command in 'Stata'.
Author(s)
Maintainer: R.Andres Castaneda acastanedaa@worldbank.org
Authors:
Zander Prinsloo zprinsloo@worldbank.org
Rossana Tatulli rtatulli@worldbank.org
See Also
Useful links:
Report bugs at https://github.com/randrescastaneda/joyn/issues
Anti join on two data frames
Description
This is a joyn
wrapper that works in a similar fashion to
dplyr::anti_join
Usage
anti_join(
x,
y,
by = intersect(names(x), names(y)),
copy = FALSE,
suffix = c(".x", ".y"),
keep = NULL,
na_matches = c("na", "never"),
multiple = "all",
relationship = "many-to-many",
y_vars_to_keep = FALSE,
reportvar = getOption("joyn.reportvar"),
reporttype = c("factor", "character", "numeric"),
roll = NULL,
keep_common_vars = FALSE,
sort = TRUE,
verbose = getOption("joyn.verbose"),
...
)
Arguments
x |
data frame: referred to as left in R terminology, or master in Stata terminology. |
y |
data frame: referred to as right in R terminology, or using in Stata terminology. |
by |
a character vector of variables to join by. If NULL, the default,
joyn will do a natural join, using all variables with common names across
the two tables. A message lists the variables so that you can check they're
correct (to suppress the message, simply explicitly list the variables that
you want to join). To join by different variables on x and y use a vector
of expressions. For example, |
copy |
If |
suffix |
If there are non-joined duplicate variables in |
keep |
Should the join keys from both
|
na_matches |
Should two |
multiple |
Handling of rows in
|
relationship |
Handling of the expected relationship between the keys of
|
y_vars_to_keep |
character: Vector of variable names in |
reportvar |
character: Name of reporting variable. Default is ".joyn". This is the same as variable "_merge" in Stata after performing a merge. If FALSE or NULL, the reporting variable will be excluded from the final table, though a summary of the join will be display after concluding. |
reporttype |
character: One of "character" or "numeric". Default is "character". If "numeric", the reporting variable will contain numeric codes of the source and the contents of each observation in the joined table. See below for more information. |
roll |
double: to be implemented |
keep_common_vars |
logical: If TRUE, it will keep the original variable from y when both tables have common variable names. Thus, the prefix "y." will be added to the original name to distinguish from the resulting variable in the joined table. |
sort |
logical: If TRUE, sort by key variables in |
verbose |
logical: if FALSE, it won't display any message (programmer's option). Default is TRUE. |
... |
Arguments passed on to
|
Value
An data frame of the same class as x
. The properties of the output
are as close as possible to the ones returned by the dplyr alternative.
See Also
Other dplyr alternatives:
full_join()
,
inner_join()
,
left_join()
,
right_join()
Examples
# Simple anti join
library(data.table)
x1 = data.table(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
y1 = data.table(id = c(1,2, 4),
y = c(11L, 15L, 16))
anti_join(x1, y1, relationship = "many-to-one")
Perform necessary preliminary checks on arguments that are passed to joyn
Description
Perform necessary preliminary checks on arguments that are passed to joyn
Usage
arguments_checks(
x,
y,
by,
copy,
keep,
suffix,
na_matches,
multiple,
relationship,
reportvar
)
Arguments
x |
data frame: left table |
y |
data frame: right table |
by |
character vector or variables to join by |
copy |
If |
keep |
Should the join keys from both
|
suffix |
If there are non-joined duplicate variables in |
na_matches |
Should two |
multiple |
Handling of rows in
|
relationship |
Handling of the expected relationship between the keys of
|
reportvar |
character: Name of reporting variable. Default is ".joyn". This is the same as variable "_merge" in Stata after performing a merge. If FALSE or NULL, the reporting variable will be excluded from the final table, though a summary of the join will be display after concluding. |
Value
list of checked arguments to pass on to the main joyn function
Check by
input
Description
This function checks the variable name(s) to be used as key(s) of the join
Usage
check_by_vars(by, x, y)
Arguments
by |
A vector of shared column names in |
x , y |
|
Value
list with information about by variables
Examples
## Not run:
x1 = data.frame(
id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
y1 = data.frame(id = 1:2,
y = c(11L, 15L))
# With var "id" shared in x and y
joyn:::check_by_vars(by = "id", x = x1, y = y1)
## End(Not run)
Check dt by
vars
Description
check variable(s) by which data frames are joined: either a single by
var, common to right and left dt,
or
Usage
check_dt_by(x, y, by, by.x, by.y)
Arguments
x |
left table |
y |
right table |
by |
character: variable to join by (common variable to x and y) |
by.x |
character: specified var in x to join by |
by.y |
character: specified var in y to join by |
Value
character specifying checked variable(s) to join by
Examples
## Not run:
x = data.table(id1 = c(1, 1, 2, 3, 3),
id2 = c(1, 1, 2, 3, 4),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = c(16, 12, NA, NA, 15))
y = data.table(id = c(1, 2, 5, 6, 3),
id2 = c(1, 1, 2, 3, 4),
y = c(11L, 15L, 20L, 13L, 10L),
x = c(16:20))
# example specifying by.x and by.y
joyn:::check_dt_by(x, y, by.x = "id1", by.y = "id2")
## End(Not run)
Check if vars in dt have duplicate names
Description
Check if vars in dt have duplicate names
Usage
check_duplicate_names(dt, name)
Arguments
dt |
data.frame to check |
name |
var name to check if has duplicates in dt |
Value
logical either TRUE, if any duplicates are found, or FALSE otherwise
Examples
## Not run:
# When no duplicates
x1 = data.table(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
joyn:::check_duplicate_names(x1, "x")
# When duplicates
x1_duplicates = data.frame(id = c(1L, 1L, 2L, 3L, NA_integer_),
x = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15,
check.names = FALSE)
joyn:::check_duplicate_names(x1_duplicates, "x")
## End(Not run)
Check match type consistency
Description
This function checks if the match type chosen by the user is consistent with the data.
(Match type must be one of the valid types: "1:1", "1:m", "m:1", "m:m")
Usage
check_match_type(x, y, by, match_type, verbose = getOption("joyn.verbose"))
Arguments
x , y |
|
by |
A vector of shared column names in |
match_type |
character: one of "m:m", "m:1", "1:m", "1:1". Default is "1:1" since this the most restrictive. However, following Stata's recommendation, it is better to be explicit and use any of the other three match types (See details in match types sections). |
Value
character vector from split_match_type
Examples
## Not run:
# Consistent match type
x1 = data.frame(
id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
y1 = data.frame(id = 1:2,
y = c(11L, 15L))
joyn:::check_match_type(x = x1, y=y1, by="id", match_type = "m:1")
# Inconsistent match type
joyn:::check_match_type(x = x1, y=y1, by="id", match_type = "1:1")
## End(Not run)
Rename vars in y so they are different to x's when joined
Description
Check vars in y with same names as vars in x, and return new variables names for those y vars for the joined data frame
Usage
check_new_y_vars(x, by, y_vars_to_keep)
Arguments
x |
master table |
by |
character: by vars |
y_vars_to_keep |
character vector of y variables to keep |
Value
vector with new variable names for y
Examples
## Not run:
y2 = data.frame(id = c(1, 2, 5, 6, 3),
yd = c(1, 2, 5, 6, 3),
y = c(11L, 15L, 20L, 13L, 10L),
x = c(16:20))
joyn:::y_vars_to_keep <- check_y_vars_to_keep(TRUE, y2, by = "id")
x2 = data.frame(id = c(1, 1, 2, 3, NA),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = c(16, 12, NA, NA, 15))
joyn:::check_new_y_vars(x = x2, by="id", y_vars_to_keep)
## End(Not run)
Check reporting variable
Description
check reportvar input
If resulting data frame has a reporting variable (storing joyn's report), check and return a valid name.
Usage
check_reportvar(reportvar, verbose = getOption("joyn.verbose"))
Value
if input reportvar is character, return valid name for the report var. If NULL or FALSE, return NULL.
Examples
## Not run:
# When null - reporting variable not returned in merged dt
joyn:::check_reportvar(reportvar = NULL)
# When FALSE - reporting variable not returned in merged dt
joyn:::check_reportvar(reportvar = FALSE)
# When character
joyn:::check_reportvar(reportvar = ".joyn")
## End(Not run)
Conduct all unmatched keys checks and return error if necessary
Description
Conduct all unmatched keys checks and return error if necessary
Usage
check_unmatched_keys(x, y, out, by, jn_type)
Arguments
x |
left table |
y |
right table |
out |
output from join |
by |
character vector of keys that x and y are joined by |
jn_type |
character: "left", "right", or "inner" |
Value
error message
Check tables X and Y
Description
This function performs checks inspired on merge.data.table: it detects errors
if x and/or y have no columns
if x and/or y contain duplicate column names
Usage
check_xy(x, y)
Arguments
x |
data frame: referred to as left in R terminology, or master in Stata terminology. |
y |
data frame: referred to as right in R terminology, or using in Stata terminology. |
Value
invisible TRUE
Examples
## Not run:
# Check passing with no errors
library(data.table)
x1 = data.table(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
y1 = data.table(id = c(1,2, 4),
y = c(11L, 15L, 16))
joyn:::check_xy(x = x1, y=y1)
## End(Not run)
Check variables in y that will be kept in returning table
Description
check and return variable names in y to keep in returning table, excluding those that are keys of the merge
Usage
check_y_vars_to_keep(y_vars_to_keep, y, by)
Arguments
y_vars_to_keep |
either TRUE, if keep all vars in |
y |
data frame |
by |
A vector of shared column names in |
Value
character vector with variable names from y
table
Examples
## Not run:
y1 = data.table(id = 1:2,
y = c(11L, 15L))
# With y_vars_to_keep TRUE
joyn:::check_y_vars_to_keep(TRUE, y1, by = "id")
# With y_vars_to_keep FALSE
joyn:::check_y_vars_to_keep(FALSE, y1, by = "id")
# Specifying which y vars to keep
joyn:::check_y_vars_to_keep("y", y1, by = "id")
## End(Not run)
Clearing joyn environment
Description
Clearing joyn environment
Usage
clear_joynenv()
See Also
Messages functions
joyn_msg()
,
joyn_msgs_exist()
,
joyn_report()
,
msg_type_dt()
,
store_msg()
,
style()
,
type_choices()
Examples
## Not run:
# Storing a message
joyn:::store_msg("info", "simple message")
# Clearing the environment
joyn:::clear_joynenv()
# Checking it does not exist in the environment
print(joyn:::joyn_msgs_exist())
## End(Not run)
Function used to correct names in input data frames using by
argument
Description
Function used to correct names in input data frames using by
argument
Usage
correct_names(by, x, y, order = TRUE)
Arguments
by |
|
x |
left data frame |
y |
right data frame |
Value
list
Create variables that uniquely identify rows in a data table
Description
This function generates unique identifier columns for a given number of rows, based on the specified number of identifier variables.
Usage
create_ids(n_rows, n_ids, prefix = "id")
Arguments
n_rows |
An integer specifying the number of rows in the data table for which unique identifiers need to be generated. |
n_ids |
An integer specifying the number of identifiers to be created. If |
prefix |
A character string specifying the prefix for the identifier variable names (default is |
Value
A named list where each element is a vector representing a unique identifier column. The number of elements in the list corresponds to the number of identifier variables (n_ids
). The length of each element is equal to n_rows
.
Tabulate simple frequencies
Description
tabulate one variable frequencies
Usage
freq_table(x, byvar, digits = 1, na.rm = FALSE, freq_var_name = "n")
Arguments
x |
data frame |
byvar |
character: name of variable to tabulate. Use Standard evaluation. |
digits |
numeric: number of decimal places to display. Default is 1. |
na.rm |
logical: report NA values in frequencies. Default is FALSE. |
freq_var_name |
character: name for frequency variable. Default is "n" |
Value
data.table with frequencies.
Examples
library(data.table)
x4 = data.table(id1 = c(1, 1, 2, 3, 3),
id2 = c(1, 1, 2, 3, 4),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = c(16, 12, NA, NA, 15))
freq_table(x4, "id1")
Full join two data frames
Description
This is a joyn
wrapper that works in a similar
fashion to dplyr::full_join
Usage
full_join(
x,
y,
by = intersect(names(x), names(y)),
copy = FALSE,
suffix = c(".x", ".y"),
keep = NULL,
na_matches = c("na", "never"),
multiple = "all",
unmatched = "drop",
relationship = "one-to-one",
y_vars_to_keep = TRUE,
update_values = FALSE,
update_NAs = update_values,
reportvar = getOption("joyn.reportvar"),
reporttype = c("factor", "character", "numeric"),
roll = NULL,
keep_common_vars = FALSE,
sort = TRUE,
verbose = getOption("joyn.verbose"),
...
)
Arguments
x |
data frame: referred to as left in R terminology, or master in Stata terminology. |
y |
data frame: referred to as right in R terminology, or using in Stata terminology. |
by |
a character vector of variables to join by. If NULL, the default,
joyn will do a natural join, using all variables with common names across
the two tables. A message lists the variables so that you can check they're
correct (to suppress the message, simply explicitly list the variables that
you want to join). To join by different variables on x and y use a vector
of expressions. For example, |
copy |
If |
suffix |
If there are non-joined duplicate variables in |
keep |
Should the join keys from both
|
na_matches |
Should two |
multiple |
Handling of rows in
|
unmatched |
How should unmatched keys that would result in dropped rows be handled?
|
relationship |
Handling of the expected relationship between the keys of
|
y_vars_to_keep |
character: Vector of variable names in |
update_values |
logical: If TRUE, it will update all values of variables
in x with the actual of variables in y with the same name as the ones in x.
NAs from y won't be used to update actual values in x. Yet, by default,
NAs in x will be updated with values in y. To avoid this, make sure to set
|
update_NAs |
logical: If TRUE, it will update NA values of all variables
in x with actual values of variables in y that have the same name as the
ones in x. If FALSE, NA values won't be updated, even if |
reportvar |
character: Name of reporting variable. Default is ".joyn". This is the same as variable "_merge" in Stata after performing a merge. If FALSE or NULL, the reporting variable will be excluded from the final table, though a summary of the join will be display after concluding. |
reporttype |
character: One of "character" or "numeric". Default is "character". If "numeric", the reporting variable will contain numeric codes of the source and the contents of each observation in the joined table. See below for more information. |
roll |
double: to be implemented |
keep_common_vars |
logical: If TRUE, it will keep the original variable from y when both tables have common variable names. Thus, the prefix "y." will be added to the original name to distinguish from the resulting variable in the joined table. |
sort |
logical: If TRUE, sort by key variables in |
verbose |
logical: if FALSE, it won't display any message (programmer's option). Default is TRUE. |
... |
Arguments passed on to
|
Value
An data frame of the same class as x
. The properties of the output
are as close as possible to the ones returned by the dplyr alternative.
See Also
Other dplyr alternatives:
anti_join()
,
inner_join()
,
left_join()
,
right_join()
Examples
# Simple full join
library(data.table)
x1 = data.table(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
y1 = data.table(id = c(1,2, 4),
y = c(11L, 15L, 16))
full_join(x1, y1, relationship = "many-to-one")
Get joyn options
Description
This function aims to display and store info on joyn options
Usage
get_joyn_options(env = .joynenv, display = TRUE, option = NULL)
Arguments
env |
environment, which is joyn environment by default |
display |
logical, if TRUE displays (i.e., print) info on joyn options and corresponding default and current values |
option |
character or NULL. If character, name of a specific joyn option. If NULL, all joyn options |
Value
joyn options and values invisibly as a list
See Also
JOYn options functions
set_joyn_options()
Examples
## Not run:
# display all joyn options, their default and current values
joyn:::get_joyn_options()
# store list of option = value pairs AND do not display info
joyn_options <- joyn:::get_joyn_options(display = FALSE)
# get info on one specific option and store it
joyn.verbose <- joyn:::get_joyn_options(option = "joyn.verbose")
# get info on two specific option
joyn:::get_joyn_options(option = c("joyn.verbose", "joyn.reportvar"))
## End(Not run)
Inner join two data frames
Description
This is a joyn
wrapper that works in a similar fashion to
dplyr::inner_join
Usage
inner_join(
x,
y,
by = intersect(names(x), names(y)),
copy = FALSE,
suffix = c(".x", ".y"),
keep = NULL,
na_matches = c("na", "never"),
multiple = "all",
unmatched = "drop",
relationship = "one-to-one",
y_vars_to_keep = TRUE,
update_values = FALSE,
update_NAs = update_values,
reportvar = getOption("joyn.reportvar"),
reporttype = c("factor", "character", "numeric"),
roll = NULL,
keep_common_vars = FALSE,
sort = TRUE,
verbose = getOption("joyn.verbose"),
...
)
Arguments
x |
data frame: referred to as left in R terminology, or master in Stata terminology. |
y |
data frame: referred to as right in R terminology, or using in Stata terminology. |
by |
a character vector of variables to join by. If NULL, the default,
joyn will do a natural join, using all variables with common names across
the two tables. A message lists the variables so that you can check they're
correct (to suppress the message, simply explicitly list the variables that
you want to join). To join by different variables on x and y use a vector
of expressions. For example, |
copy |
If |
suffix |
If there are non-joined duplicate variables in |
keep |
Should the join keys from both
|
na_matches |
Should two |
multiple |
Handling of rows in
|
unmatched |
How should unmatched keys that would result in dropped rows be handled?
|
relationship |
Handling of the expected relationship between the keys of
|
y_vars_to_keep |
character: Vector of variable names in |
update_values |
logical: If TRUE, it will update all values of variables
in x with the actual of variables in y with the same name as the ones in x.
NAs from y won't be used to update actual values in x. Yet, by default,
NAs in x will be updated with values in y. To avoid this, make sure to set
|
update_NAs |
logical: If TRUE, it will update NA values of all variables
in x with actual values of variables in y that have the same name as the
ones in x. If FALSE, NA values won't be updated, even if |
reportvar |
character: Name of reporting variable. Default is ".joyn". This is the same as variable "_merge" in Stata after performing a merge. If FALSE or NULL, the reporting variable will be excluded from the final table, though a summary of the join will be display after concluding. |
reporttype |
character: One of "character" or "numeric". Default is "character". If "numeric", the reporting variable will contain numeric codes of the source and the contents of each observation in the joined table. See below for more information. |
roll |
double: to be implemented |
keep_common_vars |
logical: If TRUE, it will keep the original variable from y when both tables have common variable names. Thus, the prefix "y." will be added to the original name to distinguish from the resulting variable in the joined table. |
sort |
logical: If TRUE, sort by key variables in |
verbose |
logical: if FALSE, it won't display any message (programmer's option). Default is TRUE. |
... |
Arguments passed on to
|
Value
An data frame of the same class as x
. The properties of the output
are as close as possible to the ones returned by the dplyr alternative.
See Also
Other dplyr alternatives:
anti_join()
,
full_join()
,
left_join()
,
right_join()
Examples
# Simple full join
library(data.table)
x1 = data.table(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
y1 = data.table(id = c(1,2, 4),
y = c(11L, 15L, 16))
inner_join(x1, y1, relationship = "many-to-one")
Is data frame balanced by group?
Description
Check if the data frame is balanced by group of columns, i.e., if it contains every combination of the elements in the specified variables
Usage
is_balanced(df, by, return = c("logic", "table"))
Arguments
df |
data frame |
by |
character: variables used to check if |
return |
character: either "logic" or "table". If "logic", returns |
Value
logical, if return == "logic", else returns data frame of unbalanced observations
Examples
x1 = data.frame(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
is_balanced(df = x1,
by = c("id", "t"),
return = "table") # returns combination of elements in "id" and "t" not present in df
is_balanced(df = x1,
by = c("id", "t"),
return = "logic") # FALSE
Check if dt is uniquely identified by by
variable
Description
report if dt is uniquely identified by by
var or, if report = TRUE, the duplicates in by
variable
Usage
is_id(
dt,
by,
verbose = getOption("joyn.verbose", default = FALSE),
return_report = FALSE
)
Arguments
dt |
either right of left table |
by |
variable to merge by |
verbose |
logical: if TRUE messages will be displayed |
return_report |
logical: if TRUE, returns data with summary of duplicates.
If FALSE, returns logical value depending on whether |
Value
logical or data.frame, depending on the value of argument return_report
Examples
library(data.table)
# example with data frame not uniquely identified by `by` var
y <- data.table(id = c("c","b", "c", "a"),
y = c(11L, 15L, 18L, 20L))
is_id(y, by = "id")
is_id(y, by = "id", return_report = TRUE)
# example with data frame uniquely identified by `by` var
y1 <- data.table(id = c("1","3", "2", "9"),
y = c(11L, 15L, 18L, 20L))
is_id(y1, by = "id")
Confirm if match type error
Description
Confirm if match type error
Usage
is_match_type_error(x, name, by, verbose, match_type_error)
Arguments
name |
name of data frame |
by |
A vector of shared column names in |
match_type_error |
logical: from existing code |
Value
logical
Examples
## Not run:
# example with dt not uniquely identified by "id"
x1 = data.table(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
joyn:::is_match_type_error(x1, name = "x1", by = "id")
## End(Not run)
Check whether specified "many" relationship is valid
Description
When "many" relationship is specified, check if it is valid.
(Specified many relationship not valid if the dt is instead uniquely identified by specified keys)
Usage
is_valid_m_key(dt, by)
Arguments
dt |
data object |
by |
character vector: specified keys, already fixed |
Value
logical: TRUE
if valid, FALSE
if uniquely identified
Examples
## Not run:
# example with data frame uniquely identified by specified `by` vars
x1 = data.frame(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
joyn:::is_valid_m_key(x1, by = c("id", "t"))
# example with valid specified "many" relationship
x2 = data.frame(id = c(1L, 1L, 1L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
joyn:::is_valid_m_key(x2, by = c("id", "t"))
## End(Not run)
Join two tables
Description
This is the primary function in the joyn
package. It executes a full join,
performs a number of checks, and filters to allow the user-specified join.
Usage
joyn(
x,
y,
by = intersect(names(x), names(y)),
match_type = c("1:1", "1:m", "m:1", "m:m"),
keep = c("full", "left", "master", "right", "using", "inner", "anti"),
y_vars_to_keep = ifelse(keep == "anti", FALSE, TRUE),
update_values = FALSE,
update_NAs = update_values,
reportvar = getOption("joyn.reportvar"),
reporttype = c("factor", "character", "numeric"),
roll = NULL,
keep_common_vars = FALSE,
sort = FALSE,
verbose = getOption("joyn.verbose"),
suffixes = getOption("joyn.suffixes"),
allow.cartesian = deprecated(),
yvars = deprecated(),
keep_y_in_x = deprecated(),
na.last = getOption("joyn.na.last"),
msg_type = getOption("joyn.msg_type")
)
Arguments
x |
data frame: referred to as left in R terminology, or master in Stata terminology. |
y |
data frame: referred to as right in R terminology, or using in Stata terminology. |
by |
a character vector of variables to join by. If NULL, the default,
joyn will do a natural join, using all variables with common names across
the two tables. A message lists the variables so that you can check they're
correct (to suppress the message, simply explicitly list the variables that
you want to join). To join by different variables on x and y use a vector
of expressions. For example, |
match_type |
character: one of "m:m", "m:1", "1:m", "1:1". Default is "1:1" since this the most restrictive. However, following Stata's recommendation, it is better to be explicit and use any of the other three match types (See details in match types sections). |
keep |
atomic character vector of length 1: One of "full", "left",
"master", "right",
"using", "inner". Default is "full". Even though this is not the
regular behavior of joins in R, the objective of |
y_vars_to_keep |
character: Vector of variable names in |
update_values |
logical: If TRUE, it will update all values of variables
in x with the actual of variables in y with the same name as the ones in x.
NAs from y won't be used to update actual values in x. Yet, by default,
NAs in x will be updated with values in y. To avoid this, make sure to set
|
update_NAs |
logical: If TRUE, it will update NA values of all variables
in x with actual values of variables in y that have the same name as the
ones in x. If FALSE, NA values won't be updated, even if |
reportvar |
character: Name of reporting variable. Default is ".joyn". This is the same as variable "_merge" in Stata after performing a merge. If FALSE or NULL, the reporting variable will be excluded from the final table, though a summary of the join will be display after concluding. |
reporttype |
character: One of "character" or "numeric". Default is "character". If "numeric", the reporting variable will contain numeric codes of the source and the contents of each observation in the joined table. See below for more information. |
roll |
double: to be implemented |
keep_common_vars |
logical: If TRUE, it will keep the original variable from y when both tables have common variable names. Thus, the prefix "y." will be added to the original name to distinguish from the resulting variable in the joined table. |
sort |
logical: If TRUE, sort by key variables in |
verbose |
logical: if FALSE, it won't display any message (programmer's option). Default is TRUE. |
suffixes |
A character(2) specifying the suffixes to be used for making non-by column names unique. The suffix behaviour works in a similar fashion as the base::merge method does. |
allow.cartesian |
logical: Check documentation in official web site.
Default is |
yvars |
|
keep_y_in_x |
|
na.last |
|
msg_type |
character: type of messages to display by default |
Value
a data.table joining x and y.
match types
Using the same wording of the Stata manual
1:1: specifies a one-to-one match merge. The variables specified in
by
uniquely identify single observations in both table.
1:m and m:1: specify one-to-many and many-to-one match merges,
respectively. This means that in of the tables the observations are
uniquely identify by the variables in by
, while in the other table many
(two or more) of the observations are identify by the variables in by
m:m refers to many-to-many merge. variables in by
does not uniquely
identify the observations in either table. Matching is performed by
combining observations with equal values in by
; within matching values,
the first observation in the master (i.e. left or x) table is matched with
the first matching observation in the using (i.e. right or y) table; the
second, with the second; and so on. If there is an unequal number of
observations within a group, then the last observation of the shorter group
is used repeatedly to match with subsequent observations of the longer
group.
reporttype
If reporttype = "numeric"
, then the numeric values have the following
meaning:
1: row comes from x
, i.e. "x" 2: row comes from y
, i.e. "y" 3: row from
both x
and y
, i.e. "x & y" 4: row has NA in x
that has been updated
with y
, i.e. "NA updated" 5: row has valued in x
that has been updated
with y
, i.e. "value updated" 6: row from x
that has not been updated,
i.e. "not updated"
NAs order
NA
s are placed either at first or at last in the
resulting data.frame depending on the value of getOption("joyn.na.last")
.
The Default is FALSE
as it is the default value of
data.table::setorderv.
Examples
# Simple join
library(data.table)
x1 = data.table(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
y1 = data.table(id = 1:2,
y = c(11L, 15L))
x2 = data.table(id = c(1, 1, 2, 3, NA),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = c(16, 12, NA, NA, 15))
y2 = data.table(id = c(1, 2, 5, 6, 3),
yd = c(1, 2, 5, 6, 3),
y = c(11L, 15L, 20L, 13L, 10L),
x = c(16:20))
joyn(x1, y1, match_type = "m:1")
# Bad merge for not specifying by argument or match_type
joyn(x2, y2)
# good merge, ignoring variable x from y
joyn(x2, y2, by = "id", match_type = "m:1")
# update NAs in x variable form x
joyn(x2, y2, by = "id", update_NAs = TRUE, match_type = "m:1")
# Update values in x with variables from y
joyn(x2, y2, by = "id", update_values = TRUE, match_type = "m:1")
display type of joyn message
Description
display type of joyn message
Usage
joyn_msg(msg_type = getOption("joyn.msg_type"), msg = NULL)
Arguments
msg_type |
character: one or more of the following: all, basic, info, note, warn, timing, or err |
msg |
character vector to be parsed to |
Value
returns data frame with message invisibly. print message in console
See Also
Messages functions
clear_joynenv()
,
joyn_msgs_exist()
,
joyn_report()
,
msg_type_dt()
,
store_msg()
,
style()
,
type_choices()
Examples
library(data.table)
x1 = data.table(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
y1 = data.table(id = 1:2,
y = c(11L, 15L))
df <- joyn(x1, y1, match_type = "m:1")
joyn_msg("basic")
joyn_msg("all")
Presence of joyn msgs in the environment
Description
Checks the presence of joyn messages stored in joyn environment
Usage
joyn_msgs_exist()
Value
invisible TRUE
See Also
Messages functions
clear_joynenv()
,
joyn_msg()
,
joyn_report()
,
msg_type_dt()
,
store_msg()
,
style()
,
type_choices()
Examples
## Not run:
Storing a message
joyn:::store_msg("info", "simple message")
Checking if it exists in the environment
print(joyn:::joyn_msgs_exist())
## End(Not run)
Print JOYn report table
Description
Print JOYn report table
Usage
joyn_report(verbose = getOption("joyn.verbose"))
Arguments
verbose |
logical: if FALSE, it won't display any message (programmer's option). Default is TRUE. |
Value
invisible table of frequencies
See Also
Messages functions
clear_joynenv()
,
joyn_msg()
,
joyn_msgs_exist()
,
msg_type_dt()
,
store_msg()
,
style()
,
type_choices()
Examples
library(data.table)
x1 = data.table(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
y1 = data.table(id = 1:2,
y = c(11L, 15L))
d <- joyn(x1, y1, match_type = "m:1")
joyn_report(verbose = TRUE)
Internal workhorse join function, used in the back-end of joyn
Description
Always executes a full join.
Usage
joyn_workhorse(
x,
y,
by = intersect(names(x), names(y)),
sort = FALSE,
suffixes = getOption("joyn.suffixes"),
reportvar = getOption("joyn.reportvar")
)
Arguments
x |
data object, "left" or "master" |
y |
data object, "right" or "using" |
by |
atomic character vector: key specifying join |
sort |
logical: sort the result by the columns in |
suffixes |
atomic character vector: give suffixes to columns common to both |
Value
data object of same class as x
Examples
## Not run:
# Full join
library(data.table)
x1 = data.table(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
y1 = data.table(id = c(1,2, 4),
y = c(11L, 15L, 16))
joyn:::joyn_workhorse(x = x1, y=y1)
## End(Not run)
Left join two data frames
Description
This is a joyn
wrapper that works in a similar
fashion to dplyr::left_join
Usage
left_join(
x,
y,
by = intersect(names(x), names(y)),
copy = FALSE,
suffix = c(".x", ".y"),
keep = NULL,
na_matches = c("na", "never"),
multiple = "all",
unmatched = "drop",
relationship = NULL,
y_vars_to_keep = TRUE,
update_values = FALSE,
update_NAs = update_values,
reportvar = getOption("joyn.reportvar"),
reporttype = c("factor", "character", "numeric"),
roll = NULL,
keep_common_vars = FALSE,
sort = TRUE,
verbose = getOption("joyn.verbose"),
...
)
Arguments
x |
data frame: referred to as left in R terminology, or master in Stata terminology. |
y |
data frame: referred to as right in R terminology, or using in Stata terminology. |
by |
a character vector of variables to join by. If NULL, the default,
joyn will do a natural join, using all variables with common names across
the two tables. A message lists the variables so that you can check they're
correct (to suppress the message, simply explicitly list the variables that
you want to join). To join by different variables on x and y use a vector
of expressions. For example, |
copy |
If |
suffix |
If there are non-joined duplicate variables in |
keep |
Should the join keys from both
|
na_matches |
Should two |
multiple |
Handling of rows in
|
unmatched |
How should unmatched keys that would result in dropped rows be handled?
|
relationship |
Handling of the expected relationship between the keys of
|
y_vars_to_keep |
character: Vector of variable names in |
update_values |
logical: If TRUE, it will update all values of variables
in x with the actual of variables in y with the same name as the ones in x.
NAs from y won't be used to update actual values in x. Yet, by default,
NAs in x will be updated with values in y. To avoid this, make sure to set
|
update_NAs |
logical: If TRUE, it will update NA values of all variables
in x with actual values of variables in y that have the same name as the
ones in x. If FALSE, NA values won't be updated, even if |
reportvar |
character: Name of reporting variable. Default is ".joyn". This is the same as variable "_merge" in Stata after performing a merge. If FALSE or NULL, the reporting variable will be excluded from the final table, though a summary of the join will be display after concluding. |
reporttype |
character: One of "character" or "numeric". Default is "character". If "numeric", the reporting variable will contain numeric codes of the source and the contents of each observation in the joined table. See below for more information. |
roll |
double: to be implemented |
keep_common_vars |
logical: If TRUE, it will keep the original variable from y when both tables have common variable names. Thus, the prefix "y." will be added to the original name to distinguish from the resulting variable in the joined table. |
sort |
logical: If TRUE, sort by key variables in |
verbose |
logical: if FALSE, it won't display any message (programmer's option). Default is TRUE. |
... |
Arguments passed on to
|
Value
An data frame of the same class as x
. The properties of the output
are as close as possible to the ones returned by the dplyr alternative.
See Also
Other dplyr alternatives:
anti_join()
,
full_join()
,
inner_join()
,
right_join()
Examples
# Simple left join
library(data.table)
x1 = data.table(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
y1 = data.table(id = c(1,2, 4),
y = c(11L, 15L, 16))
left_join(x1, y1, relationship = "many-to-one")
Merge two data frames
Description
This is a joyn wrapper that works in a similar fashion to base::merge and data.table::merge, which is why merge masks the other two.
Usage
merge(
x,
y,
by = NULL,
by.x = NULL,
by.y = NULL,
all = FALSE,
all.x = all,
all.y = all,
sort = TRUE,
suffixes = c(".x", ".y"),
no.dups = TRUE,
allow.cartesian = getOption("datatable.allow.cartesian"),
match_type = c("m:m", "m:1", "1:m", "1:1"),
keep_common_vars = TRUE,
...
)
Arguments
x , y |
|
by |
A vector of shared column names in |
by.x , by.y |
Vectors of column names in |
all |
logical; |
all.x |
logical; if |
all.y |
logical; analogous to |
sort |
logical. If |
suffixes |
A |
no.dups |
logical indicating that |
allow.cartesian |
See |
match_type |
character: one of "m:m", "m:1", "1:m", "1:1". Default is "1:1" since this the most restrictive. However, following Stata's recommendation, it is better to be explicit and use any of the other three match types (See details in match types sections). |
keep_common_vars |
logical: If TRUE, it will keep the original variable from y when both tables have common variable names. Thus, the prefix "y." will be added to the original name to distinguish from the resulting variable in the joined table. |
... |
Arguments passed on to
|
Value
data.table merging x and y
Examples
x1 = data.frame(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
y1 = data.frame(id = c(1,2, 4),
y = c(11L, 15L, 16))
joyn::merge(x1, y1, by = "id")
# example of using by.x and by.y
x2 = data.frame(id1 = c(1, 1, 2, 3, 3),
id2 = c(1, 1, 2, 3, 4),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = c(16, 12, NA, NA, 15))
y2 = data.frame(id = c(1, 2, 5, 6, 3),
id2 = c(1, 1, 2, 3, 4),
y = c(11L, 15L, 20L, 13L, 10L),
x = c(16:20))
jn <- joyn::merge(x2,
y2,
match_type = "m:m",
all.x = TRUE,
by.x = "id1",
by.y = "id2")
# example with all = TRUE
jn <- joyn::merge(x2,
y2,
match_type = "m:m",
by.x = "id1",
by.y = "id2",
all = TRUE)
convert style of joyn message to data frame containing type and message
Description
convert style of joyn message to data frame containing type and message
Usage
msg_type_dt(type, ...)
Value
data frame with two variables, type and msg
See Also
Messages functions
clear_joynenv()
,
joyn_msg()
,
joyn_msgs_exist()
,
joyn_report()
,
store_msg()
,
style()
,
type_choices()
Find possible unique identifies of data frame
Description
Identify possible combinations of variables that uniquely identifying dt
Usage
possible_ids(
dt,
vars = NULL,
exclude = NULL,
include = NULL,
exclude_classes = NULL,
include_classes = NULL,
verbose = getOption("possible_ids.verbose", default = FALSE),
min_combination_size = 1,
max_combination_size = 5,
max_processing_time = 60,
max_numb_possible_ids = 100,
get_all = FALSE
)
Arguments
dt |
data frame |
vars |
character: A vector of variable names to consider for identifying unique combinations. |
exclude |
character: Names of variables to exclude from analysis |
include |
character: Name of variable to be included, that might belong
to the group excluded in the |
exclude_classes |
character: classes to exclude from analysis (e.g., "numeric", "integer", "date") |
include_classes |
character: classes to include in the analysis (e.g., "numeric", "integer", "date") |
verbose |
logical: If FALSE no message will be displayed. Default is TRUE |
min_combination_size |
numeric: Min number of combinations. Default is 1, so all combinations. |
max_combination_size |
numeric. Max number of combinations. Default is
5. If there is a combinations of identifiers larger than
|
max_processing_time |
numeric: Max time to process in seconds. After that, it returns what it found. |
max_numb_possible_ids |
numeric: Max number of possible IDs to find. See details. |
get_all |
logical: get all possible combinations based on the parameters above. |
Value
list with possible identifiers
Number of possible IDs
The number of possible IDs in a dataframe could be very large. This is why,
possible_ids()
makes use of heuristics to return something useful without
wasting the time of the user. In addition, we provide multiple parameter so
that the user can fine tune their search for possible IDs easily and
quickly.
Say for instance that you have a dataframe with 10 variables. Testing every possible pair of variables will give you 90 possible unique identifiers for this dataframe. If you want to test all the possible IDs, you will have to test more 5000 combinations. If the dataframe has many rows, it may take a while.
Examples
library(data.table)
x4 = data.table(id1 = c(1, 1, 2, 3, 3),
id2 = c(1, 1, 2, 3, 4),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = c(16, 12, NA, NA, 15))
possible_ids(x4)
Process the by
vector
Description
Gives as output a vector of names to be used for the specified
table that correspond to the by
argument for that table
Usage
process_by_vector(by, input = c("left", "right"))
Arguments
by |
character vector: by argument for join |
input |
character: either "left" or "right", indicating
whether to give the left or right side of the equals ("=") if
the equals is part of the |
Value
character vector
Examples
joyn:::process_by_vector(by = c("An = foo", "example"), input = "left")
Rename to syntactically valid names
Description
Rename to syntactically valid names
Usage
rename_to_valid(name, verbose = getOption("joyn.verbose"))
Arguments
name |
character: name to be coerced to syntactically valid name |
verbose |
logical: if FALSE, it won't display any message (programmer's option). Default is TRUE. |
Value
valid character name
Examples
joyn:::rename_to_valid("x y")
Report frequencies from attributes in report var
Description
Report frequencies from attributes in report var
Usage
report_from_attr(x, y, reportvar)
Arguments
x |
dataframe from joyn_workhorse |
y |
dataframe from original merge ("right" or "using") |
Value
dataframe with frequencies of report var
Right join two data frames
Description
This is a joyn
wrapper that works in a similar
fashion to dplyr::right_join
Usage
right_join(
x,
y,
by = intersect(names(x), names(y)),
copy = FALSE,
suffix = c(".x", ".y"),
keep = NULL,
na_matches = c("na", "never"),
multiple = "all",
unmatched = "drop",
relationship = "one-to-one",
y_vars_to_keep = TRUE,
update_values = FALSE,
update_NAs = update_values,
reportvar = getOption("joyn.reportvar"),
reporttype = c("factor", "character", "numeric"),
roll = NULL,
keep_common_vars = FALSE,
sort = TRUE,
verbose = getOption("joyn.verbose"),
...
)
Arguments
x |
data frame: referred to as left in R terminology, or master in Stata terminology. |
y |
data frame: referred to as right in R terminology, or using in Stata terminology. |
by |
a character vector of variables to join by. If NULL, the default,
joyn will do a natural join, using all variables with common names across
the two tables. A message lists the variables so that you can check they're
correct (to suppress the message, simply explicitly list the variables that
you want to join). To join by different variables on x and y use a vector
of expressions. For example, |
copy |
If |
suffix |
If there are non-joined duplicate variables in |
keep |
Should the join keys from both
|
na_matches |
Should two |
multiple |
Handling of rows in
|
unmatched |
How should unmatched keys that would result in dropped rows be handled?
|
relationship |
Handling of the expected relationship between the keys of
|
y_vars_to_keep |
character: Vector of variable names in |
update_values |
logical: If TRUE, it will update all values of variables
in x with the actual of variables in y with the same name as the ones in x.
NAs from y won't be used to update actual values in x. Yet, by default,
NAs in x will be updated with values in y. To avoid this, make sure to set
|
update_NAs |
logical: If TRUE, it will update NA values of all variables
in x with actual values of variables in y that have the same name as the
ones in x. If FALSE, NA values won't be updated, even if |
reportvar |
character: Name of reporting variable. Default is ".joyn". This is the same as variable "_merge" in Stata after performing a merge. If FALSE or NULL, the reporting variable will be excluded from the final table, though a summary of the join will be display after concluding. |
reporttype |
character: One of "character" or "numeric". Default is "character". If "numeric", the reporting variable will contain numeric codes of the source and the contents of each observation in the joined table. See below for more information. |
roll |
double: to be implemented |
keep_common_vars |
logical: If TRUE, it will keep the original variable from y when both tables have common variable names. Thus, the prefix "y." will be added to the original name to distinguish from the resulting variable in the joined table. |
sort |
logical: If TRUE, sort by key variables in |
verbose |
logical: if FALSE, it won't display any message (programmer's option). Default is TRUE. |
... |
Arguments passed on to
|
Value
An data frame of the same class as x
. The properties of the output
are as close as possible to the ones returned by the dplyr alternative.
See Also
Other dplyr alternatives:
anti_join()
,
full_join()
,
inner_join()
,
left_join()
Examples
# Simple right join
library(data.table)
x1 = data.table(id = c(1L, 1L, 2L, 3L, NA_integer_),
t = c(1L, 2L, 1L, 2L, NA_integer_),
x = 11:15)
y1 = data.table(id = c(1,2, 4),
y = c(11L, 15L, 16))
right_join(x1, y1, relationship = "many-to-one")
Add x key var and y key var (with suffixes) to x and y -when joining by different variables and keep is true
Description
Add x key var and y key var (with suffixes) to x and y -when joining by different variables and keep is true
Usage
set_col_names(x, y, by, suffix, jn_type)
Arguments
x |
data table: left table |
y |
data table: right table |
by |
character vector of variables to join by |
suffix |
character(2) specifying the suffixes to be used for making non-by column names unique |
jn_type |
character specifying type of join |
Value
list containing x and y
Set joyn options
Description
This function is used to change the value of one or more joyn options
Usage
set_joyn_options(..., env = .joynenv)
Arguments
... |
pairs of option = value |
env |
environment, which is joyn environment by default |
Value
joyn new options and values invisibly as a list
See Also
JOYn options functions
get_joyn_options()
Examples
joyn:::set_joyn_options(joyn.verbose = FALSE, joyn.reportvar = "joyn_status")
joyn:::set_joyn_options() # return to default options
Split matching type
Description
Split matching type (one of "1:1", "m:1", "1:m", "m:m"
) into its two components
Usage
split_match_type(match_type)
Arguments
match_type |
character: one of "m:m", "m:1", "1:m", "1:1". Default is "1:1" since this the most restrictive. However, following Stata's recommendation, it is better to be explicit and use any of the other three match types (See details in match types sections). |
Value
character vector
store checked variables as possible ids
Description
This function processes a list of possible IDs by removing any NULL
entries,
storing a set of checked variables as an attribute and in the specified environment,
and then returning the updated list of possible IDs.
Usage
store_checked_ids(checked_ids, possible_ids, env = .joynenv)
Arguments
checked_ids |
A vector of variable names that have been checked as possible IDs. |
possible_ids |
A list containing potential identifiers. This list may contain |
env |
An environment where the |
Value
A list of possible IDs with NULL
values removed, and the checked_ids
stored as an attribute.
Wrapper for store_msg function This function serves as a wrapper for the store_msg function, which is used to store various types of messages within the .joyn environment. :errors, warnings, timing information, or info
Description
Wrapper for store_msg function This function serves as a wrapper for the store_msg function, which is used to store various types of messages within the .joyn environment. :errors, warnings, timing information, or info
Usage
store_joyn_msg(err = NULL, warn = NULL, timing = NULL, info = NULL)
Arguments
err |
A character string representing an error message to be stored. Default value is NULL |
warn |
A character string representing a warning message to be stored. Default value is NULL |
timing |
A character string representing a timing message to be stored. Default value is NULL |
info |
A character string representing an info message to be stored. Default value is NULL |
Value
invisible TRUE
Hot to pass the message string
The function allows for the customization of the message string using cli classes to emphasize specific components of the message Here's how to format the message string: *For variables: .strongVar *For function arguments: .strongArg *For dt/df: .strongTable *For text/anything else: .strong *NOTE: By default, the number of seconds specified in timing messages is automatically emphasized using a custom formatting approach. You do not need to apply cli classes nor to specify that the number is in seconds.
Examples
# Timing msg
joyn:::store_joyn_msg(timing = paste(" The entire joyn function, including checks,
is executed in ", round(1.8423467, 6)))
# Error msg
joyn:::store_joyn_msg(err = " Input table {.strongTable x} has no columns.")
# Info msg
joyn:::store_joyn_msg(info = "Joyn's report available in variable {.strongVar .joyn}")
Store joyn message to .joynenv environment
Description
Store joyn message to .joynenv environment
Usage
store_msg(type, ...)
Arguments
... |
combination of type and text in the form |
Value
current message data frame invisibly
See Also
Messages functions
clear_joynenv()
,
joyn_msg()
,
joyn_msgs_exist()
,
joyn_report()
,
msg_type_dt()
,
style()
,
type_choices()
Examples
# Storing msg with msg_type "info"
joyn:::store_msg("info",
ok = cli::symbol$tick, " ",
pale = "This is an info message")
# Storing msg with msg_type "warn"
joyn:::store_msg("warn",
err = cli::symbol$cross, " ",
note = "This is a warning message")
style of text displayed
Description
This is an adaptation from https://github.com/r-lib/pkgbuild/blob/3ba537ab8a6ac07d3fe11c17543677d2a0786be6/R/styles.R
Usage
style(..., sep = "")
Arguments
... |
combination of type and text in the form
|
sep |
a character string to separate the terms to paste |
Value
formatted text
See Also
Messages functions
clear_joynenv()
,
joyn_msg()
,
joyn_msgs_exist()
,
joyn_report()
,
msg_type_dt()
,
store_msg()
,
type_choices()
Choice of messages
Description
Choice of messages
Usage
type_choices()
Value
character vector with choices of types
See Also
Messages functions
clear_joynenv()
,
joyn_msg()
,
joyn_msgs_exist()
,
joyn_report()
,
msg_type_dt()
,
store_msg()
,
style()
Check for unmatched keys
Description
Gives TRUE if unmatched keys, FALSE if not.
Usage
unmatched_keys(x, out, by)
Arguments
x |
input table to join |
out |
output of join |
by |
by argument, giving keys for join |
Value
logical
Update NA and/or values
Description
The function updates NAs and/or values in the following way:
If only update_NAs is TRUE: update NAs of var in x with values of var y of the same name
If only update_values = TRUE: update all values, but NOT NAs, of var in x with values of var y of the same name. NAs from y are not used to update values in x . (e.g., if x.var = 10 and y.var = NA, x.var remains 10)
If both update_NAs and update_values are TRUE, both NAs and values in x are updated as described above
If both update_NAs and update_values are FALSE, no update
Usage
update_na_values(
dt,
var,
reportvar = getOption("joyn.reportvar"),
suffixes = getOption("joyn.suffixes"),
rep_NAs = FALSE,
rep_values = FALSE
)
Arguments
dt |
joined data.table |
var |
variable(s) to be updated |
reportvar |
character: Name of reporting variable. Default is ".joyn". This is the same as variable "_merge" in Stata after performing a merge. If FALSE or NULL, the reporting variable will be excluded from the final table, though a summary of the join will be display after concluding. |
suffixes |
A character(2) specifying the suffixes to be used for making non-by column names unique. The suffix behaviour works in a similar fashion as the base::merge method does. |
rep_NAs |
inherited from joyn update_NAs |
rep_values |
inherited from joyn update_values |
Value
data.table