Type: | Package |
Title: | Literature Matrix Synthesis Tools for Epidemiology and Health Science Research |
Version: | 1.0.1 |
Maintainer: | JP Monteagudo <jpmonteagudo2014@gmail.com> |
Description: | An easy-to-use workflow that provides tools to create, update and fill literature matrices commonly used in research, specifically epidemiology and health sciences research. The project is born out of need as an easy–to–use tool for my research methods classes. |
License: | AGPL (≥ 3) |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
URL: | https://github.com/jpmonteagudo28/matriz |
BugReports: | https://github.com/jpmonteagudo28/matriz/issues |
Suggests: | testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
Imports: | dplyr, readr, readxl, rlang, stringr, writexl |
Depends: | R (≥ 4.1.0) |
NeedsCompilation: | no |
Packaged: | 2025-02-04 21:21:53 UTC; jpmonteagudo |
Author: | JP Monteagudo |
Repository: | CRAN |
Date/Publication: | 2025-02-05 18:30:01 UTC |
matriz: Literature Matrix Synthesis Tools for Epidemiology and Health Science Research
Description
An easy-to-use workflow that provides tools to create, update and fill literature matrices commonly used in research, specifically epidemiology and health sciences research. The project is born out of need as an easy–to–use tool for my research methods classes.
Author(s)
Maintainer: JP Monteagudo jpmonteagudo2014@gmail.com (ORCID) [copyright holder]
See Also
Useful links:
Report bugs at https://github.com/jpmonteagudo28/matriz/issues
Add Multiple Records to a literature matrix
Description
Adds one or more records to a literature matrix at a specified position. Records can be provided as lists or data frames, and can be inserted before or after specific rows.
Usage
add_batch_record(.data, ..., .before = NULL, .after = NULL)
Arguments
.data |
A data frame to which records will be added |
... |
One or more records to add. Each record can be either:
|
.before |
Row number before which to insert the new records. If NULL (default), and '.after' is also NULL, records are appended to the end. |
.after |
Row number after which to insert the new records. If NULL (default), and '.before' is also NULL, records are appended to the end. |
Value
A data frame with the new records added at the specified position
Examples
# Create sample data frame
df <- data.frame(
name = c("John", "Jane"),
age = c(25, 30)
)
# Add a single record as a list
df <- add_batch_record(df, list(name = "Bob", age = 35))
# Add multiple records as data frames
new_records <- data.frame(
name = c("Alice", "Charlie"),
age = c(28, 40)
)
df <- add_batch_record(df, new_records, .before = 2)
Add an Empty Row to a Data Frame
Description
Adds a single row of NA values to a data frame
Usage
add_empty_row(.data)
Arguments
.data |
A data frame to which an empty row will be added |
Value
Modified data frame with an additional empty row
Add a Record to a Data Frame
Description
Adds a new row to a data frame at a specified position
Usage
add_record(.data, ..., .before = NULL, .after = NULL)
Arguments
.data |
A data frame to which a record will be added |
... |
New record to be added (vector, list, or data frame) |
.before |
Optional. Row number before which to insert the new record |
.after |
Optional. Row number after which to insert the new record |
Value
Modified data frame with the new record inserted
Examples
df <- data.frame(x = 1:3, y = 4:6)
add_record(df, c(4, 7))
add_record(df, c(4, 7), .before = 2)
Append a Column to a Data Frame
Description
Internal function to add a new column to a data frame and optionally position it before or after a specified column.
Usage
append_column(.data, new_col, .before = NULL, .after = NULL)
Arguments
.data |
A data frame to which the column will be added. |
new_col |
A vector containing the values of the new column to append. The name of this vector will be used as the column name in the data frame. |
.before |
(Optional) A string specifying the name of the column before which the new column should be inserted. Defaults to 'NULL'. |
.after |
(Optional) A string specifying the name of the column after which the new column should be inserted. Defaults to 'NULL'. |
Details
If both '.before' and '.after' are provided, '.before' takes precedence. If neither is provided, the new column is appended at the end of the data frame.
The column name is derived from the name of the input vector 'new_col'.
Value
A data frame with the new column added at the specified position or at the end if no position is specified.
Delete Records from a Data Frame
Description
Deletes specific rows from a data frame or clears the entire data frame by leveraging the 'truncate' function. If no position is provided, it will issue a message and either return the unchanged data or use 'truncate' to empty the data frame, depending on additional arguments.
Usage
delete_record(.data, position = NULL, ...)
Arguments
.data |
A data frame from which records will be deleted. |
position |
A numeric vector specifying the row positions to be deleted. If 'NULL', behavior is determined by the number of rows in the data frame and additional arguments passed to the 'truncate' function. |
... |
Additional arguments passed to the 'truncate' function. Specifically, the 'keep_rows' argument can be used to decide whether non-NA cells in the data frame are cleared when truncating. |
Details
- If 'position' is 'NULL' and the data frame has more than one row, a message is issued, and no records are deleted. - If 'position' is a numeric vector, the specified rows are deleted using 'dplyr::slice()'. - If 'position' is empty or invalid (e.g., not numeric), the function stops with an appropriate error message. - When no rows remain after deletion, the function calls 'truncate' to handle the data frame, with behavior controlled by the 'keep_rows' argument passed through '...'.
Value
A modified data frame with the specified rows removed. If 'position' is 'NULL', the function either returns the original data frame or an empty data frame, based on the 'keep_rows' argument in the 'truncate' function.
Examples
df <- data.frame(A = 1:5, B = letters[1:5])
# Delete a specific row
delete_record(df, position = 2)
# Delete multiple rows
delete_record(df, position = c(2, 4))
# Use truncate to clear the data frame
delete_record(df, position = NULL, keep_rows = FALSE)
# Keep non-NA cells but empty rows
delete_record(df, position = NULL, keep_rows = TRUE)
Deparse dots arguments into character vector
Description
Takes dots arguments and deparses them into a character vector.
Usage
deparse_dots(...)
Arguments
... |
Arguments to deparse |
Value
A character vector containing the deparsed expressions
Determine Row Insertion Position
Description
Calculates the position for inserting a new row in a data frame
Usage
determine_position(.data, record, .before = NULL, .after = NULL)
Arguments
.data |
Original data frame |
record |
New row to be inserted |
.before |
Optional. Row number before which to insert |
.after |
Optional. Row number after which to insert |
Value
List containing the potentially modified data frame and insertion position
Count number of dots arguments
Description
Returns the number of arguments passed via ...
Usage
dots_n(...)
Arguments
... |
Arguments to count |
Value
Integer length of dots arguments
Check if two objects have identical column names
Description
Compares column names between two objects element by element.
Usage
equal_names(x, y)
Arguments
x |
First object to compare |
y |
Second object to compare |
Value
Logical indicating if objects have identical column names
Export a Data Matrix to Various File Formats
Description
This function exports a data frame to a specified file format, including CSV, TSV, RDS, XLSX, and TXT. If the format is not provided, it is inferred from the file extension.
Usage
export_matrix(
.data,
file,
format = NULL,
drop_extra = FALSE,
extra_columns = NULL,
silent = FALSE,
...
)
Arguments
.data |
A data frame or tibble to be exported. |
file |
A character string specifying the file name and path. |
format |
A character string specifying the file format. If 'NULL', the format is inferred from the file extension. Supported formats: '"csv"', '"tsv"', '"rds"', '"xlsx"', '"txt"'. |
drop_extra |
Logical. If 'TRUE', removes columns not listed in 'extra_columns' before exporting. Default is 'FALSE'. |
extra_columns |
A character vector specifying additional columns to retain if 'drop_extra = TRUE'. Default is 'NULL'. |
silent |
Logical. If 'TRUE', suppresses messages. Default is 'FALSE'. |
... |
Additional arguments passed to the underlying export functions ('write.csv', 'writexl::write_xlsx', etc.). |
Value
Exports the data to a file and returns 'NULL' invisibly.
Extract elements
Description
Alias for base R extract operator '['
Usage
extract()
Value
typically an array-like R object of a similar class as x.
Extract single element
Description
Alias for base R extract operator '[['
Usage
extract2()
Value
typically an array-like R object of a similar class as x.
Extract field value from BibTeX entry
Description
Extracts the value of a specified field from a BibTeX entry string using regular expressions. The function is case-insensitive and handles various spacing patterns around the field delimiter.
Usage
extract_field(entry, field)
Arguments
entry |
Character string containing a BibTeX entry |
field |
Name of the field to extract (e.g., "title", "author", "year") |
Details
The function searches for patterns of the form "field = {value}" in the BibTeX entry, ignoring case and allowing for variable whitespace around the equals sign. The value is expected to be enclosed in curly braces.
Value
Character string containing the field value if found, NA if the field is not present
Format BibTeX Entry to AMA Citation Style
Description
Converts a BibTeX entry into AMA (American Medical Association) citation format. Handles article, book, and miscellaneous entry types.
Usage
format_ama_citation(bibtex_entry)
Arguments
bibtex_entry |
A character string containing a single BibTeX entry |
Value
An object of class c("bibentry", "character", "citation") containing:
string |
The original BibTeX entry |
year |
The publication year as numeric |
citation |
The formatted AMA citation string |
keywords |
A vector of keywords |
Examples
bibtex <- "@article{key,
author = {Smith J and Jones K},
title = {Example Title},
journal = {Journal Name},
year = {2024},
volume = {1},
pages = {1-10}
}"
citation <- format_ama_citation(bibtex)
Format Multiple BibTeX Entries to AMA Citation Style
Description
Processes multiple BibTeX entries and converts them to AMA (American Medical Association) citation format. Handles article, book, and miscellaneous entry types.
Usage
format_batch_ama_citation(bibtex_entries)
Arguments
bibtex_entries |
A character vector containing one or more BibTeX entries |
Value
If given a single entry, returns a single citation object. If given multiple entries, returns a list of citation objects. Each object is of class c("bibentry", "character", "citation") containing:
string |
The original BibTeX entry |
year |
The publication year as numeric |
citation |
The formatted AMA citation string |
keywords |
A vector of keywords |
Examples
entries <- c(
"@article{key1,
author = {Smith J},
title = {First Example},
journal = {Journal One},
year = {2024}
}",
"@book{key2,
author = {Jones K},
title = {Second Example},
publisher = {Publisher},
year = {2024}
}"
)
citations <- format_batch_ama_citation(entries)
This function imports a matrix (data frame) from various file formats (CSV, TSV, RDS, XLSX, XLS, TXT) and ensures it contains the required columns. It also allows the user to control whether extra columns should be dropped or kept.
Description
This function imports a matrix (data frame) from various file formats (CSV, TSV, RDS, XLSX, XLS, TXT) and ensures it contains the required columns. It also allows the user to control whether extra columns should be dropped or kept.
Usage
import_matrix(
path,
format = NULL,
drop_extra = FALSE,
extra_columns = NULL,
remove_dups = TRUE,
silent = FALSE,
...
)
Arguments
path |
A character string specifying the path to the file to be imported. |
format |
A character string specifying the file format. If not provided, the format is automatically detected based on the file extension. Supported formats: "csv", "tsv", "rds", "xlsx", "xls", "txt". |
drop_extra |
A logical value indicating whether extra columns (not in the list of required columns) should be dropped. Default is 'FALSE'. |
extra_columns |
A character vector of column names that are allowed in addition to the required columns. By default, no extra columns are allowed. |
remove_dups |
A logical value indicating whether to remove duplicate columns before merging. Default is 'TRUE'. |
silent |
A logical value indicating whether to suppress messages. Default is 'FALSE'. |
... |
Additional arguments passed to the specific file-reading functions (e.g., 'read.csv', 'read.delim', 'readRDS', 'readxl::read_xlsx', 'readxl::read_xls', 'read.table'). Refer to the documentation of the corresponding read function for the list of valid arguments. |
Details
The matrix includes the following predefined columns:
- 'year': Numeric. Year of publication. - 'citation': Character. Citation or reference details. - 'keywords': Character. Keywords or tags for the study. - 'profession': Character. Profession of the study participants or target audience. - 'electronic': Logical. Indicates whether the study is available electronically. - 'purpose': Character. Purpose or objective of the study. - 'study_design': Character. Study design or methodology. - 'outcome_var': Character. Outcome variables measured in the study. - 'predictor_var': Character. Predictor variables considered in the study. - 'sample': Numeric. Sample size. - 'dropout_rate': Numeric. Dropout or attrition rate. - 'setting': Character. Study setting (e.g., clinical, educational). - 'inclusion_criteria': Character. Inclusion criteria for participants. - 'ethnicity': Character. Ethnic background of participants. - 'age': Numeric. Age of participants. - 'sex': Factor. Sex of participants. - 'income': Factor. Income level of participants. - 'education': Character. Educational background of participants. - 'measures': Character. Measures or instruments used for data collection. - 'analysis': Character. Analytical methods used. - 'results': Character. Summary of results or findings. - 'limitations': Character. Limitations of the study. - 'implications': Character. Implications or recommendations from the study. - 'ethical_concerns': Character. Ethical concerns addressed in the study. - 'biases': Character. Potential biases in the study. - 'notes': Character. Additional notes or observations.
Extra columns beyond the required ones are handled via the 'extra_columns' argument. If the 'drop_extra' argument is set to 'TRUE', extra columns will be removed. If 'drop_extra' is 'FALSE', extra columns will remain in the imported data, and a message will be shown.
The '...' argument allows you to pass additional parameters directly to the read functions. For instance: - For 'read.csv', '...' could include 'header = TRUE', 'sep = ","', or 'stringsAsFactors = FALSE'. - For 'read.delim', '...' could include 'header = TRUE', 'sep ', or 'stringsAsFactors = FALSE'. - For 'readRDS', '...' could include 'refhook = NULL'. - For 'readxl::read_xlsx', '...' could include 'sheet = 1' or 'col_names = TRUE'. - For 'readxl::read_xls', '...' could include 'sheet = 1' or 'col_Names = TRUE'. - For 'read.table', '...' could include 'header = TRUE', 'sep', or 'stringsAsFactors = FALSE'.
Value
A data frame containing the imported matrix, with the required columns and any allowed extra columns.
Initialize a Literature Review Matrix
Description
Creates a standardized data frame for systematic literature review with predefined columns, allowing the addition of custom columns if needed.
Usage
init_matrix(...)
Arguments
... |
Optional. Additional column names (as character strings) to be appended to the matrix. |
Details
The matrix includes the following predefined columns: - 'year': Numeric. Year of publication. - 'citation': Character. Citation or reference details. - 'keywords': Character. Keywords or tags for the study. - 'profession': Character. Profession of the study participants or target audience. - 'electronic': Logical. Indicates whether the study is available electronically. - 'purpose': Character. Purpose or objective of the study. - 'study_design': Character. Study design or methodology. - 'outcome_var': Character. Outcome variables measured in the study. - 'predictor_var': Character. Predictor variables considered in the study. - 'sample': Numeric. Sample size. - 'dropout_rate': Numeric. Dropout or attrition rate. - 'setting': Character. Study setting (e.g., clinical, educational). - 'inclusion_criteria': Character. Inclusion criteria for participants. - 'ethnicity': Character. Ethnic background of participants. - 'age': Numeric. Age of participants. - 'sex': Factor. Sex of participants. - 'income': Factor. Income level of participants. - 'education': Character. Educational background of participants. - 'measures': Character. Measures or instruments used for data collection. - 'analysis': Character. Analytical methods used. - 'results': Character. Summary of results or findings. - 'limitations': Character. Limitations of the study. - 'implications': Character. Implications or recommendations from the study. - 'ethical_concerns': Character. Ethical concerns addressed in the study. - 'biases': Character. Potential biases in the study. - 'notes': Character. Additional notes or observations.
Custom columns can also be added by passing their names via the '...' argument.
Value
A data frame with predefined columns for literature review analysis.
Examples
# Create a basic literature review matrix
lit_matrix <- init_matrix()
Check if object is empty
Description
Tests if an object has length zero.
Usage
is_empty(x)
Arguments
x |
Object to test |
Value
Logical indicating if object has zero length
Determine if list is nested
Description
Checks for nested lists in batch functions
Usage
is_nested_list(x)
Arguments
x |
List to check |
Value
Logical indicating if list is nested
Display package version for matriz
Description
matriz_message()
produces a message about the package version
and the version of R making use of this package.
Usage
matriz_message()
Value
dmatriz_message()
returns a message about the install version
of matriz.
Author(s)
JP Monteagudo
Examples
matriz_message()
Retrieve Column Classes from deafult literature matrix.
Description
This function calls init_matrix()
to obtain a matrix or data frame,
then extracts the class of each column. It returns a data frame containing
the class information for each column.
Usage
matriz_names(...)
Arguments
... |
extra arguments to pass as column names for the literature matrix |
Details
The purpose of this function is to provide the user with a quick way to check the default names and classes as the matrix is being filled instead of having to type 'str(init_matrix())' every time the user forgets a category in the default matrix.
Value
A data frame with one column named class
that lists the class
of each column from the matrix or data frame returned by init_matrix()
.
Examples
matriz_names()
Merge Two literature matrices by Common Columns
Description
This function merges two literature matrices based on specified key columns, with options for full or inner joins and duplicate column removal.
Usage
merge_matrix(
.data,
.data2,
by = NULL,
all = FALSE,
remove_dups = TRUE,
suffixes = c(".x", ".y"),
silent = FALSE
)
Arguments
.data |
A data frame to be merged. |
.data2 |
A second data frame to be merged with '.data'. |
by |
A character vector specifying the column(s) to merge by. Must exist in both data frames. |
all |
A logical value indicating whether to perform a full join ('TRUE') or an inner join ('FALSE', default). |
remove_dups |
A logical value indicating whether to remove duplicate columns before merging. Default is 'TRUE'. |
suffixes |
A character vector of length 2 specifying suffixes to apply to overlapping column names from '.data' and '.data2', respectively. Default is 'c(".x", ".y")'. |
silent |
A logical value indicating whether to suppress messages about duplicate column removal. Default is 'FALSE'. |
Details
The function first ensures that '.data' and '.data2' are valid data frames and checks that the 'by' columns exist in both. If 'remove_dups = TRUE', duplicate columns are removed before merging. The function then performs either a full or inner join using 'dplyr::full_join()' or 'dplyr::inner_join()', respectively.
Value
A merged data frame with specified join conditions applied.
Examples
df1 <- data.frame(id = c(1, 2, 3), value1 = c("A", "B", "C"))
df2 <- data.frame(id = c(2, 3, 4), value2 = c("X", "Y", "Z"))
# Inner join (default)
merge_matrix(df1, df2, by = "id")
# Full join
merge_matrix(df1, df2, by = "id", all = TRUE)
# Remove duplicate columns before merging
df3 <- data.frame(id = c(1, 2, 3), value1 = c("A", "B", "C"), extra = c(1, 2, 3))
df4 <- data.frame(id = c(2, 3, 4), value2 = c("X", "Y", "Z"), extra = c(4, 5, 6))
merge_matrix(df3, df4, by = "id", remove_dups = TRUE)
Parse Multiple Citations from File
Description
Reads and parses multiple BibTeX citations from a file, handling whitespace and formatting while ensuring proper brace matching.
Usage
parse_batch_citation(entry)
Arguments
entry |
A character string containing the path to a file with BibTeX citations |
Value
If the file contains a single citation, returns a character string. If multiple citations are present, returns a character vector of citations. Returns NULL if no citations are found.
Parse Citation from File
Description
Reads and parses a single BibTeX citation from a file, cleaning up whitespace and formatting.
Usage
parse_citation(entry)
Arguments
entry |
A character string containing the path to a file with a BibTeX citation |
Value
A character string containing the cleaned and parsed BibTeX entry
Process Multiple BibTeX Citations and Update Literature Matrix
Description
Reads multiple BibTeX citations from files and updates the corresponding rows in a literature matrix with formatted citations, keywords, and years.
Usage
process_batch_citation(.data, citations, where = NULL)
Arguments
.data |
A data frame containing at least three columns:
|
citations |
Character vector of file paths to BibTeX citation files |
where |
Numeric vector indicating which rows to update. If NULL (default), all rows will be updated. |
Value
A data frame with updated citation information in the specified rows
See Also
format_batch_ama_citation
, parse_batch_citation
Process a Citation Record
Description
Takes a record list and a citation string, processes the citation into AMA format, and updates the record with the formatted citation, keywords, and year.
Usage
process_citation(.record, citation)
Arguments
.record |
A list containing the record to be updated |
citation |
A character string containing a BibTeX citation |
Value
An updated list containing the original record with added fields:
citation |
The formatted AMA citation |
keywords |
A vector of keywords from the citation |
year |
The publication year |
Remove Duplicates from Vectors or Data Frame Columns
Description
Remove Duplicates from Vectors or Data Frame Columns
Usage
rid_dups(x, incomparables = FALSE, ...)
Arguments
x |
A vector or data frame |
incomparables |
A vector of values that cannot be compared. See ?duplicated |
... |
arguments for particular methods used in 'unique' and 'duplicated' |
Value
The input with duplicates removed
Check if two objects have the same number of columns
Description
Compares number of columns between two objects.
Usage
same_column(x, y)
Arguments
x |
First object to compare |
y |
Second object to compare |
Value
Logical indicating if objects have same number of columns
Check if two objects have the same length
Description
Compares lengths of two objects, handling special cases for lists and data frames.
Usage
same_length(x, y)
Arguments
x |
First object to compare |
y |
Second object to compare |
Value
Logical indicating if objects have same length
Search and Filter Records in a literature matrix
Description
Filters a literature matrix based on a specified condition, with the option to restrict the search to a specific column. The function supports both column names and numeric indices for column selection.
Usage
search_record(.data, column = NULL, where = NULL)
Arguments
.data |
A data frame to search within. |
column |
Optional. The column to search in, specified either by name or numeric index. If NULL (default), the search is performed across all columns. |
where |
A logical expression that defines the search condition. Must evaluate to a logical vector of the same length as the number of rows in '.data'. |
Value
A filtered data frame containing only the rows that match the search condition. If a specific column was selected, only that column is returned.
Examples
df <- data.frame(
id = 1:5,
name = c("John", "Jane", "Bob", "Alice", "John"),
age = c(25, 30, 35, 28, 40)
)
# Search across all columns where age > 30
search_record(df, where = age > 30)
# Search only in the name column for "John"
search_record(df, column = "name", where = name == "John")
# Search using column index
search_record(df, column = 2, where = name == "Jane")
Convert Symbol to Character
Description
Converts a symbol to a character string
Usage
to_char(symbol)
Arguments
symbol |
A symbol to convert |
Value
A character string
Truncate a Data Frame or Matrix
Description
Remove all rows from a literature matrix but preserve the general structure. Mimics SQL's TRUNCATE operation by clearing data while preserving structure.
Usage
truncate(.data, keep_rows = FALSE)
Arguments
.data |
A data frame or matrix to be truncated |
keep_rows |
Logical. If TRUE, replaces non-NA values with NA instead of removing all data |
Value
An empty data frame or matrix with the same structure as the input
Examples
# Completely empty a data frame
df <- data.frame(x = 1:3, y = 4:6)
truncate(df)
# Replace non-NA values with NA while keeping structure
truncate(df, keep_rows = TRUE)
Update Rows in a Data Frame Based on a Condition
Description
Modifies the values in a specified column of a data frame for rows that meet a given condition.
Usage
update_record(.data, column = NULL, where = NULL, set_to = NULL, ...)
Arguments
.data |
A data frame. The dataset to modify. |
column |
A column in the data frame to update. Can be specified as a column name, index, or unquoted column symbol. |
where |
A condition that determines which rows to update. Must evaluate to a logical vector of the same length as the number of rows in '.data'. |
set_to |
The value to assign to the rows in the specified column where the 'where' condition is 'TRUE'. |
... |
Additional arguments (currently unused, reserved for future use). |
Details
This function updates values in a specified column of a data frame for rows that satisfy the given condition. The 'column' parameter can be provided as: - A numeric column index (e.g., '2'). - A column name (e.g., '"value"'). - An unquoted column symbol (e.g., 'value').
Value
The modified data frame with updated values.
Examples
# Example data frame
df <- data.frame(
id = 1:5,
value = c(10, 20, 30, 40, 50)
)
# Update rows where id > 3
updated_df <- update_record(df, column = value, where = id > 3, set_to = 100)
print(updated_df)
# Using column as a string
updated_df <- update_record(df, column = "value", where = id == 2, set_to = 99)
print(updated_df)
Validate and Clean Imported Data Matrix
Description
This function ensures that the imported data contains all required columns, optionally removes unwanted extra columns, and provides informative messages about the dataset's structure.
Usage
validate_columns(
data,
extra_columns = NULL,
drop_extra = FALSE,
silent = FALSE
)
Arguments
data |
A data frame containing the imported matrix. |
extra_columns |
A character vector of allowed additional columns beyond the required ones. Defaults to NULL. |
drop_extra |
A logical value indicating whether to remove extra columns that are not in 'extra_columns'. Defaults to FALSE. |
silent |
A logical value indicating whether to suppress messages. Defaults to FALSE. |
Details
The function checks whether all required columns are present in the data. If any required columns are missing, it stops execution and informs the user.
It also identifies extra columns beyond the required set and compares them against the allowed 'extra_columns'. If 'drop_extra = TRUE', it removes any extra columns not listed in 'extra_columns'. If 'drop_extra = FALSE', it retains the extra columns but issues a message unless 'silent = TRUE'.
Value
A cleaned data frame with required columns intact and, optionally, extra columns removed.
Note
The function assumes that column names in 'data' are correctly formatted and case-sensitive.