Type: Package
Title: Write from Multiple Sources to a Database Table
Version: 0.1.0
Description: Provides unified syntax to write data from lazy 'dplyr' 'tbl' or 'dplyr' 'sql' query or a dataframe to a database table with modes such as create, append, insert, update, upsert, patch, delete, overwrite, overwrite_schema.
License: LGPL (≥ 3)
Imports: DBI (≥ 1.2.3), dplyr (≥ 1.1.0), dbplyr (≥ 2.3.1), glue (≥ 1.5.1), cli (≥ 3.4.0), rlang (≥ 1.1.2)
Suggests: RSQLite, testthat (≥ 3.0.0),
URL: https://github.com/talegari/writer
Config/testthat/edition: 3
Encoding: UTF-8
RoxygenNote: 7.3.2
NeedsCompilation: no
Packaged: 2025-01-31 18:22:17 UTC; srikanth-ks1
Author: Komala Sheshachala Srikanth ORCID iD [aut, cre]
Maintainer: Komala Sheshachala Srikanth <sri.teach@gmail.com>
Repository: CRAN
Date/Publication: 2025-02-03 11:40:02 UTC

string to_id

Description

simple string to DBI id

Usage

string_to_id(x)

Validate table name

Description

Handles string, I() or Id objects and returns a string

Usage

validate_table_name(name)

wrap by I()

Description

wraps using a separate function as cannot be done literally

Usage

wrap_by_I(string)

write_df_create

Description

write_df_create

Usage

write_df_append(df, dest_table_name, con, ...)

write_df_create

Description

write_df_create

Usage

write_df_create(df, dest_table_name, con, ...)

write_df_overwrite

Description

write_df_overwrite

Usage

write_df_overwrite(df, dest_table_name, con, ...)

write_df_overwrite_schema

Description

write_df_overwrite_schema

Usage

write_df_overwrite_schema(df, dest_table_name, con, ...)

write_rows

Description

write_rows

Usage

write_rows(
  source,
  dest_table_name,
  con,
  mode = c("append", "insert", "update", "upsert", "patch", "delete"),
  ...
)

Write from multiple sources to a database table

Description

Provides unified syntax to write data from dplyr::tbl() (lazy dplyr query via a DBI connection) or a dplyr::sql() or a data.frame to a database table with modes such as: create, append, insert, update, upsert, patch, delete, overwrite, overwrite_schema.

Usage

write_table(x, table_name, mode, con = NULL, verbose = TRUE, ...)

## S3 method for class 'tbl_lazy'
write_table(
  x,
  table_name,
  mode = c("create", "append", "insert", "update", "upsert", "patch", "delete",
    "overwrite", "overwrite_schema"),
  con = NULL,
  verbose = TRUE,
  ...
)

## S3 method for class 'sql'
write_table(
  x,
  table_name,
  mode = c("create", "append", "insert", "update", "upsert", "patch", "delete",
    "overwrite", "overwrite_schema"),
  con = NULL,
  verbose = TRUE,
  ...
)

## S3 method for class 'data.frame'
write_table(
  x,
  table_name,
  mode = c("create", "append", "insert", "update", "upsert", "patch", "delete",
    "overwrite", "overwrite_schema"),
  con = NULL,
  verbose = TRUE,
  ...
)

Arguments

x

( dplyr::tbl() or dplyr::sql() or data.frame ) The data to write to the database. Input need not have any rows.

table_name

( string or AsIs or Id ) The name of the table to write to. This has to be one among: string (typically "catalog.schema.table"), Object of class "AsIs" generated by wrapping I() (typically, I("catalog.schema.table")) or Object of class Id generated by DBI::Id (typically, DBI::Id("catalog", "schema", "table")).

mode

( string ) Writing mode. Possible values are one among: create, append, insert, update, upsert, patch, delete, overwrite, overwrite_schema.

con

( default: NULL ) DBI-connection object to use to write operation. When con is NULL and source is a tbl_lazy, then DBI connection object from the source tbl is used. When source is a data.frame, then con should be provided.

verbose

( default: TRUE ) Whether the progress message should be shown

...

Arguments passed to specific function based on mode. See details.

Details

The DBI-dplyr-dbplyr combination provides a great workflow to handle database operations from R. When saving the output from analysis notebooks or scripts, different functions need to be called based on the type of the object we intend to write. writer package solves the problem by exporting one generic write_table to handle multiple input types and multiple modes. Further, overwrite and overwrite_schema refine the idea of table overwrite so that schema of the table is not changed inadvertently.

Modes

Failures

The failures are mostly due to unavailable or wrong sql translation to the specific backend. Please raise issues in dbplyr repo.

Errors are raised with a class (using rlang::abort()). These are the error classes:

 * `error_input`: Related to wrong or unexpected input.
 * `error_connection`: Raised when connection is inactive.
 * `error_table`: Related to table existence or non-existence.
 * `error_operation`: Related to core write operation.

Permissions

 * `create`: create new table
 * `append`, `insert`, `update`, `upsert`, `patch`, `delete`: modify existing table
 * `overwrite`: modify existing table
 * `overwrite_schema`: delete, create and rename table

Value

When successful, returns the output table name as a string. Else, throws an error with informative messages.

Examples

## Not run: 
#' Create an in-memory SQLite database connection
con = DBI::dbConnect(RSQLite::SQLite(), ":memory:")

remove_new_table = function(){
  df = data.frame(id = 1:3, name = c("Alice", "Bob", "Charlie"))
  DBI::dbRemoveTable(con, "new_table", fail_if_missing = FALSE)
}

create_new_table = function(){
  df = data.frame(id = 1:3, name = c("Alice", "Bob", "Charlie"))
  DBI::dbWriteTable(con, "new_table", df, overwrite = TRUE)
}

#' Create a sample data.frame
df = data.frame(id = 1:3, name = c("Alice", "Bob", "Charlie"))
df

#' Create a new table
write_table(df, "new_table", mode = "create", con = con)
dplyr::tbl(con, "new_table")


intermediate = dplyr::tbl(con, "new_table") |>
  dplyr::filter(id >= 2)

write_table(intermediate, "new_filtered_table", mode = "create")
dplyr::tbl(con, "new_filtered_table")

#' Append data to an existing table
create_new_table()
append_df = data.frame(id = 4:5, name = c("Dave", "Eve"))

append_df |>
  write_table("new_table", mode = "append", con = con)
dplyr::tbl(con, "new_table")

create_new_table()
write_table(append_df, "append_table", mode = "create", con)
dplyr::tbl(con, "append_table")

dplyr::tbl(con, "append_table") |>
  write_table("new_table", mode = "append")
dplyr::tbl(con, "new_table")

#' Insert data into an existing table, only if key values do not exist
create_new_table()
dplyr::tbl(con, "new_table")
insert_df = data.frame(id = 3:4, name = c("Dave", "Eve"))
insert_df

insert_df |>
  write_table("new_table",
              mode = "insert",
              con = con,
              by = "id",
              conflict = "ignore"
              )
dplyr::tbl(con, "new_table")

create_new_table()
insert_df |>
  write_table("insert_table", mode = "create", con = con)
dplyr::tbl(con, "insert_table")

dplyr::tbl(con, "insert_table") |>
  write_table("new_table",
              mode = "insert",
              by = "id",
              conflict = "ignore"
              )
dplyr::tbl(con, "new_table")

#' Update data in an existing table, only if key values match
create_new_table()
update_df = data.frame(id = c(1, 3), name = c("Alicia", "Charles"))
update_df
write_table(update_df,
            "new_table",
            mode = "update",
            con = con,
            unmatched = "ignore"
            )
dplyr::tbl(con, "new_table")

create_new_table()
write_table(update_df, "update_table", mode = "create", con = con)
dplyr::tbl(con, "update_table")
write_table(dplyr::tbl(con, "update_table"),
            "new_table",
            mode = "update",
            unmatched = "ignore"
            )
dplyr::tbl(con, "new_table")

#' upsert
create_new_table()
upsert_df = data.frame(id = c(2, 6), name = c("Bobby", "Frank"))
upsert_df
write_table(upsert_df,
            "new_table",
            mode = "upsert",
            con = con,
            by = "id"
            )
dplyr::tbl(con, "new_table")

create_new_table()
write_table(upsert_df,
            "upsert_table",
            mode = "create",
            con = con
            )
dplyr::tbl(con, "upsert_table")
write_table(dplyr::tbl(con, "upsert_table"),
            "new_table",
            mode = "upsert"
            )
dplyr::tbl(con, "new_table")


#' Patch data, updating only missing values
create_new_table()
patch_df = data.frame(id = c(1, 2), name = c("alice", NA))
patch_df
write_table(df_patch, "table_with_na", mode = "create", con)
dplyr::tbl(con, "table_with_na")
df
write_table(df,
            "table_with_na",
            mode = "patch",
            con = con,
            unmatched = "ignore"
            )
dplyr::tbl(con, "table_with_na")

DBI::dbRemoveTable(con, "table_with_na")
write_table(df_patch, "table_with_na", mode = "create", con)
dplyr::tbl(con, "new_table")
write_table(dplyr::tbl(con, "new_table"),
            "table_with_na",
            mode = "patch",
            con = con,
            unmatched = "ignore"
            )
dplyr::tbl(con, "table_with_na")

#' Delete rows for matching key values
create_new_table()
delete_df = data.frame(id = c(3, 4))
delete_df
write_table(df_delete,
            "new_table",
            mode = "delete",
            con = con,
            unmatched = "ignore"
            )
dplyr::tbl(con, "new_table")

create_new_table()
write_table(delete_df,
            "delete_table",
            mode = "create",
            con = con
            )
dplyr::tbl(con, "delete_table")
write_table(df_delete,
            "new_table",
            mode = "delete",
            con = con,
            unmatched = "ignore"
            )
dplyr::tbl(con, "new_table")

#' Overwrite data in an existing table, schema must match
create_new_table()
overwrite_df = data.frame(id = c(2, 6), name = c("Bobby", "Frank"))
overwrite_df
write_table(overwrite_df,
            "new_table",
            mode = "overwrite",
            con = con
            )
dplyr::tbl(con, "new_table")

create_new_table()
overwrite_df = data.frame(id = c(2, 6), name = c("Bobby", "Frank"))
overwrite_df
write_table(overwrite_df,
            "new_table",
            mode = "overwrite",
            con = con
            )
dplyr::tbl(con, "new_table")

#' Overwrite schema
overwrite_schema_df = data.frame(id = c(2, 6),
                                 name = c("Bobby", "Frank"),
                                 age = c(30, 40)
                                 )
write_table(overwrite_schema_df,
            "new_table",
            mode = "overwrite_schema",
            con = con
            )
dplyr::tbl(con, "new_table")

create_new_table()
write_table(overwrite_schema_df,
            "overwrite_schema_table",
            mode = "overwrite_schema",
            con = con
            )
write_table(dplyr::tbl(con, "overwrite_schema_table"),
            "new_table",
            mode = "overwrite_schema",
            con = con
            )
dplyr::tbl(con, "new_table")

#' Disconnect from the database
DBI::dbDisconnect(con)

## End(Not run)

write_tbl_create

Description

write_tbl_create

Usage

write_tbl_create(source_tbl, dest_table_name, con, ...)

write_tbl_overwrite

Description

write_tbl_overwrite

Usage

write_tbl_overwrite(source_tbl, dest_table_name, con, ...)

write_tbl_overwrite_schema

Description

write_tbl_overwrite_schema

Usage

write_tbl_overwrite_schema(source_tbl, dest_table_name, con, ...)