Help for package rolap

Title:

Obtaining Star Databases from Flat Tables

Version:

2.5.2

Description:

Data in multidimensional systems is obtained from operational systems and is transformed to adapt it to the new structure. Frequently, the operations to be performed aim to transform a flat table into a ROLAP (Relational On-Line Analytical Processing) star database. The main objective of the package is to allow the definition of these transformations easily. The implementation of the multidimensional database obtained can be exported to work with multidimensional analysis tools on spreadsheets or relational databases.

License:

MIT + file LICENSE

URL:

https://josesamos.github.io/rolap/, https://github.com/josesamos/rolap

BugReports:

https://github.com/josesamos/rolap/issues

Depends:

R (≥ 4.1.0)

Imports:

dm, dplyr, methods, purrr, readr, rlang, sf, snakecase, tibble, tidyr, tidyselect, tools, utils, when, xlsx

Suggests:

DBI, dbplyr, DiagrammeR, DiagrammeRsvg, knitr, lubridate, magrittr, maps, pander, pivottabler, RMariaDB, rmarkdown, RSQLite, stringr, testthat (≥ 3.0.0)

VignetteBuilder:

knitr

Config/testthat/edition:

Encoding:

UTF-8

Language:

en-GB

LazyData:

true

LazyDataCompression:

RoxygenNote:

7.3.2

NeedsCompilation:

Packaged:

2025-05-21 10:48:29 UTC; jsamos

Author:

Jose Samos

[aut, cre], Universidad de Granada [cph]

Maintainer:

Jose Samos <jsamos@ugr.es>

Repository:

CRAN

Date/Publication:

2025-05-22 05:10:02 UTC

Add custom column

Description

Add a column returned by a function that takes the data of the flat table as a parameter.

Usage

add_custom_column(ft, name, definition)

## S3 method for class 'flat_table'
add_custom_column(ft, name = NULL, definition)

Arguments

ft

A flat_table object.

name

A string, new column name.

definition

A function that returns a table column.

Value

A flat_table object.

Examples


f <- function(table) {
  paste0(table$City, ' - ', table$State)
}

ft <- flat_table('ft_num', ft_num) |>
  add_custom_column(name = 'city_state', definition = f)

Add dimension instances

Description

Add dimension instances

Usage

add_dimension_instances(db, name, table)

Arguments

db

A star_database object.

name

A string, dimension name.

table

A table of new instances.

Value

A star_database object.

For each row, add a vector of values

Description

For each row, add a vector of values

Usage

add_dput_column(v, column)

Arguments

v

A tibble, rows of a dimension table.

column

A string, name of the column to include a vector of values.

Value

A tibble, rows of a dimension table.

A `star_operation` object row is added with a new operation

Description

A star_operation object row is added with a new operation

Usage

add_operation(op, op_name, name = NULL, details = NULL, details2 = NULL)

Arguments

op

A star_operation object.

op_name

A string, operation name.

name

A string, element name.

details

A vector of strings, operation details.

details2

A vector of strings, operation additional details.

Value

A star_operation object.

Add the surrogate key from a dimension table to the instances table.

Description

Add the surrogate key from a dimension table to the instances table.

Usage

## S3 method for class 'dimension_table'
add_surrogate_key(dimension_table, instances)

Arguments

dimension_table

A dimension_table object.

instances

A tibble, the instances table.

Value

A tibble.

Apply filter dimension

Description

Select the instances of the dimensions that meet the defined conditions.

Usage

apply_filter_dimension(db, sq)

Arguments

db

A star_database object.

sq

A star_query object.

Apply select dimension

Description

Select dimensions and attributes.

Usage

apply_select_dimension(db, sq)

Arguments

db

A star_database object.

sq

A star_query object.

Apply select fact

Description

Select the facts, measures and define the aggregation functions.

Usage

apply_select_fact(db, sq)

Arguments

db

A star_database object.

sq

A star_query object.

Save as `GeoPackage`

Description

Save the geolayer (geographic information layer) and the variables layer in a file in GeoPackage format to be able to work with other tools.

Usage

as_GeoPackage(gl, dir, name, keep_all_variables_na)

## S3 method for class 'geolayer'
as_GeoPackage(gl, dir = NULL, name = NULL, keep_all_variables_na = FALSE)

Arguments

gl

A geolayer object.

dir

A string.

name

A string, file name.

keep_all_variables_na

A boolean, keep rows with all variables NA.

Details

If the file name is not indicated, it defaults to the name of the geovariable.

By default, rows that are NA for all variables are eliminated.

The GeoPackage format only allows defining a maximum of 1998 columns. If the number of variables and columns in the geographic layer exceeds this number, it cannot be saved in this format.

Value

A string, file name.

Examples


gl <- mrs_db_geo |>
  as_geolayer()

f <- gl |>
  as_GeoPackage(dir = tempdir())

Generate csv files with fact and dimension tables

Description

To port databases to other work environments it is useful to be able to export them as csv files, as this function does.

Usage

as_csv_files(db, dir, type)

## S3 method for class 'star_database'
as_csv_files(db, dir = NULL, type = 1)

Arguments

db

A star_database object.

dir

A string, name of a dir.

type

An integer, 1: uses "." for the decimal point and a comma for the separator; 2: uses a comma for the decimal point and a semicolon for the separator.

Value

A string, name of a dir.

Examples


db1 <- star_database(mrs_cause_schema, ft_num) |>
  snake_case()
tl1 <- db1 |>
  as_csv_files()

db2 <- star_database(mrs_age_schema, ft_age) |>
  snake_case()

ct <- constellation("MRS", db1, db2)
d <- ct |>
  as_csv_files(dir = tempdir())

Generate a `dm` class with fact and dimension tables

Description

To port databases to other work environments it is useful to be able to export them as a dm class, as this function does, in this way it can be saved directly in a DBMS.

Usage

as_dm_class(db, pk_facts, fk)

## S3 method for class 'star_database'
as_dm_class(db, pk_facts = TRUE, fk = TRUE)

Arguments

db

A star_database object.

pk_facts

A boolean, include primary key in fact tables.

fk

A boolean, include foreign key in fact tables.

Value

A dm object.

Examples


db1 <- star_database(mrs_cause_schema, ft_num) |>
  snake_case()
dm1 <- db1 |>
  as_dm_class()

db2 <- star_database(mrs_age_schema, ft_age) |>
  snake_case()

ct <- constellation("MRS", db1, db2)
dm <- ct |>
  as_dm_class()

Get a `geolayer` object

Description

From a star_database with at least one geoattribute, we obtain a geolayer object that allows us to select the data to obtain a vector layer with geographic information.

Usage

as_geolayer(db, dimension, attribute, geometry, include_nrow_agg)

## S3 method for class 'star_database'
as_geolayer(
  db,
  dimension = NULL,
  attribute = NULL,
  geometry = NULL,
  include_nrow_agg = FALSE
)

Arguments

db

An star_database object.

dimension

A string, dimension name.

attribute

A vector, attribute names.

geometry

A string, geometry name.

include_nrow_agg

A boolean, include default measure.

Details

If only one geographic attribute is defined, it is not necessary to indicate the dimension or the attribute. By default, polygon geometry is considered.

Value

A geolayer object.

Examples


gl_polygon <- mrs_db_geo |>
  as_geolayer()

gl_point <- mrs_db_geo |>
  as_geolayer(geometry = "point")

Generate a `geomultistar::multistar` object

Description

In order to be able to use the query and integration functions with geographic information offered by the geomultistar package, we can obtain a multistar object from a star database or a constellation.

Usage

as_multistar(db)

## S3 method for class 'star_database'
as_multistar(db)

Arguments

db

A star_database object.

Value

A geomultistar::multistar object.

Examples


db1 <- star_database(mrs_cause_schema, ft_num) |>
  snake_case()
ms1 <- db1 |>
  as_multistar()

db2 <- star_database(mrs_age_schema, ft_age) |>
  snake_case()

ct <- constellation("MRS", db1, db2)
ms <- ct |>
  as_multistar()

Generate tables in a relational database

Description

Given a connection to a relational database, it stores the facts and dimensions in the form of tables. Tables can be overwritten.

Usage

as_rdb(db, con, overwrite)

## S3 method for class 'star_database'
as_rdb(db, con, overwrite = FALSE)

Arguments

db

A star_database object.

con

A DBI::DBIConnection object.

overwrite

A boolean, allow overwriting tables in the database.

Value

Invisible NULL.

Examples


my_db <- DBI::dbConnect(RSQLite::SQLite())

db <- star_database(mrs_cause_schema, ft_num) |>
  snake_case()
db |>
  as_rdb(my_db)

DBI::dbDisconnect(my_db)

Generate a list of tibbles of flat tables

Description

Allows you to transform a star database into a flat table. If we have a constellation, it returns a list of flat tables.

Usage

as_single_tibble_list(db)

## S3 method for class 'star_database'
as_single_tibble_list(db)

Arguments

db

A star_database object.

Value

A list of tibble

Examples


db1 <- star_database(mrs_cause_schema, ft_num) |>
  snake_case()
tl1 <- db1 |>
  as_single_tibble_list()

db2 <- star_database(mrs_age_schema, ft_age) |>
  snake_case()

ct <- constellation("MRS", db1, db2)
tl <- ct |>
  as_single_tibble_list()

Get a star database from a flat table

Description

Obtain a star database from the flat table and a star schema.

Usage

as_star_database(ft, schema)

## S3 method for class 'flat_table'
as_star_database(ft, schema)

Arguments

ft

A flat_table object.

schema

A star_schema object.

Value

A star_database object.

Examples


db <- flat_table('ft_num', ft_num) |>
  as_star_database(mrs_cause_schema)

Generate a list of tibbles with fact and dimension tables

Description

To port databases to other work environments it is useful to be able to export them as a list of tibbles, as this function does.

Usage

as_tibble_list(db)

## S3 method for class 'star_database'
as_tibble_list(db)

Arguments

db

A star_database object.

Value

A list of tibble

Examples


db1 <- star_database(mrs_cause_schema, ft_num) |>
  snake_case()
tl1 <- db1 |>
  as_tibble_list()

db2 <- star_database(mrs_age_schema, ft_age) |>
  snake_case()

ct <- constellation("MRS", db1, db2)
tl <- ct |>
  as_tibble_list()

Generate a xlsx file with fact and dimension tables

Description

To port databases to other work environments it is useful to be able to export them as a xlsx file, as this function does.

Usage

as_xlsx_file(db, file)

## S3 method for class 'star_database'
as_xlsx_file(db, file = NULL)

Arguments

db

A star_database object.

file

A string, name of a file.

Value

A string, name of a file.

Examples


db1 <- star_database(mrs_cause_schema, ft_num) |>
  snake_case()
tl1 <- db1 |>
  as_xlsx_file()

db2 <- star_database(mrs_age_schema, ft_age) |>
  snake_case()

ct <- constellation("MRS", db1, db2)
f <- ct |>
  as_xlsx_file(file = tempfile())

Cancel deployment

Description

Cancel deployment

Usage

cancel_deployment(db, name)

## S3 method for class 'star_database'
cancel_deployment(db, name)

Arguments

db

A star_database object.

name

A string, name of the deployment.

Value

A star_database object.

Examples


mrs_rdb_file <- tempfile("mrs", fileext = ".rdb")
mrs_sqlite_file <- tempfile("mrs", fileext = ".sqlite")

mrs_sqlite_connect <- function() {
  DBI::dbConnect(RSQLite::SQLite(),
                 dbname = mrs_sqlite_file)
}

mrs_db <- mrs_db |>
  deploy(
    name = "mrs",
    connect = mrs_sqlite_connect,
    file = mrs_rdb_file
  )

mrs_db <- mrs_db |>
  cancel_deployment(name = "mrs")

Check a `geoattribute` geometry instances.

Description

Get unrelated instances of a geoattribute for a geometry.

Usage

check_geoattribute_geometry(db, dimension, attribute, geometry)

## S3 method for class 'star_database'
check_geoattribute_geometry(
  db,
  dimension = NULL,
  attribute = NULL,
  geometry = "polygon"
)

Arguments

db

A star_database object.

dimension

A string, dimension name.

attribute

A vector, attribute names.

geometry

A string, geometry name ('point' or 'polygon').

Details

We obtain the values of the dimension attribute that do not have an associated geographic element of the indicated geometry.

If there is only one geoattribute defined, neither the dimension nor the attribute must be indicated.

Value

A tibble.

Examples


db <- mrs_db |>
  define_geoattribute(
    dimension = "where",
    attribute = "state",
    from_layer = us_layer_state,
    by = "STUSPS"
  )

instances <- check_geoattribute_geometry(db,
                                         dimension = "where",
                                         attribute = "state")

Check the result of joining a flat table with a lookup table

Description

Before joining a flat table with a lookup table we can check the result to determine if we need to adapt the values of some instances or add new elements to the lookup table. This function returns the values of the foreign key of the flat table that do not correspond to the primary key of the lookup table.

Usage

check_lookup_table(ft, fk_attributes, lookup)

## S3 method for class 'flat_table'
check_lookup_table(ft, fk_attributes = NULL, lookup)

Arguments

ft

A flat_table object.

fk_attributes

A vector of strings, attribute names.

lookup

A flat_table object.

Details

If no attributes are indicated, those that form the primary key of the lookup table are considered in the flat table.

Value

A tibble with attribute values.

Examples


lookup <- flat_table('iris', iris) |>
  lookup_table(
    measures = c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"),
    measure_agg = c('MAX', 'MIN', 'SUM', 'MEAN')
  )
values <- flat_table('iris', iris) |>
  check_lookup_table(lookup = lookup)

Checks the refresh of the selected star database from the given database

Description

Checks the refresh operation of the selected star database from the given database. Once this operation is carried out, the results can be consulted on the new instances in dimensions or existing instances in the facts.

Usage

check_refesh(db, refresh_db)

Arguments

db

A star_database object.

refresh_db

A star_database object with the same structure with new data.

Value

A list of facts and dimensions, first facts, then dimensions.

Conform dimensions

Description

Generate a dimension from a list of dimensions with the same schema.

Usage

conform_dimensions(to_conform)

Arguments

to_conform

A dimension_table object list.

Value

A dimension_table object.

Create constellation

Description

Creates a constellation from a list of star_database objects. A constellation is also represented by a star_database object. All dimensions with the same name in the star schemas have to be conformable (share the same structure, even though they have different instances).

Usage

constellation(name = NULL, ...)

Arguments

name

A string.

...

star_database objects.

Value

A star_database object.

Examples


db1 <- star_database(mrs_cause_schema, ft_num) |>
  snake_case()
db2 <- star_database(mrs_age_schema, ft_age) |>
  snake_case()
ct1 <- constellation("MRS", db1, db2)


db3 <- star_database(mrs_cause_schema_rpd, ft_cause_rpd) |>
  role_playing_dimension(
    rpd = "When",
    roles = c("When Available", "When Received")
  )

db4 <- star_database(mrs_age_schema_rpd, ft_age_rpd) |>
  role_playing_dimension(
    rpd = "When Arrived",
    roles = c("When Available")
  )
ct2 <- constellation("MRS", db3, db4)

Transform coordinates to point geometry

Description

From the coordinates defined in fields such as latitude and longitude, it returns a layer of points.

Usage

coordinates_to_point(table, lon_lat = c("intptlon", "intptlat"), crs = NULL)

Arguments

table

A tibble object.

lon_lat

A vector, name of longitude and latitude attributes.

crs

A coordinate reference system: integer with the EPSG code, or character with proj4string.

Details

If we start from a geographic layer, it initially transforms it into a table.

The CRS of the new layer is indicated. If a CRS is not indicated, it considers the layer's CRS by default and, if it is not a layer, it considers 4326 CRS (WGS84).

Value

A sf object.

Examples


us_state_point <-
  coordinates_to_point(us_layer_state,
                       lon_lat = c("INTPTLON", "INTPTLAT"))

Default disconnect function

Description

Disconnect function that is used if no other is indicated in the parameter of the deploy function.

Usage

default_disconnect(con)

Arguments

con

A DBI::DBIConnection object.

Value

TRUE, invisibly.

Define dimension in a `star_schema` object.

Description

Dimensions are part of a star_schema object. They can be defined directly as a dimension_schema object or giving the name and a set of attributes.

Usage

define_dimension(
  schema,
  dimension,
  name,
  attributes,
  scd_nk,
  scd_t0,
  scd_t1,
  scd_t2,
  scd_t3,
  scd_t6,
  is_when,
  ...
)

## S3 method for class 'star_schema'
define_dimension(
  schema,
  dimension = NULL,
  name = NULL,
  attributes = NULL,
  scd_nk = NULL,
  scd_t0 = NULL,
  scd_t1 = NULL,
  scd_t2 = NULL,
  scd_t3 = NULL,
  scd_t6 = NULL,
  is_when = FALSE,
  ...
)

Arguments

schema

A star_schema object.

dimension

A dimension_schema object.

name

A string, name of the dimension.

attributes

A vector of attribute names.

scd_nk

A vector of attribute names, scd natural key.

scd_t0

A vector of attribute names, scd T0 attributes.

scd_t1

A vector of attribute names, scd T1 attributes.

scd_t2

A vector of attribute names, scd T2 attributes.

scd_t3

A vector of attribute names, scd T3 attributes.

scd_t6

A vector of attribute names, scd T6 attributes.

is_when

A boolean, is when dimension.

...

When dimension configuration parameters.

Value

A star_schema object.

Examples


s <- star_schema() |>
  define_dimension(
    name = "when",
    attributes = c(
      "Week Ending Date",
      "WEEK",
      "Year"
    )
  )

s <- star_schema()
d <- dimension_schema(
  name = "when",
  attributes = c(
    "Week Ending Date",
    "WEEK",
    "Year"
  )
)
s <- s |>
  define_dimension(d)

Define facts in a `star_schema` object.

Description

Facts are part of a star_schema object. They can be defined directly as a fact_schema object or giving the name and a set of measures that can be empty (does not have explicit measures).

Usage

define_facts(schema, facts, name, measures, agg_functions, nrow_agg)

## S3 method for class 'star_schema'
define_facts(
  schema,
  facts = NULL,
  name = NULL,
  measures = NULL,
  agg_functions = NULL,
  nrow_agg = NULL
)

Arguments

schema

A star_schema object.

facts

A fact_schema object.

name

A string, name of the fact.

measures

A vector of measure names.

agg_functions

A vector of aggregation function names, each one for its corresponding measure. If none is indicated, the default is SUM. Additionally they can be MAX or MIN.

nrow_agg

A string, name of a new measure that represents the COUNT of rows aggregated for each resulting row.

Details

Associated with each measurement there is an aggregation function that can be SUM, MAX or MIN. AVG is not considered among the possible aggregation functions: The reason is that calculating AVG by considering subsets of data does not necessarily yield the AVG of the total data.

An additional measurement corresponding to the COUNT of aggregated rows is added which, together with SUM, allows us to obtain the mean if needed.

Value

A star_schema object.

Examples


s <- star_schema() |>
  define_facts(
    name = "mrs_cause",
    measures = c(
      "Pneumonia and Influenza Deaths",
      "Other Deaths"
    )
  )

s <- star_schema()
f <- fact_schema(
  name = "mrs_cause",
  measures = c(
    "Pneumonia and Influenza Deaths",
    "Other Deaths"
  )
)
s <- s |>
  define_facts(f)

Define `geoattribute` of a dimension

Description

Define a set of attributes as a dimension's geoattribute. The set of attribute values must uniquely designate the instances of the given geographic layer.

Usage

define_geoattribute(db, dimension, attribute, from_layer, by, from_attribute)

## S3 method for class 'star_database'
define_geoattribute(
  db,
  dimension = NULL,
  attribute = NULL,
  from_layer = NULL,
  by = NULL,
  from_attribute = NULL
)

Arguments

db

A star_database object.

dimension

A string, dimension name.

attribute

A vector, attribute names.

from_layer

A sf object.

by

a vector of correspondence of attributes of the dimension with the sf layer structure.

from_attribute

A vector, attribute names.

Details

The definition can be done in two ways: Associates the instances of the attributes with the instances of a geographic layer or defines it from the geometry of previously defined geographic attributes.

Multiple attributes can be specified in the attribute parameter, the geographical attribute is the combination of all of them.

If defined from a layer (from_layer parameter), additionally the attributes used for the join between the tables (dimension and layer tables) must be indicated (by parameter).

If defined from another attribute, it should have the same or finer granularity, to obtain the result by grouping its instances. The considered attribute can be the pair that defines longitude and latitude.

If other geographic information has previously been associated with that attribute, the new information is considered and previous instances for which no new information is provided are also added.

If the geometry provided is polygons, a point layer is also generated.

Value

A star_database object.

Examples


db <- mrs_db |>
  define_geoattribute(
    dimension = "where",
    attribute = "state",
    from_layer = us_layer_state,
    by = "STUSPS"
  ) |>
  define_geoattribute(
    dimension = "where",
    attribute = "region",
    from_attribute = "state"
  )  |>
  define_geoattribute(
    dimension = "where",
    attribute = "city",
    from_attribute = c("long", "lat")
  )

Define geoattribute from a layer

Description

Define geoattribute from a layer

Usage

define_geoattribute_from_layer(
  db,
  dimension = NULL,
  attribute = NULL,
  geoatt = NULL,
  from_layer = NULL,
  by = NULL
)

Arguments

db

A star_database object.

dimension

A string, dimension name.

attribute

A string, attribute name.

geoatt

A string, geoattribute name.

from_layer

A sf object

by

a vector of correspondence of attributes of the dimension with the sf structure.

Value

A star_database object.

Delete in stars all operations found

Description

Delete in stars all operations found

Usage

delete_all_operations_found(stars, op)

Arguments

stars

A list of star_database objects.

op

A star_operations object.

Value

A list of star_database objects.

Delete an operation

Description

Delete an operation

Usage

delete_operation(op, op_name, name = NULL, details = NULL, details2 = NULL)

Arguments

op

A star_operation object.

op_name

A string, operation name.

name

A string, element name.

details

A vector of strings, operation details.

details2

A vector of strings, operation additional details.

Value

op A star_operation object.

Delete a set of operations

Description

Delete a set of operations

Usage

delete_operation_set(op, op2)

Arguments

op

A star_operation object.

op2

A star_operation object.

Value

op A star_operation object.

Deploy a star database in a relational database

Description

To deploy the star database, we must indicate a name for the deployment, a connection function and a disconnection function from the database. If it is the first deployment, we must also indicate the name of a local file where the star database will be stored.

Usage

deploy(db, name, connect, disconnect, file)

## S3 method for class 'star_database'
deploy(db, name, connect, disconnect = NULL, file = NULL)

Arguments

db

A star_database object.

name

A string, name of the deployment.

connect

A function that returns a DBI::DBIConnection object.

disconnect

A function that receives a DBI::DBIConnection object as a parameter and close the connection.

file

A string, name of the file to store the object.

Details

If the disconnection function consists only of calling DBI::dbDisconnect(con), there is no need to indicate it, it is taken by default.

As a result, it exports the tables from the star database to the connection database and from now on will keep them updated with each periodic refresh. Additionally, it will also keep a copy of the star database updated on file, which can be used when needed.

Value

A star_database object.

Examples


mrs_rdb_file <- tempfile("mrs", fileext = ".rdb")
mrs_sqlite_file <- tempfile("mrs", fileext = ".sqlite")

mrs_sqlite_connect <- function() {
  DBI::dbConnect(RSQLite::SQLite(),
                 dbname = mrs_sqlite_file)
}

mrs_db <- mrs_db |>
  deploy(
    name = "mrs",
    connect = mrs_sqlite_connect,
    file = mrs_rdb_file
  )

`dimension_schema` S3 class

Description

A dimension_schema object is created, we have to define its name and the set of attributes that make it up.

Usage

dimension_schema(
  name = NULL,
  attributes = NULL,
  scd_nk = NULL,
  scd_t0 = NULL,
  scd_t1 = NULL,
  scd_t2 = NULL,
  scd_t3 = NULL,
  scd_t6 = NULL,
  is_when = FALSE,
  ...
)

Arguments

name

A string, name of the dimension.

attributes

A vector of attribute names.

scd_nk

A vector of attribute names, scd natural key.

scd_t0

A vector of attribute names, scd T0 attributes.

scd_t1

A vector of attribute names, scd T1 attributes.

scd_t2

A vector of attribute names, scd T2 attributes.

scd_t3

A vector of attribute names, scd T3 attributes.

scd_t6

A vector of attribute names, scd T6 attributes.

is_when

A boolean, is when dimension.

...

When dimension configuration parameters.

Details

A dimension_schema object is part of a star_schema object, defines a dimension of the star schema.

Value

A dimension_schema object.

Examples


d <- dimension_schema(
  name = "when",
  attributes = c(
    "Week Ending Date",
    "WEEK",
    "Year"
  )
)

`dimension_table` S3 class

Description

A dimension_table object is created, we have to define its surrogate key.

Usage

dimension_table(name = NULL, attributes = NULL, instances = NULL)

Arguments

name

A string, dimension name.

attributes

A vector of strings, attributes names.

instances

A flat table with the dimension instances.

Value

A dimension_table object.

Draw tables

Description

Draw the tables of the ROLAP star diagrams.

Usage

draw_tables(db)

## S3 method for class 'star_database'
draw_tables(db)

Arguments

db

A star_database object.

Value

An object with a print() method.

Examples


db <- star_database(mrs_cause_schema, ft_num) |>
  snake_case()

db |>
  draw_tables()

`fact_schema` S3 class

Description

A fact_schema object is created, the essential data is a name and a set of measures that can be empty (does not have explicit measures). It is part of a star_schema object, defines the facts of the star schema.

Usage

fact_schema(
  name = NULL,
  measures = NULL,
  agg_functions = NULL,
  nrow_agg = NULL
)

Arguments

name

A string, name of the fact.

measures

A vector of measure names.

agg_functions

A vector of aggregation function names, each one for its corresponding measure. If none is indicated, the default is SUM. Additionally they can be MAX or MIN.

nrow_agg

A string, name of a new measure that represents the COUNT of rows aggregated for each resulting row.

Details

Associated with each measure there is an aggregation function that can be SUM, MAX or MIN. AVG is not considered among the possible aggregation functions: The reason is that calculating AVG by considering subsets of data does not necessarily yield the AVG of the total data.

An additional measure corresponding to the COUNT of aggregated rows is added which, together with SUM, allows us to obtain the AVG if needed.

Value

A fact_schema object.

Examples


f <- fact_schema(
  name = "mrs_cause",
  measures = c(
    "Pneumonia and Influenza Deaths",
    "Other Deaths"
  )
)

f <- fact_schema(
  name = "mrs_cause",
  measures = c(
    "Pneumonia and Influenza Deaths",
    "Other Deaths"
  ),
  agg_functions = c(
    "MAX",
    "SUM"
  ),
  nrow_agg = "Nrow"
)

`fact_table` S3 class

Description

A fact_table object is created, we have to get its surrogate keys.

Usage

fact_table(
  name = NULL,
  surrogate_keys = NULL,
  agg = NULL,
  dim_int_names = NULL,
  instances = NULL
)

Arguments

name

A string, fact name.

surrogate_keys

A vector of strings, surrogate key names.

agg

A vector of strings, aggregation functions.

dim_int_names

A vector of strings, internal names of dimensions.

instances

A flat table with the fact instances.

Value

A fact_table object.

Filter dimension

Description

Allows you to define selection conditions for dimension rows.

Usage

filter_dimension(sq, name, ...)

## S3 method for class 'star_query'
filter_dimension(sq, name = NULL, ...)

Arguments

sq

A star_query object.

name

A string, name of the dimension.

...

Conditions, defined in exactly the same way as in dplyr::filter.

Details

Conditions can be defined on any attribute of the dimension (not only on attributes selected in the query for the dimension). The selection is made based on the function dplyr::filter. Conditions are defined in exactly the same way as in that function.

Value

A star_query object.

Examples


sq <- mrs_db |>
  star_query() |>
  filter_dimension(name = "when", week <= " 3") |>
  filter_dimension(name = "where", city == "Cambridge")

From attributes, leave only these contained in dimensions

Description

From attributes, leave only these contained in dimensions

Usage

filter_geo_attributes(db)

Arguments

db

A star_database object.

Value

A list of geodimensions.

From geodimensions, leave only contained in vector of names

Description

From geodimensions, leave only contained in vector of names

Usage

filter_geo_dimensions(db, dim)

Arguments

db

A star_database object.

dim

A vector of strings, dimension names.

Value

A list of geodimensions.

From rpd dimensions, leave only contained in vector of names.

Description

From rpd dimensions, leave only contained in vector of names.

Usage

filter_rpd_dimensions(db, names)

Arguments

db

A star_database object.

names

A vector of strings, dimension names.

Value

A list of vectors of dimension names.

`flat_table` S3 class

Description

Creates a flat_table object.

Usage

flat_table(name = NULL, instances, unknown_value = NULL)

Arguments

name

A string.

instances

A tibble, table of instances.

unknown_value

A string, value used to replace empty and NA values in attributes.

Details

The objective is to allow the transformation of flat tables.

We indicate the name of the flat table and we can also give the value that will be used to replace NA or empty values.

Value

A flat_table object.

Examples


ft <- flat_table('iris', iris)

ft <- flat_table('ft_num', ft_num)

Mortality Reporting System

Description

Selection of 20 rows from the 122 Cities Mortality Reporting System.

Usage

ft

Format

A tibble.

Details

The original dataset covers from 1962 to 2016. For each week, in 122 US cities, mortality figures by age group and cause, considered separately, are included. In the cause, only a distinction is made between pneumonia or influenza and others.

Source

https://catalog.data.gov/dataset/deaths-in-122-u-s-cities-1962-2016-122-cities-mortality-reporting-system

Mortality Reporting System by Age Group

Description

Selection data from the 122 Cities Mortality Reporting System by age group.

Usage

ft_age

Format

A tibble.

Details

The original dataset covers from 1962 to 2016. For each week, in 122 US cities, mortality figures by age group and cause, considered separately, are included.

Source

https://catalog.data.gov/dataset/deaths-in-122-u-s-cities-1962-2016-122-cities-mortality-reporting-system

Examples

# The operations to obtain it from the `ft` data set are:

if (rlang::is_installed("stringr")) {
  ft_age <- ft |>
    dplyr::select(-`Pneumonia and Influenza Deaths`, -`All Deaths`) |>
    tidyr::gather("Age", "All Deaths", 7:11) |>
    dplyr::mutate(`All Deaths` = as.integer(`All Deaths`)) |>
    dplyr::mutate(Age = stringr::str_replace(Age, " \\(all cause deaths\\)", ""))
}

Mortality Reporting System by Age

Description

Selection of data from the 122 Cities Mortality Reporting System by age group, for the first 9 weeks of 1962 and 4 cities.

Usage

ft_age_rpd

Format

A tibble.

Details

The original dataset begins in 1962. For each week, in 122 US cities, mortality figures by age group and cause, considered separately, are included (i.e., the combination of age group and cause is not included). In the cause, only a distinction is made between pneumonia or influenza and others.

Two additional dates have been generated, which were not present in the original dataset.

Source

https://catalog.data.gov/dataset/deaths-in-122-u-s-cities-1962-2016-122-cities-mortality-reporting-system

Mortality Reporting System by Cause

Description

Selection of data from the 122 Cities Mortality Reporting System by cause, for the first 9 weeks of 1962 and 4 cities.

Usage

ft_cause_rpd

Format

A tibble.

Details

Two additional dates have been generated, which were not present in the original dataset.

Source

https://catalog.data.gov/dataset/deaths-in-122-u-s-cities-1962-2016-122-cities-mortality-reporting-system

Mortality Reporting System with numerical measures

Description

Selection of 20 rows from the 122 Cities Mortality Reporting System. Measures have been defined as integer values.

Usage

ft_num

Format

A tibble.

Details

Source

https://catalog.data.gov/dataset/deaths-in-122-u-s-cities-1962-2016-122-cities-mortality-reporting-system

Examples

# The operations to obtain it from the `ft` data set are:

ft_num <- ft |>
  dplyr::mutate(`Pneumonia and Influenza Deaths` = as.integer(`Pneumonia and Influenza Deaths`)) |>
  dplyr::mutate(`All Deaths` = as.integer(`All Deaths`))

Generate refresh sql

Description

Generate sql code for the first refresh operation.

Usage

generate_refresh_sql(refresh)

Arguments

refresh

A list of operations over tables.

Value

A vector of strings.

Generate table sql delete

Description

Generate sql code for deleting instances in a table.

Usage

generate_table_sql_delete(table, instances)

Arguments

table

A string, table name.

instances

A tibble.

Value

A vector of strings.

Generate table sql insert

Description

Generate sql code for inserting a table.

Usage

generate_table_sql_insert(table, instances)

Arguments

table

A string, table name.

instances

A tibble.

Value

A string.

Generate table sql update

Description

Generate sql code for updating a table.

Usage

generate_table_sql_update(table, surrogate_keys, instances)

Arguments

table

A string, table name.

surrogate_keys

A string.

instances

A tibble.

Value

A vector of strings.

Get aggregate functions

Description

Get aggregate functions

Usage

## S3 method for class 'fact_schema'
get_agg_functions(schema)

Arguments

schema

A fact_schema object.

Value

A vector of strings.

Gets the operations performed on a dimension in all `star_database` objects

Description

Gets the operations performed on a dimension in all star_database objects

Usage

get_all_dimension_operations(op_name, name, stars)

Arguments

op_name

A string, operation name.

name

A string, element name.

stars

A list of star_database objects.

Value

A star_operations object.

Get the names of the attributes

Description

Obtain the names of the attributes in a flat table or a dimension in a star database.

Usage

## S3 method for class 'flat_table'
get_attribute_names(db, name = NULL, ordered = FALSE, as_definition = FALSE)

get_attribute_names(db, name, ordered, as_definition)

## S3 method for class 'star_database'
get_attribute_names(db, name, ordered = FALSE, as_definition = FALSE)

Arguments

db

A flat_table or star_database object.

name

A string, dimension name.

ordered

A boolean, sort names alphabetically.

as_definition

A boolean, get the names as a vector definition in R.

Details

If indicated, names can be obtained in alphabetical order or as a vector definition in R

Value

A vector of strings or a string, attribute names.

Examples


names <- star_database(mrs_cause_schema, ft_num) |>
  get_attribute_names(name = "where")

names <- flat_table('iris', iris) |>
  get_attribute_names()

Get attribute names

Description

Get the attribute names.

Usage

## S3 method for class 'dimension_schema'
get_attribute_names_schema(schema)

Arguments

schema

A dimension_schema object.

Value

A string.

Get attribute names

Description

Get the attribute names.

Usage

## S3 method for class 'star_schema'
get_attribute_names_schema(schema)

Arguments

schema

A dimension_schema object.

Value

A string.

get default unknown value

Description

get default unknown value

Usage

get_default_unknown_value()

Value

A string.

Get the names of the facts of a star database

Description

Obtain the names of the facts of a star database.

Usage

get_deployment_names(db)

## S3 method for class 'star_database'
get_deployment_names(db)

Arguments

db

A star_database object.

Value

A vector of strings, fact names.

Examples



mrs_rdb_file <- tempfile("mrs", fileext = ".rdb")
mrs_sqlite_file <- tempfile("mrs", fileext = ".sqlite")

mrs_sqlite_connect <- function() {
  DBI::dbConnect(RSQLite::SQLite(),
                 dbname = mrs_sqlite_file)
}

mrs_db <- mrs_db |>
  deploy(
    name = "mrs",
    connect = mrs_sqlite_connect,
    file = mrs_rdb_file
  )

names <- mrs_db |>
  get_deployment_names()

Get the names of the dimensions of a star database

Description

Obtain the names of the dimensions of a star database.

Usage

get_dimension_names(db, star)

## S3 method for class 'star_database'
get_dimension_names(db, star = NULL)

Arguments

db

A star_database object.

star

A string or integer, star database name or index in constellation.

Value

A vector of strings, dimension names.

Examples


names <- star_database(mrs_cause_schema, ft_num) |>
  get_dimension_names()

Get dimension table

Description

Get the table for the dimension indicated by its name.

Usage

get_dimension_table(db, name)

## S3 method for class 'star_database'
get_dimension_table(db, name = NULL)

Arguments

db

A star_database object.

name

A string, dimension name.

Value

A tibble, dimension table.

Examples


table <- star_database(mrs_cause_schema, ft_num) |>
  get_dimension_table("where")

Get existing fact instances

Description

From the planned update, it obtains the instances of the update facts that are already included in the star database facts to be updated.

Usage

get_existing_fact_instances(sdbu)

## S3 method for class 'star_database_update'
get_existing_fact_instances(sdbu)

Arguments

sdbu

A star_database_update object.

Details

The most common thing is that refresh operations only include new instances in fact tables, but it may be the case that repeated instances appear: They may have different values in the measures, but the same values in the dimension foreign keys. When the update occurs, we need to determine what happens to these instances.

Value

A tibble object.

Examples


f1 <-
  flat_table('ft_num', ft_cause_rpd[ft_cause_rpd$City != 'Cambridge' &
                                      ft_cause_rpd$WEEK != '4',]) |>
  as_star_database(mrs_cause_schema_rpd) |>
  role_playing_dimension(rpd = "When",
                         roles = c("When Available", "When Received"))
f2 <- flat_table('ft_num2', ft_cause_rpd[ft_cause_rpd$City != 'Bridgeport' &
                                           ft_cause_rpd$WEEK != '2',])
f2 <- f2 |>
  update_according_to(f1)
fact_instances <- f2 |>
  get_existing_fact_instances()

Get fact name

Description

Get fact name

Usage

## S3 method for class 'fact_schema'
get_fact_name(schema)

Arguments

schema

A fact_schema object.

Value

A string.

Get the names of the facts of a star database

Description

Obtain the names of the facts of a star database.

Usage

get_fact_names(db)

## S3 method for class 'star_database'
get_fact_names(db)

Arguments

db

A star_database object.

Value

A vector of strings, fact names.

Examples


names <- star_database(mrs_cause_schema, ft_num) |>
  get_fact_names()

Get geoattribute geometries

Description

For each geoattribute, get its geometries.

Usage

get_geoattribute_geometries(db, dimension, attribute)

## S3 method for class 'star_database'
get_geoattribute_geometries(db, dimension = NULL, attribute = NULL)

Arguments

db

A star_database object.

dimension

A string, dimension name.

attribute

A vector, attribute names.

Details

If the name of the dimension is not indicated, it is considered the first one that has geoattributes defined.

Value

A vector of strings.

Examples


db <- mrs_db |>
  define_geoattribute(
    dimension = "where",
    attribute = "state",
    from_layer = us_layer_state,
    by = "STUSPS"
  )

geometries <- db |>
  get_geoattribute_geometries(
    dimension = "where",
    attribute = "state"
  )

Get geoattribute name

Description

Get the name of the geoattribute from a vector of attribute names

Usage

get_geoattribute_name(attribute)

Arguments

attribute

A vector, attribute names.

Value

A string.

Get geoattributes

Description

For each dimension, get a list of available geoattributes.

Usage

get_geoattributes(db)

## S3 method for class 'star_database'
get_geoattributes(db)

Arguments

db

A star_database object.

Value

A list of dimension geoattributes.

Examples


db <- mrs_db |>
  define_geoattribute(
    dimension = "where",
    attribute = "state",
    from_layer = us_layer_state,
    by = "STUSPS"
  )

attributes <- db |>
    get_geoattributes()

Get geographic information layer

Description

Get the geographic information layer from a geolayer object.

Usage

get_layer(gl, keep_all_variables_na)

## S3 method for class 'geolayer'
get_layer(gl, keep_all_variables_na = FALSE)

Arguments

gl

A geolayer object.

keep_all_variables_na

A boolean, keep rows with all variables NA.

Details

By default, rows that are NA for all variables are eliminated.

Value

A sf object.

Examples


gl <- mrs_db_geo |>
  as_geolayer()

l <- gl |>
  get_layer()

Get layer from attribute

Description

Gets the geographic layer associated with the from_attribute at the level of the indicated attributes.

Usage

get_layer_from_attribute(
  db,
  dimension = NULL,
  attribute = NULL,
  from_attribute = NULL
)

Arguments

db

A star_database object.

dimension

A string, dimension name.

attribute

A string, attribute name.

from_attribute

A string, attribute name.

Value

A star_database object.

Get layer geometry

Description

Get the geometry of a layer. It will only be valid if one of the two geometries is interpreted: point or polygon.

Usage

get_layer_geometry(layer)

Arguments

layer

A sf object.

Value

A string.

Examples


geometry <- get_layer_geometry(us_layer_state)

Get lookup tables

Description

From the planned update, it obtains the lookup tables used to define the data.

Usage

get_lookup_tables(sdbu)

## S3 method for class 'star_database_update'
get_lookup_tables(sdbu)

Arguments

sdbu

A star_database_update object.

Value

A list of flat_table objects.

Examples


f1 <- flat_table('ft_num', ft_cause_rpd) |>
  as_star_database(mrs_cause_schema_rpd)
f2 <- flat_table('ft_num2', ft_cause_rpd) |>
  update_according_to(f1)
ft <- f2 |>
  get_lookup_tables()

Get the names of the measures

Description

Obtain the names of the measures in a flat table or in a star database.

Usage

## S3 method for class 'flat_table'
get_measure_names(db, name = NULL, ordered = FALSE, as_definition = FALSE)

get_measure_names(db, name, ordered, as_definition)

## S3 method for class 'star_database'
get_measure_names(db, name = NULL, ordered = FALSE, as_definition = FALSE)

Arguments

db

A flat_table or star_database object.

name

A string, dimension name.

ordered

A boolean, sort names alphabetically.

as_definition

A boolean, get the names as a vector definition in R.

Value

A vector of strings or a string, measure names.

Examples


names <- star_database(mrs_cause_schema, ft_num) |>
  get_measure_names()

names <- flat_table('iris', iris) |>
  get_measure_names()

Get measure names

Description

Get the names of the measures defined in the fact schema.

Usage

## S3 method for class 'fact_schema'
get_measure_names_schema(schema)

Arguments

schema

A fact_schema object.

Value

A vector of strings.

Get measure names

Description

Get the names of the measures defined in the fact schema.

Usage

## S3 method for class 'star_schema'
get_measure_names_schema(schema)

Arguments

schema

A star_schema object.

Value

A vector of strings.

Get new dimension instances

Description

From the planned update, it obtains the instances of the update dimensions that are not included in the star database dimensions to be updated.

Usage

get_new_dimension_instances(sdbu)

## S3 method for class 'star_database_update'
get_new_dimension_instances(sdbu)

Arguments

sdbu

A star_database_update object.

Value

A list of tibble objects.

Examples


f1 <-
  flat_table('ft_num', ft_cause_rpd[ft_cause_rpd$City != 'Cambridge' &
                                      ft_cause_rpd$WEEK != '4',]) |>
  as_star_database(mrs_cause_schema_rpd) |>
  role_playing_dimension(rpd = "When",
                         roles = c("When Available", "When Received"))
f2 <- flat_table('ft_num2', ft_cause_rpd[ft_cause_rpd$City != 'Bridgeport' &
                                           ft_cause_rpd$WEEK != '2',])
f2 <- f2 |>
  update_according_to(f1)
dim_instances <- f2 |>
  get_new_dimension_instances()

A `star_operation` object row is returned, the one following the actual given

Description

A star_operation object row is returned, the one following the actual given

Usage

get_next_operation(op, op_name, name = NULL, actual = NULL)

Arguments

op

A star_operation object.

op_name

A string, operation name.

name

A string, element name.

actual

A star_operation object.

Value

A data frame.

Get number of rows aggregate column

Description

Get number of rows aggregate column

Usage

## S3 method for class 'fact_schema'
get_nrow_agg(schema)

Arguments

schema

A fact_schema object.

Value

A string.

Get the names of the primary key attributes of a flat table

Description

Obtain the names of the attributes that form the primary key of a flat table, if defined.

Usage

get_pk_attribute_names(ft, as_definition)

## S3 method for class 'flat_table'
get_pk_attribute_names(ft, as_definition = FALSE)

Arguments

ft

A flat_table object.

as_definition

A boolean, as the definition of the vector in R.

Value

A vector of strings or a tibble, attribute names.

Examples


ft <- flat_table('iris', iris) |>
  lookup_table(
    measures = c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"),
    measure_agg = c('MAX', 'MIN', 'SUM', 'MEAN')
  )
names <- ft |>
  get_pk_attribute_names()

Get point geometry

Description

Obtain point geometry from polygon geometry.

Usage

get_point_geometry(layer)

Arguments

layer

A sf object.

Value

A sf object.

Examples


layer <-
  get_point_geometry(us_layer_state)

Get the names of the role playing dimensions

Description

Role playing dimensions are defined in star_databases. When integrating several star_databases to form a constellation, role playing dimensions are also integrated. This function allows you to see the result.

Usage

get_role_playing_dimension_names(db)

## S3 method for class 'star_database'
get_role_playing_dimension_names(db)

Arguments

db

A constellation object.

Value

A list of vector of strings with dimension names.

Examples


db1 <- star_database(mrs_cause_schema_rpd, ft_cause_rpd) |>
  role_playing_dimension(
    rpd = "When",
    roles = c("When Available", "When Received")
  )

db2 <- star_database(mrs_age_schema_rpd, ft_age_rpd) |>
  role_playing_dimension(
    rpd = "When Arrived",
    roles = c("When Available")
  )
rpd <- constellation("MRS", db1, db2) |>
  get_role_playing_dimension_names()

Get rpd dimensions of a dimension

Description

Get rpd dimensions of a dimension

Usage

get_rpd_dimensions(db, name)

Arguments

db

A star_database object.

name

A string, dimension name.

Value

A vector of dimension names.

Get similar attribute values combination

Description

Get sets of attribute values that differ only by tildes, spaces, or punctuation marks, for the combination of the given set of attributes. If no attributes are indicated, they are all considered together.

Usage

## S3 method for class 'flat_table'
get_similar_attribute_values(
  db,
  name = NULL,
  attributes = NULL,
  exclude_numbers = FALSE,
  col_as_vector = NULL
)

get_similar_attribute_values(
  db,
  name,
  attributes,
  exclude_numbers,
  col_as_vector
)

## S3 method for class 'star_database'
get_similar_attribute_values(
  db,
  name = NULL,
  attributes = NULL,
  exclude_numbers = FALSE,
  col_as_vector = NULL
)

Arguments

db

A flat_table or star_database object.

name

A string, dimension name.

attributes

A vector of strings, attribute names.

exclude_numbers

A boolean, exclude numbers from comparison.

col_as_vector

A string, name of the column to include a vector of values.

Details

For star databases, a list of dimensions can be indicated, otherwise it considers all dimensions. If a dimension is indicated, a list of attributes to be considered in it can also be indicated.

You can indicate that the numbers are ignored to make the comparison.

If a name is indicated in the col_as_vector parameter, it includes a column with the data in vector form to be used in other functions.

Value

A vector of tibble objects with similar instances.

Examples


instances <- star_database(mrs_cause_schema, ft_num) |>
  get_similar_attribute_values(name = "where")

db <- star_database(mrs_cause_schema, ft_num)
db$dimensions$where$table$City[2] <- " BrId  gEport "
instances <- db |>
  get_similar_attribute_values("where")

db <- star_database(mrs_cause_schema, ft_num)
db$dimensions$where$table$City[2] <- " BrId  gEport "
instances <- db |>
  get_similar_attribute_values("where",
    attributes = c("City", "State"),
    col_as_vector = "As a vector")

ft <- flat_table('iris', iris)
ft$table$Species[20] <- "se.Tosa."
ft$table$Species[60] <- "Versicolor"
instances <- ft |>
  get_similar_attribute_values()

Get similar values for individual attributes

Description

Get sets of attribute values for individual attributes that differ only by tildes, spaces, or punctuation marks. If no attributes are indicated, all are considered.

Usage

## S3 method for class 'flat_table'
get_similar_attribute_values_individually(
  db,
  name = NULL,
  attributes = NULL,
  exclude_numbers = FALSE,
  col_as_vector = NULL
)

get_similar_attribute_values_individually(
  db,
  name,
  attributes,
  exclude_numbers,
  col_as_vector
)

## S3 method for class 'star_database'
get_similar_attribute_values_individually(
  db,
  name = NULL,
  attributes = NULL,
  exclude_numbers = FALSE,
  col_as_vector = NULL
)

Arguments

db

A flat_table or star_database object.

name

A vector of strings, dimension names.

attributes

A vector of strings, attribute names.

exclude_numbers

A boolean, exclude numbers from comparison.

col_as_vector

A string, name of the column to include a vector of values.

Details

For star databases, if no dimension name is indicated, all dimensions are considered.

You can indicate that the numbers are ignored to make the comparison.

If a name is indicated in the col_as_vector parameter, it includes a column with the data in vector form to be used in other functions.

Value

A vector of tibble objects with similar instances.

Examples


instances <- star_database(mrs_cause_schema, ft_num) |>
  get_similar_attribute_values_individually(name = c("where", "when"))

instances <- star_database(mrs_cause_schema, ft_num) |>
  get_similar_attribute_values_individually()

ft <- flat_table('iris', iris)
ft$table$Species[20] <- "se.Tosa."
ft$table$Species[60] <- "Versicolor"
instances <- ft |>
  get_similar_attribute_values_individually()

Get similar values in a table

Description

Get similar values in a table

Usage

get_similar_values_table(table, attributes, exclude_numbers, col_as_vector)

Arguments

table

A tibble object.

attributes

A vector of strings, attribute names.

exclude_numbers

A boolean, exclude numbers from comparison.

col_as_vector

A string, name of the column to include a vector of values.

Value

A vector of tibble objects with similar instances.

Get star database

Description

It obtains the star database: For updates, the one defined from the data; for constellations, the one indicated by the parameter.

Usage

get_star_database(db, name)

## S3 method for class 'star_database_update'
get_star_database(db, name = NULL)

## S3 method for class 'star_database'
get_star_database(db, name)

Arguments

db

A star_database_update object.

name

A string, star database name (fact name).

Value

A star_database object.

Examples


f1 <- flat_table('ft_num', ft_cause_rpd) |>
  as_star_database(mrs_cause_schema_rpd)
f2 <- flat_table('ft_num2', ft_cause_rpd) |>
  update_according_to(f1)
st <- f2 |>
  get_star_database()

db1 <- star_database(mrs_cause_schema, ft_num) |>
  snake_case()
db2 <- star_database(mrs_age_schema, ft_age) |>
  snake_case()
ct <- constellation("MRS", db1, db2)
names <- ct |>
  get_fact_names()
st <- ct |>
  get_star_database(names[1])

Get star query schema

Description

Obtain the star database schema to perform queries.

Usage

get_star_query_schema(db)

Arguments

db

A star_database object.

Value

A star database schema, list of fact and dimension schemes.

Get star schema

Description

From the planned update, it obtains the star schema used to define the data.

Usage

get_star_schema(sdbu)

## S3 method for class 'star_database_update'
get_star_schema(sdbu)

Arguments

sdbu

A star_database_update object.

Value

A star_schema object.

Examples


f1 <- flat_table('ft_num', ft_cause_rpd) |>
  as_star_database(mrs_cause_schema_rpd)
f2 <- flat_table('ft_num2', ft_cause_rpd) |>
  update_according_to(f1)
st <- f2 |>
  get_star_schema()

Get surrogate key names

Description

Get the names of the surrogate keys defined in the dimension table.

Usage

## S3 method for class 'dimension_table'
get_surrogate_key(dimension_table)

Arguments

dimension_table

A dimension_table object.

Value

A vector of strings.

Get the table of the flat table

Description

Obtain the table of a flat table.

Usage

get_table(ft)

## S3 method for class 'flat_table'
get_table(ft)

Arguments

ft

A flat_table object.

Value

A tibble, the table.

Examples


table <- flat_table('iris', iris) |>
  get_table()

Get the names of the tables of a star database

Description

Obtain the names of the tables of a star database.

Usage

get_table_names(db)

## S3 method for class 'star_database'
get_table_names(db)

Arguments

db

A star_database object.

Value

A vector of strings, table names.

Examples


names <- star_database(mrs_cause_schema, ft_num) |>
  get_table_names()

Get transformation function code

Description

From the planned update, it obtains the function with the source code of the transformations performed on the original data in string vector format.

Usage

get_transformation_code(sdbu)

## S3 method for class 'star_database_update'
get_transformation_code(sdbu)

Arguments

sdbu

A star_database_update object.

Value

A vector of strings.

Examples


f1 <- flat_table('ft_num', ft_cause_rpd) |>
  as_star_database(mrs_cause_schema_rpd) |>
  replace_attribute_values(
    name = "When Available",
    old = c('1962', '11', '1962-03-14'),
    new = c('1962', '3', '1962-01-15')
  ) |>
  group_dimension_instances(name = "When")
f2 <- flat_table('ft_num2', ft_cause_rpd) |>
  update_according_to(f1)
code <- f2 |>
  get_transformation_code()

Get transformation function file

Description

From the planned update, it obtains the function with the source code of the transformations performed on the original data in file format.

Usage

get_transformation_file(sdbu, file)

## S3 method for class 'star_database_update'
get_transformation_file(sdbu, file = NULL)

Arguments

sdbu

A star_database_update object.

file

A string, file name.

Value

A string, file name.

Examples


f1 <- flat_table('ft_num', ft_cause_rpd) |>
  as_star_database(mrs_cause_schema_rpd) |>
  replace_attribute_values(
    name = "When Available",
    old = c('1962', '11', '1962-03-14'),
    new = c('1962', '3', '1962-01-15')
  ) |>
  group_dimension_instances(name = "When")
f2 <- flat_table('ft_num2', ft_cause_rpd) |>
  update_according_to(f1)
file <- f2 |>
  get_transformation_file()

Get unique attribute values

Description

Get unique set of values for the given attributes. If no attributes are indicated, all are considered.

Usage

## S3 method for class 'flat_table'
get_unique_attribute_values(
  db,
  name = NULL,
  attributes = NULL,
  col_as_vector = NULL
)

get_unique_attribute_values(db, name, attributes, col_as_vector)

## S3 method for class 'star_database'
get_unique_attribute_values(
  db,
  name = NULL,
  attributes = NULL,
  col_as_vector = NULL
)

Arguments

db

A flat_table or star_database object.

name

A string, dimension name.

attributes

A vector of strings, attribute names.

col_as_vector

A string, name of the column to include a vector of values.

Details

If we work on a star database, a dimension must be indicated.

Value

A vector of tibble objects with unique instances.

Examples


instances <- star_database(mrs_cause_schema, ft_num) |>
  get_unique_attribute_values()

instances <- star_database(mrs_cause_schema, ft_num) |>
  get_unique_attribute_values(name = "where")

instances <- star_database(mrs_cause_schema, ft_num) |>
  get_unique_attribute_values("where",
    attributes = c("REGION", "State"))

instances <- flat_table('iris', iris) |>
  get_unique_attribute_values()

Get unique values in a table

Description

Get unique values in a table

Usage

get_unique_values_table(table, attributes, col_as_vector)

Arguments

table

A tibble object.

attributes

A vector of strings, attribute names.

col_as_vector

A string, name of the column to include a vector of values.

Value

A vector of tibble objects with similar instances.

Get the unknown value defined

Description

Obtain the unknown value of a flat table.

Usage

get_unknown_value_defined(ft)

## S3 method for class 'flat_table'
get_unknown_value_defined(ft)

Arguments

ft

A flat_table object.

Value

A string.

Examples


table <- flat_table('iris', iris) |>
  get_unknown_value_defined()

Get unknown attribute values

Description

Obtain the instances that have an empty or unknown value in any given attribute. If no attribute is given, all are considered.

Usage

get_unknown_values(ft, attributes, col_as_vector)

## S3 method for class 'flat_table'
get_unknown_values(ft, attributes = NULL, col_as_vector = NULL)

Arguments

ft

A flat_table object.

attributes

A vector of strings, attribute names.

col_as_vector

A string, name of the column to include a vector of values.

Details

If a name is indicated in the col_as_vector parameter, it includes a column with the data in vector form to be used in other functions.

Value

A tibble with unknown values in instances.

Examples


iris2 <- iris
iris2[10, 'Species'] <- NA
instances <- flat_table('iris', iris2) |>
  get_unknown_values()

Get variable description

Description

Obtain a description of the variables whose name is indicated. If no name is indicated, all are returned.

Usage

get_variable_description(gl, name, only_values)

## S3 method for class 'geolayer'
get_variable_description(gl, name = NULL, only_values = FALSE)

Arguments

gl

A geolayer object.

name

A string vector.

only_values

A boolean, add names to component values.

Details

Using the parameter only_values, we can obtain only the combination of values or also the combination of names with values.

Value

A string vector.

Examples


gl <- mrs_db_geo |>
  as_geolayer()

vd <- gl |>
  get_variable_description()

Get the variables layer

Description

The variables layer includes the names and description through various fields of the variables contained in the geolayer.

Usage

get_variables(gl)

## S3 method for class 'geolayer'
get_variables(gl)

Arguments

gl

A geolayer object.

Details

The way to select the variables we want to work with is to filter this layer and subsequently set it as the object's variables layer using the set_variables() function.

Value

A tibble object.

Examples


gl <- mrs_db_geo |>
  as_geolayer()

v <- gl |>
  get_variables()

Group table instances by keys aggregating the measures using the corresponding aggregation function.

Description

Group table instances by keys aggregating the measures using the corresponding aggregation function.

Usage

group_by_keys(table, keys, measures, agg_functions, nrow_agg)

Arguments

table

A tibble, the instances table.

keys

A vector of strings, key names to group by.

measures

A vector of strings, measures to aggregate.

agg_functions

A vector of strings, aggregate functions.

nrow_agg

A string, name of a new column to count the number of rows aggregated.

Value

A tibble.

Group instances of a dimension

Description

After changes in values in the instances of a dimension, groups the instances and, if necessary, also the related facts.

Usage

group_dimension_instances(db, name)

## S3 method for class 'star_database'
group_dimension_instances(db, name)

Arguments

db

A star_database object.

name

A string, dimension name.

Value

A star_database object.

Examples


db <- star_database(mrs_cause_schema, ft_num) |>
  group_dimension_instances(name = "where")

Group facts

Description

Once the external keys have been possibly replaced, group the rows of facts.

Usage

group_facts(db)

Arguments

db

A star_database object.

Refresh a star database in a constellation

Description

Incremental update of a star database from the star database generated with the new data.

Usage

incremental_refresh(db, sdbu, existing_instances, replace_transformations, ...)

## S3 method for class 'star_database'
incremental_refresh(
  db,
  sdbu,
  existing_instances = "ignore",
  replace_transformations = FALSE,
  ...
)

Arguments

db

A star_database object.

sdbu

A star_database_update object.

existing_instances

A string, operation to be carried out on the instances of already existing facts. The possible values are: "ignore", "replace", "group" and "delete".

replace_transformations

A boolean, replace the star_database transformation code with the star_database_update one.

...

internal test parameters.

Details

There may be data in the update that already exists in the facts: it is indicated what to do with it, replace it, group it, delete it or ignore it in the update.

If to obtain the update data we have had to perform new transformations (which were not necessary to obtain the star database), we can indicate that these are the new transformation operations for the star database. These operations are not applied to the star database, they will only be applied to new periodic updates.

Value

A star_database object.

Examples


db <-
  flat_table('ft_num', ft_cause_rpd[ft_cause_rpd$City != 'Cambridge' &
                                      ft_cause_rpd$WEEK != '4',]) |>
  as_star_database(mrs_cause_schema_rpd) |>
  role_playing_dimension(rpd = "When",
                         roles = c("When Available", "When Received"))
f2 <- flat_table('ft_num2', ft_cause_rpd[ft_cause_rpd$City != 'Bridgeport' &
                                           ft_cause_rpd$WEEK != '2',])
f2 <- f2 |>
  update_according_to(db)

db <- db |>
  incremental_refresh(f2)

Integrate two geodimensions

Description

Integrate two geodimensions

Usage

integrate_geo_dimensions(gd1, gd2)

Arguments

gd1

A geodimension.

gd2

A geodimension.

Value

A geodimension.

Interpret operation

Description

operation, name, details, details2 "add_custom_column", name, as.character(list(definition))

Usage

interpret_operation_add_custom_column(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Details

f <- function(...) g <- as.character(list(f)) h <- eval(parse(text = g))

Value

A flat table.

Interpret operation

Description

Interpret operation

Usage

interpret_operation_flat_table(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name "group_dimension_instances", name)

Usage

interpret_operation_group_dimension_instances(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "join_lookup_table", fk_attributes, pos)

Usage

interpret_operation_join_lookup_table(ft, op, lookup_tables, file, last_op)

Arguments

ft

flat table

op

operation

lookup_tables

lookup tables

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "lookup_table", pk_attributes, c(attributes, '|', attribute_agg), c(measures, '|', measure_agg)

Usage

interpret_operation_lookup_table(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, "remove_instances_without_measures")

Usage

interpret_operation_remove_instances_without_measures(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "replace_attribute_values", attributes, old, new) "replace_attribute_values", c(name, "|", attributes), old, new)

Usage

interpret_operation_replace_attribute_values(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "replace_empty_values", attributes, empty_values

Usage

interpret_operation_replace_empty_values(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "replace_string", attributes, string, replacement)

Usage

interpret_operation_replace_string(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "replace_unknown_values", attributes, value)

Usage

interpret_operation_replace_unknown_values(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "role_playing_dimension", rpd, roles, att_names)

Usage

interpret_operation_role_playing_dimension(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name "select_attributes", attributes)

Usage

interpret_operation_select_attributes(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "select_instances", not, attributes, unlist(values)

Usage

interpret_operation_select_instances(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "select_instances_by_comparison", c(not, n_ele_set), unlist(attributes), c(unlist(comparisons), unlist(values))

Usage

interpret_operation_select_instances_by_comparison(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details "select_measures", measures, na_rm

Usage

interpret_operation_select_measures(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "separate_measures", measures, c(name, names), na_rm)

Usage

interpret_operation_separate_measures(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "set_attribute_names", name, old, new)

Usage

interpret_operation_set_attribute_names(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "set_measure_names", name, old, new)

Usage

interpret_operation_set_measure_names(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, "snake_case")

Usage

interpret_operation_snake_case(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "star_database", names(db$schemas), unknown_value)

Usage

interpret_operation_star_database(ft, op, schema, file, last_op)

Arguments

ft

flat table

op

operation

schema

multidimensional schema

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "transform_attribute_format", attributes, c(width, decimal_places), c(k_sep, decimal_sep, space_filling)

Usage

interpret_operation_transform_attribute_format(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, "transform_from_values", attribute)

Usage

interpret_operation_transform_from_values(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "transform_to_attribute", measures, c(width, decimal_places), c(k_sep, decimal_sep)

Usage

interpret_operation_transform_to_attribute(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "transform_to_measure", attributes, k_sep, decimal_sep

Usage

interpret_operation_transform_to_measure(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

Interpret operation

Description

operation, name, details, details2 "transform_to_values", attribute, measure, c(id_reverse, na_rm))

Usage

interpret_operation_transform_to_values(ft, op, file, last_op)

Arguments

ft

flat table

op

operation

file

file to write the code

last_op

A boolean, is the last operation?

Value

A flat table.

check if a string is empty

Description

check if a string is empty

Usage

is_empty_string(string)

Arguments

string

A string.

Value

A boolean.

A `star_operation` is new?

Description

A star_operation is new?

Usage

is_new_operation(op, op_name, name = NULL, details = NULL, details2 = NULL)

Arguments

op

A star_operation object.

op_name

A string, operation name.

name

A string, element name.

details

A vector of strings, operation details.

details2

A vector of strings, operation additional details.

Value

A boolean.

Is a scd dimension

Description

Is a scd dimension

Usage

is_scd(schema)

Arguments

schema

A dimension_schema object.

Value

A boolean.

Join a flat table with a lookup table

Description

To join a flat table with a lookup table, the attributes of the first table that will be used in the operation are indicated. The lookup table must have the primary key previously defined.

Usage

join_lookup_table(ft, fk_attributes, lookup)

## S3 method for class 'flat_table'
join_lookup_table(ft, fk_attributes = NULL, lookup)

Arguments

ft

A flat_table object.

fk_attributes

A vector of strings, attribute names.

lookup

A flat_table object.

Details

If no attributes are indicated, those that form the primary key of the lookup table are considered in the flat table.

Value

A flat_table object.

Examples


lookup <- flat_table('iris', iris) |>
  lookup_table(
    measures = c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"),
    measure_agg = c('MAX', 'MIN', 'SUM', 'MEAN')
  )
ft <- flat_table('iris', iris) |>
  join_lookup_table(lookup = lookup)

Get line last operation

Description

Get line last operation

Usage

line_last_op(last_op)

Arguments

last_op

A boolean, is the last operation?

Value

A string

Load star_database (from a RDS file)

Description

Load star_database (from a RDS file)

Usage

load_star_database(file)

Arguments

file

A string, name of the file that stores the object.

Value

A star_database object.

Examples


mrs_rdb_file <- tempfile("mrs", fileext = ".rdb")
mrs_sqlite_file <- tempfile("mrs", fileext = ".sqlite")

mrs_sqlite_connect <- function() {
  DBI::dbConnect(RSQLite::SQLite(),
                 dbname = mrs_sqlite_file)
}

mrs_db <- mrs_db |>
  deploy(
    name = "mrs",
    connect = mrs_sqlite_connect,
    file = mrs_rdb_file
  )

mrs_db2 <- load_star_database(mrs_rdb_file)

Transform a flat table into a look up table

Description

Checks that the given attributes form a primary key of the table. Otherwise, group the records so that they form a primary key. To carry out the groupings, aggregation functions for attributes and measures must be provided.

Usage

lookup_table(
  ft,
  pk_attributes,
  attributes,
  attribute_agg,
  measures,
  measure_agg
)

## S3 method for class 'flat_table'
lookup_table(
  ft,
  pk_attributes = NULL,
  attributes = NULL,
  attribute_agg = NULL,
  measures = NULL,
  measure_agg = NULL
)

Arguments

ft

A flat_table object.

pk_attributes

A vector of strings, attribute names.

attributes

A vector of strings, rest of attribute names.

attribute_agg

A vector of strings, attribute aggregation functions.

measures

A vector of strings, measure names.

measure_agg

A vector of strings, measure aggregation functions.

Details

If the table does not have measures, attributes with equal values are grouped without the need to indicate a grouping function.

If no attribute is indicated, all the attributes are considered to form the primary key.

Value

A flat_table object.

Examples


ft <- flat_table('iris', iris) |>
  lookup_table(
    measures = c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"),
    measure_agg = c('MAX', 'MIN', 'SUM', 'MEAN')
  )

Star schema for Mortality Reporting System by Age

Description

Definition of schemas for facts and dimensions for the Mortality Reporting System considering the age classification.

Usage

mrs_age_schema

Format

A star_schema object.

Details

Dimension schemes can be defined using variables so that you do not have to repeat the definition in several multidimensional designs.

Examples

# Defined by:

when <- dimension_schema(name = "When",
                         attributes = c("Year"))
where <- dimension_schema(name = "Where",
                          attributes = c("REGION",
                                         "State",
                                         "City"))
mrs_age_schema <- star_schema() |>
  define_facts(name = "MRS Age",
               measures = c("All Deaths")) |>
  define_dimension(when) |>
  define_dimension(where) |>
  define_dimension(name = "Who",
                   attributes = c("Age"))

Star schema for Mortality Reporting System by Age with additional dates

Description

Definition of schemas for facts and dimensions for the Mortality Reporting System considering the cause classification with additional dates to be used as role playing dimensions..

Usage

mrs_age_schema_rpd

Format

A star_schema object.

Examples

# Defined by:

mrs_age_schema_rpd <- star_schema() |>
  define_facts(fact_schema(
    name = "mrs_age",
    measures = c(
      "Deaths"
    )
  )) |>
  define_dimension(dimension_schema(
    name = "When",
    attributes = c(
      "Year",
      "WEEK",
      "Week Ending Date"
    )
  )) |>
  define_dimension(dimension_schema(
    name = "When Available",
    attributes = c(
      "Data Availability Year",
      "Data Availability Week",
      "Data Availability Date"
    )
  )) |>
  define_dimension(dimension_schema(
    name = "When Arrived",
    attributes = c(
      "Arrival Year",
      "Arrival Week",
      "Arrival Date"
    )
  )) |>
  define_dimension(dimension_schema(
    name = "Who",
    attributes = c(
      "Age Range"
    )
  )) |>
  define_dimension(dimension_schema(
    name = "where",
    attributes = c(
      "REGION",
      "State",
      "City"
    )
  ))

Star schema for Mortality Reporting System by Cause

Description

Definition of schemas for facts and dimensions for the Mortality Reporting System considering the cause classification.

Usage

mrs_cause_schema

Format

A star_schema object.

Details

Dimension schemes can be defined using variables so that you do not have to repeat the definition in several multidimensional designs.

Examples

# Defined by:

when <- dimension_schema(name = "When",
                         attributes = c("Year"))
where <- dimension_schema(name = "Where",
                          attributes = c("REGION",
                                         "State",
                                         "City"))
mrs_cause_schema <- star_schema() |>
  define_facts(name = "MRS Cause",
               measures = c("Pneumonia and Influenza Deaths",
                            "All Deaths")) |>
  define_dimension(when) |>
  define_dimension(where)

Star schema for Mortality Reporting System by Cause with additional dates

Description

Definition of schemas for facts and dimensions for the Mortality Reporting System considering the cause classification with additional dates to be used as role playing dimensions..

Usage

mrs_cause_schema_rpd

Format

A star_schema object.

Examples

# Defined by:

mrs_cause_schema_rpd <- star_schema() |>
  define_facts(fact_schema(
    name = "mrs_cause",
    measures = c(
      "Pneumonia and Influenza Deaths",
      "All Deaths"
    )
  )) |>
  define_dimension(dimension_schema(
    name = "When",
    attributes = c(
      "Year",
      "WEEK",
      "Week Ending Date"
    )
  )) |>
  define_dimension(dimension_schema(
    name = "When Available",
    attributes = c(
      "Data Availability Year",
      "Data Availability Week",
      "Data Availability Date"
    )
  )) |>
  define_dimension(dimension_schema(
    name = "When Received",
    attributes = c(
      "Reception Year",
      "Reception Week",
      "Reception Date"
    )
  )) |>
  define_dimension(dimension_schema(
    name = "where",
    attributes = c(
      "REGION",
      "State",
      "City"
    )
  ))

Constellation generated from MRS file

Description

The original dataset covers from 1962 to 2016. For each week, in 122 US cities, from the original file, we have stored in the package a file with the same format as the original file but that includes only 1% of its data, selected at random.

Usage

mrs_db

Format

A star_database.

Details

From these data the constellation in the vignette titled 'Obtaining and transforming flat tables' has been generated. This variable contains the defined constellation.

Source

https://catalog.data.gov/dataset/deaths-in-122-u-s-cities-1962-2016-122-cities-mortality-reporting-system

Constellation generated from MRS file through a query and with geographic information

Description

Usage

mrs_db_geo

Format

A star_database.

Details

From these data the constellation in the vignette titled 'Obtaining and transforming flat tables' has been generated. This variable contains the defined constellation.

Source

https://catalog.data.gov/dataset/deaths-in-122-u-s-cities-1962-2016-122-cities-mortality-reporting-system

Examples

# Defined by:

sq <- mrs_db |>
  star_query() |>
  select_dimension(name = "where",
                   attributes = "state") |>
  select_dimension(name = "when",
                   attributes = "year") |>
  select_fact(
    name = "mrs_age",
    measures = "all_deaths"
  )  |>
  select_fact(
    name = "mrs_cause",
    measures = "pneumonia_and_influenza_deaths"
  )

db <- mrs_db |>
  run_query(sq)

mrs_db_geo <- db |>
  define_geoattribute(
    dimension = "where",
    attribute = "state",
    from_layer = us_layer_state,
    by = "STUSPS"
  )

Flat table generated from MRS file

Description

Usage

mrs_ft

Format

A flat_table.

Source

https://catalog.data.gov/dataset/deaths-in-122-u-s-cities-1962-2016-122-cities-mortality-reporting-system

Flat table generated from MRS file

Description

Usage

mrs_ft_new

Format

A flat_table.

Source

https://catalog.data.gov/dataset/deaths-in-122-u-s-cities-1962-2016-122-cities-mortality-reporting-system

Multiple value key

Description

Gets the keys that have multiple values associated with them. The first field in the table is the key, the rest of fields are the values.

Usage

multiple_value_key(tb, col_as_vector = NULL)

Arguments

tb

A tibble.

col_as_vector

A string, name of the column to include a vector of values.

Details

If a name is indicated in the col_as_vector parameter, it includes a column with the data in vector form to be used in other functions.

Value

A tibble.

Examples


tb <- unique(ft[, c('WEEK', 'Week Ending Date')])
mvk <- multiple_value_key(tb)

Name with nexus

Description

Given a name, if it ends in "/" the nexus is the empty string, otherwise it is "/". Add the nexus.

Usage

name_with_nexus(name)

Arguments

name

A string.

Value

A string.

`multistar` S3 class

Description

Internal low-level constructor that creates new objects with the correct structure.

Usage

new_multistar(fl = list(), dl = list())

Arguments

fl

A fact_table list.

dl

A dimension_table list.

Details

It only distinguishes between general and conformed dimensions, each dimension has its own data. It can contain multiple fact tables.

Value

A multistar object.

Prepare the instances table implemented by a `tibble` to join

Description

Transform all fields in the instances table to character type and replace the NA values to facilitate the join operation.

Usage

prepare_to_join(table, unknown_value)

Arguments

table

A tibble, the instances table.

unknown_value

A string, value used to replace NA values in dimensions.

Value

A tibble.

Purge instances of a dimension

Description

Delete instances of a dimension that are not referenced in the facts.

Usage

purge_dimension(db, dim)

Arguments

db

A star_database object.

dim

A string, dimension name.

Value

A tibble, dimension table.

Purge instances of dimensions

Description

Delete instances of dimensions that are not referenced in the facts.

Usage

purge_dimension_instances(db)

Arguments

db

A star_database object.

Value

A star_database object.

Purge instances of dimensions

Description

Delete instances of dimensions that are not referenced in the facts.

Usage

purge_dimension_instances_star_database(db)

Arguments

db

A star_database object.

Value

A star_database object.

Import flat table file

Description

Reads a text file and creates a flat_table object. The file is expected to contain a flat table whose first row contains the name of the columns. All columns are considered to be of type String.

Usage

read_flat_table_file(name, file, sep = ",", page = NULL, unknown_value = NULL)

Arguments

name

A string, flat table name.

file

A string, name of a text file.

sep

Column separator character.

page

A string, name of the new field in which to include the name of the file.

unknown_value

A string, value used to replace empty and NA values in attributes.

Details

When multiple files are handled, the file name may contain information associated with the flat table, it could be the table page information if the name of a new field in which to store it is indicated in the page parameter.

We can also indicate the value that is used in the data with undefined values.

Value

A flat_table object.

Examples


file <-
  system.file("extdata/mrs",
              "mrs_122_us_cities_1962_2016_new.csv",
              package = "rolap")

ft <- read_flat_table_file('mrs_new', file)

Import all flat table files in a folder

Description

Reads all text files in a folder and creates a flat_table object. Each file is expected to contain a flat table, all with the same structure, whose first row contains the name of the columns. All columns are considered to be of type String.

Usage

read_flat_table_folder(
  name,
  folder,
  sep = ",",
  page = NULL,
  unknown_value = NULL,
  same_columns = FALSE,
  snake_case = FALSE
)

Arguments

name

A string, flat table name.

folder

A string, folder name.

sep

Column separator character.

page

A string, name of the new field in which to include the name of the file.

unknown_value

A string, value used to replace empty and NA values in attributes.

same_columns

A boolean, indicates whether all tables have the same columns in the same order.

snake_case

A boolean, indicates if we want to transform the names of the columns to snake case.

Details

We can also indicate the value that is used in the data with undefined values.

In some situations all the files have the same structure but the column names may change slightly. In these cases it can be useful to transform the names to snake case or consider for all the files the names of the columns of the first one. These operations can be indicated by the corresponding parameters.

Value

A flat_table object.

Examples


file <- system.file("extdata/mrs", package = "rolap")

ft <- read_flat_table_folder('mrs_new', file)

Get line last operation

Description

Get line last operation

Usage

reformat_file(out_file, function_name)

Arguments

out_file

A string, file name.

function_name

A string, name of the function to generate in the file.

Value

A string

Refresh deployments

Description

Generate sql code for the first refresh operation.

Usage

refresh_deployments(db, internal)

Arguments

db

A star_database object.

internal

A boolean.

Value

A star_database object.

Remove instance if all measures are na

Description

Remove instance if all measures are na

Usage

remove_all_measures_na(table, measures)

Arguments

table

A tibble object.

measures

A vector of strings, measure names.

Remove duplicate dimension rows

Description

After selecting only a few columns of the dimensions, there may be rows with duplicate values. We eliminate duplicates and adapt facts to the new dimensions.

Usage

remove_duplicate_dimension_rows(db)

Arguments

db

A star_database object.

Remove instances without measures

Description

Delete instances that have all measures undefined.

Usage

remove_instances_without_measures(ft)

## S3 method for class 'flat_table'
remove_instances_without_measures(ft)

Arguments

ft

A flat_table object.

Value

A flat_table object.

Examples


ft <- flat_table('iris', iris) |>
  remove_instances_without_measures()

Replace instance values

Description

Given the values of a possible instance, for that combination, replace them with the new data values.

Usage

## S3 method for class 'flat_table'
replace_attribute_values(db, name = NULL, attributes = NULL, old, new)

replace_attribute_values(db, name, attributes, old, new)

## S3 method for class 'star_database'
replace_attribute_values(db, name, attributes = NULL, old, new)

Arguments

db

A flat_table or star_database object.

name

A string, dimension name.

attributes

A vector of strings, attribute names.

old

A vector of values.

new

A vector of values.

Value

A flat_table or star_database object.

Examples


db <- star_database(mrs_cause_schema, ft_num) |>
  replace_attribute_values(name = "where",
    old = c('1', 'CT', 'Bridgeport'),
    new = c('1', 'CT', 'Hartford'))

db <- star_database(mrs_cause_schema, ft_num) |>
  replace_attribute_values(name = "where",
                           attributes = c('REGION', 'State'),
                           old = c('1', 'CT'),
                           new = c('2', 'CT'))

ft <- flat_table('iris', iris) |>
  replace_attribute_values(
    attributes = 'Species',
    old = c('setosa'),
    new = c('versicolor')
  )

Replace empty values with the unknown value

Description

Transforms the given attributes by replacing the empty values with the unknown value.

Usage

replace_empty_values(ft, attributes, empty_values)

## S3 method for class 'flat_table'
replace_empty_values(ft, attributes = NULL, empty_values = NULL)

Arguments

ft

A flat_table object.

attributes

A vector of names.

empty_values

A vector of values that correspond to empty values.

Details

In addition to the NA or empty values, those indicated (e.g., "-") can be considered as empty values.

Value

A flat_table object.

Examples


iris2 <- iris
iris2[10, 'Species'] <- NA
ft <- flat_table('iris', iris2) |>
  replace_empty_values()

Replace empty values with the unknown value

Description

Replace empty values with the unknown value

Usage

replace_empty_values_table(
  table,
  attributes = NULL,
  empty_values = NULL,
  unknown_value
)

Arguments

table

A tibble object.

attributes

A vector of names.

empty_values

A vector of values that correspond to empty values.

unknown_value

A string.

Value

A tibble object.

Replace names

Description

Replace names

Usage

replace_names(original, old, new)

Arguments

original

A string, original names.

old

A vector of names to replace.

new

A vector of names, new names.

Value

A vector of strings, names replaced.

Replace strings

Description

Transforms the given attributes by replacing the string values with the replacement value.

Usage

replace_string(ft, attributes, string, replacement)

## S3 method for class 'flat_table'
replace_string(ft, attributes = NULL, string, replacement = NULL)

Arguments

ft

A flat_table object.

attributes

A vector of strings, attribute names.

string

A character string to replace.

replacement

A replacement for matched string.

Value

A flat_table object.

Examples


ft <- flat_table('iris', iris) |>
  replace_string(
    attributes = 'Species',
    string = c('set'),
    replacement = c('Set')
  )

Replace unknown values with the given value

Description

Transforms the given attributes by replacing unknown values in them with the given value.

Usage

replace_unknown_values(ft, attributes, value)

## S3 method for class 'flat_table'
replace_unknown_values(ft, attributes = NULL, value)

Arguments

ft

A flat_table object.

attributes

A vector of names.

value

A value.

Value

A flat_table object.

Examples


iris2 <- iris
iris2[10, 'Species'] <- NA
ft <- flat_table('iris', iris2) |>
  replace_empty_values() |>
  replace_unknown_values(value = "Not available")

Define a role playing dimension and its associated dimensions

Description

The same dimension can play several roles in relation to the facts. We can define the main dimension and the dimensions that play different roles.

Usage

role_playing_dimension(db, rpd, roles, rpd_att_names, att_names)

## S3 method for class 'star_database'
role_playing_dimension(db, rpd, roles, rpd_att_names = FALSE, att_names = NULL)

Arguments

db

A star_database object.

rpd

A string, dimension name (role playing dimension).

roles

A vector of strings, dimension names (dimension roles).

rpd_att_names

A boolean, common attribute names taken from rpd dimension.

att_names

A vector of strings, common attribute names.

Details

As a result, all the dimensions will have the same instances and, if we deem it necessary, also the same name of their attributes (except the surrogate key).

Value

A star_database object.

Examples


s <- star_schema() |>
  define_facts(fact_schema(
    name = "mrs_cause",
    measures = c(
      "Pneumonia and Influenza Deaths",
      "All Deaths"
    )
  )) |>
  define_dimension(dimension_schema(
    name = "When",
    attributes = c(
      "Year",
      "WEEK",
      "Week Ending Date"
    )
  )) |>
  define_dimension(dimension_schema(
    name = "When Available",
    attributes = c(
      "Data Availability Year",
      "Data Availability Week",
      "Data Availability Date"
    )
  )) |>
  define_dimension(dimension_schema(
    name = "When Received",
    attributes = c(
      "Reception Year",
      "Reception Week",
      "Reception Date"
    )
  )) |>
  define_dimension(dimension_schema(
    name = "where",
    attributes = c(
      "REGION",
      "State",
      "City"
    )
  ))

db <- star_database(s, ft_cause_rpd) |>
  role_playing_dimension(
    rpd = "When",
    roles = c("When Available", "When Received"),
    rpd_att_names = TRUE
  )

db <- star_database(s, ft_cause_rpd) |>
  role_playing_dimension("When",
                         c("When Available", "When Received"),
                         att_names = c("Year", "Week", "Date"))

Transform role playing dimensions in constellation

Description

Transform role playing dimensions in constellation

Usage

rpd_in_constellation(db)

Arguments

db

A constellation object.

Value

A constellation object.

Run query

Description

Once we have selected the facts, dimensions and defined the conditions on the instances, we can execute the query to obtain the result.

Usage

run_query(db, sq)

## S3 method for class 'star_database'
run_query(db, sq)

Arguments

db

A star_database object.

sq

A star_query object.

Details

As an option, we can indicate if we do not want to unify the facts in the case of having the same grain.

Value

A star_database object.

Examples


sq <- mrs_db |>
  star_query() |>
  select_dimension(name = "where",
                   attributes = c("city", "state")) |>
  select_dimension(name = "when",
                   attributes = "year") |>
  select_fact(
    name = "mrs_age",
    measures = "all_deaths",
    agg_functions = "MAX"
  ) |>
  select_fact(
    name = "mrs_cause",
    measures = c("pneumonia_and_influenza_deaths", "all_deaths")
  ) |>
  filter_dimension(name = "when", week <= " 3") |>
  filter_dimension(name = "where", city == "Bridgeport")

mrs_db_2 <- mrs_db |>
  run_query(sq)

Do all fact tables have the same granularity?

Description

Do all fact tables have the same granularity?

Usage

same_granularity_facts(db, names)

Arguments

db

A star_database object.

names

A vector of strings, fact names.

Value

A boolean.

Select attributes of a flat table

Description

Select only the indicated attributes from the flat table.

Usage

select_attributes(ft, attributes)

## S3 method for class 'flat_table'
select_attributes(ft, attributes)

Arguments

ft

A flat_table object.

attributes

A vector of names.

Value

A flat_table object.

Examples


ft <- flat_table('iris', iris) |>
  select_attributes(attributes = c('Species'))

ft <- flat_table('ft_num', ft_num) |>
  select_attributes(attributes = c('Year', 'WEEK', 'Week Ending Date'))

Select dimension

Description

To add a dimension in a star_query object, we have to define its name and a subset of the dimension attributes. If only the name of the dimension is indicated, it is considered that all its attributes should be added.

Usage

select_dimension(sq, name, attributes)

## S3 method for class 'star_query'
select_dimension(sq, name = NULL, attributes = NULL)

Arguments

sq

A star_query object.

name

A string, name of the dimension.

attributes

A vector of attribute names.

Value

A star_query object.

Examples


sq <- mrs_db |>
  star_query() |>
  select_dimension(name = "where",
                  attributes = c("city", "state")) |>
  select_dimension(name = "when")

Select fact

Description

To define the fact to be consulted, its name is indicated, optionally, a vector of names of selected measures, another of aggregation functions and another of new names for measures are also indicated.

Usage

select_fact(sq, name, measures, agg_functions, new, nrow_agg)

## S3 method for class 'star_query'
select_fact(
  sq,
  name = NULL,
  measures = NULL,
  agg_functions = NULL,
  new = NULL,
  nrow_agg = NULL
)

Arguments

sq

A star_query object.

name

A string, name of the fact.

measures

A vector of measure names.

agg_functions

A vector of aggregation function names, each one for its corresponding measure. They can be SUM, MAX or MIN.

new

A vector of measure new names.

nrow_agg

A string, name of a new measure that represents the COUNT of rows aggregated for each resulting row.

Details

If there is only one fact table, it is the one that is considered if no name is indicated.

If no aggregation function is given, those defined for the measures are considered.

If no new names are given, the original names will be considered. If the aggregation function is different from the one defined by default, it will be included as a prefix to the name.

Value

A star_query object.

Examples


sq <- mrs_db |>
  star_query()

sq_1 <- sq |>
  select_fact(
    name = "mrs_age",
    measures = "all_deaths",
    agg_functions = "MAX"
  )

sq_2 <- sq |>
  select_fact(name = "mrs_age",
              measures = "all_deaths")

sq_3 <- sq |>
  select_fact(name = "mrs_age")

Select instances of a flat table by value

Description

Select only the indicated instances from the flat table.

Usage

select_instances(ft, not, attributes, values)

## S3 method for class 'flat_table'
select_instances(ft, not = FALSE, attributes = NULL, values)

Arguments

ft

A flat_table object.

not

A boolean.

attributes

A vector of names.

values

A list of value vectors.

Details

Several values can be indicated for attributes (performs an OR operation) or several attributes and a value for each one (performs an AND operation).

If the parameter not is true, the reported values are those that are not included.

Value

A flat_table object.

Examples


ft <- flat_table('iris', iris) |>
  select_instances(attributes = c('Species'),
                   values = c('versicolor', 'virginica'))

ft <- flat_table('ft_num', ft_num) |>
  select_instances(
    not = TRUE,
    attributes = c('Year', 'WEEK'),
    values = list(c('1962', '2'), c('1964', '2'))
  )

Select instances of a flat table by comparison

Description

Select only the indicated instances from the flat table by comparison.

Usage

select_instances_by_comparison(ft, not, attributes, comparisons, values)

## S3 method for class 'flat_table'
select_instances_by_comparison(
  ft,
  not = FALSE,
  attributes = NULL,
  comparisons,
  values
)

Arguments

ft

A flat_table object.

not

A boolean.

attributes

A list of name vectors.

comparisons

A list of comparison operator vectors.

values

A list of value vectors.

Details

The elements of the three parameter lists correspond (all three must have the same structure and length or be of length 1). AND is performed for each combination of attribute, operator and value within each element of each list and OR between elements of the lists.

If the parameter not is true, the negation operation will be applied to the result.

Value

A flat_table object.

Examples


ft <- flat_table('iris', iris) |>
  select_instances_by_comparison(attributes = 'Species',
                                 comparisons = '>=',
                                 values = 'v')

ft <- flat_table('ft_num', ft_num) |>
  select_instances_by_comparison(
    not = FALSE,
    attributes = c('Year', 'Year', 'WEEK'),
    comparisons = c('>=', '<=', '=='),
    values = c('1962', '1964', '2')
  )

ft <- flat_table('ft_num', ft_num) |>
  select_instances_by_comparison(
    not = FALSE,
    attributes = c('Year', 'Year', 'WEEK'),
    comparisons = c('>=', '<=', '=='),
    values = list(c('1962', '1964', '2'),
                  c('1962', '1964', '4'))
  )

Select measures of a flat table

Description

Select only the indicated measures from the flat table.

Usage

select_measures(ft, measures, na_rm)

## S3 method for class 'flat_table'
select_measures(ft, measures = NULL, na_rm = TRUE)

Arguments

ft

A flat_table object.

measures

A vector of names.

na_rm

A boolean, remove rows from output where all measure values are NA.

Value

A flat_table object.

Examples


ft <- flat_table('iris', iris) |>
  select_measures(measures = c('Sepal.Length', 'Sepal.Width'))

Separate measures in flat tables

Description

Separate groups of measures into different flat tables. For each group we must indicate a name. If we indicate more names than groups of measures, the measures not included in other groups are also included in a new group.

Usage

separate_measures(ft, measures, names, na_rm)

## S3 method for class 'flat_table'
separate_measures(ft, measures = NULL, names = NULL, na_rm = TRUE)

Arguments

ft

A flat_table object.

measures

A list of string vectors, groups of measure names.

names

A list of string, measure group names.

na_rm

A boolean, remove rows from output where all measure values are NA.

Details

A list of flat tables is returned. It assign the names to the result list.

Value

A list of flat_table objects.

Examples


lft <- flat_table('iris', iris) |>
  separate_measures(
    measures = list(
      c('Petal.Length'),
      c('Petal.Width'),
      c('Sepal.Length')
    ),
    names = c('PL', 'PW', 'SL', 'SW')
  )

Rename attributes

Description

Rename attributes in a flat table or a dimension in a star database.

Usage

## S3 method for class 'flat_table'
set_attribute_names(db, name = NULL, old = NULL, new)

set_attribute_names(db, name, old, new)

## S3 method for class 'star_database'
set_attribute_names(db, name, old = NULL, new)

Arguments

db

A flat_table or star_database object.

name

A string, dimension name.

old

A vector of names.

new

A vector of names.

Details

To rename the attributes there are three possibilities: 1) give only one vector with the new names for all the attributes; 2) a vector of old names and another of new names that must correspond; 3) a vector of new names whose names are the old names they replace.

Value

A flat_table or star_database object.

Examples


db <- star_database(mrs_cause_schema, ft_num) |>
  set_attribute_names(
    name = "where",
    new = c(
      "Region",
      "State",
      "City"
    )
  )

db <- star_database(mrs_cause_schema, ft_num) |>
  set_attribute_names(name = "where",
                      old = "REGION",
                      new = "Region")

new <- "Region"
names(new) <- "REGION"
db <- star_database(mrs_cause_schema, ft_num) |>
  set_attribute_names(name = "where",
                      new = new)

ft <- flat_table('iris', iris) |>
  set_attribute_names(
    old = 'Species',
    new = 'species')

new <- "species"
names(new) <- "Species"
ft <- flat_table('iris', iris) |>
  set_attribute_names(
    new = new)

Set geographic layer

Description

If for some reason we modify the geographic layer, for example, to add a new calculated variable, we can set that layer to become the new geographic layer of the geolayer object using this function.

Usage

set_layer(gl, layer)

## S3 method for class 'geolayer'
set_layer(gl, layer)

Arguments

gl

A geolayer object.

layer

A sf object.

Value

A geolayer object.

Examples


gl <- mrs_db_geo |>
  as_geolayer()

l <- gl |>
  get_layer()

l$tpc_001 <- l$var_002 * 100 / l$var_001

gl <- gl |>
  set_layer(l)

Rename measures

Description

Rename measures in a flat table or in facts in a star database.

Usage

## S3 method for class 'flat_table'
set_measure_names(db, name = NULL, old = NULL, new)

set_measure_names(db, name, old, new)

## S3 method for class 'star_database'
set_measure_names(db, name = NULL, old = NULL, new)

Arguments

db

A flat_table or star_database object.

name

A string, fact name.

old

A vector of names.

new

A vector of names.

Details

To rename the measures there are three possibilities: 1) give only one vector with the new names for all the measures; 2) a vector of old names and another of new names that must correspond; 3) a vector of new names whose names are the old names they replace.

Value

A flat_table or star_database object.

Examples


db <- star_database(mrs_cause_schema, ft_num) |>
  set_measure_names(
    new = c(
      "Pneumonia and Influenza",
      "All",
      "Rows Aggregated"
    )
  )

ft <- flat_table('iris', iris) |>
  set_measure_names(
    old = c('Petal.Length', 'Petal.Width', 'Sepal.Length', 'Sepal.Width'),
    new = c('pl', 'pw', 'ls', 'sw'))

new <- c('pl', 'pw', 'ls', 'sw')
names(new) <- c('Petal.Length', 'Petal.Width', 'Sepal.Length', 'Sepal.Width')
ft <- flat_table('iris', iris) |>
  set_measure_names(
    new = new)

Set variables layer

Description

The variables layer includes the names and description through various fields of the variables contained in the reports.

Usage

set_variables(gl, variables, keep_all_variables_na)

## S3 method for class 'geolayer'
set_variables(gl, variables, keep_all_variables_na = FALSE)

Arguments

gl

A geolayer object.

variables

A tibble object.

keep_all_variables_na

A boolean, keep rows with all variables NA.

Details

When we set the variables layer, after filtering it, the data layer is also filtered keeping only the variables from the variables layer.

By default, rows that are NA for all variables are eliminated.

Value

A sf object.

Examples


gl <- mrs_db_geo |>
  as_geolayer()

v <- gl |>
  get_variables()

v <- v |>
  dplyr::filter(year == '1966' | year == '2016')

gl_sel <- gl |>
  set_variables(v)

Share dimension instance operations between all `star_database` objects

Description

Share dimension instance operations between all star_database objects

Usage

share_dimension_instance_operations(stars, dim_freq)

Arguments

stars

A list of star_database objects.

dim_freq

Dimension frequency table.

Value

A list of star_database objects.

Share the given dimensions in the database

Description

Share the given dimensions in the database

Usage

share_dimensions(db, dims)

Arguments

db

star_database object.

dims

Vector of dimension names.

Value

A star_database object.

From a vector of dimensions, leave only one of each rpd.

Description

From a vector of dimensions, leave only one of each rpd.

Usage

simplify_rpd_dimensions(db, names)

Arguments

db

A star_database object.

names

A vector of strings, dimension names.

Value

A vector of dimension names.

Transform names according to the snake case style

Description

For flat tables, transform attribute and measure names according to the snake case style. For star databases, transform fact, dimension, measures, and attribute names according to the snake case style.

Usage

## S3 method for class 'flat_table'
snake_case(db)

snake_case(db)

## S3 method for class 'star_database'
snake_case(db)

Arguments

db

A flat_table or star_database object.

Details

This style is suitable if we are going to work with databases.

Value

A flat_table or star_database object.

Examples


db <- star_database(mrs_cause_schema, ft_num) |>
  snake_case()

ft <- flat_table('iris', iris) |>
  snake_case()

Transform names according to the snake case style

Description

Transform names according to the snake case style

Usage

## S3 method for class 'dimension_table'
snake_case_table(table)

Arguments

table

A dimension_table object.

Value

A dimension_table object.

Transform names according to the snake case style

Description

Transform names according to the snake case style

Usage

## S3 method for class 'fact_table'
snake_case_table(table)

Arguments

table

A fact_table object.

Value

A fact_table object.

`star_database` S3 class

Description

A star_database object is created from a star_schema object and a flat table that contains the data from which database instances are derived.

Usage

star_database(schema, instances, unknown_value = NULL)

Arguments

schema

A star_schema object.

instances

A flat table to define the database instances according to the schema.

unknown_value

A string, value used to replace NA values in dimensions.

Details

Measures and measures of the star_schema must correspond to the names of the columns of the flat table.

Since NA values cause problems when doing Join operations between tables, you can indicate the value that will be used to replace them before doing these operations. If none is indicated, a default value is taken.

Value

A star_database object.

Examples


db <- star_database(mrs_cause_schema, ft_num)

Creates a `star_database` adding previous operations

Description

Creates a star_database adding previous operations

Usage

star_database_with_previous_operations(
  schema,
  instances,
  unknown_value = NULL,
  operations = NULL,
  lookup_tables = NULL
)

Arguments

schema

A star_schema object.

instances

A flat table to define the database instances according to the schema.

unknown_value

A string, value used to replace NA values in dimensions.

operations

A list of operations.

lookup_tables

A list of lookup tables.

Value

A star_database object.

`star_operation` S3 class

Description

A star_operation object is created.

Usage

star_operation()

Details

A star_operation object is part of a star_schema object, defines operations of the star schema.

Value

A star_operation object.

`star_query` S3 class

Description

An empty star_query object is created where we can select facts and measures, dimensions, dimension attributes and filter dimension rows.

Usage

star_query(db)

## S3 method for class 'star_database'
star_query(db)

Arguments

db

A star_database object.

Value

A star_query object.

Examples


sq <- mrs_db |>
  star_query()

`star_schema` S3 class

Description

An empty star_schema object is created in which definition of facts and dimensions can be added.

Usage

star_schema()

Details

To get a star database (a star_database object) we need a flat table and a star_schema object. The definition of facts and dimensions in the star_schema object is made from the flat table columns.

Value

A star_schema object.

Examples


s <- star_schema()

Get the representation to output

Description

Get the representation to output

Usage

string_or_null(value, last = FALSE)

Arguments

value

A string

last

A boolean

Value

A string

Transforms string into a vector of strings.

Description

Transforms string into a vector of strings.

Usage

string_to_vector(str)

Arguments

str

A string.

Value

A vector of strings.

Summarize geometry of a layer

Description

Groups the geometric elements of a layer according to the values of the indicated attribute.

Usage

summarize_layer(layer, attribute)

Arguments

layer

A sf object.

attribute

A string, attribute name.

Value

A sf object.

Examples


layer <-
  summarize_layer(us_layer_state, "REGION")

Transform attribute format

Description

Transforms numeric attributes adapting their format as indicated.

Usage

transform_attribute_format(
  ft,
  attributes,
  width,
  decimal_places,
  k_sep,
  decimal_sep,
  space_filling
)

## S3 method for class 'flat_table'
transform_attribute_format(
  ft,
  attributes,
  width = 1,
  decimal_places = 0,
  k_sep = NULL,
  decimal_sep = NULL,
  space_filling = TRUE
)

Arguments

ft

A flat_table object.

attributes

A vector of strings, attribute names.

width

An integer, string length.

decimal_places

An integer, number of decimal places.

k_sep

A character, thousands separator used (It can not be changed).

decimal_sep

A character, decimal separator used (It can not be changed).

space_filling

A boolean, fill on the left with spaces (with '0' otherwise).

Details

If a number > 1 is specified in the width parameter, at least that length will be obtained in the result, padded with blanks on the left.

Value

ft A flat_table object.

Examples


ft <- flat_table('iris', iris) |>
  transform_to_attribute(measures = "Sepal.Length", decimal_places = 2) |>
  transform_attribute_format(
    attributes = "Sepal.Length",
    width = 5,
    decimal_places = 1
  )

Transform attribute values into measure names

Description

The values of an attribute will become measure names. There can only be one measure that will be from where the new defined measures take the values.

Usage

transform_from_values(ft, attribute)

## S3 method for class 'flat_table'
transform_from_values(ft, attribute = NULL)

Arguments

ft

A flat_table object.

attribute

A string, attribute that stores the measures names.

Value

A flat_table object.

Examples


ft <- flat_table('iris', iris) |>
  transform_to_values(attribute = 'Characteristic',
                      measure = 'Value',
                      id_reverse = 'id')
ft <- ft |>
  transform_from_values(attribute = 'Characteristic')

For each row, add a vector of values

Description

For each row, add a vector of values

Usage

transform_names(names, ordered, as_definition)

Arguments

names

A vector of strings, names of attributes or measures.

ordered

A boolean, sort names alphabetically.

as_definition

A boolean, as the definition of the vector in R.

Value

A vector of strings, attribute or measure names.

Transform to attribute

Description

Transform measures into attributes. We can indicate if we want all the numbers in the result to have the same length and the number of decimal places.

Usage

transform_to_attribute(ft, measures, width, decimal_places, k_sep, decimal_sep)

## S3 method for class 'flat_table'
transform_to_attribute(
  ft,
  measures,
  width = 1,
  decimal_places = 0,
  k_sep = ",",
  decimal_sep = "."
)

Arguments

ft

A flat_table object.

measures

A vector of strings, measure names.

width

An integer, string length.

decimal_places

An integer, number of decimal places.

k_sep

A character, indicates thousands separator.

decimal_sep

A character, indicates decimal separator.

Details

If a number > 1 is specified in the width parameter, at least that length will be obtained in the result, padded with blanks on the left.

Value

ft A flat_table object.

Examples


ft <- flat_table('iris', iris) |>
  transform_to_attribute(
    measures = "Sepal.Length",
    width = 3,
    decimal_places = 2
  )

Transform to measure

Description

Transform attributes into measures.

Usage

transform_to_measure(ft, attributes, k_sep, decimal_sep)

## S3 method for class 'flat_table'
transform_to_measure(ft, attributes, k_sep = NULL, decimal_sep = NULL)

Arguments

ft

A flat_table object.

attributes

A vector of strings, attribute names.

k_sep

A character, thousands separator to remove.

decimal_sep

A character, new decimal separator to use, if necessary.

Details

We can indicate a thousands indicator to remove and a decimal separator to use. The only decimal separators considered are "." and ",".

Value

ft A flat_table object.

Examples


ft <- flat_table('iris', iris) |>
  transform_to_attribute(measures = "Sepal.Length", decimal_places = 2) |>
  transform_to_measure(attributes = "Sepal.Length", decimal_sep = ".")

Transform measure names into attribute values

Description

Transforms the measure names into values of a new attribute. The values of the measures will become values of the new measure that is indicated.

Usage

transform_to_values(ft, attribute, measure, id_reverse, na_rm)

## S3 method for class 'flat_table'
transform_to_values(
  ft,
  attribute = NULL,
  measure = NULL,
  id_reverse = NULL,
  na_rm = TRUE
)

Arguments

ft

A flat_table object.

attribute

A string, new attribute that will store the measures names.

measure

A string, new measure that will store the measure value.

id_reverse

A string, name of a new attribute that will store the row id.

na_rm

A boolean, remove rows from output where the value column is NA.

Details

If we wanted to perform the reverse operation later using the transform_from_values function, we would need to uniquely identify each original row. By indicating a value in the id_reverse parameter, an identifier is added that will allow us to always carry out the inverse operation.

Value

A flat_table object.

Examples


ft <- flat_table('iris', iris) |>
  transform_to_values(attribute = 'Characteristic',
                      measure = 'Value')

ft <- flat_table('iris', iris) |>
  transform_to_values(attribute = 'Characteristic',
                      measure = 'Value',
                      id_reverse = 'id')

Unify facts and dimensions in a flat table

Description

Unify facts and dimensions in a flat table

Usage

unify_facts_and_dimensions(db, dimension, include_nrow_agg)

Arguments

db

A star_database object.

dimension

A vector of strings, dimension names.

include_nrow_agg

A boolean.

Value

A tibble.

Unify lists of dimension names if there are any in common

Description

Unify lists of dimension names if there are any in common

Usage

unify_rpd(rpd)

Arguments

rpd

A list of strings (dimension names).

Value

A list of strings (dimension names).

Update a flat table according to another structure

Description

Update a flat table with the operations of another structure based on a flat table.

Usage

update_according_to(ft, sdb, star, sdb_operations)

## S3 method for class 'flat_table'
update_according_to(ft, sdb, star = 1, sdb_operations = NULL)

Arguments

ft

A flat_table object.

sdb

A star_database object with defined modification operations.

star

A string or integer, star database name or index in constellation.

sdb_operations

A star_database object with new defined modification operations.

Value

A star_database_update object.

Examples


f1 <- flat_table('ft_num', ft_cause_rpd) |>
  as_star_database(mrs_cause_schema_rpd) |>
  replace_attribute_values(
    name = "When Available",
    old = c('1962', '11', '1962-03-14'),
    new = c('1962', '3', '1962-01-15')
  ) |>
  group_dimension_instances(name = "When")
f2 <- flat_table('ft_num2', ft_cause_rpd) |>
  update_according_to(f1)

Census of US States, by sex and age

Description

Census of US States, by sex and age, obtained from the United States Census Bureau (USCB), American Community Survey (ACS). Obtained from the variables defined in reports, classifying the concepts according to the defined subjects.

Usage

us_census_state

Format

A tibble.

Details

U.S. Census Bureau. “Government Units: US and State: Census Years 1942 - 2022.” Public Sector, PUB Public Sector Annual Surveys and Census of Governments, Table CG00ORG01, 2022, https://data.census.gov/table/GOVSTIMESERIES.CG00ORG01?q=census+state+year. Accessed on October 25, 2023.

Source

https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-data.2021.html

Geographic layer of US States

Description

Geographic layer with data from the States of the USA in polygon format, with simplified geometry so that it takes up less space.

Usage

us_layer_state

Format

A sf.

Details

It has been obtained from the geographic data included in the US census prepared by the U.S. Census Bureau.

Source

https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-data.2021.html

Validate attribute names

Description

Validate attribute names

Usage

validate_attributes(defined_attributes, attributes, repeated = FALSE)

Arguments

defined_attributes

A vector of strings, defined attribute names.

attributes

A vector of strings, new attribute names.

repeated

A boolean, repeated attributes allowed.

Value

A vector of strings, attribute names.

Validate dimension attributes

Description

Validate dimension attributes

Usage

validate_dimension_attributes(db, dimension, attributes)

Arguments

db

A star_database object.

dimension

A dimension name.

attributes

Attribute names.

Value

A vector of strings, dimension names.

Validate dimension names

Description

Validate dimension names

Usage

validate_dimension_names(db, name)

Arguments

db

A star_database object.

name

A vector of strings, dimension names.

Value

A vector of strings, dimension names.

Validate fact names

Description

Validate fact names

Usage

validate_facts(defined_facts, facts)

Arguments

defined_facts

A vector of strings, defined fact names.

facts

A vector of strings, fact names.

Value

A vector of strings, fact names.

Validate lookup parameters

Description

Validate lookup parameters

Usage

validate_lookup_parameters(ft, fk_attributes, lookup)

Arguments

ft

A flat_table object.

fk_attributes

A vector of strings, attribute names.

lookup

A flat_table object.

Value

A vector of strings, fk attribute names.

Validate measure names

Description

Validate measure names

Usage

validate_measures(defined_measures, measures)

Arguments

defined_measures

A vector of strings, defined measure names.

measures

A vector of strings, measure names.

Value

A vector of strings, measure names.

Validate names

Description

Validate names

Usage

validate_names(defined_names, names, concept = "name", repeated = FALSE)

Arguments

defined_names

A vector of strings, defined attribute names.

names

A vector of strings, new attribute names.

concept

A string, treated concept.

repeated

A boolean, repeated names allowed.

Value

A vector of strings, names.

vector to string for presentation

Description

vector to string for presentation

Usage

vector_presentation(vector)

Arguments

vector

A vector

Value

A string

Transforms a vector of strings into a string.

Description

Transforms a vector of strings into a string.

Usage

vector_to_string(vector)

Arguments

vector

A vector of strings.

Value

A string.

Add custom column

Description

Usage

Arguments

Value

See Also

Examples

Add dimension instances

Description

Usage

Arguments

Value

For each row, add a vector of values

Description

Usage

Arguments

Value

A star_operation object row is added with a new operation

Description

Usage

Arguments

Value

Add the surrogate key from a dimension table to the instances table.

Description

Usage

Arguments

Value

Apply filter dimension

Description

Usage

Arguments

Apply select dimension

Description

Usage

Arguments

Apply select fact

Description

Usage

Arguments

Save as GeoPackage

Description

Usage

Arguments

Details

Value

See Also

Examples

Generate csv files with fact and dimension tables

Description

Usage

Arguments

Value

See Also

Examples

Generate a dm class with fact and dimension tables

Description

Usage

Arguments

Value

See Also

Examples

Get a geolayer object

Description

Usage

Arguments

Details

Value

See Also

Examples

Generate a geomultistar::multistar object

Description

Usage

Arguments

Value

See Also

Examples

Generate tables in a relational database

Description

Usage

Arguments

A `star_operation` object row is added with a new operation

Save as `GeoPackage`

Generate a `dm` class with fact and dimension tables

Get a `geolayer` object

Generate a `geomultistar::multistar` object

Check a `geoattribute` geometry instances.