Type: Package
Title: Multidimensional Queries Enriched with Geographic Data
Version: 1.2.2
Description: Multidimensional systems allow complex queries to be carried out in an easy way. The geographical dimension, together with the temporal dimension, plays a fundamental role in multidimensional systems. Through this package, vector geographic data layers can be associated to the attributes of geographic dimensions, so that the results of multidimensional queries can be obtained directly as vector layers. The multidimensional structures on which we can define the queries can be created from a flat table or imported directly using functions from this package.
License: MIT + file LICENSE
URL: https://josesamos.github.io/geomultistar/, https://github.com/josesamos/geomultistar
BugReports: https://github.com/josesamos/geomultistar/issues
Depends: R (≥ 2.10)
Imports: dplyr, generics, purrr, rlang, RSQLite, sf, tibble, tidyr, tidyselect
Suggests: knitr, pander, rmarkdown, testthat
VignetteBuilder: knitr
Encoding: UTF-8
Language: en-GB
LazyData: true
RoxygenNote: 7.3.2
NeedsCompilation: no
Packaged: 2024-07-29 04:49:37 UTC; joses
Author: Jose Samos ORCID iD [aut, cre], Universidad de Granada [cph]
Maintainer: Jose Samos <jsamos@ugr.es>
Repository: CRAN
Date/Publication: 2024-07-29 05:20:02 UTC

Add a dimension table to a multistar

Description

To add a dimension table to a multistar object, we must indicate the name that we give to the dimension, the tibble that contains the data and the name of the attribute corresponding to the table primary key.

Usage

add_dimension(
  ms,
  dimension_name = NULL,
  dimension_table = NULL,
  dimension_key = NULL,
  fact_name = NULL,
  fact_key = NULL,
  key_as_data = FALSE
)

## S3 method for class 'multistar'
add_dimension(
  ms,
  dimension_name = NULL,
  dimension_table = NULL,
  dimension_key = NULL,
  fact_name = NULL,
  fact_key = NULL,
  key_as_data = FALSE
)

Arguments

ms

A multistar object.

dimension_name

A string, name of dimension table.

dimension_table

A tibble, dimension table.

dimension_key

A string, name of the dimension primary key.

fact_name

A string, name of fact table.

fact_key

A string, name of the dimension foreign key.

key_as_data

A boolean, define the primary key as an attribute of the dimension accessible in queries?

Details

We cannot add a dimension without defining a correspondence with one of the multistar's fact tables. We have to define the name of the fact table and the name of its foreign key. The referential integrity of the instances of the facts is checked.

The attribute that is used as the primary key will no longer be accessible for queries (its function is considered to be exclusively related to facts). If you want to use it for queries, it must be explicitly indicated by the boolean parameter key_as_data.

Value

A multistar.

See Also

Other multistar functions: add_facts(), multistar(), relate_dimension()

Examples


ms <- multistar() |>
  add_facts(
    fact_name = "mrs_age",
    fact_table = mrs_fact_age,
    measures = "n_deaths",
    nrow_agg = "count"
  ) |>
  add_facts(
    fact_name = "mrs_cause",
    fact_table = mrs_fact_cause,
    measures = c("pneumonia_and_influenza_deaths", "other_deaths"),
    nrow_agg = "nrow_agg"
  ) |>
  add_dimension(
    dimension_name = "where",
    dimension_table = mrs_where,
    dimension_key = "where_pk",
    fact_name = "mrs_age",
    fact_key = "where_fk"
  ) |>
  add_dimension(
    dimension_name = "when",
    dimension_table = mrs_when,
    dimension_key = "when_pk",
    fact_name = "mrs_age",
    fact_key = "when_fk",
    key_as_data = TRUE
  ) |>
  add_dimension(
    dimension_name = "who",
    dimension_table = mrs_who,
    dimension_key = "who_pk",
    fact_name = "mrs_age",
    fact_key = "who_fk"
  )


Add a fact table to a multistar

Description

To add a fact table to a multistar object, we must indicate the name that we give to the facts, the tibble that contains the data and a vector of attribute names corresponding to the measures.

Usage

add_facts(
  ms,
  fact_name = NULL,
  fact_table = NULL,
  measures = NULL,
  agg_functions = NULL,
  nrow_agg = "nrow_agg"
)

## S3 method for class 'multistar'
add_facts(
  ms,
  fact_name = NULL,
  fact_table = NULL,
  measures = NULL,
  agg_functions = NULL,
  nrow_agg = "nrow_agg"
)

Arguments

ms

A multistar object.

fact_name

A string, name of fact table.

fact_table

A tibble, fact table.

measures

A vector of measure names.

agg_functions

A vector of aggregation function names. If none is indicated, the default is SUM. Additionally they can be MAX or MIN.

nrow_agg

A string, measurement name for the number of rows aggregated. If it does not exist, it is added to the table.

Details

Associated with each measurement, an aggregation function is required, which by default is SUM. It that can be SUM, MAX or MIN. Mean is not considered among the possible aggregation functions: The reason is that calculating the mean by considering subsets of data does not necessarily yield the mean of the total data.

An additional measurement, nrow_agg, corresponding to the number of aggregated rows is always added which, together with SUM, allows us to obtain the mean if needed. As the value of this parameter, you can specify an attribute of the table or the name that you want to assign to it (if it does not exist, it is added to the table).

Value

A multistar.

See Also

Other multistar functions: add_dimension(), multistar(), relate_dimension()

Examples


ms <- multistar() |>
  add_facts(
    fact_name = "mrs_age",
    fact_table = mrs_fact_age,
    measures = "n_deaths",
    nrow_agg = "count"
  ) |>
  add_facts(
    fact_name = "mrs_cause",
    fact_table = mrs_fact_cause,
    measures = c("pneumonia_and_influenza_deaths", "other_deaths"),
    nrow_agg = "nrow_agg"
  )


Add geodimension additional attributes

Description

Adds the geodimension attributes not explicitly selected in the query but included in the definition.

Usage

add_geodimension_additional_attributes(dq)

Arguments

dq

A dimensional_query object.

Value

A dimensional_query object.


Default attribute

Description

Get the attribute with the most instances, if there is more than one.

Usage

default_attribute(dq, dimension)

Arguments

dq

A dimensional_query object.

dimension

A string, name of the geographic dimension.

Value

A string, name of the attribute.


Define geographic attributes

Description

Defines a geographic attributes in two possible ways: Associates the instances of attributes of the geographic dimension with the instances of a geographic layer or defines it from the geometry of another previously defined geographic attribute. Multiple attributes can be specified in the attribute parameter.

Usage

define_geoattribute(
  gms,
  dimension = NULL,
  attribute = NULL,
  additional_attributes = NULL,
  from_layer = NULL,
  by = NULL,
  from_attribute = NULL
)

## S3 method for class 'geomultistar'
define_geoattribute(
  gms,
  dimension = NULL,
  attribute = NULL,
  additional_attributes = NULL,
  from_layer = NULL,
  by = NULL,
  from_attribute = NULL
)

Arguments

gms

A geomultistar object.

dimension

A string, dimension name.

attribute

A vector, attribute names.

additional_attributes

A vector, attribute names.

from_layer

A sf object.

by

a vector of correspondence of attributes of the dimension with the sf layer structure.

from_attribute

A string, attribute name.

Details

If defined from a layer (from_layer parameter), additionally the attributes used for the join between the tables (dimension and layer tables) must be indicated (by parameter).

If defined from another attribute, it should have a finer granularity, to obtain the result by grouping its instances.

If no value is indicated in the attribute parameter, it is defined for all those attributes of the dimension that do not have any previous definition, they are obtained from the attribute indicated in the from_attribute parameter.

Value

A geomultistar object.

See Also

Other geo functions: geomultistar(), get_empty_geoinstances(), run_geoquery()

Examples


gms <- geomultistar(ms = ms_mrs, geodimension = "where") |>
  define_geoattribute(
    attribute = "city",
    from_layer = usa_cities,
    by = c("city" = "city", "state" = "state")
  )

gms <- gms |>
  define_geoattribute(attribute = c("region", "all_where"),
                      from_attribute = "city")

gms <- gms |>
  define_geoattribute(from_attribute = "city")

gms <- gms |>
  define_geoattribute(attribute = "all_where",
                      from_layer = usa_nation)


Define a geoattribute from another

Description

Define a geoattribute from another.

Usage

define_geoattribute_from_attribute(
  gms,
  dimension = NULL,
  attribute = NULL,
  from_attribute = NULL,
  additional_attributes = NULL
)

Arguments

gms

A geomultistar object.

dimension

A string, dimension name.

attribute

A string, attribute name.

from_attribute

A string, attribute name.

additional_attributes

A vector, attribute names.

Value

A geomultistar object.


Define an attribute from a layer

Description

Define an attribute from a layer.

Usage

define_geoattribute_from_layer(
  gms,
  dimension = NULL,
  attribute = NULL,
  from_layer = NULL,
  by = NULL
)

Arguments

gms

A geomultistar object.

dimension

A string, dimension name.

attribute

A string, attribute name.

from_layer

A sf object

by

a vector of correspondence of attributes of the dimension with the sf structure.

Value

A geomultistar object.


Define selected dimensions

Description

Include the selected dimensions and only the selected attributes in them.

Usage

define_selected_dimensions(dq)

Arguments

dq

A dimensional_query object.

Value

A dimensional_query object.


Define selected facts

Description

Measure names are stored as the names of the columns with the aggregation functions.

Usage

define_selected_facts(dq)

Arguments

dq

A dimensional_query object.

Value

A dimensional_query object.


Delete unused foreign keys

Description

In facts, remove foreign keys from dimensions not included in the result.

Usage

delete_unused_foreign_keys(dq)

Arguments

dq

A dimensional_query object.

Value

A dimensional_query object.


Dereference a dimension

Description

Given a dimension, transform the fact table so that the primary key of the dimension (which is a foreign key in the fact table) is replaced by the other attributes of the dimension.

Usage

dereference_dimension(ft, dimension, conversion = TRUE)

Arguments

ft

A fact_table object.

dimension

A dimension_table object.

conversion

A boolean, indicates whether the attributes need to be transformed.

Value

A fact_table object.


dimensional_query S3 class

Description

An empty dimensional_query object is created where you can select fact measures, dimension attributes and filter dimension rows.

Usage

dimensional_query(ms = NULL)

Arguments

ms

A multistar object.

Value

A dimensional_query object.

See Also

Other query functions: filter_dimension(), run_query(), select_dimension(), select_fact()

Examples


# ms_mrs <- ct_mrs |>
#  constellation_as_multistar()

# dq <- dimensional_query(ms_mrs)


Filter dimension

Description

Allows you to define selection conditions for dimension rows.

Usage

filter_dimension(dq, name = NULL, ...)

## S3 method for class 'dimensional_query'
filter_dimension(dq, name = NULL, ...)

Arguments

dq

A dimensional_query object.

name

A string, name of the dimension.

...

Conditions, defined in exactly the same way as in dplyr::filter.

Details

Conditions can be defined on any attribute of the dimension (not only on attributes selected in the query for the dimension). The selection is made based on the function dplyr::filter. Conditions are defined in exactly the same way as in that function.

Value

A dimensional_query object.

See Also

Other query functions: dimensional_query(), run_query(), select_dimension(), select_fact()

Examples


dq <- dimensional_query(ms_mrs) |>
  filter_dimension(name = "when", when_happened_week <= "03") |>
  filter_dimension(name = "where", city == "Boston")


Filter selected instances

Description

For some dimensions the instances to include have been defined, we have the value of the primary key. They are filtered for both facts and dimensions.

Usage

filter_selected_instances(dq)

Arguments

dq

A dimensional_query object.

Value

A dimensional_query object.


geomultistar S3 class

Description

A geomultistar object is created. Dimensions that contain geographic information are indicated.

Usage

geomultistar(ms = NULL, geodimension = NULL)

Arguments

ms

A multistar structure.

geodimension

A vector of dimension names.

Value

A geomultistar object.

See Also

Other geo functions: define_geoattribute(), get_empty_geoinstances(), run_geoquery()

Examples


# gms <- geomultistar(ms = ms_mrs, geodimension = "where")


Get empty instances of a geographic attribute

Description

Gets the instances of the given geographic attribute that do not have a geometry associated with them.

Usage

get_empty_geoinstances(gms, dimension = NULL, attribute = NULL)

## S3 method for class 'geomultistar'
get_empty_geoinstances(gms, dimension = NULL, attribute = NULL)

Arguments

gms

A geomultistar object.

dimension

A string, dimension name.

attribute

A string, attribute name.

Value

A sf object.

See Also

Other geo functions: define_geoattribute(), geomultistar(), run_geoquery()

Examples


gms <- geomultistar(ms = ms_mrs, geodimension = "where") |>
  define_geoattribute(
    attribute = "city",
    from_layer = usa_cities,
    by = c("city" = "city", "state" = "state")
  )

  empty <- gms |>
     get_empty_geoinstances(attribute = "city")


get_selected_measure_names

Description

Get the names of the measures selected in the query.

Usage

get_selected_measure_names(dq, ft)

Arguments

dq

A dimensional_query object.

ft

A tibble object.

Value

A vector of names.


Group facts

Description

Once the external keys have been possibly replaced, group the rows of facts.

Usage

group_facts(dq)

Arguments

dq

A dimensional_query object.

Value

A dimensional_query object.


Group the records in the table

Description

Group the records in the table using the aggregation functions for the measurements.

Usage

group_table(ft)

Arguments

ft

A fact_table object.

Value

A fact_table object.


Mortality Reporting System by Age Test

Description

Selection of data from the 2 Cities Mortality Reporting System by age group, for the first 3 weeks of 1962.

Usage

mrs_age_test

Format

A tibble.

Details

The original dataset begins in 1962. For each week, in 122 US cities, mortality figures by age group and cause, considered separately, are included (i.e., the combination of age group and cause is not included). In the cause, only a distinction is made between pneumonia or influenza and others.

Two additional dates have been generated, which were not present in the original dataset.

Source

https://catalog.data.gov/dataset/deaths-in-122-u-s-cities-1962-2016-122-cities-mortality-reporting-system


Fact age

Description

Fact age table of the Mortality Reporting System. Defined from ms_mrs. Foreign keys have been renamed, only a when dimension has been considered, the type for the when dimension has been changed.

Usage

mrs_fact_age

Format

A tibble.

Source

https://CRAN.R-project.org/package=starschemar


Fact cause

Description

Fact cause table of the Mortality Reporting System. Defined from ms_mrs. Foreign keys have been renamed, only a when dimension has been considered, the type for the when dimension has been changed.

Usage

mrs_fact_cause

Format

A tibble.

Source

https://CRAN.R-project.org/package=starschemar


Dimension when

Description

When dimension table of the Mortality Reporting System. Defined from ms_mrs. The primary key has been renamed and its type has been changed. The other attributes have also been renamed.

Usage

mrs_when

Format

A tibble.

Source

https://CRAN.R-project.org/package=starschemar


Dimension where

Description

Where dimension table of the Mortality Reporting System. Defined from ms_mrs. The primary key has been renamed.

Usage

mrs_where

Format

A tibble.

Source

https://CRAN.R-project.org/package=starschemar


Dimension who

Description

Who dimension table of the Mortality Reporting System. Defined from ms_mrs. The primary key has been renamed.

Usage

mrs_who

Format

A tibble.

Source

https://CRAN.R-project.org/package=starschemar


Multistar for Mortality Reporting System

Description

Multistar for the Mortality Reporting System considering age and cause classification.

Usage

ms_mrs

Format

A multistar object.

Source

https://CRAN.R-project.org/package=starschemar


Multistar for Mortality Reporting System Test

Description

Multistar for the Mortality Reporting System considering age and cause classification data test.

Usage

ms_mrs_test

Format

A multistar object.

Source

https://CRAN.R-project.org/package=starschemar


multistar S3 class

Description

Creates an empty multistar object that allows you to import fact and dimension tables.

Usage

multistar()

Value

A multistar object.

See Also

Other multistar functions: add_dimension(), add_facts(), relate_dimension()

Examples


ms <- multistar()


Export a multistar as a flat table

Description

We can obtain a flat table, implemented using a tibble, from a multistar (which can be the result of a query). If it only has one fact table, it is not necessary to provide its name.

Usage

multistar_as_flat_table(ms, fact = NULL)

## S3 method for class 'multistar'
multistar_as_flat_table(ms, fact = NULL)

Arguments

ms

A multistar object.

fact

A string, name of the fact.

Value

A tibble.

Examples


ft <- ms_mrs |>
  multistar_as_flat_table(fact = "mrs_age")

ms <- dimensional_query(ms_mrs) |>
  select_dimension(name = "where",
                   attributes = c("city", "state")) |>
  select_dimension(name = "when",
                   attributes = c("when_happened_year")) |>
  select_fact(name = "mrs_age",
              measures = c("n_deaths")) |>
  select_fact(
    name = "mrs_cause",
    measures = c("pneumonia_and_influenza_deaths", "other_deaths")
  ) |>
  filter_dimension(name = "when", when_happened_week <= "03") |>
  filter_dimension(name = "where", city == "Boston") |>
  run_query()

ft <- ms |>
  multistar_as_flat_table()


Name with nexus

Description

Given a name, if it ends in "/" the nexus is the empty string, otherwise it is "/". Add the nexus.

Usage

name_with_nexus(name)

Arguments

name

A string.

Value

A string.


dimensional_query S3 class

Description

Internal low-level constructor that creates new objects with the correct structure.

Usage

new_dimensional_query(ms = NULL)

Value

A dimensional_query object.


fact_table S3 class

Description

Internal low-level constructor that creates new objects with the correct structure.

Usage

new_fact_table(
  ft = tibble::tibble(),
  name = NULL,
  measures = NULL,
  agg_functions = NULL,
  nrow_agg = NULL
)

Arguments

ft

A tibble, contains the fact table.

name

A string, name of the fact.

measures

A vector of measurement names.

agg_functions

A vector of aggregation function names.

nrow_agg

A string, measurement name for the number of rows aggregated.

Value

A fact_table object.


geomultistar S3 class

Description

Internal low-level constructor that creates new objects with the correct structure.

Usage

new_geomultistar(ms = NULL, geodimension = NULL)

Arguments

ms

A multistar structure.

geodimension

A vector of dimension names.

Value

A geomultistar object.


multistar S3 class

Description

Internal low-level constructor that creates new objects with the correct structure.

Usage

new_multistar(fl = list(), dl = list())

Arguments

fl

A fact_table list.

dl

A dimension_table list.

Details

It only distinguishes between general and conformed dimensions, each dimension has its own data. It can contain multiple fact tables.

Value

A multistar object.


multistar S3 class

Description

Internal low-level constructor that creates new objects with the correct structure.

Usage

new_multistar_empty()

Value

A multistar object.


Transform a tibble to join

Description

Transform all fields in a tibble to character type and replace the NA with a specific value.

Usage

prepare_join(tb)

Arguments

tb

A tibble.

Value

A tibble.


Reference a dimension

Description

Given a dimension, transform the fact table so that the attributes of the dimension indicated as a parameter, which are in the fact table, are replaced by the other attributes of the dimension.

Usage

reference_dimension(ft, dimension, attributes, conversion = TRUE)

Arguments

ft

A fact_table object.

dimension

A dimension_table object.

attributes

A vector of attribute names, attributes used to reference the dimension.

conversion

A boolean, indicates whether the attributes need to be transformed.

Details

It is used to replace a set of attributes in the fact table with the generated key of the dimension.

If necessary, it is also used for the inverse operation: replace the generated key with the rest of attributes (dereference a dimension).

Value

A fact_table object.


Relate a dimension table to a fact table in a multistar

Description

Adding a dimension to a multistar can only relate to a fact table. You can then relate to other fact tables in the multistar using this function. The name of the fact table and its foreign key must be indicated. The referential integrity of the instances of the facts is checked.

Usage

relate_dimension(ms, dimension_name = NULL, fact_name = NULL, fact_key = NULL)

## S3 method for class 'multistar'
relate_dimension(ms, dimension_name = NULL, fact_name = NULL, fact_key = NULL)

Arguments

ms

A multistar object.

dimension_name

A string, name of dimension table.

fact_name

A string, name of fact table.

fact_key

A string, name of the dimension foreign key.

Value

A multistar.

See Also

Other multistar functions: add_dimension(), add_facts(), multistar()

Examples


ms <- multistar() |>
  add_facts(
    fact_name = "mrs_age",
    fact_table = mrs_fact_age,
    measures = "n_deaths",
    nrow_agg = "count"
  ) |>
  add_facts(
    fact_name = "mrs_cause",
    fact_table = mrs_fact_cause,
    measures = c("pneumonia_and_influenza_deaths", "other_deaths"),
    nrow_agg = "nrow_agg"
  ) |>
  add_dimension(
    dimension_name = "where",
    dimension_table = mrs_where,
    dimension_key = "where_pk",
    fact_name = "mrs_age",
    fact_key = "where_fk"
  ) |>
  add_dimension(
    dimension_name = "when",
    dimension_table = mrs_when,
    dimension_key = "when_pk",
    fact_name = "mrs_age",
    fact_key = "when_fk",
    key_as_data = TRUE
  ) |>
  add_dimension(
    dimension_name = "who",
    dimension_table = mrs_who,
    dimension_key = "who_pk",
    fact_name = "mrs_age",
    fact_key = "who_fk"
  ) |>
  relate_dimension(dimension_name = "where",
                   fact_name = "mrs_cause",
                   fact_key = "where_fk") |>
  relate_dimension(dimension_name = "when",
                   fact_name = "mrs_cause",
                   fact_key = "when_fk")


Remove duplicate dimension rows

Description

After selecting only a few columns of the dimensions, there may be rows with duplicate values. We eliminate duplicates and adapt facts to the new dimensions.

Usage

remove_duplicate_dimension_rows(dq)

Arguments

dq

A dimensional_query object.

Value

A dimensional_query object.


Get a geographic vector from a query

Description

After defining a query and geographic dimensions, run the query and select the geographic data associated with it to get a geographic data layer as the result.

Usage

run_geoquery(
  dq,
  unify_by_grain = TRUE,
  fact = NULL,
  dimension = NULL,
  attribute = NULL,
  wider = FALSE
)

## S3 method for class 'dimensional_query'
run_geoquery(
  dq,
  unify_by_grain = TRUE,
  fact = NULL,
  dimension = NULL,
  attribute = NULL,
  wider = FALSE
)

Arguments

dq

A dimensional_query object.

unify_by_grain

A boolean, unify facts with the same grain.

fact

A string, name of the fact.

dimension

A string, name of the geographic dimension.

attribute

A string, name of the geographic attribute to consider.

wider

A boolean, avoid repeating geographic data.

Details

In the case of having several fact tables, as an option, we can indicate if we do not want to unify the facts in the case of having the same grain.

If the result only has one fact table, it is not necessary to provide its name. Nor is it necessary to indicate the name of the geographic dimension if there is only one available.

If no attribute is specified, the geographic attribute of the result with finer granularity is selected.

In geographic layers, geographic objects are not repeated. The tables are wide: for each object the rest of the attributes are defined as columns. By means of the parameter wider we can indicate that we want a result of this type.

Value

A sf object.

See Also

Other geo functions: define_geoattribute(), geomultistar(), get_empty_geoinstances()

Examples


gms <- geomultistar(ms = ms_mrs, geodimension = "where") |>
  define_geoattribute(
    attribute = "city",
    from_layer = usa_cities,
    by = c("city" = "city", "state" = "state")
  )  |>
  define_geoattribute(
    attribute = "state",
    from_layer = usa_states,
    by = c("state" = "state")
  ) |>
  define_geoattribute(attribute = "region",
                      from_attribute = "state") |>
  define_geoattribute(attribute = "all_where",
                      from_layer = usa_nation)

gdq <- dimensional_query(gms) |>
  select_dimension(name = "where",
                   attributes = c("state", "city")) |>
  select_dimension(name = "when",
                   attributes = c("when_happened_year", "when_happened_week")) |>
  select_fact(
    name = "mrs_age",
    measures = c("n_deaths")
  ) |>
  select_fact(name = "mrs_cause",
              measures = c("pneumonia_and_influenza_deaths", "other_deaths")) |>
  filter_dimension(name = "when", when_happened_week <= "03") |>
  filter_dimension(name = "where", state == "MA")

sf <- gdq |>
  run_geoquery()

sfw <- gdq |>
  run_geoquery(wider = TRUE)


Run query

Description

Once we have selected the facts, dimensions and defined the conditions on the instances, we can execute the query to obtain the result.

Usage

run_query(dq, unify_by_grain = TRUE)

## S3 method for class 'dimensional_query'
run_query(dq, unify_by_grain = TRUE)

Arguments

dq

A dimensional_query object.

unify_by_grain

A boolean, unify facts with the same grain.

Details

As an option, we can indicate if we do not want to unify the facts in the case of having the same grain.

Value

A dimensional_query object.

See Also

Other query functions: dimensional_query(), filter_dimension(), select_dimension(), select_fact()

Examples


ms <- dimensional_query(ms_mrs) |>
  select_dimension(name = "where",
                   attributes = c("city", "state")) |>
  select_dimension(name = "when",
                   attributes = c("when_happened_year")) |>
  select_fact(
    name = "mrs_age",
    measures = c("n_deaths"),
    agg_functions = c("MAX")
  ) |>
  select_fact(
    name = "mrs_cause",
    measures = c("pneumonia_and_influenza_deaths", "other_deaths")
  ) |>
  filter_dimension(name = "when", when_happened_week <= "03") |>
  filter_dimension(name = "where", city == "Boston") |>
  run_query()


Save as geopackage

Description

Save the result of a geoquery in a geopackage. The result can be a layer in the form of a flat table or a list consisting of a layer and a description table of the variables.

Usage

save_as_geopackage(sf, layer_name, file_name = NULL, filepath = NULL)

Arguments

sf

A tibble or a list of tibble objects.

layer_name

A string.

file_name

A string.

filepath

A string.

Value

A tibble or a list of tibble objects.

Examples


gms <- geomultistar(ms = ms_mrs, geodimension = "where") |>
  define_geoattribute(
    attribute = "city",
    from_layer = usa_cities,
    by = c("city" = "city", "state" = "state")
  )  |>
  define_geoattribute(
    attribute = "state",
    from_layer = usa_states,
    by = c("state" = "state")
  ) |>
  define_geoattribute(attribute = "region",
                      from_attribute = "state") |>
  define_geoattribute(attribute = "all_where",
                      from_layer = usa_nation)

gdq <- dimensional_query(gms) |>
  select_dimension(name = "where",
                   attributes = c("state", "city")) |>
  select_dimension(name = "when",
                   attributes = c("when_happened_year", "when_happened_week")) |>
  select_fact(
    name = "mrs_age",
    measures = c("n_deaths")
  ) |>
  select_fact(name = "mrs_cause",
              measures = c("pneumonia_and_influenza_deaths", "other_deaths")) |>
  filter_dimension(name = "when", when_happened_week <= "03") |>
  filter_dimension(name = "where", state == "MA")

sf <- gdq |>
  run_geoquery(wider = TRUE)

save_as_geopackage(sf, "city", filepath = tempdir())


Select dimension

Description

To add a dimension in a dimensional_query object, we have to define its name and a subset of the dimension attributes. If only the name of the dimension is indicated, it is considered that all its attributes should be added.

Usage

select_dimension(dq, name = NULL, attributes = NULL)

## S3 method for class 'dimensional_query'
select_dimension(dq, name = NULL, attributes = NULL)

Arguments

dq

A dimensional_query object.

name

A string, name of the dimension.

attributes

A vector of attribute names.

Value

A dimensional_query object.

See Also

Other query functions: dimensional_query(), filter_dimension(), run_query(), select_fact()

Examples


dq <- dimensional_query(ms_mrs) |>
  select_dimension(name = "where",
                  attributes = c("city", "state")) |>
  select_dimension(name = "when")


Select fact

Description

To define the fact to be consulted, its name is indicated, optionally, a vector of names of selected measures and another of aggregation functions are also indicated.

Usage

select_fact(dq, name = NULL, measures = NULL, agg_functions = NULL)

## S3 method for class 'dimensional_query'
select_fact(dq, name = NULL, measures = NULL, agg_functions = NULL)

Arguments

dq

A dimensional_query object.

name

A string, name of the fact.

measures

A vector of measure names.

agg_functions

A vector of aggregation function names. If none is indicated, those defined in the fact table are considered.

Details

If the name of any measure is not indicated, only the one corresponding to the number of aggregated rows is included, which is always included.

If no aggregation function is included, those defined for the measures are considered.

Value

A dimensional_query object.

See Also

Other query functions: dimensional_query(), filter_dimension(), run_query(), select_dimension()

Examples


dq <- dimensional_query(ms_mrs) |>
  select_fact(
    name = "mrs_age",
    measures = c("n_deaths"),
    agg_functions = c("MAX")
  )

dq <- dimensional_query(ms_mrs) |>
  select_fact(name = "mrs_age",
             measures = c("n_deaths"))

dq <- dimensional_query(ms_mrs) |>
  select_fact(name = "mrs_age")


Star Schema for Mortality Reporting System by Age Test

Description

Star Schema for the Mortality Reporting System considering the age classification data test.

Usage

st_mrs_age_test

Format

A star_schema object.

Source

https://CRAN.R-project.org/package=starschemar


UK London Boroughs

Description

From the original dataset, some fields have been selected and renamed.

Usage

uk_london_boroughs

Format

A sf.

Details

Since not so much detail is needed, the geometry has been simplified 20 m.

Source

https://data.london.gov.uk/dataset/statistical-gis-boundary-files-london


Unify facts by grain

Description

Unify facts by grain

Usage

unify_facts_by_grain(dq)

Arguments

dq

A dimensional_query object.

Value

A dimensional_query object.


USA Cities, 2014

Description

From the original dataset, some fields have been selected and renamed, and only includes the Mortality Reporting System cities.

Usage

usa_cities

Format

A sf.

Source

https://earthworks.stanford.edu/catalog/stanford-bx729wr3020


USA Counties, 2018

Description

From the original dataset, some fields have been selected and renamed, and only includes the Mortality Reporting System counties.

Usage

usa_counties

Format

A sf.

Details

Some counties appear with the same repeated name within the same state, they are the following: Baltimore, MD; Richmond, VA; St. Louis, MO. Since they are accessed by name (county and state), those of the same name within the state have been grouped together.

Source

https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_county_20m.zip


USA Divisions, 2018

Description

From the original dataset, some fields have been selected and renamed.

Usage

usa_divisions

Format

A sf.

Source

https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_division_20m.zip


USA Nation, 2018

Description

From the original dataset, some fields have been selected and renamed.

Usage

usa_nation

Format

A sf.

Source

https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_nation_20m.zip


USA Regions, 2018

Description

From the original dataset, some fields have been selected and renamed.

Usage

usa_regions

Format

A sf.

Source

https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_region_20m.zip


USA States, 2018

Description

From the original dataset, some fields have been selected and renamed, and only includes the Mortality Reporting System states.

Usage

usa_states

Format

A sf.

Source

https://www2.census.gov/geo/tiger/GENZ2018/shp/cb_2018_us_state_20m.zip


Validate names

Description

Validate names

Usage

validate_names(defined_names, names, concept = "name", repeated = FALSE)

Arguments

defined_names

A vector of strings, defined attribute names.

names

A vector of strings, new attribute names.

concept

A string, treated concept.

repeated

A boolean, repeated names allowed.

Value

A vector of strings, names.


widen_flat_table

Description

Makes the pk fields the primary key of the table. The rest of the fields that prevent them from being so, are extracted to another table. The original table is widened by combining the extracted fields with the measures.

Usage

widen_flat_table(ft, pk, measures)

Arguments

ft

A tibble object.

pk

A vector of names.

measures

A vector of names.

Value

A list of tibble objects.