Version: | 0.33.0 |
Date: | 2025-06-26 |
Title: | Methods for Statistical Disclosure Control in Tabular Data |
Description: | Methods for statistical disclosure control in tabular data such as primary and secondary cell suppression as described for example in Hundepol et al. (2012) <doi:10.1002/9781118348239> are covered in this package. |
URL: | https://github.com/sdcTools/sdcTable |
BugReports: | https://github.com/sdcTools/userSupport/issues |
Depends: | R (≥ 3.5.0), Rcpp (≥ 0.11.0), sdcHierarchies (≥ 0.19.1) |
Imports: | data.table, knitr, rlang, stringr, methods, slam, progress, utils, Matrix (≥ 1.3-0), SSBtools, highs |
Suggests: | testthat (≥ 0.3), rmarkdown, webshot, digest, RegSDC |
LinkingTo: | Rcpp |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
LazyData: | true |
SystemRequirements: | GLPK library, including -dev or -devel part |
Encoding: | UTF-8 |
VignetteBuilder: | knitr |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | yes |
Packaged: | 2025-06-26 07:30:32 UTC; meindl |
Author: | Bernhard Meindl [aut, cre] |
Maintainer: | Bernhard Meindl <bernhard.meindl@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-06-26 08:50:02 UTC |
argusVersion
Description
returns the version and build number of a given tau-argus executable
specified in argument exe
.
Usage
argusVersion(exe, verbose = FALSE)
Arguments
exe |
a path to a tau-argus executable |
verbose |
(logical) if |
Value
a list with two elements being the tau-argus version and the build-number.
Examples
## Not run:
argusVersion(exe="C:\\Tau\\TauArgus.exe", verbose=TRUE)
## End(Not run)
Attacking primary suppressed cells
Description
Function [attack()] is used to compute lower and upper bounds for a given sdcProblem instance. For all calculations the current suppression pattern is used when calculating solutions of the attacker's problem.
Usage
attack(object, to_attack = NULL, verbose = FALSE, ...)
Arguments
object |
an object of class 'sdcProblem' |
to_attack |
if 'NULL' all current primary suppressed cells are attacked; otherwise either an integerish (indices) or character-vector (str-ids) of the cells that should be attacked. |
verbose |
a logical scalar determing if additional output should be displayed |
... |
placeholder for possible additional input, currently unused; |
Value
a 'data.frame' with the following columns: - 'prim_supps': index of primary suppressed cells - 'status': the original sdc-status code - 'val' the original value of the cell - ‘low': computed lower bound of the attacker’s problem - ‘up': computed upper bound of the attacker’s problem - 'protected' shows if a given cell is accordingly protected
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
Examples
## Not run:
dims <- list(
v1 = sdcHierarchies::hier_create("tot", letters[1:4]),
v2 = sdcHierarchies::hier_create("tot", letters[5:8])
)
N <- 150
df <- data.frame(
v1 = sample(letters[1:4], N, replace = TRUE),
v2 = sample(letters[5:8], N, replace = TRUE)
)
sdc <- makeProblem(data = df, dimList = dims)
# set primary suppressions
specs <- data.frame(
v1 = c("a", "b", "a"),
v2 = c("e", "e", "f")
)
sdc <- change_cellstatus(sdc, specs = specs, rule = "u")
# attack all primary sensitive cells
# the cells can be recomputed exactly
attack(sdc, to_attack = NULL)
# protect the table and attack again
sdc <- protectTable(sdc, method = "SIMPLEHEURISTIC")
attack(sdc, to_attack = NULL)
# attack only selected cells
attack(sdc, to_attack = c(7, 12))
## End(Not run)
perform calculations on cutList
-objects depending on argument type
Description
perform calculations on cutList
-objects depending on argument type
Usage
calc.cutList(object, type, input)
## S4 method for signature 'cutList,character,list'
calc.cutList(object, type, input)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
strengthen: strenghten constraints in argument
object
checkViolation: check if a given solution violates any in argument
object
bindTogether: combine two
cutList
-objects
input |
a list depending on argument |
type==strengthen: input is not used (empty list)
type==checkViolation: input is a list of length 2
first element: numeric vector specifying a solution to a linear problem
second element: numeric vector specifying weights
type==bindTogether: input is a list of length 1
first element: object of class
cutList
Value
manipulated data based on argument type
an object of class
cutList
if argumenttype
matches 'strengthen' or 'bindTogether'a logical vector of length 1 if argument
type
matches 'checkViolation' with TRUE if at least one constraint is violated by the given solution
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
modify dimVar
-objects depending on argument type
Description
modify dimVar
-objects depending on argument type
Usage
calc.dimVar(object, type, input)
## S4 method for signature 'dimVar,character,character'
calc.dimVar(object, type, input)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
hasDefaultCodes: calculates if a vector of codes (specified by argument
input
) corresponds to default codes inobject
matchCodeOrig: obtain default|standard codes for a vector of original codes specified by argument
input
matchCodeDefault: obtain original codes for a vector of default|standard codes specified by argument
input
standardize: perform standardization of level-codes (temporarily removing duplicates,..)
requiredMinimalCodes: calculate a set of minimal codes that are required to calculate a specific (sub)total specified by argument
input
input |
a character vector |
Value
information from object
depending on type
a character vector if type matches 'matchCodeOrig', 'matchCodeDefault', 'standardize' or 'requiredMinimalCodes'
a logical vector of length 1 if type matches 'hasDefaultCodes' being TRUE if argument
input
are default codes and FALSE otherwise
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
perform calculations on linProb
-objects depending on argument type
Description
perform calculations on linProb
-objects depending on argument type
Usage
calc.linProb(object, type, input)
## S4 method for signature 'linProb,character,list'
calc.linProb(object, type, input)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
solveProblem: solve the linear problem (minimize objective function)
fixVariables: try to fix objective variables to 0|1 based on dual costs depending on input
input |
a list depending on argument |
type==solveProblem: a list of length 1
first element: character vector of length 1 specifying the solver to use.
type==fixVariables: a list of length 3
first element: numeric vector specifying lower bounds for the objective variables
second element: numeric vector specifying upper bounds for the objective variables
third element: numeric vector specifying indices of primary suppressed cells
Value
manipulated data based on argument type
list containing the solution and additional information if argument
type
matches 'solveProblema numeric vector of indices if argument
type
matches 'fixVariables'
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
perform calculations on multiple objects depending on argument type
Description
perform calculations on multiple objects depending on argument type
Usage
calc.multiple(type, input)
## S4 method for signature 'character,list'
calc.multiple(type, input)
Arguments
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are:
|
input |
a list depending on argument
|
Value
manipulated data based on argument type
list with elements 'groups', 'indices', 'strIDs', 'nrGroups' and 'nrTables' if argument
type
matches 'makePartitions'object of class
linProb
if argumenttype
matches 'makeAttackerProblem'object of class
sdcProblem
if argumenttype
matches 'calcFullProblem'
Note
internal functions/methods
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
perform calculations on problemInstance
-objects depending on argument type
Description
perform calculations on problemInstance
-objects depending on argument type
Usage
calc.problemInstance(object, type, input)
## S4 method for signature 'problemInstance,character,list'
calc.problemInstance(object, type, input)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
makeMasterProblem: create the master problem that is the core of the secondary cell suppression problem
isProtectedSolution: check if a solution violates any required (upper|lower|sliding) protection levels
input |
a list depending on argument |
type==makeMasterProblem: input is not used (empty list)
type==isProtectedSolution: input is a list of length 2 with elements 'input1' and 'input2'
element 'input1': numeric vector of calculated known lower cell bounds (from attacker's problem)
element 'input2': numeric vector of known upper cell bounds (from attacker's problem)
Value
information from objects of class problemInstance
depending on argument type
an object of class
linProb
if argumenttype
matches 'makeMasterProblem'logical vector of length 1 if argument
type
matches 'isProtectedSolution' with TRUE if all primary suppressed cells are adequately protected, FALSE otherwise
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
perform calculations on sdcProblem
-objects depending on argument type
Description
perform calculations on sdcProblem
-objects depending on argument type
Usage
calc.sdcProblem(object, type, input)
## S4 method for signature 'sdcProblem,character,list'
calc.sdcProblem(object, type, input)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
rule.freq: modify suppression status within
object
according to frequency suppression ruleheuristicSolution: obtain a heuristic (greedy) solution to the problem defined by
object
cutAndBranch: solve a secondary cell suppression problem defined by
object
using cut and branchanonWorker: is used to solve the suppression problem depending on information provided with argument
input
ghmiter: solve a secondary cell suppression problem defined by
object
using hypercube algorithmpreprocess: perform a preprocess procedure by trying to identify primary suppressed cells that are already protected due to other primary suppressed cells
cellID: find index of cell defined by information provided with argument
input
finalize: create an object of class
safeObj
ghmiter.diagObj: calculate codes required to identify diagonal cells given a valid cell code - used for ghmiter-algorithm only
ghmiter.calcInformation: calculate information for quaders identified by diagonal indices - used for ghmiter-algorithm only
ghmiter.suppressQuader: suppress a quader based on indices
ghmiter.selectQuader: select a quader for suppression depending on information provided with argument
input
- used for ghmiter-algorithm onlyghmiter.suppressAdditionalQuader: select and suppress an additional quader (if required) based on information provided with argument
input
- used for ghmiter-algorithm onlycontributingIndices: calculate indices within the current problem that contribute to a given cell
reduceProblem: reduce the problem given by
object
using a vector of indicesgenStructuralCuts: calculate cuts that are absolute necessary for a valid solution of the secondary cell suppression problem
input |
a list depending on argument |
a list (typically generated using genParaObj()) specifying parameters for primary cell suppression if argument
type
matches 'rule.freq'a list if argument
type
matches 'heuristicSolution' having the following elements:element 'aProb': an object of class
linProb
defining the attacker's problemelement 'validCuts': an object of class
cutList
representing a list of constraintselement 'solver': a character vector of length 1 specifying a solver to use
element 'verbose': a logical vector of length 1 setting if verbose output is desired
a list (typically generated using genParaObj()) specifying parameters for the secondary cell suppression problem if argument
type
matches 'cutAndBranch', 'anonWorker', 'ghmiter', 'preprocess'a list of length 3 if argument
type
matches 'cellID' having following elementsfirst element: character vector specifying variable names that need to exist in slot 'dimInfo' of
object
second element: character vector specifying codes for each variable that define a specific table cell
third element: logical vector of length 1 with TRUE setting verbosity and FALSE to turn verbose output off
a list of length 3 if argument
type
matches 'ghmiter.diagObj' having following elementsfirst element: numeric vector of length 1
second element: a list with as many elements as dimensional variables have been specified and each element being a character vector of dimension-variable specific codes
third element: logical vector of length 1 defining if diagonal indices with frequency == 0 should be allowed or not
a list of length 4 if argument
type
matches 'ghmiter.calcInformation' having following elementsfirst element: a list object typically generated with method
calc.sdcProblem
and type=='ghmiter.diagObj'second element: a list with as many elements as dimensional variables have been specified and each element being a character vector of dimension-variable specific codes
third element: numeric vector of length 1 specifying a desired protection level
fourth element: logical vector of length 1 defining if quader containing empty cells should be allowed or not
a list of length 1 if argument
type
matches 'ghmiter.suppressQuader' having following elementfirst element: numeric vector of indices that should be suppressed
a list of length 2 if argument
type
matches 'ghmiter.selectQuader' having following elementsfirst element: a list object typically generated with method
calc.sdcProblem
and type=='ghmiter.calcInformation'second element: a list (typically generated using genParaObj())
a list of length 4 if argument
type
matches 'ghmiter.suppressAdditionalQuader' having following elementsfirst element: a list object typically generated with method
calc.sdcProblem
and type=='ghmiter.diagObj'second element: a list object typically generated with method
calc.sdcProblem
and type=='ghmiter.calcInformation'third element: a list object typically generated with method
calc.sdcProblem
and type=='ghmiter.selectQuader'fourth element: a list (typically generated using genParaObj())
a list of length 1 if argument
type
matches 'contributingIndices' having following elementfirst element: character vector of length 1 being an ID for which contributing indices should be calculated
a list of length 1 if argument
type
matches 'reduceProblem' having following elementfirst element: numeric vector defining indices of cells that should be kept in the reduced problem
an empty list if argument
type
matches 'genStructuralCuts'
Value
information from objects of class sdcProblem
depending on argument type
an object of class
sdcProblem
if argumenttype
matches 'rule.freq', 'cutAndBranch', 'anonWorker', 'ghmiter', 'ghmiter.supressQuader', 'ghmiter.suppressAdditionalQuader' or 'reduceProblem'a numeric vector with elements being 0 or 1 if argument
type
matches 'heuristicSolution'a list if argument
type
matches 'preprocess' having following elements:element 'sdcProblem': an object of class
sdcProblem
element 'aProb': an object of class
linProb
element 'validCuts': an object of class
cutList
a numeric vector of length 1 specifying the index of the cell of interest if argument
type
matches 'cellID'an object of class
safeObj
if argumenttype
matches 'finalize'a list if argument
type
matches 'ghmiter.diagObj' having following elements:element 'cellToProtect': character vector of length 1 defining the ID of the cell to protect
element 'indToProtect': numeric vector of length 1 defining the index of the cell to protect
element 'diagIndices': numeric vector defining indices of possible cells defining cubes
a list containing information about each quader that could possibly be suppressed if argument
type
matches 'ghmiter.calcInformation'a list containing information about a single quader that should be suppressed if argument
type
matches 'ghmiter.selectQuader'a numeric vector with indices that contribute to the desired table cell if argument
type
matches 'contributingIndices'an object of class
cutList
if argumenttype
matches 'genStructuralCuts'
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
modify simpleTriplet
-objects depending on argument type
Description
modify simpleTriplet
-objects depending on argument type
Usage
calc.simpleTriplet(object, type, input)
## S4 method for signature 'simpleTriplet,character,list'
calc.simpleTriplet(object, type, input)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
removeRow: remove a row with given index from
object
removeCol: remove a column with given index from
object
addRow: add a row to
object
addCol: add a column to
object
modifyRow: change specified row of
object
modifyCol: change specified column of
object
modifyCell: change specified cell of
object
bind: bind two objects of class
simpleTriplet
together
input |
a list depending on argument |
type==removeRow: input is a list of length 1
first element: numeric vector of length 1 defining the index of the row that should be removed
type==removeCol: input is a list of length 1
first element: numeric vector of length 1 defining the index of the column that should be removed
type==addRow: input is a list of length 2
first element: numeric vector of column-indices
second element: numeric vector defining the cell-values of the row that will be added
type==addCol: input is a list of length 2
first element: numeric vector of row-indices
second element: numeric vector defining the cell-values of the column that will be added
type==modifyRow: input is a list of length 3
first element: numeric vector of length 1 specifying the the row-index of the row that will be modified
second element: numeric vector specifying the column-indices that should be modified
third element: numeric vector defining values that should be set in the given row
type==modifyCol: input is a list of length 3
first element: numeric vector specifying the row-indices that should be modified
second element: numeric vector of length 1 specifying the the column-index of the column that will be modified
third element: numeric vector defining values that should be set in the given column
type==modifyCell: input is a list of length 3
first element: numeric vector of length 1 defining the column-index
second element: numeric vector of length 1 defining the row-index
third element: numeric vector of length 1 holding the value that should be set in the given cell
type==bind: input is a list of length 2
first element: an object of class
simpleTriplet
second argument: is a logical vector of length 1 being TRUE if a 'rbind' or 'FALSE' if a 'cbind' should be done
Value
an object of class simpleTriplet
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
Get information about specific cells
Description
Function cellInfo()
can be used to query information of a single cell
from a sdcProblem object. If the instance has already been protected
using protectTable()
, the information is retrieved from the final protected
dataset, otherwise from the current state of the instance.
Usage
cell_info(object, specs, ...)
Arguments
object |
an object of class sdcProblem |
specs |
input that defines which cells to query; the function expects either (see examples below)
|
... |
additional parameters for potential future use, currently unused. |
Value
a data.frame
with a row for each of the queried cells; the object
contains the following columns:
id: numeric vector of length 1 specifying the numerical index of the cell
a column
strID
ifobject
has not yet been protectedone column for each dimensional variable
a column
freq
containing the cell-frequenciesif available, one column for each (possible) numerical value that was tabulated
a column
sdcStatus
with the current sdc codeis_primsupp: is
TRUE
if the cell is a primary sensitive cellis_secondsupp: is
TRUE
if the cell is a secondary suppressed cell
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
Examples
# as in makeProblem() with a single primary suppression
p <- sdc_testproblem(with_supps = TRUE)
sdcProb2df(p)
# vector input
specs_vec <- c(region = "D", gender = "male")
cell_info(p, specs = specs_vec)
# data.frame input
specs_df <- data.frame(
region = c("A", "D", "A"),
gender = c("male", "female", "female")
)
cell_info(p, specs = specs_df)
# protect the table
p_safe <- protectTable(p, method = "SIMPLEHEURISTIC")
# re-apply
cell_info(p_safe, specs = specs_df)
Change anonymization status of a specific cell
Description
Function change_cellstatus()
allows to change|modify the anonymization state
of single table cells for objects of class sdcProblem.
Usage
change_cellstatus(object, specs, rule, verbose = FALSE, ...)
Arguments
object |
an object of class sdcProblem |
specs |
input that defines which cells to query; the function expects either (see examples below)
|
rule |
scalar character vector specifying a valid anonymization code ('u', 'z', 'x', 's') to which all the desired cells under consideration should be set. |
verbose |
scalar logical value defining verbosity, defaults to |
... |
additional parameters for potential future use, currently unused. |
Value
a sdcProblem object
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
Examples
# load example-problem
# (same as example from ?makeProblem)
p <- sdc_testproblem(with_supps = FALSE)
# goal: set cells with region = "D" and gender != "total" as primary sensitive
# using a data.frame as input
specs <- data.frame(
region = "D",
gender = c("male", "female", "total")
)
# marking the cells as sensitive
p <- change_cellstatus(
object = p,
specs = specs,
rule = "u"
)
# check
cell_info(p, specs = specs)
# using a named vector for a single cell to revert
# setting D/total as primary-sensitive
specs <- c(gender = "total", region = "D")
p <- change_cellstatus(
object = p,
specs = specs,
rule = "s"
)
# and check again
cell_info(p, specs = specs)
Compute contributing units to table cells
Description
This function computes (with respect to the raw input data) the indices of all
contributing units to given cells identified by ids
.
Usage
contributing_indices(prob, ids = NULL)
Arguments
prob |
a sdcProblem object created with |
ids |
a character vector containing default ids (strIDs) that define table
cells. Valid inputs can be extracted by using |
Value
a named list where names correspond to the given
ids' and the values
to the row numbers within the raw input data.
Examples
# loading test problem
p <- sdc_testproblem(with_supps = FALSE)
dt <- sdcProb2df(p, dimCodes = "original")
# question: which units contribute to cell region = "A" and gender = "female"?
# compute the id ("0102")
dt[region == "A" & gender == "female", strID]
# which indices contribute to the cell?
ids <- contributing_indices(prob = p, ids = "0101")
# check
dataObj <- get.sdcProblem(p, "dataObj")
rawData <- slot(dataObj, "rawData")
rawData[ids[["0101"]]]
# compute contributing ids for all cells
contributing_indices(p)
Create input files for tauArgus
Description
create required input-files and batch-file for tau-argus given an sdcProblem object
Usage
createArgusInput(
obj,
typ = "microdata",
verbose = FALSE,
path = getwd(),
solver = "FREE",
method,
primSuppRules = NULL,
responsevar = NULL,
shadowvar = NULL,
costvar = NULL,
requestvar = NULL,
holdingvar = NULL,
...
)
Arguments
obj |
an object of class sdcProblem from |
typ |
(character) either |
verbose |
(logical) if TRUE, the contents of the batch-file are written to the prompt |
path |
path, into which (temporary) files will be written to (amongst them being the batch-files). Each file written to this folder belonging to the same problem contains a random id in its filename. |
solver |
which solver should be used. allowed choices are
In case |
method |
secondary cell suppression algorithm, possible choices include:
|
primSuppRules |
rules for primary suppression, provided as a
|
responsevar |
which variable should be tabulated (defaults to frequencies). For details see tau-argus manual section 4.4.4. |
shadowvar |
if specified, this variable is used to apply the safety rules, defaults to |
costvar |
if specified, this variable describes the costs of suppressing each individual cell. For details see tau-argus manual section 4.4.4. |
requestvar |
if specified, this variable (0/1-coded) contains information about records that request protection. Records with 1 will be protected in case a corresponding request rule matches. It is ignored, if tabular input is used. |
holdingvar |
if specified, this variable contains information about records that should be grouped together. It is ignored, if tabular input is used. |
... |
allows to specify additional parameters for selected suppression-method as described above
as well as |
Value
the filepath to the batch-file
Examples
## Not run:
# loading micro data from sdcTable
utils::data("microdata1", package="sdcTable")
microdata1$num1 <- rnorm(mean = 100, sd = 25, nrow(microdata1))
microdata1$num2 <- round(rnorm(mean = 500, sd=125, nrow(microdata1)),2)
microdata1$weight <- sample(10:100, nrow(microdata1), replace = TRUE)
dim_region <- hier_create(root = "Total", nodes = LETTERS[1:4])
dim_region_dupl <- hier_create(root = "Total", nodes = LETTERS[1:4])
dim_region_dupl <- hier_add(dim_region_dupl, root = "B", nodes = c("b1"))
dim_region_dupl <- hier_add(dim_region_dupl, root = "D", nodes = c("d1"))
dim_gender <- hier_create(root = "Total", nodes = c("male", "female"))
dimList <- list(region = dim_region, gender = dim_gender)
dimList_dupl <- list(region = dim_region_dupl, gender = dim_gender)
dimVarInd <- 1:2
numVarInd <- 3:5
sampWeightInd <- 6
# creating an object of class \code{\link{sdcProblem-class}}
obj <- makeProblem(
data = microdata1,
dimList = dimList,
dimVarInd = dimVarInd,
numVarInd = numVarInd,
sampWeightInd = sampWeightInd)
# creating an object of class \code{\link{sdcProblem-class}} containing "duplicated" codes
obj_dupl <- makeProblem(
data = microdata1,
dimList = dimList_dupl,
dimVarInd = dimVarInd,
numVarInd = numVarInd,
sampWeightInd = sampWeightInd)
## create primary suppression rules
primSuppRules <- list()
primSuppRules[[1]] <- list(type = "freq", n = 5, rg = 20)
primSuppRules[[2]] <- list(type = "p", n = 5, p = 20)
# other supported formats are:
# list(type = "nk", n=5, k=20)
# list(type = "zero", rg = 5)
# list(type = "mis", val = 1)
# list(type = "wgt", val = 1)
# list(type = "man", val = 20)
## create batchInput object
bO_md1 <- createArgusInput(
obj = obj,
typ = "microdata",
path = tempdir(),
solver = "FREE",
method = "OPT",
primSuppRules = primSuppRules,
responsevar = "num1")
bO_td1 <- createArgusInput(
obj = obj,
typ = "tabular",
path = tempdir(),
solver = "FREE",
method = "OPT")
bO_td2 <- createArgusInput(
obj = obj_dupl,
typ = "tabular",
path = tempdir(),
solver = "FREE",
method = "OPT")
## in case CPLEX should be used, it is required to specify argument licensefile
bO_md2 <- createArgusInput(
obj = obj,
typ = "microdata",
path = tempdir(),
solver = "CPLEX",
method = "OPT",
primSuppRules = primSuppRules,
responsevar = "num1",
licensefile = "/path/to/my/cplexlicense")
## End(Not run)
Create input for jj_format
Description
This function transforms a sdcProblem object into a list that can
be used as input for writeJJFormat()
to write a problem in "JJ-format"
to
disk.
Usage
createJJFormat(x)
Arguments
x |
a sdcProblem object |
Value
an input suitable for writeJJFormat()
Author(s)
Bernhard Meindl (bernhard.meindl@statistik.gv.at) and Sapphire Yu Han (y.han@cbs.nl)
Examples
# setup example problem
# microdata
utils::data("microdata1", package = "sdcTable")
# create hierarchies
dims <- list(
region = sdcHierarchies::hier_create(root = "Total", nodes = LETTERS[1:4]),
gender = sdcHierarchies::hier_create(root = "Total", nodes = c("male", "female")))
# create a problem instance
p <- makeProblem(
data = microdata1,
dimList = dims,
numVarInd = "val")
# create suitable input for `writeJJFormat`
inp <- createJJFormat(p); inp
# write files to disk
# frequency table by default
writeJJFormat(
x = inp,
path = file.path(tempdir(), "prob_freqs.jj"),
overwrite = TRUE
)
# or using the numeric variable `val` previously specified
writeJJFormat(
x = inp,
tabvar = "val",
path = file.path(tempdir(), "prob_val.jj"),
overwrite = TRUE
)
Create input for RegSDC/other Tools
Description
This function transforms a sdcProblem object into an object that can be used as input for RegSDC::SuppressDec (among others).
Usage
createRegSDCInput(x, chk = FALSE)
Arguments
x |
a sdcProblem object |
chk |
a logical value deciding if computed linear relations should be additionally checked for validity |
Value
an list
with the following elements:
-
mat
: linear combinations depending on inner-cells of the given problem instance. -
y
: a 1-column matrix containing the frequencies of inner cells -
z
: a 1-column matrix containing the frequencies of all cells -
z_supp
: a 1-column matrix containing the frequencies of all cells but suppressed cells have a value ofNA
-
info
: adata.frame
with the following columns:-
cell_id
: internal cell-id used in sdcTable -
is_innercell
: a binary indicator if the cell is an internal cell (TRUE
) or a (sub)total (FALSE
)
-
Author(s)
Bernhard Meindl (bernhard.meindl@gmail.com)
Examples
## Not run:
utils::data("microdata1", package = "sdcTable")
head(microdata1)
# define the problem
dim_region <- hier_create(root = "total", nodes = sort(unique(microdata1$region)))
dim_gender <- hier_create(root = "total", nodes = sort(unique(microdata1$gender)))
prob <- makeProblem(
data = microdata1,
dimList = list(region = dim_region, gender = dim_gender),
freqVarInd = NULL
)
# suppress some cells
prob <- primarySuppression(prob, type = "freq", maxN = 15)
# compute input for RegSDC-package
inp_regsdc <- createRegSDCInput(x = prob, chk = TRUE)
# estimate innner cells based on linear dependencies
res_regsdc <- RegSDC::SuppressDec(
x = as.matrix(inp_regsdc$x),
z = inp_regsdc$z_supp,
y = inp_regsdc$y)[, 1]
# check if inner cells are all protected
df <- data.frame(
freqs_orig = inp_regsdc$z[inp_regsdc$info$is_innercell == TRUE, ],
freqs_supp = inp_regsdc$z_supp[inp_regsdc$info$is_innercell == TRUE, ],
regsdc = res_regsdc
)
subset(df, df$regsdc == df$freqs_orig & is.na(freqs_supp))
## End(Not run)
Create a hierarchy
Description
create_node() is defunct, please use sdcHierarchies::hier_create()
add_nodes() is defunct, please use sdcHierarchies::hier_add()
delete_nodes() is defunct, please use sdcHierarchies::hier_delete()
rename_node() is defunct, please use sdcHierarchies::hier_rename()
cellInfo() is defunct, please use [cell_info()]
changeCellStatus() is defunct, please use [change_cellstatus()]
Usage
create_node(...)
add_nodes(...)
delete_nodes(...)
rename_node(...)
cellInfo(...)
changeCellStatus(...)
S4 class describing a cutList-object
Description
An object of class cutList
holds constraints that can be extracted and
used as for objects of class linProb-class
. An object of class
cutList
consists of a constraint matrix (slot con
), a vector
of directions (slot direction
) and a vector specifying the right hand
sides of the constraints (slot rhs
).
Details
- slot
con
: an object of class
simpleTriplet-class
specifying the constraint matrix of the problem- slot
direction
: a character vector holding the directions of the constraints, allowed values are:
-
==
: equal -
<
: less -
>
: greater -
<=
: less or equal -
>=
: greater or equal
-
- slot
rhs
: numeric vector holding right hand side values of the constraints
Note
objects of class cutList
are dynamically generated (and removed) during the cut and branch algorithm when solving the secondary cell suppression problem
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
S4 class describing a dataObj-object
Description
This class models a data object containing the 'raw' data for a given problem as well as information on the position of the dimensional variables, the count variable, additional numerical variables, weights or sampling weights within the raw data. Also slot 'isMicroData' shows if slow 'rawData' consists of microdata (multiple observations for each cell are possible, isMicroData==TRUE) or if data have already been aggregated (isMicroData==FALSE)
Details
- slot
rawData
: list with each element being a vector of either codes of dimensional variables, counts, weights that should be used for secondary cell suppression problem, numerical variables or sampling weights.
- slot
dimVarInd
: numeric vector (or NULL) defining the indices of the dimensional variables within slot 'rawData'
- slot
freqVarInd
: numeric vector (or NULL) defining the indices of the frequency variables within slot 'rawData'
- slot
numVarInd
: numeric vector (or NULL) defining the indices of the numerical variables within slot 'rawData'
- slot
weightVarInd
: numeric vector (or NULL) defining the indices of the variables holding weights within slot 'rawData'
- slot
sampWeightInd
: numeric vector (or NULL) defining the indices of the variables holding sampling weights within slot 'rawData'
- slot
isMicroData
: logical vector of length 1 (or NULL) that is TRUE if slot 'rawData' are microData and FALSE otherwise
Note
objects of class dataObj
are input for slot dataObj
in class sdcProblem
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
S4 class describing a dimInfo-object
Description
An object of class dimInfo
holds all necessary information about the
dimensional variables defining a hierarchical table that needs to be protected.
Details
- slot
dimInfo
: a list (or NULL) with all list elements being objects of class
dimVar
- slot
strID
: a character vector (or NULL) defining IDs that identify each table cell. The ID's are based on (default) codes of the dimensional variables defining a cell.
- slot
strInfo
: a list object (or NULL) with each list element being a numeric vector of length 2 defining the start and end-digit that is allocated by the i-th dimensional variable in ID-codes available in slot
strID
- slot
vNames
: a character vector (or NULL) defining the variable names of the dimensional variables defining the table structure
- slot
posIndex
: a numeric vector (or NULL) holding the position of the dimensional variables within slot
rawData
of classdataObj
Note
objects of class dimInfo
are input for slots in classes sdcProblem
and safeObj
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
S4 class describing a dimVar-object
Description
An object of class dimVar
holds all necessary information about a single
dimensional variable such as original and standardized codes, the level-structure,
the hierarchical structure, codes that may be (temporarily) removed from
building the complete hierarchy (dups) and their corresponding codes that correspond
to these duplicated codes.
Details
- slot
codesOriginal
: a character vector (or NULL) holding original variable codes
- slot
codesDefault
: a character vector (or NULL) holding standardized codes
- slot
codesMinimal
: a logical vector (or NULL) defining if a code is required to build the complete hierarchy or not (then the code is a (sub)total)
- slot
vName
: character vector of length 1 (or NULL) defining the variable name of the dimensional variable
- slot
levels
: a numeric vector (or NULL) defining the level structure. For each code the corresponding level is listed with the grand-total always having level==1
- slot
structure
: a numeric vector (or NULL) with length of the total number of levels. Each element shows how many digits the i-th level allocates within the standardized codes (note: level 1 always allocates exactly 1 digit in the standardized codes)
- slot
dims
: a list (or NULL) defining the hierarchical structure of the dimensional variable. Each list-element is a character vector with elements available in slot
codesDefault
and the first element always being a (sub)total and the remaining elements being the codes that contribute to the (sub)total- slot
dups
: character vector (or NULL) having showing original codes that are duplicates in the hierarchy and can temporarily removed when building a table with this dimensional variable
- slot
dupsUp
: character vector (or NULL) with original codes that are the corresponding upper-levels to the codes that may be removed because they are duplicates and that are listed in slot
dups
Note
objects of class dimVar
form the base for elements in slot dimInfo
of class dimInfo
.
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
query cutList
-objects depending on argument type
Description
query cutList
-objects depending on argument type
Usage
get.cutList(object, type)
## S4 method for signature 'cutList,character'
get.cutList(object, type)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
constraints: constraint matrix of object
direction: directions of the constraints
rhs: right hand side of the constraints
nrConstraints: total number of constraints
Value
information from objects of class cutList
depending on argument type
object of class
simpleTriplet
if argumenttype
matches 'constraints'character vector if argument
type
matches 'direction'numeric vector if argument
type
matches 'objective' or 'rhs'
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
query dataObj
-objects depending on argument type
Description
query dataObj
-objects depending on argument type
Usage
get.dataObj(object, type)
## S4 method for signature 'dataObj,character'
get.dataObj(object, type)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
rawData: raw input data
dimVarInd: indices of dimensional variables
freqVarInd: index of frequency variable
numVarInd: indices of numerical variables
weightVarInd index of weight variable
sampWeightInd index of variable holding sampling weights
isMicroData does
object
consist of microdata?numVarNames variable names of numerical variables
freqVarName variable name of frequency variable
varName variable names of dimensional variables
Value
information from objects of class dataObj
depending on argument type
a list if argument
type
matches 'rawData'numeric vector if argument
type
matches 'dimVarInd', 'freqVarInd', 'numVarInd', 'weightVarInd' or 'sampWeightInd'character vector if argument
type
matches 'numVarNames', 'freqVarName' or 'varName'logical vector of length 1 if argument
type
matches 'isMicroData'
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
query dimInfo
-objects depending on argument type
Description
query dimInfo
-objects depending on argument type
Usage
get.dimInfo(object, type)
## S4 method for signature 'dimInfo,character'
get.dimInfo(object, type)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
strInfo: info on how many digits in the default codes ach dimensional variable allocates
dimInfo: a list object with each slot containing an object of class
dimVar
varName: variable names
strID: character vector of ID's defining table cells
posIndex vector showing the index of the elements of
dimInfo
in the underlying data
Value
information from objects of class dimInfo
depending on argument type
a list (or NULL) if argument
type
matches 'strInfo', 'dimInfo'numeric vector (or NULL) if argument
type
matches 'posIndex'character vector (or NULL) if argument
type
matches 'varName' or 'strID'
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
query dimVar
-objects depending on argument type
Description
query dimVar
-objects depending on argument type
Usage
get.dimVar(object, type)
## S4 method for signature 'dimVar,character'
get.dimVar(object, type)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
varName: variable name of the variable from which
object
was calculatedcodesOriginal: original codes (as specified by the user)
codesDefault: calculated, default codes
codesMinimal: all codes required to calculate the complete hierarchy (no sub-totals)
levels: level-structure of the dimensional variable
structure: vector showing how many digits in the default codes are required for each level
dims: list showing the complete hierarchy of the dimensional variable
dups: vector of duplicated codes
dupsUp: vector of codes that are the 'upper' levels to which the codes in
dups
correspondhasDuplicates: does the dimensional variable has codes that can be (temporarily) removed
nrLevels: the total number of levels of a dimensional variable
minimalCodesDefault: the standardized codes of the minimal set of required level-codes
Value
information from objects of class dataObj
depending on argument type
a list if argument
type
matches 'dims'numeric vector if argument
type
matches 'levels' or 'nrLevels'character vector if argument
type
matches 'codesOriginal', 'codesDefault', 'vName', 'dups', 'dupsUp' or 'minimalCodesDefault'logical vector of length 1 if argument
type
matches 'hasDuplicates'a logical vector if argument
type
matches 'codesMinimal'
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
query linProb
-objects depending on argument type
Description
query linProb
-objects depending on argument type
Usage
get.linProb(object, type)
## S4 method for signature 'linProb,character'
get.linProb(object, type)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
constraints: constraint matrix of object
linProb
direction: directions of the constraints
rhs: right hand side of the constraints
objective: objective function
types: types of the objective variables
bounds: bounds of the objective variables
Value
information from objects of class linProb
depending on type
an object of class
simpleTriplet
if type matches 'constraints'a character vector if type matches 'direction' or 'types'
a numeric vector if type matches 'objective' or 'rhs'
a list with elements 'lower' and 'upper' if type matches 'bounds'
element 'lower': a list with the first element containing indices and the second element containing corrsponding lower bounds
element 'upper': a list with the first element containing indices and the second element containing corrsponding upper bounds
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
query problemInstance
-objects depending on argument type
Description
query problemInstance
-objects depending on argument type
Usage
get.problemInstance(object, type)
## S4 method for signature 'problemInstance,character'
get.problemInstance(object, type)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
strID: vector of unique IDs for each table cell
nrVars: total number of table cells
freq: vector of frequencies
w: a vector of weights used in the linear problem (or NULL)
numVars: a list containing numeric vectors containing values for numerical variables for each table cell (or NULL)
sdcStatus: a vector containing the suppression state for each cell (possible values are 'u': primary suppression, 'x': secondary suppression, 'z': forced for publication, 's': publishable cell, 'w': dummy cells that are considered only when applying the simple greedy heuristic to protect the table)
lb: lower bound assumed to be known by attackers for each table cell
ub: upper bound assumed to be known by attackers for each table cell
LPL: lower protection level required to protect table cells
UPL: upper protection level required to protect table cells
SPL: sliding protection level required to protect table cells
primSupps: vector of indices of primary sensitive cells
secondSupps: vector of indices of secondary suppressed cells
forcedCells: vector of indices of cells that must not be suppressed
hasPrimSupps: shows if
object
has primary suppressions or nothasSecondSupps: shows if
object
has secondary suppressions or nothasForcedCells: shows if
object
has cells that must not be suppressedweight: gives weight that is used the suppression procedures
suppPattern: gives the current suppression pattern
Value
information from objects of class dataObj
depending on argument type
a list (or NULL) if argument
type
matches 'numVars'numeric vector if argument
type
matches 'freq', 'lb', 'ub', 'LPL', 'UPL', 'SPL', 'weight', 'suppPattern'numeric vector (or NULL) if argument
type
matches 'w', 'primSupps', 'secondSupps', 'forcedCells'character vector if argument
type
matches 'strID', 'sdcStatus', ”logical vector of length 1 if argument
type
matches 'hasPrimSupps', 'hasSecondSupps', 'hasForcedCells'numerical vector of length 1 if argument
type
matches 'nrVars'
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
query sdcProblem
-objects depending on argument type
Description
query sdcProblem
-objects depending on argument type
Usage
get.sdcProblem(object, type)
## S4 method for signature 'sdcProblem,character'
get.sdcProblem(object, type)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
dataObj: a list containing the (raw) input data
problemInstance: return the current problem instance
partition: a list containing information on the subtables that are required to be protected as well as information on the processing order of the subtables
dimInfo: information on the variables defining the hierarchical table
indicesDealtWith: a set of indices that have already been dealt with during the protection algorithmus
startI: current level at which subtables need to be protected (useful when restarting HITAS|HYPERCUBE)
startJ: current number of the subtable within a given level that needs to be protected (useful when restarting HITAS|HYPERCUBE)
innerAndMarginalCellInfo: for a given problem, get indices of inner- and marginal table cells
Value
information from objects of class sdcProblem
depending on argument type
an object of class
dataObj
(or NULL) iftype
matches 'dataObj'an object of class
problemInstance
(or NULL) iftype
matches 'problemInstance'a list (or NULL) if argument
type
matches 'partition' containing the following elements:element 'groups': list with each list-element being a character vector specifying a specific level-group
element 'indices': list with each list-element being a numeric vector defining indices of a subtable
element 'strIDs': list with each list-element being a character vector defining IDs of a subtable
element 'nrGroups': numeric vector of length 1 defining the total number of groups that have to be considered
element 'nrTables': numeric vector of length 1 defining the total number of subtables that have to be considered
a list (or NULL) if argument
type
matches 'innerAndMarginalCellInfo' containing the following elements:element 'innerCells': character vector specifying ID's of inner cells
element 'totCells': character vector specifying ID's of marginal cells
element 'indexInnerCells': numeric vector specifying indices of inner cells
element 'indexTotCells': numeric vector specifying indices of marginal cells
an object of class
dimInfo
(or NULL) iftype
matches 'dimInfo'numeric vector of length 1 if argument
type
matches 'startI' or 'startJ'
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
query simpleTriplet
-objects depending on argument type
Description
query simpleTriplet
-objects depending on argument type
Usage
get.simpleTriplet(object, type, input)
## S4 method for signature 'simpleTriplet,character,list'
get.simpleTriplet(object, type, input)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
rowInd: extract all row-indices
colInd: extract all column-indices
values: extract all values
nrRows: return the number of rows of the input object
nrCols: return the number of columns of the input object
nrCells: return the number of cells (different from 0!)
duplicatedRows: return a numeric vector showing indices of duplicated rows
transpose: transpose input
object
and return the transposed matrixgetRow: return a specific row of input
object
getCol: return a specific column of input
object
input |
a list depending on argument |
type == 'getRow': input is a list of length 1
first element: numeric vector of length 1 defining index of row that is to be returned
type == 'getCol': input is a list of length 1
first element: numeric vector of length 1 defining index of column that is to be returned
else: input is not used at all (empty list)
Value
information from object
depending on type
a numeric vector if type matches 'rowInd', 'colInd', 'values', 'nrRows', 'nrCols', 'nrCells' or 'duplicatedRows'
an object of class
simpleTriplet
if type matches 'transpose', 'getRow' or 'getCol'
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
Retrieve information in sdcProblem
or problemInstance
objects
Description
Function getInfo()
is used to extract values from
sdcProblem
or problemInstance
objects
Usage
getInfo(object, type)
Arguments
object |
an object of class |
type |
a scalar character specifying the information which should be
returned. If
|
Value
manipulated data depending on arguments object
and type
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
Examples
# define an example problem with two hierarchies
p <- sdc_testproblem(with_supps = FALSE)
# apply primary suppression
p <- primarySuppression(p, type = "freq", maxN = 3)
# `p` is an `sdcProblem` object
print(class(p))
for (slot in c("lb", "ub", "LPL", "SPL", "UPL", "sdcStatus",
"freq", "strID", "numVars", "w")) {
message("slot: ", shQuote(slot))
print(getInfo(p, type = slot))
}
# protect the cell and extract results
p_protected <- protectTable(p, method = "SIMPLEHEURISTIC")
for (slot in c("finalData", "nrNonDuplicatedCells", "nrPrimSupps",
"nrSecondSupps", "nrPublishableCells", "suppMethod")) {
message("slot: ", shQuote(slot))
print(getInfo(p_protected, type = slot))
}
Query information from protected problem instances
Description
get_safeobj()
allows to extract information from protected sdcProblem
instances.
Usage
get_safeobj(object, type, ...)
Arguments
object |
an object of class sdcProblem |
type |
a character vector defining what should be returned. Possible choices are:
|
... |
additional argument required for choices
|
Value
the required information.
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
initialize cutList
-objects depending on argument type
Description
initialize cutList
-objects depending on argument type
Usage
init.cutList(type, input)
## S4 method for signature 'character,list'
init.cutList(type, input)
Arguments
type |
a character vector of length 1 defining what|how to initialize. Allowed types are: |
empty: create an empty
cutList
-objectsingleCut: create a
cutList
-object with exactly one constraintmultipleCuts: create a
cutList
-object with more than one constraint
input |
a list depending on argument |
type==empty: input is not used (empty list)
type==singleCut: input is a list of length 3
first element: numeric vector specifying a values for the row of the constraint matrix that must be created
second element: character vector of length 1 specifying the direction
third element: numeric vector of length 1 specifying the right hand side of the constraint
type==multipleCuts: input is a list of length 3
first element: object of class
matrix
second element: character vector specifying the direction of the constraints
third element: numeric vector specifying the right hand side of the constraints
Value
an object of class cutList
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
initialize dataObj
-objects
Description
initialize dataObj
-objects
Usage
init.dataObj(input)
## S4 method for signature 'list'
init.dataObj(input)
Arguments
input |
a list with element described below: |
element 'inputData': a list object holding data
element 'dimVarInd': index (within
inputData
) of variables that define the table to protectelement 'freqVarInd': index (within
inputData
) of variable holding frequencieselement 'numVarInd' index (within
inputData
) of numerical variables (or NULL)element 'weightInd': index (within
inputData
) of variable holding weights (or NULL)element 'sampWeightInd': index (within
inputData
) of variable holding sampling weights (or NULL)element 'isMicroData': logical vector of length 1
Value
an object of class dataObj
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
initialize dimVar
-object
Description
initialize dimVar
-object
Usage
init.dimVar(input)
## S4 method for signature 'list'
init.dimVar(input)
Arguments
input |
a list with 2 elements |
first element: either an object of class 'matrix' or a data.frame or a link to a file. The input data need to be in a specific format (2 columns) with the first column defining the level-structure and the second column defining the level-codes.
second element: a character vector of length 1 specifying a variable name
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
initialize simpleTriplet
-objects depending on argument type
Description
init.simpleTriplet should be used to create objects of class simpleTriplet
.
It is possible to create an object from class simpleTriplet
from an existing matrix (using type=='simpleTriplet').
A positive (or negative) identity matrix stored as an object of class simpleTriplet
can be created by specifying type=='simpleTripletDiag'.
Usage
init.simpleTriplet(type, input)
## S4 method for signature 'character,list'
init.simpleTriplet(type, input)
Arguments
type |
a character vector of length 1 defining what|how to initialize. Allowed types are: |
simpleTriplet: a simple triplet matrix
simpleTripletDiag: identity matrix
input |
a list depending on argument |
type == 'simpleTriplet': input is a list of length 1
first element: object of class 'matrix'
type == 'simpleTripletDiag': input is a list of length 2
first element: numeric vector of length 1 defining the desired number of rows of the identiy matrix
second element: logical vector of length 1 being TRUE if a positive and FALSE if a negative identity matrix should be returned
Value
an object of class simpleTriplet
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
S4 class describing a linProb-object
Description
An object of class linProb
defines a linear problem given by the
objective coefficients (slot objective
), a constraint matrix (slot
constraints
), the direction (slot direction
) and the right
hand side (slot rhs
) of the constraints. Also, allowed lower (slot
boundsLower
) and upper (slot boundsUpper
) bounds of the
variables as well as its types (slot types
) are specified.
Details
- slot
objective
: a numeric vector holding coefficients of the objective function
- slot
constraints
: an object of class
simpleTriplet-class
specifying the constraint matrix of the problem- slot
direction
: a character vector holding the directions of the constraints, allowed values are:
-
==
: equal -
<
: less -
>
: greater -
<=
: less or equal -
>=
: greater or equal
-
- slot
rhs
: numeric vector holding right hand side values of the constraints
- slot
boundsLower
: a numeric vector holding lower bounds of the objective variables
- slot
boundsUpper
: a numeric vector holding upper bounds of the objective variables
- slot
types
: a character vector specifying types of the objective variables, allowed types are:
-
C
: binary -
B
: continuous -
I
: integer
-
Note
when solving the problems in the procedure, minimization of the objective is performed.
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
Create a problem instance
Description
Function makeProblem()
is used to create sdcProblem objects.
Usage
makeProblem(
data,
dimList,
dimVarInd = NULL,
freqVarInd = NULL,
numVarInd = NULL,
weightInd = NULL,
sampWeightInd = NULL
)
Arguments
data |
a data frame featuring at least one column for each desired dimensional variable. Optionally the input data can feature variables that contain information on cell counts, weights that should be used during the cut and branch algorithm, additional numeric variables or variables that hold information on sampling weights. |
dimList |
a (named) list where the names refer to variable names in
input
|
dimVarInd |
if |
freqVarInd |
if not |
numVarInd |
if not |
weightInd |
if not |
sampWeightInd |
if not |
Value
a sdcProblem object
Author(s)
Bernhard Meindl
Examples
# loading micro data
utils::data("microdata1", package = "sdcTable")
# we can observe that we have a micro data set consisting
# of two spanning variables ('region' and 'gender') and one
# numeric variable ('val')
# specify structure of hierarchical variable 'region'
# levels 'A' to 'D' sum up to a Total
dim.region <- data.frame(
levels=c('@','@@','@@','@@','@@'),
codes=c('Total', 'A','B','C','D'),
stringsAsFactors=FALSE)
# specify structure of hierarchical variable 'gender'
# using create_node() and add_nodes() (see ?manage_hierarchies)
dim.gender <- hier_create(root = "Total", nodes = c("male", "female"))
hier_display(dim.gender)
# create a named list with each element being a data-frame
# containing information on one dimensional variable and
# the names referring to variables in the input data
dimList <- list(region = dim.region, gender = dim.gender)
# third column containts a numeric variable
numVarInd <- 3
# no variables holding counts, numeric values, weights or sampling
# weights are available in the input data
# creating an problem instance using numeric indices
p1 <- makeProblem(
data = microdata1,
dimList = dimList,
numVarInd = 3 # third variable in `data`
)
# using variable names is also possible
p2 <- makeProblem(
data = microdata1,
dimList = dimList,
numVarInd = "val"
)
# what do we have?
print(class(p1))
# have a look at the data
df1 <- sdcProb2df(p1, addDups = TRUE,
addNumVars = TRUE, dimCodes = "original")
df2 <- sdcProb2df(p2, addDups=TRUE,
addNumVars = TRUE, dimCodes = "original")
print(df1)
identical(df1, df2)
Synthetic Microdata (1)
Description
A 'data.frame' used for examples and problem-generation in various examples.
Usage
data(microdata1)
Format
a 'data.frame' with '100' rows and variables 'region', 'gender' and 'val'.
Examples
utils::data("microdata1", package = "sdcTable")
head(microdata1)
Synthetic Microdata (2)
Description
Example microdata used for example in [protect_linked_tables()].
Usage
data(microdata2)
Format
a 'data.frame' with '100' observations containing variables 'region', 'gender', 'ecoOld', 'ecoNew' and 'numVal'.
Examples
utils::data("microdata2", package = "sdcTable")
head(microdata2)
Apply primary suppression
Description
Function primarySuppression()
is used to identify and suppress primary
sensitive table cells in sdcProblem objects.
Argument type
allows to select a rule that should be used to identify
primary sensitive cells. At the moment it is possible to identify and
suppress sensitive table cells using the frequency-rule, the nk-dominance
rule and the p-percent rule.
Usage
primarySuppression(object, type, ...)
Arguments
object |
a sdcProblem object |
type |
character vector of length 1 defining the primary suppression rule. Allowed types are:
|
... |
parameters used in the identification of primary sensitive cells. Parameters that can be modified|changed are:
|
Details
since versions >= 0.29
it is no longer possible to specify underlying
variables for dominance rules ("p"
, "pq"
or "nk"
) by index; these variables must
be set by name using argument numVarName
.
Value
a sdcProblem object
Note
the nk-dominance rule, the p-percent rule and the pq-rule can only
be applied if micro data have been used as input data to function makeProblem()
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
Examples
# load micro data
utils::data("microdata1", package = "sdcTable")
# load problem (as it was created in the example in ?makeProblem
p <- sdc_testproblem(with_supps = FALSE)
# we have a look at the frequency table by gender and region
xtabs(rep(1, nrow(microdata1)) ~ gender + region, data = microdata1)
# 2 units contribute to cell with region=='A' and gender=='female'
# --> this cell is considered sensitive according the the
# freq-rule with 'maxN' equal to 2!
p1 <- primarySuppression(
object = p,
type = "freq",
maxN = 2
)
# we can also apply a p-percent rule with parameter "p" being 30 as below.
# This is only possible if we are dealing with micro data and we also
# have to specify the name of a numeric variable.
p2 <- primarySuppression(
object = p,
type = "p",
p = 30,
numVarName = "val"
)
# looking at anonymization states we see, that one cell is primary
# suppressed (sdcStatus == "u")
# the remaining cells are possible candidates for secondary cell
# suppression (sdcStatus == "s") given the frequency rule with
# parameter "maxN = 2".
#
# Applying the p-percent rule with parameter 'p = 30' resulted in
# two primary suppressions.
data.frame(
p1_sdc = getInfo(p1, type = "sdcStatus"),
p2_sdc = getInfo(p2, type = "sdcStatus")
)
print dimVar-class
objects
Description
print dimVar-class
objects in a resonable way
Usage
## S4 method for signature 'dimVar'
print(x, ...)
Arguments
x |
An object of class |
... |
currently not used |
print objects of class sdcProblem-class
.
Description
print some useful information instead of just displaying the entire object (which may be large)
Usage
## S4 method for signature 'sdcProblem'
print(x, ...)
Arguments
x |
an objects of class |
... |
currently not used. |
S4 class describing a problemInstance-object
Description
An object of class problemInstance
holds the main information that is
required to solve the secondary cell suppression problem.
Details
- slot
strID
: a character vector (or NULL) of ID's identifying table cells
- slot
Freq
: a numeric vector (or NULL) of counts for each table cell
- slot
w
: a numeric vector (or NULL) of weights that should be used when solving the secondary cell suppression problem
- slot
numVars
: a list (or NULL) with each element being a numeric vector holding values of specified numerical variables for each table cell
- slot
lb
: numeric vector (or NULL) holding assumed lower bounds for each table cell
- slot
ub
: numeric vector (or NULL) holding assumed upper bounds for each table cell
- slot
LPL
: numeric vector (or NULL) holding required lower protection levels for each table cell
- slot
UPL
: numeric vector (or NULL) holding required upper protection levels for each table cell
- slot
SPL
: numeric vector (or NULL) holding required sliding protection levels for each table cell
- slot
sdcStatus
: character vector (or NULL) holding the current anonymization state for each cell.
-
z
: cell is forced to be published and must not be suppressed -
u
: cell has been primary suppressed -
x
: cell is a secondary suppression -
s
: cell can be published
-
Note
objects of class problemInstance
are used as input for slot problemInstance
in class sdcProblem
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
Protect two tables with common cells
Description
protect_linked_tables()
can be used to protect tables that have
common cells. It is of course required that after the anonymization process
has finished, all common cells have the same anonymization state in both
tables.
Usage
protectLinkedTables(
objectA,
objectB,
commonCells,
method = "SIMPLEHEURISTIC",
...
)
protect_linked_tables(x, y, common_cells, method = "SIMPLEHEURISTIC", ...)
Arguments
objectA |
maps to argument |
objectB |
maps to argument |
commonCells |
maps to argument |
method |
which protection algorithm should be used; choices are
|
... |
additional arguments to control the secondary cell suppression
algorithm. For details, see |
x |
a sdcProblem object |
y |
a sdcProblem object |
common_cells |
a list object defining common cells in
|
Value
a list elements x
and y
containing protected sdcProblem
objects
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
See Also
Examples
## Not run:
# load micro data for further processing
utils::data("microdata2", package = "sdcTable")
# table1: defined by variables 'gender' and 'ecoOld'
md1 <- microdata2[,c(2,3,5)]
# table2: defined by variables 'region', 'gender' and 'ecoNew'
md2 <- microdata2[,c(1,2,4,5)]
# we need to create information on the hierarchies
# variable 'region': exists only in md2
d_region <- hier_create(root = "Tot", nodes = c("R1", "R2"))
# variable 'gender': exists in both datasets
d_gender <- hier_create(root = "Tot", nodes = c("m", "f"))
# variable 'eco1': exists only in md1
d_eco1 <- hier_create(root = "Tot", nodes = c("A", "B"))
d_eco1 <- hier_add(d_eco1, root = "A", nodes = c("Aa", "Ab"))
d_eco1 <- hier_add(d_eco1, root = "B", nodes = c("Ba", "Bb"))
# variable 'ecoNew': exists only in md2
d_eco2 <- hier_create(root = "Tot", nodes = c("C", "D"))
d_eco2 <- hier_add(d_eco2, root = "C", nodes = c("Ca", "Cb", "Cc"))
d_eco2 <- hier_add(d_eco2, root = "D", nodes = c("Da", "Db", "Dc"))
# creating objects holding information on dimensions
dl1 <- list(gender = d_gender, ecoOld = d_eco1)
dl2 <- list(region = d_region, gender = d_gender, ecoNew = d_eco2)
# creating input objects for further processing.
# For details, see ?makeProblem.
p1 <- makeProblem(
data = md1,
dimList = dl1,
dimVarInd = 1:2,
numVarInd = 3)
p2 <- makeProblem(
data = md2,
dimList = dl2,
dimVarInd = 1:3,
numVarInd = 4)
# the cell specified by gender == "Tot" and ecoOld == "A"
# is one of the common cells! -> we mark it as primary suppression
p1 <- change_cellstatus(
object = p1,
specs = data.frame(gender = "Tot", ecoOld = "A"),
rule = "u",
verbose = FALSE)
# the cell specified by region == "Tot" and gender == "f" and ecoNew == "C"
# is one of the common cells! -> we mark it as primary suppression
p2 <- change_cellstatus(
object = p2,
specs = data.frame(region = "Tot", gender = "f", ecoNew = "C"),
rule = "u",
verbose = FALSE)
# specifying input to define common cells
common_cells <- list()
# variable "gender"
common_cells$v.gender <- list()
common_cells$v.gender[[1]] <- "gender" # variable name in "p1"
common_cells$v.gender[[2]] <- "gender" # variable name in "p2"
# "gender" has equal characteristics on both datasets -> keyword "ALL"
common_cells$v.gender[[3]] <- "ALL"
# variables: "ecoOld" and "ecoNew"
common_cells$v.eco <- list()
common_cells$v.eco[[1]] <- "ecoOld" # variable name in "p1"
common_cells$v.eco[[2]] <- "ecoNew" # variable name in "p2"
# vector of common characteristics:
# "A" and "B" in variable "ecoOld" in "p1"
common_cells$v.eco[[3]] <- c("A", "B")
# correspond to codes "C" and "D" in variable "ecoNew" in "p2"
common_cells$v.eco[[4]] <- c("C", "D")
# protect the linked data
result <- protect_linked_tables(
x = p1,
y = p2,
common_cells = common_cells,
verbose = TRUE)
# having a look at the results
result_tab1 <- result$x
result_tab2 <- result$y
summary(result_tab1)
summary(result_tab2)
## End(Not run)
Protecting sdcProblem objects
Description
Function protectTable()
is used to protect primary sensitive table cells
(that usually have been identified and set using
primarySuppression()
). The function protects primary
sensitive table cells according to the method that has been chosen and the
parameters that have been set. Additional parameters that are used to control
the protection algorithm are set using parameter ...
.
Usage
protectTable(object, method, ...)
Arguments
object |
a sdcProblem object that has created using |
method |
a character vector of length 1 specifying the algorithm that should be used to protect the primary sensitive table cells. Allowed values are:
|
... |
parameters used in the protection algorithm that has been selected. Parameters that can be changed are:
|
Details
The implemented methods may have bugs that yield in not-fully protected tables. Especially
the usage of "OPT"
, "HITAS"
and "HYPERCUBE"
in production is not
suggested as these methods may eventually be removed completely. In case you encounter any problems,
please report it or use Tau-Argus (https://research.cbs.nl/casc/tau.htm).
Value
an safeObj object
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
Examples
## Not run:
# load example-problem with with a single primary suppression
# (same as example from ?primarySuppression)
p <- sdc_testproblem(with_supps = TRUE)
# protect the table using the 'GAUSS' algorithm with verbose output
res1 <- protectTable(p, method = "GAUSS", verbose = TRUE)
res1
# protect the table using the 'HITAS' algorithm with verbose output
res2 <- protectTable(p, method = "HITAS", verbose = TRUE, useC = TRUE)
res2
# protect using the heuristic algorithm
res3 <- protectTable(p, method = "SIMPLEHEURISTIC")
res3
# protect using the old implmentation of the heuristic algorithm
# used in sdcTable versions <0.32
res4 <- protectTable(p, method = "SIMPLEHEURISTIC_OLD")
res4
# looking at the final table with result suppression pattern
print(getInfo(res1, type = "finalData"))
## End(Not run)
runArgusBatchFile
Description
allows to run batch-files for tau argus given the path to an executable of argus.
The provided batch input files can either be created using function createArgusInput
or
can be arbitrarily created. In the latter case, argument obj
should not be specified and not output
is returned, the script is just executed in tau-argus.
Usage
runArgusBatchFile(
obj = NULL,
batchF,
exe = "C:\\Tau\\TauArgus.exe",
batchDataDir = NULL,
verbose = FALSE
)
Arguments
obj |
|
batchF |
a filepath to an batch-input file created by e.g. |
exe |
(character) file-path to tau-argus executable |
batchDataDir |
if different from |
verbose |
(logical) if |
Value
a data.table
containing the protected table or an error in case the batch-file was not solved correctly
if the batch-file was created using sdcTable (argument obj
) was specified. In
case an arbitrarily batch-file has been run, NULL
is returned.
Note
in case a custom batch-file is used as input (e.g obj
is NULL
), this
functions does currently not try to read in any tables to the system.
S4 class describing a safeObj-object
Description
Objects of class safeObj
are the final result after protection a
tabular structure. After a successful run of protectTable
an object of this class is generated and returned. Objects of class
safeObj
contain a final, complete data set (slot finalData
)
that has a column showing the anonymization state of each cell and the
complete information on the dimensional variables that have defined the table
that has been protected (slot dimInfo
). Also, the number of
non-duplicated table cells (slot nrNonDuplicatedCells
) is returned
along with the number of primary (slot nrPrimSupps
) and secondary
(slot nrSecondSupps
) suppressions. Furthermore, the number of cells
that can be published (slot nrPublishableCells
) and the algorithm that
has been used to protect the data (slot suppMethod
) is returned.
Details
- slot
finalData
: a data.frame (or NULL) featuring columns for each variable defining the table (with their original codes), the cell counts and values of any numerical variables and the anonymization status for each cell with
-
s, z
: cell can be published -
u
: cell is a primary sensitive cell -
x
: cell was selected as a secondary suppression
-
- slot
dimInfo
: an object of class
dimInfo-class
holding all information on variables defining the table- slot
nrNonDuplicatedCells
: numeric vector of length 1 (or NULL) showing the number of non-duplicated table cells. This value is different from 0 if any dimensional variable features duplicated codes. These codes have been re-added to the final dataset.
- slot
nrPrimSupps
: numeric vector of length 1 (or NULL) showing the number of primary suppressed cells
- slot
nrSecondSupps
: numeric vector of length 1 (or NULL) showing the number of secondary suppressions
- slot
nrPublishableCells
: numeric vector of length 1 (or NULL) showing the number of cells that may be published
- slot
suppMethod
: character vector of length 1 holding information on the protection method
Note
objects of class safeObj
are returned after the function protectTable
has finished.
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
Transform a problem instance
Description
sdcProb2df()
returns a data.table
given an sdcProblem input object.
Usage
sdcProb2df(obj, addDups = TRUE, addNumVars = FALSE, dimCodes = "both")
Arguments
obj |
an sdcProblem object |
addDups |
(logical), if |
addNumVars |
(logical), if |
dimCodes |
(character) allows to specify in which coding the dimensional variables should be returned. Possible choices are:
|
Value
a data.table
containing information about all cells of the given problem
Examples
# loading micro data
utils::data("microdata1", package = "sdcTable")
# we can observe that we have a micro data set consisting
# of two spanning variables ('region' and 'gender') and one
# numeric variable ('val')
# specify structure of hierarchical variable 'region'
# levels 'A' to 'D' sum up to a Total
dim.region <- data.frame(
levels=c('@','@@','@@','@@','@@'),
codes=c('Total', 'A','B','C','D'),
stringsAsFactors=FALSE)
# specify structure of hierarchical variable 'gender'
# using create_node() and add_nodes() (see ?manage_hierarchies)
dim.gender <- hier_create(root = "Total", nodes = c("male", "female"))
hier_display(dim.gender)
# create a named list with each element being a data-frame
# containing information on one dimensional variable and
# the names referring to variables in the input data
dimList <- list(region = dim.region, gender = dim.gender)
# third column containts a numeric variable
numVarInd <- 3
# no variables holding counts, numeric values, weights or sampling
# weights are available in the input data
# creating an problem instance using numeric indices
p1 <- makeProblem(
data = microdata1,
dimList = dimList,
numVarInd = 3 # third variable in `data`
)
# using variable names is also possible
p2 <- makeProblem(
data = microdata1,
dimList = dimList,
numVarInd = "val"
)
# what do we have?
print(class(p1))
# have a look at the data
df1 <- sdcProb2df(p1, addDups = TRUE,
addNumVars = TRUE, dimCodes = "original")
df2 <- sdcProb2df(p2, addDups=TRUE,
addNumVars = TRUE, dimCodes = "original")
print(df1)
identical(df1, df2)
S4 class describing a sdcProblem-object
Description
An object of class sdcProblem
contains the entire information that is
required to protect the complete table that is given by the dimensional
variables. Such an object holds the data itself (slot dataObj
), the
entire information about the dimensional variables (slot dimInfo
),
information on all table cells (ID's, bounds, values, anonymization state in
slot problemInstance
), the indices on the sub tables that need to be
considered if one wants to protect primary sensitive cells using a heuristic
approach (slot partition
and the information on which groups or rather
subtables have already been protected while performing a heuristic method
(slots startI
and startJ
).
Details
- slot
dataObj
: an object of class
dataObj
(or NULL) holding information on the underlying data- slot
dimInfo
: an object of class
dimInfo
(or NULL) containing information on all dimensional variables- slot
problemInstance
: an object of class
problemInstance
holding information on values, bounds, required protection levels as well as the anonymization state for all table cells- slot
partition
: a list object (or NULL) that is typically generated with calc.multiple(type='makePartitions',...) specifying information on the subtables and the necessary order that need to be protected when using a heuristic approach to solve the cell suppression problem
- slot
startI
: a numeric vector of length 1 defining the group-level of the subtables in which a heuristic algorithm needs to start. All subtables having a group-index less than
startI
have already been protected- slot
startJ
: a numeric vector of length 1 defining the number of the table within the group defined by parameter
startI
at which a heuristic algorithm needs to start. All tables in the group having an indexj
smaller thanstartJ
have already been protected- slot
indicesDealtWith
: a numeric vector holding indices of table cells that have protected and whose anonymization state must remain fixed
Note
objects of class sdcProblem
are typically generated by function makeProblem
and are the input of functions primarySuppression
and protectTable
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
A Problem-Instance used for examples/testing
Description
sdc_testproblem()
returns a sdc-problem instance with 2
hierarchies and
optionally with a single suppressed cell that is used in various examples
and tests.
Usage
sdc_testproblem(with_supps = FALSE)
Arguments
with_supps |
if |
Value
a problem instance
Examples
p1 <- sdc_testproblem(); p1
sdcProb2df(p1)
# a single protected cell
p2 <- sdc_testproblem(with_supps = TRUE); p2
sdcProb2df(p2)
# cell status differs in one cell
specs <- c(gender = "female", region = c("A"))
cell_info(p1, specs = specs)
cell_info(p2, specs = specs)
modify cutList
-objects depending on argument type
Description
modify cutList
-objects depending on argument type
Usage
set.cutList(object, type, input)
## S4 method for signature 'cutList,character,list'
set.cutList(object, type, input)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
addCompleteConstraint: add a constraint to argument
object
removeCompleteConstraint: remove a constraint from argument
object
input |
a list depending on argument |
type==addCompleteConstraint: a list of length 1
first element: an object of class
cutList
with exactly one constraint
type==removeCompleteConstraint: a list of length 1
first element: numeric vector of length 1 specifying the index of the constraint that should be removed
Value
an object of class cutList
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
modify dataObj
-objects depending on argument type
Description
modify dataObj
-objects depending on argument type
Usage
set.dataObj(object, type, input)
## S4 method for signature 'dataObj,character,listOrNULL'
set.dataObj(object, type, input)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
rawData: set slot 'rawData' of argument
object
input |
a list depending on argument |
type==rawData: a list containing raw data
Value
an object of class dataObj
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
modify dimInfo
-objects depending on argument type
Description
modify dimInfo
-objects depending on argument type
Usage
set.dimInfo(object, type, input)
## S4 method for signature 'dimInfo,character,character'
set.dimInfo(object, type, input)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
strID: set slot 'strID' of argument
object
input |
a list depending on argument |
type==strID: a character vector containing ID's
Value
an object of class dimInfo
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
change linProb
-objects depending on argument type
Description
change linProb
-objects depending on argument type
Usage
set.linProb(object, type, input)
## S4 method for signature 'linProb,character,list'
set.linProb(object, type, input)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
objective: change coefficients of the objective
direction: change vector of direction of the constraints
rhs: change vector of right hand side of the constraints
types: change vector of bounds of the objective variables
bounds: change bounds of the objective variables
constraints: change constraint matrix
removeCompleteConstraint: remove a specific constraint from the object
addCompleteConstraint: add a constraint to the object
input |
a list depending on argument |
type==objective: a list of length 1
first element: numeric vector defining coefficients of the objective
type==direction: a list of length 1
first element: character vector defining direction of the constraints
type==rhs: a list of length 1
first element: numeric vector defining right hand side of the constraints
type==types: a list of length 1
first element: character vector defining types of objective variables
type==bounds: a list of length 2
element 'lower': a list with the first element containing indices and the second element containing corrsponding lower bounds
element 'upper': a list with the first element containing indices and the second element containing corrsponding upper bounds
type==constraints: a list of length 1
first element: an object of class
simpleTriplet
type==removeCompleteConstraint: a list of length 1
first element: numeric vector of length 1 defining the index of the constraint that should be removed
type==addCompleteConstraint: a list of length 1
first element: an object of class
cutList
defining the constraint that should be added
Value
an object of class linProb
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
modify problemInstance
-objects depending on argument type
Description
modify problemInstance
-objects depending on argument type
Usage
set.problemInstance(object, type, input)
## S4 method for signature 'problemInstance,character,list'
set.problemInstance(object, type, input)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
lb: set assumed to be known lower bounds
ub: set assumed to be upper lower bounds
LPL: set lower protection levels
UPL: set upper protection levels
SPL: set sliding protection levels
sdcStatus: change anonymization status
input |
a list with elements 'indices' and 'values'. |
element 'indices': numeric vector defining the indices of the cells that should be modified
element 'values': numeric vector whose values are going to replace current values for cells defined by 'indices' depending on argument
type
Value
an object of class problemInstance
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
modify sdcProblem
-objects depending on argument type
Description
modify sdcProblem
-objects depending on argument type
Usage
set.sdcProblem(object, type, input)
## S4 method for signature 'sdcProblem,character,list'
set.sdcProblem(object, type, input)
Arguments
object |
an object of class |
type |
a character vector of length 1 defining what to calculate|return|modify. Allowed types are: |
problemInstance: set|modify slot 'problemInstance' of argument
object
partition: set|modify slot 'partition' of argument
object
startI: set|modify slot 'startI' of argument
object
startJ: set|modify slot 'startJ' of argument
object
indicesDealtWith: set|modify slot 'indicesDealtWith' of argument
object
input |
a list with elements depending on argument |
an object of class
problemInstance
if argumenttype
matches 'problemInstance'a list (derived from calc.multiple(type='makePartition', ...) if argument
type
matches 'partition'a numeric vector of length 1 if argument
type
matches 'startI' or 'startJ'a numeric vector if argument
type
matches 'indicesDealtWith'
Value
an object of class sdcProblem
Note
internal function
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
Set/Update information in sdcProblem
or problemInstance
objects
Description
Function setInfo()
is used to update values in
sdcProblem
or problemInstance
objects
Usage
setInfo(object, type, index, input)
Arguments
object |
an object of class |
type |
a scalar character specifying the kind of information that
should be changed or modified; if
|
index |
numeric vector defining cell-indices for which which values in a specified slot should be changed|modified |
input |
numeric or character vector depending on argument
|
Value
a sdcProblem
- or problemInstance
object
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
Examples
# load example-problem with suppressions
# (same as example from ?primarySuppression)
p <- sdc_testproblem(with_supps = TRUE)
# which is the overall total?
idx <- which.max(getInfo(p, "freq")); idx
# we see that the cell with idx = 1 is the overall total and its
# anonymization state of the total can be extracted as follows:
print(getInfo(p, type = "sdcStatus")[idx])
# we want this cell to never be suppressed
p <- setInfo(p, type = "sdcStatus", index = idx, input = "z")
# we can verify this:
print(getInfo(p, type = "sdcStatus")[idx])
# changing slot 'UPL' for all cells
inp <- data.frame(
strID = getInfo(p, "strID"),
UPL_old = getInfo(p, "UPL")
)
inp$UPL_new <- inp$UPL_old + 1
p <- setInfo(p, type = "UPL", index = 1:nrow(inp), input = inp$UPL_new)
show objects of class sdcProblem-class
.
Description
just calls the corresponding print-method
Usage
## S4 method for signature 'sdcProblem'
show(object)
Arguments
object |
an objects of class |
S4 class describing a simpleTriplet-object
Description
Objects of class simpleTriplet
define matrices that are stored in a
sparse format. Only the row- and column indices and the corresponding values
of non-zero cells are stored. Additionally, the dimension of the matrix given
by the total number of rows and columns is stored.
Details
- slot
i
: a numeric vector specifying row-indices with each value being geq 1 and leq of the value in
nrRows
- slot
j
: a numeric vector specifying column-indices with each value being geq 1 and leq of the value in
nrCols
- slot
v
: a numeric vector specifying the values of the matrix in cells specified by the corresponding row- and column indices
- slot
nrRows
: a numeric vector of length 1 holding the total number of rows of the matrix
- slot
nrCols
: a numeric vector of length 1 holding the total number of columns of the matrix
Note
objects of class simpleTriplet
are input of slot constraints
in class linProb-class
and slot slot con
in class cutList-class
Author(s)
Bernhard Meindl bernhard.meindl@statistik.gv.at
summarize object of class sdcProblem-class
or safeObj-class
.
Description
extract and show relevant information stored in object ofs class sdcProblem-class
or safeObj-class
.
Usage
## S4 method for signature 'sdcProblem'
summary(object, ...)
Arguments
object |
Objects of either class |
... |
currently not used. |
Write a problem in jj-format to a file
Description
This function allows to write a problem instance in JJ-Format to a file.
Usage
writeJJFormat(x, tabvar = "freqs", path = "out.jj", overwrite = FALSE)
Arguments
x |
an input produced by |
tabvar |
the name of the variable that will be used when producing the
problem in JJ format. It is possible to specify |
path |
a scalar character defining the name of the file that should be written. This can be an absolute or relative URL; however the file must not exist. |
overwrite |
logical scalar, if |
Value
invisibly the path to the file that was created.
Examples
# setup example problem
# microdata
utils::data("microdata1", package = "sdcTable")
# create hierarchies
dims <- list(
region = sdcHierarchies::hier_create(root = "Total", nodes = LETTERS[1:4]),
gender = sdcHierarchies::hier_create(root = "Total", nodes = c("male", "female")))
# create a problem instance
p <- makeProblem(
data = microdata1,
dimList = dims,
numVarInd = "val")
# create suitable input for `writeJJFormat`
inp <- createJJFormat(p); inp
# write files to disk
# frequency table by default
writeJJFormat(
x = inp,
path = file.path(tempdir(), "prob_freqs.jj"),
overwrite = TRUE
)
# or using the numeric variable `val` previously specified
writeJJFormat(
x = inp,
tabvar = "val",
path = file.path(tempdir(), "prob_val.jj"),
overwrite = TRUE
)