Type: Package
Title: Download and Manage Optional Package Data
Version: 0.1.5
Maintainer: Tim Schäfer <ts+code@rcmd.org>
Description: Manage optional data for your package. The data can be hosted anywhere, and you have to give a Uniform Resource Locator (URL) for each file. File integrity checks are supported. This is useful for package authors who need to ship more than the 5 Megabyte of data currently allowed by the the Comprehensive R Archive Network (CRAN).
License: MIT + file LICENSE
Encoding: UTF-8
URL: https://github.com/dfsp-spirit/pkgfilecache
BugReports: https://github.com/dfsp-spirit/pkgfilecache/issues
Suggests: knitr, rmarkdown, testthat (≥ 2.1.0)
Imports: downloader, rappdirs, curl
VignetteBuilder: knitr
RoxygenNote: 7.2.3
NeedsCompilation: no
Packaged: 2024-02-02 18:17:05 UTC; spirit
Author: Tim Schäfer ORCID iD [aut, cre]
Repository: CRAN
Date/Publication: 2024-02-02 20:30:02 UTC

Check whether the given files exist in the package cache.

Description

Check whether the given files exist in the package cache. You can pass MD5 sums, which will be verified and only files with correct MD5 hash will count as existing.

Usage

are_files_available(pkg_info, relative_filenames, md5sums = NULL)

Arguments

pkg_info

named list. Package identifier, see get_pkg_info() on how to get one.

relative_filenames

vector of strings. A vector of filenames, relative to the package cache.

md5sums

vector of strings or NULL. A list of MD5 checksums, one for each file in param 'relative_filenames', if not NULL. If given, the files will only be reported as existing if the MD5 sums match.

Value

logical vector. For each file, whether it passed the check.

Examples

    pkg_info = get_pkg_info("mypackage")
    is_available = are_files_available(pkg_info, c("file1.txt", "file2.txt"))


Download files marked as mismatch to package cache.

Description

Download files marked as mismatched to package cache. You should check afterwards whether this was successful, e.g., via 'files_exist_md5'.

Usage

download_files_with_md5_mismatch(
  local_files_absolute,
  local_files_md5_ok,
  urls,
  files_are_binary = NULL
)

Arguments

local_files_absolute

vector of strings. A vector of filenames, must already include the package cache part.

local_files_md5_ok

logical vector. For each file, whether the local copy is OK. Only files for which this lists FALSE will be downloaded.

urls

vector of strings. For each file, a remote URL where to download the file. Will be passed to 'downloader::download', see that function for URL encoding details.

files_are_binary

logical vector. For each file, whether it is binary. Only required on Windows, when files need to be downloaded. See 'downloader::download' docs for details.


Ensure all given files exist in the file cache, download them if they are not.

Description

Ensure all given files exist in the file cache, download them if they are not.

Usage

ensure_files_available(
  pkg_info,
  relative_filenames,
  urls,
  files_are_binary = NULL,
  md5sums = NULL,
  on_errors = "warn",
  download = TRUE
)

Arguments

pkg_info

named list. Package identifier, see get_pkg_info() on how to get one.

relative_filenames

vector of strings. A vector of filenames, realtive to the package cache.

urls

vector of strings. For each file, a remote URL where to download the file. Will be passed to 'downloader::download', see that function for URL encoding details.

files_are_binary

logical vector. For each file, whether it is binary. Only required on Windows, when files need to be downloaded. See 'downloader::download' docs for details.

md5sums

vector of strings or NULL. A list of MD5 checksums, one for each file in param 'relative_filenames', if not NULL. If given, the files will only be reported as existing if the MD5 sums match.

on_errors

string. What to do if getting the files failed. One of c("warn", "stop", "ignore"). At the end, files are checked using 'files_available'(including MD5 if given). Depending on the check results, the behaviours triggered are: "warn": Print a warning for each file that failed the check. "stop": Stop the script, i.e., the whole application. "ignore": Do nothing. You can still react using the return value.

download

logical. Whether to try downloading missing files. Defaults to TRUE. Existing files (with correct MD5 if available) will never be downloaded.

Value

Named list. The list has entries: "available": vector of strings. The names of the files that are available in the local file cache. You can access them using get_filepath(). "missing": vector of strings. The names of the files that this function was unable to retrieve. "file_status": Logical array indicating whether the files are available. Order is identical to the one in argument 'relative_filenames'.

Examples

   pkg_info = get_pkg_info("mypackage");
   local_relative_filenames = c("local_file1.txt", "local_file2.txt");
   bu = "https://raw.githubusercontent.com/dfsp-spirit/";
   url1 = paste(bu, "pkgfilecache/master/inst/extdata/file1.txt", sep="");
   url2 = paste(bu, "pkgfilecache/master/inst/extdata/file2.txt", sep="");
   urls = c(url1, url2);
   md5sums = c("35261471bcd198583c3805ee2a543b1f", "85ffec2e6efb476f1ee1e3e7fddd86de");
   res = ensure_files_available(pkg_info, local_relative_filenames, urls, md5sums=md5sums);
   erase_file_cache(pkg_info); # clear full cache


Delete the full package cache directory for the given package.

Description

Delete the full package cache directory for the given package.

Usage

erase_file_cache(pkg_info)

Arguments

pkg_info

named list. Package identifier, see get_pkg_info() on how to get one.

Value

integer. The return value of the unlink() call: 0 for success, 1 for failure. See the unlink() documentation for details.


Check whether files exist, optionally with MD5 check.

Description

Check whether files exist. If MD5 hashes are given, they will be verified.

Usage

files_exist_md5(files_absolute, md5sums = NULL)

Arguments

files_absolute

vector of strings. A vector of filenames. Files are check as given, so they must already include the package cache part of the path.

md5sums

vector of strings or NULL. A list of MD5 checksums, one for each file in param 'files', if not NULL. If given, the files will only be reported as existing if the MD5 sums match.

Value

logical vector. Whether the files exist. If the md5sums were given, whether the files exist and the MD5 sum matches.


Turn a filepath into a flat string.

Description

Turn a filepath into a flat string.

Usage

flatten_filepath(filepath)

Arguments

filepath

string or list of strings

Value

string, the flattened filepath


Join all relative filenames to a datadir.

Description

For each file, create a full path by joining the datadir with the filename.

Usage

get_abs_filenames(datadir, relative_filenames)

Arguments

datadir

string, the path to the package cache directory.

relative_filenames

vector of strings. A vector of filenames, relative to the package cache. Can be a list of vectors, which will be interpreted as files with subdirs.

Value

vector of strings, the absolute file names.


Construct absolute path for package cache files.

Description

Construct absolute path for package cache files.

Usage

get_absolute_path_for_files(pkg_info, relative_filenames)

Arguments

pkg_info

named list. Package identifier, see get_pkg_info() on how to get one.

relative_filenames

vector of strings. A vector of filenames, relative to the package cache.

Value

vector of strings. The absolute paths.

Examples

    rel_files = c("file1.txt", "file2.txt")
    pkg_info = get_pkg_info("mypackage")
    abs_paths = get_absolute_path_for_files(pkg_info, rel_files)


Get the absolute path of the package cache.

Description

Get the absolute path of the package cache.

Usage

get_cache_dir(pkg_info)

Arguments

pkg_info

named list. Package identifier, see get_pkg_info() on how to get one.

Value

string. The absolute path of the package cache. It is constructed by calling 'rappdirs::user_data_dir' with the package, author, and version if available. If the author is null, the package name is also used as the author name.

Examples

    pkg_info = get_pkg_info("mypackage")
    opt_data_dir = get_cache_dir(pkg_info)



Retrieve the path to a single file from the package cache.

Description

Retrieve the path to a single file from the package cache.

Usage

get_filepath(pkg_info, relative_filename, mustWork = TRUE)

Arguments

pkg_info

named list. Package identifier, see get_pkg_info() on how to get one.

relative_filename

string. A filename, relative to the package cache.

mustWork

logical. Whether an error should be created if the file does not exist.

Value

string. The path to the file. If mustWork=TRUE, the file is guaranteed to exist if the function returns (an error will occur if it does not). If mustWork=FALSE and the file does not exist, the empty string is returned.

Examples

    pkg_info = get_pkg_info("mypackage")
    full_path_of_file = get_filepath(pkg_info, "file1.txt", mustWork=FALSE)


Construct a pkg_info object to be used with all other functions.

Description

This functions constructs an object that uniquely identifies your package, i.e., the package that want to use the package cache. This is not a secret.

Usage

get_pkg_info(packagename, author = NULL, version = NULL)

Arguments

packagename

string. The name of the package using the package cache. Must be a valid directory name. Should not contain spaces. Passed as 'appname' to 'rappdirs::user_data_dir'.

author

string. The author of the package using the package cache, or NULL. Must be a valid directory name if given, no need for the real author name. Should not contain spaces. Defaults to NULL. Passed as 'appauthor' to 'rappdirs::user_data_dir'. Leave at NULL if in doubt.

version

string or NULL. An optional version path element to append to the path. You might want to use this if you want multiple versions of your pacakge to be able to have independent data. If used, this would typically be "<major>.<minor>". Must be a valid directory name. Should not contain spaces or special characters.

Value

named list. This can be passed to all function which require a 'pkg_info' argument. You should not care for the inner structure and treat it as some identifier.

Examples

    pkg_info = get_pkg_info("mypackage")
    pkg_info = get_pkg_info("mypackage", author="me")
    pkg_info = get_pkg_info("mypackage", author="me", version="0.3")


Given a relative file, determine its subdir in the package cache.

Description

Given a relative file, determine its subdir in the package cache.

Usage

get_relative_file_subdir(pkg_info, relative_file)

Arguments

pkg_info

named list. Package identifier, see get_pkg_info() on how to get one.

relative_file

string or vector of strings. If a string, this function does nothing. If a vector of strings, a path is created from the elements using file.path, and the directory of it (determined by dirname()) is created.

Value

named list. The entries are: "has_subdir": logical, whether the file has a subdir. "relative_filepath": string. The input relative_file, flattened to a string. For files without subdir, this is identical to string in the parameter 'relative_file'. For others, it is the result of applying file.path() to the elements of the vector 'relative_file'. If "has_subdir" is TRUE, the following 2 fields also exist: "relative_subdir": string, subdir path relative to package cache dir. "absolute_subdir": string, absolute subdir path.


List files that are available locally in the package cache.

Description

List files that are available locally in the package cache.

Usage

list_available(pkg_info)

Arguments

pkg_info

named list. Package identifier, see get_pkg_info() on how to get one.

Value

vector of strings. The file names available, relative to the package cache. The returned names may include a subdirectory part. The subdirectories are not listed separately.

Examples

    pkg_info = get_pkg_info("mypackage")
    available_files_in_cache = list_available(pkg_info)


Given a relative file, create the subdir in the package cache if needed.

Description

Given a relative file, create the subdir in the package cache if needed.

Usage

make_pgk_cache_subdir_for_all_relative_files(pkg_info, relative_filenames)

Arguments

pkg_info

named list. Package identifier, see get_pkg_info() on how to get one.

relative_filenames

vector of strings. A vector of filenames, relative to the package cache. Can be a list of vectors, which will be interpreted as files with subdirs.


Given a relative file, create the subdir in the package cache if needed.

Description

Given a relative file, create the subdir in the package cache if needed.

Usage

make_pgk_cache_subdir_for_relative_file(pkg_info, relative_file)

Arguments

pkg_info

named list. Package identifier, see get_pkg_info() on how to get one.

relative_file

string or vector of strings. If a string, this function does nothing. If a vector of strings, a path is created from the elements using file.path, and the directory of it (determined by dirname()) is created.


Delete all the given files from the package cache.

Description

Delete all the given files from the package cache.

Usage

remove_cached_files(pkg_info, relative_filenames)

Arguments

pkg_info

named list. Package identifier, see get_pkg_info() on how to get one.

relative_filenames

vector of strings. A vector of filenames, relative to the package cache.

Value

logical vector. For each file, whether it was deleted. Note that files which did not exist were not deleted! You should check the results using 'files_available'.

Examples

    pkg_info = get_pkg_info("mypackage")
    deleted = remove_cached_files(pkg_info, "some_file.txt")