Type: | Package |
Title: | Retrieve Data on European Union Law |
Version: | 0.4.8 |
Description: | Access to data on European Union laws and court decisions made easy with pre-defined 'SPARQL' queries and 'GET' requests. See Ovadek (2021) <doi:10.1080/2474736X.2020.1870150> . |
License: | GPL-3 |
Encoding: | UTF-8 |
Language: | en-US |
Depends: | R (≥ 3.5.0) |
Imports: | magrittr, dplyr, xml2, tidyr, httr, curl, rvest, rlang, stringr, pdftools, antiword |
Suggests: | knitr, rmarkdown, tidytext, wordcloud, purrr, ggplot2, ggiraph, testthat (≥ 3.0.0) |
URL: | https://michalovadek.github.io/eurlex/, https://github.com/michalovadek/eurlex |
BugReports: | https://github.com/michalovadek/eurlex/issues |
RoxygenNote: | 7.3.1 |
VignetteBuilder: | knitr |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2024-07-03 00:26:56 UTC; uctqova |
Author: | Michal Ovadek |
Maintainer: | Michal Ovadek <michal.ovadek@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2024-07-03 07:00:02 UTC |
Scrape list of court cases from Curia
Description
Harvests data from lists of EU court cases from curia.europa.eu. CELEX identifiers are extracted from hyperlinks where available.
Usage
elx_curia_list(
data = c("all", "ecj_old", "ecj_new", "gc_all", "cst_all"),
parse = TRUE
)
Arguments
data |
Data to be scraped from four separate lists of cases maintained by Curia, defaults to "all" which contains cases from Court of Justice, General Court and Civil Service Tribunal. |
parse |
If |
Value
A data frame containing case identifiers and information as character columns. Where the case id contains a hyperlink to Eur-Lex, the CELEX identifier is retrieved as well. Hyperlinks to Eur-Lex disappeared from more recent cases.
Examples
elx_curia_list(data = "cst_all", parse = FALSE)
Download XML notice associated with a URL
Description
Downloads an XML notice of a given type associated with a Cellar resource.
Usage
elx_download_xml(
url,
file = paste(basename(url), ".xml", sep = ""),
notice = c("tree", "branch", "object"),
language_1 = "en",
language_2 = "fr",
language_3 = "de",
mode = "wb"
)
Arguments
url |
A valid url as character vector of length one based on a resource identifier such as CELEX or Cellar URI. |
file |
A character string with the name where the downloaded file is saved. |
notice |
The type of notice requested controls what kind of metadata are returned. |
language_1 |
The priority language in which the data will be attempted to be retrieved, in ISO 639 2-char code |
language_2 |
If data not available in |
language_3 |
If data not available in |
mode |
A character string specifying the mode with which to write the file. Useful values are "w", "wb" (binary), "a" (append) and "ab". |
Details
To retrieve all identifiers associated with a url, use elx_fetch_data(type = "ids").
Value
Path of downloaded file (invisibly) if server validates request (http status code has to be 200). For more information about notices, see Cellar documentation.
Examples
temploc <- paste(tempdir(), "elxnotice.xml", sep = "\\")
elx_download_xml(url = "http://publications.europa.eu/resource/celex/32022D0154",
file = temploc, notice = "object")
unlink(temploc)
Retrieve additional data on EU documents
Description
Get titles, texts, identifiers and XML notices for EU resources.
Usage
elx_fetch_data(
url,
type = c("title", "text", "ids", "notice"),
notice = c("tree", "branch", "object"),
language_1 = "en",
language_2 = "fr",
language_3 = "de",
include_breaks = TRUE,
html_text = c("text2", "text")
)
Arguments
url |
A valid url as character vector of length one based on a resource identifier such as CELEX or Cellar URI. |
type |
The type of data to be retrieved. When type = "text", the returned list contains named elements reflecting the source of each text. When type = "notice", the results return an XML notice associated with the url. |
notice |
If type = "notice", controls what kind of metadata are returned by the notice. |
language_1 |
The priority language in which the data will be attempted to be retrieved, in ISO 639 2-char code |
language_2 |
If data not available in |
language_3 |
If data not available in |
include_breaks |
If TRUE, text includes tags showing where pages ("—pagebreak—", for pdfs) and documents ("—documentbreak—") were concatenated |
html_text |
Choose whether to read text from html using |
Value
A character vector of length one containing the result. When type = "text"
, named character vector where the name contains the source of the text.
Examples
elx_fetch_data(url = "http://publications.europa.eu/resource/celex/32014R0001", type = "title")
Label EuroVoc concepts
Description
Create a look-up table with labels for EuroVoc concept URIs. Only unique identifiers are returned.
Usage
elx_label_eurovoc(uri_eurovoc = "", alt_labels = FALSE, language = "en")
Arguments
uri_eurovoc |
Character vector with valid EuroVoc URIs |
alt_labels |
If |
language |
Language in which to return the labels, in ISO 639 2-char code |
Value
A tibble
containing EuroVoc unique concept identifiers and labels.
Examples
elx_label_eurovoc(uri_eurovoc = "http://eurovoc.europa.eu/5760", language = "fr")
Create SPARQL queries
Description
Generates pre-defined or manual SPARQL queries to retrieve document ids from Cellar. List of available resource types: http://publications.europa.eu/resource/authority/resource-type . Note that not all resource types are compatible with default parameter values.
Usage
elx_make_query(
resource_type = c("any", "directive", "regulation", "decision", "recommendation",
"intagr", "caselaw", "manual", "proposal", "national_impl"),
manual_type = "",
directory = NULL,
sector = NULL,
include_corrigenda = FALSE,
include_celex = TRUE,
include_lbs = FALSE,
include_date = FALSE,
include_date_force = FALSE,
include_date_endvalid = FALSE,
include_date_transpos = FALSE,
include_date_lodged = FALSE,
include_force = FALSE,
include_eurovoc = FALSE,
include_citations = FALSE,
include_citations_detailed = FALSE,
include_author = FALSE,
include_directory = FALSE,
include_directory_code = FALSE,
include_sector = FALSE,
include_ecli = FALSE,
include_court_procedure = FALSE,
include_judge_rapporteur = FALSE,
include_advocate_general = FALSE,
include_court_formation = FALSE,
include_court_scholarship = FALSE,
include_court_origin = FALSE,
include_original_language = FALSE,
include_proposal = FALSE,
order = FALSE,
limit = NULL
)
Arguments
resource_type |
Type of resource to be retrieved via SPARQL query |
manual_type |
Define manually the type of resource to be retrieved |
directory |
Restrict the results to a given directory code |
sector |
Restrict the results to a given sector code |
include_corrigenda |
If |
include_celex |
If |
include_lbs |
If |
include_date |
If |
include_date_force |
If |
include_date_endvalid |
If |
include_date_transpos |
If |
include_date_lodged |
If |
include_force |
If |
include_eurovoc |
If |
include_citations |
If |
include_citations_detailed |
If |
include_author |
If |
include_directory |
If |
include_directory_code |
If |
include_sector |
If |
include_ecli |
If |
include_court_procedure |
If |
include_judge_rapporteur |
If |
include_advocate_general |
If |
include_court_formation |
If |
include_court_scholarship |
If |
include_court_origin |
If |
include_original_language |
If |
include_proposal |
If |
order |
Order results by ids |
limit |
Limit the number of results, for testing purposes mainly |
Value
A character string containing the SPARQL query
Examples
elx_make_query(resource_type = "directive", include_date = TRUE, include_force = TRUE)
elx_make_query(resource_type = "regulation", include_corrigenda = TRUE, order = TRUE)
elx_make_query(resource_type = "any", sector = 2)
elx_make_query(resource_type = "manual", manual_type = "SWD")
Execute SPARQL queries
Description
Executes cURL request to a pre-defined endpoint of the EU Publications Office. Relies on elx_make_query to generate valid SPARQL queries. Results are capped at 1 million rows.
Usage
elx_run_query(
query = "",
endpoint = "http://publications.europa.eu/webapi/rdf/sparql"
)
Arguments
query |
A valid SPARQL query specified by |
endpoint |
SPARQL endpoint |
Value
A data frame containing the results of the SPARQL query.
Column work
contains the Cellar URI of the resource.
Examples
elx_run_query(elx_make_query("directive", include_force = TRUE, limit = 10))