Title: | Air Quality Evaluation |
Version: | 0.6.2 |
Date: | 2025-07-12 |
Description: | Developed for use by those tasked with the routine detection, characterisation and quantification of discrete changes in air quality time-series, such as identifying the impacts of air quality policy interventions. The main functions use signal isolation then break-point/segment (BP/S) methods based on 'strucchange' and 'segmented' methods to detect and quantify change events (Ropkins & Tate, 2021, <doi:10.1016/j.scitotenv.2020.142374>). |
Maintainer: | Karl Ropkins <k.ropkins@its.leeds.ac.uk> |
License: | GPL (≥ 3) |
Encoding: | UTF-8 |
LazyData: | true |
Depends: | R (≥ 3.5.0) |
Imports: | openair, dplyr, loa, ggplot2, strucchange, segmented, mgcv, tidyr, lubridate, purrr, ggtext, stats, data.table |
RoxygenNote: | 7.3.2 |
URL: | https://github.com/karlropkins/AQEval, https://karlropkins.github.io/AQEval/ |
BugReports: | https://github.com/karlropkins/AQEval/issues |
Suggests: | tinytest |
NeedsCompilation: | no |
Packaged: | 2025-07-12 14:20:02 UTC; trakradmin |
Author: | Karl Ropkins |
Repository: | CRAN |
Date/Publication: | 2025-07-14 19:10:18 UTC |
Air Quality Evaluation
Description
R AQEval: R code for the analysis of discrete change in Air Quality time-series.
AQEval
AQEval
was developed for use by those tasked with
the routine detection, characterisation and quantification
of discrete changes in air quality time-series.
The main functions, quantBreakPoints
and quantBreakSegments
, use
break-point/segment (BP/S) methods
based on the consecutive use of methods in the
strucchange
and segmented
R
packages
to first detection (as break-points) and then characterise
and quantify (as segments), discrete changes in
air-quality time-series.
AQEval
functions adopt an openair
-friendly
approach using function and data structures that many
in the air quality research community are already familiar
with.
Most notably, most functions expect supplied data
to be time-series, to be supplied as a single
data.frame
(or similar R object), and for
time-series to be identified by column names.
The main functions are typically structured expect
first the data.frame
, then the name of the
pollutant to be used, then other arguments:
function(data, "polluant.name", ...)
output <- function(data, "polluant.name", ...)
Author(s)
Karl Ropkins
References
Ropkins, K. and Tate, J.E., 2021. Early observations on the impact of the COVID-19 lockdown on air quality trends across the UK. Science of the Total Environment, 754, p.142374. https://doi.org/10.1016/j.scitotenv.2020.142374
Ropkins, K., Tate, J.E., Walker, A. and Clark, T., 2022. Measuring the impact of air quality related interventions. Environmental Science: Atmospheres, 2(3), pp.500-516. https://doi.org/10.1039/d1ea00073j
Ropkins, K., Walker, A., Philips, I., Rushton, C., Clark, T. and Tate, J., Change Detection of Air Quality Time-Series Using the R Package AEQval. Available at SSRN 4267722. https://ssrn.com/abstract=4267722 or http://dx.doi.org/10.2139/ssrn.4267722 Also at: https://karlropkins.github.io/AQEval/articles/AQEval_Intro_Preprint.pdf
See Also
For more about data structure and an example data set,
see AQEval.data
For more about the main functions, see
quantBreakPoints
and quantBreakSegments
AQEval Example data
Description
Data packaged with AQEval for use with example code.
Usage
aq.data
Format
(26280x6) 'tbl_df' objects
- date
Time-series of POSIX class date and time records.
- no2
Time-series of nitrogen dioxide measurements from local site.
- bg.no2
Time-series of nitrogen dioxide measurements from nearby background site.
- ws
Time-series of local wind speed measurements.
- wd
Time-series of local wind direction measurements.
- air_temp
Time-series of local air temperature measurements.
Details
Most of functions in AQEval
adopt the
openair
convention of assuming supplied data is
a single data.frame
or similar.
The data frame was initially adopted for two reasons:
Firstly, air quality data collected and archived in numerous formats and keeping the import requirements simple minimises the frustrations associated with data importation.
Secondly, restricting the user to work with a single data format greatly simplifies data management for those less familiar with programming environments.
As part of this work several openair
coding
conventions were adopted, most importantly that data
sets should include a column named date
of
POSIX
class data-and-time-stamps
(DateTimeClasses
).
This and other conventions, such as the use of
ws
and wd
for numeric wind speed and
direction data-series, and site
and code
for character or factor monitoring site name and
identifier code, are now commonplace for many working
with R in the air quality research community, and many
air quality archives provide data in (or support import
functions that convert their own data structures to)
this openair
-friendly structure.
Source
Air quality and meteorological data packaged for use with AQEval Examples.
Time-series sources:
-
date Date-and-time-stamp of POSIX class (
DateTimeClasses
). -
no2 Nitrogen dioxide downloaded from King's College London Archive using
importKCL
function inopenair
. -
bg.no2 Nitrogen dioxide downloaded from the Automatic Urban and Rural Network Archive using
importAURN
function inopenair
. -
ws, wd, air_temp Wind speed, wind direction and air temperature downloaded from NOAA's Integrated Surface Database using
importNOAA
function inworldmet
.
References
Regarding openair
and openair
-friendly
data structuring, see:
Carslaw, D. C. and K. Ropkins (2012), openair — an R package for air quality data analysis. Environmental Modelling & Software. Volume 27-28, 52-61, DOI doi:10.1016/j.envsoft.2011.09.008
Ropkins, K. and D.C. Carslaw (2012), openair-Data Analysis Tools for the Air Quality Community. R Journal, 4(1). URL https://journal.r-project.org/archive/2012/RJ-2012-003/RJ-2012-003.pdf
Regarding worldmet
, see:
David Carslaw (2021), worldmet: Import Surface Meteorological Data from NOAA Integrated Surface Database (ISD). R package version 0.9.5. URL https://CRAN.R-project.org/package=worldmet
See Also
openair
: functions importAURN
and
importKCL
worldmet
: function importNOAA
(See References)
Examples
#data set used in AQEval Examples
dim(aq.data)
head(aq.data)
with(aq.data, plot(date, no2, type="l"))
aqeval.generics
Description
Generic functions for use with aqe
object class for
AQEval
outputs.
Usage
## S3 method for class 'aqe'
print(x, ...)
## S3 method for class 'aqe'
plot(x, ...)
Arguments
x |
the |
... |
additional arguments, typically passed on to next method or ignored. |
Some functions to calculate statistics
Description
Calculate data set statistics for selected time intervals.
Usage
calcDateRangeStat(
data,
from = NULL,
to = NULL,
stat = NULL,
pollutant = NULL,
...,
method = 2
)
calcRollingDateRangeStat(
data,
range = "year",
res = "day",
stat = NULL,
pollutant = NULL,
from = NULL,
to = NULL,
...,
method = 2
)
Arguments
data |
(data.frame, tibble, etc) Data set containing
data statistic to be calculated for, and |
from |
(various) Start date(s) to subsample from when
calculating statistic, by default end of supplied
|
to |
(various) End date(s) to subsample to when
calculating statistic, by default end of supplied
|
stat |
(function) Statistic to be applied to selected
data, by default |
pollutant |
(character) The name(s) of data-series to
analyse in |
... |
extra arguments. |
method |
(numeric) Method to use when calculating statistic. Currently 1 (using base R), 2 (using dplyr), 3 (using data.table), and 4 (using dplyr and purrr) |
range |
(character) For |
res |
(character) For |
Value
These functions return data.frame
s of function
outputs.
Note
These functions are in development and likely to change significantly in future versions, please handle with care.
find and test break-points
Description
Finding and testing break-points in conventionally formatted air quality data sets.
Usage
findBreakPoints(data, pollutant, h = 0.15, ...)
testBreakPoints(data, pollutant, breaks, ...)
Arguments
data |
Data source, typically a |
pollutant |
Name of time-series, assumed to be
a column in |
h |
( |
... |
other parameters |
breaks |
( |
Details
findBreakPoints
uses methods from
strucchange
package (see references) and
modifications as suggested by the main author of
strucchange
to handle missing cases to find
potential breaks-points in a supplied time-series.
testBreakPoints
tests and identifies most likely
break-points using methods proposed for use with
quantBreakPoints
and quantBreakSegments
and conventionally formatted air quality data sets.
Value
findBreakPoints
returns a data.frame
of found break-points.
testBreakPoints
return a likely break-point/segment
report.
References
Regarding strucchange
methods see
breakpoints
, and:
Achim Zeileis, Friedrich Leisch, Kurt Hornik and Christian Kleiber (2002). strucchange: An R Package for Testing for Structural Change in Linear Regression Models. Journal of Statistical Software, 7(2), 1-38. URL https://www.jstatsoft.org/v07/i02/.
Achim Zeileis, Christian Kleiber, Walter Kraemer and Kurt Hornik (2003). Testing and Dating of Structural Changes in Practice. Computational Statistics & Data Analysis, 44, 109-123.
Regarding missing data handling, see:
URL: https://stackoverflow.com/questions/43243548/strucchange-not-reporting-breakdates.
Regarding testBreakPoints
, see:
Ropkins, K., Walker, A., Philips, I., Rushton, C., Clark, T. and Tate, J., Change Detection of Air Quality Time-Series Using the R Package AEQval. Available at SSRN 4267722. https://ssrn.com/abstract=4267722 or http://dx.doi.org/10.2139/ssrn.4267722 Also at: https://karlropkins.github.io/AQEval/articles/AQEval_Intro_Preprint.pdf
See Also
find nearby sites
Description
Function to find nearest locations in a reference by latitude and longitude.
Usage
findNearLatLon(lat, lon = NULL, nmax = 10, ..., ref = NULL, units = "m")
findNearSites(
lat,
lon,
pollutant = "no2",
site.type = "rural background",
nmax = 10,
...,
ref = NULL,
units = "m"
)
Arguments
lat , lon |
(numeric) The supplied latitude and longitude. |
nmax |
(numeric) The maximum number of nearest sites to report, by default 10. |
... |
Other parameters, mostly ignored. |
ref |
( |
units |
(character) The units to use when reporting distances to near locations; current options m. |
pollutant |
(character) For |
site.type |
(character) For |
Details
If investigating air quality in a particular location,
for example a UK Clean Air Zone
(https://www.gov.uk/guidance/driving-in-a-clean-air-zone),
you may wish to locate an appropriate rural background air quality
monitoring station. findNearSites
locates air quality monitoring
sites with openly available data such as that available from the UK AURN
network (https://uk-air.defra.gov.uk/networks/network-info?view=aurn)
Value
find.near
returns data.frame
of near site meta
data.
Note
This function uses haversine formula to account to the Earth's surface curvature, and uses 6371 km as the radius of earth.
Examples
#find rural background NO2 monitoring sites
#near latitude = 50, longitude = -1
#not run: requires internet
## Not run:
findNearSites(lat = 50, lon = -1)
## End(Not run)
isolateContribution
Description
Environmental time-series signal processing: Contribution isolation based on background subtraction, deseasonalisation and/or deweathering.
Usage
isolateContribution(
data,
pollutant,
background = NULL,
deseason = TRUE,
deweather = TRUE,
method = 2,
add.term = NULL,
formula = NULL,
use.bam = FALSE,
output = "mean",
...
)
Arguments
data |
Data source, typically |
pollutant |
The column name of the |
background |
(optional) if supplied, the background time-series to use as a background correction. See below. |
deseason |
logical or character vector, if
|
deweather |
logical or character vector, if
|
method |
numeric, contribution isolation method (default 2). See Note. |
add.term |
extra terms to add to the contribution isolation model; ignore for now (in development). |
formula |
(optional) Signal isolate model formula;
this allows user to set the signal isolation model formula
directly, but means function arguments |
use.bam |
(logical) If TRUE, the |
output |
output options; currently, |
... |
other arguments; ignore for now (in development) |
Details
isolateContribution
estimates and
subtracts pollutant
variance associated with
factors that may hinder break-point/segment analysis:
-
Background Correction If applied, this fits the supplied
background
time-series as a spline term:s(background)
. -
Seasonality If applied, this fits regular frequency terms, e.g.
day.hour
,year.day
, as spline terms, default TRUE is equivalent tos(day.hour)
ands(year.day)
. All terms are calculated fromdate
column indata
. -
Weather If applied, this fits time-series of identified meteorological measurements, e.g. wind speed and direction (
ws
andwd
indata
). If bothws
andwd
are present these are fitted as a tensor termte(ws, wd)
. Otherdeweather
ing terms, if included, are fitted as spline terms(term)
. The defaultTRUE
is equivalent tote(ws, wd)
.
Using the supplied arguments, it builds a signal
(mgcv
) GAM model, calculates,
and returns the mean-centred residuals as an
estimate of the isolated local contribution.
Value
isolateContribution
returns a vector of
predictions of the pollutant
time-series after
the requested signal isolation.
Note
method
was included as part of method
development and testing work, and retained for now.
Please ignore for now.
Author(s)
Karl Ropkins
References
Regarding mgcv
GAM fitting methods, see
Wood (2017) for general introduction and package
documentation regarding coding (mgcv
):
Wood, S.N. (2017) Generalized Additive Models: an introduction with R (2nd edition), Chapman and Hall/CRC.
Regarding isolateContribution
, see:
Ropkins, K., Walker, A., Philips, I., Rushton, C., Clark, T. and Tate, J., Change Detection of Air Quality Time-Series Using the R Package AEQval. Available at SSRN 4267722. https://ssrn.com/abstract=4267722 or http://dx.doi.org/10.2139/ssrn.4267722 Also at: https://karlropkins.github.io/AQEval/articles/AQEval_Intro_Preprint.pdf
See Also
Regarding seasonal terms and frequency
analysis, see also stl
and
spectralFrequency
.
Examples
#fitting a simple deseasonalisation, deweathering
#and background correction (dswb) model to no2:
aq.data$dswb.no2 <- isolateContribution(aq.data,
"no2", background="bg.no2")
#compare at 14 day resolution:
temp <- openair::timeAverage(aq.data, "14 day")
#without dswb
quantBreakPoints(temp, "no2", test=FALSE, h=0.1)
#with dswb
quantBreakPoints(temp, "dswb.no2", test=FALSE, h=0.1)
Other Air Quality Models
Description
Other packaged Air Quality Models.
Usage
fitNearSiteModel(data, pollutant = "no2", y, x = "rest", elements = NULL, ...)
Arguments
data |
|
pollutant |
The name of the |
y |
The name of the monitor site to be modelled,
assumed to be one several names in the |
x |
The other sites to use when building the model, the default 'rest' uses all supplied sites except 'y'. |
elements |
The number of inputs to use in the
site models, can be any number up to length of x or
combination thereof; by default this is set as
|
... |
extra arguments. |
Details
fitNearSiteModel
builds an air quality
model for one location using air quality data from nearby
sites.
Value
data
with model output added as additional
column.
quantify break-point/segments
Description
Quantify either break-points or break-segment methods for pollutant time-series
Usage
quantBreakPoints(
data,
pollutant,
breaks,
ylab = NULL,
xlab = NULL,
pt.col = c("lightgrey", "darkgrey"),
line.col = "red",
break.col = "blue",
event = NULL,
show = c("plot", "report"),
...
)
quantBreakSegments(
data,
pollutant,
breaks,
ylab = NULL,
xlab = NULL,
pt.col = c("lightgrey", "darkgrey"),
line.col = "red",
break.col = "blue",
event = NULL,
seg.method = 2,
seg.seed = 12345,
show = c("plot", "report"),
...
)
Arguments
data |
Data source, typically a data.frame or similar, containing data-series to model and a paired time-stamp data-series, named date. |
pollutant |
The name of the data-series to break-point or break-segment model. |
breaks |
(Optional) The break-points and
confidence intervals to use when building either
break-point or break-segment models. If not supplied
these are build using |
ylab |
Y-label term, by default pollutant. |
xlab |
X-label term, by default date. |
pt.col |
Point fill and line colours for plot, defaults lightgrey and darkgrey. |
line.col |
Line colour for plot, default red. |
break.col |
Break-point/segment colour for plot, default blue. |
event |
An optional list of plot terms for an event
marker, applied to a vertical line and text label. List
items include: |
show |
What to show before returning the break-point quantification mode, by default plot and report. |
... |
other parameters |
seg.method |
( |
seg.seed |
( |
Details
quantBreakPoints
and
quantBreakSegments
both use
strucchange
methods to identify potential
break-points in time-series, and then quantify
these as conventional break-points or break-segments,
respectively:
-
Finding Break-points Using the
strucchange
methods of Zeileis and colleagues and independent change detection model, the functions apply a rolling-window approach, assuming the first window (or data subset) is without change, building a statistical model of that, advancing the window, building a second model and comparing these, and so on, to identify the most likely points of change in a larger data-series. See alsofindBreakPoints
-
Quantifying Break-points Using the supplied break-points to build a break-point model.
-
Quantifying Break-segments Using the confidence regions for the supplied break-points as the starting points to build a break-segment model.
Value
Both functions use the show
argument
to control which elements of the functions outputs
are shown but also invisible return a list
of all outputs which can caught using, e.g.:
brk.mod <- quantBreakPoints(data, pollutant)
Note
AQEval
function quantBreakSegments
is currently running segmented v.1.3-4
while we
evaluate latest version, v.1.4-0
.
Author(s)
Karl Ropkins
References
Regarding strucchange
methods see in-package
documentation, e.g. breakpoints
,
and:
Achim Zeileis, Friedrich Leisch, Kurt Hornik and Christian Kleiber (2002). strucchange: An R Package for Testing for Structural Change in Linear Regression Models. Journal of Statistical Software, 7(2), 1-38. URL https://www.jstatsoft.org/v07/i02/.
Achim Zeileis, Christian Kleiber, Walter Kraemer and Kurt Hornik (2003). Testing and Dating of Structural Changes in Practice. Computational Statistics & Data Analysis, 44, 109-123. DOI doi:10.1016/S0167-9473(03)00030-6.
Regarding segmented
methods see in-package
documentation, e.g.
segmented
, and:
Vito M. R. Muggeo (2003). Estimating regression models with unknown break-points. Statistics in Medicine, 22, 3055-3071. DOI 10.1002/sim.1545.
Vito M. R. Muggeo (2008). segmented: an R Package to Fit Regression Models with Broken-Line Relationships. R News, 8/1, 20-25. URL https://cran.r-project.org/doc/Rnews/.
Vito M. R. Muggeo (2016). Testing with a nuisance parameter present only under the alternative: a score-based approach with application to segmented modelling. J of Statistical Computation and Simulation, 86, 3059-3067. DOI 10.1080/00949655.2016.1149855.
Vito M. R. Muggeo (2017). Interval estimation for the breakpoint in segmented regression: a smoothed score-based approach. Australian & New Zealand Journal of Statistics, 59, 311-322. DOI 10.1111/anzs.12200.
Regarding break-points/segment methods, see:
Ropkins, K., Walker, A., Philips, I., Rushton, C., Clark, T. and Tate, J., Change Detection of Air Quality Time-Series Using the R Package AEQval. Available at SSRN 4267722. https://ssrn.com/abstract=4267722 or http://dx.doi.org/10.2139/ssrn.4267722 Also at: https://karlropkins.github.io/AQEval/articles/AQEval_Intro_Preprint.pdf
See Also
timeAverage
in openair
,
breakpoints
in strucchange
, and
segmented
in segmented
.
Examples
#using openair timeAverage to covert 1-hour data to 1-day averages
temp <- openair::timeAverage(aq.data, "1 day")
#break-points
quantBreakPoints(temp, "no2", h=0.3)
#break-segments
quantBreakSegments(temp, "no2", h=0.3)
#addition examples (not run)
## Not run:
#in-call plot modification
#removing x axis label
#recolouring break line and
#adding an event marker
quantBreakPoints(temp, "no2", h=0.3,
xlab="", break.col = "red",
event=list(label="Event expected here",
x="2002-08-01", col="grey"))
## End(Not run)
Spectral Analysis
Description
Time-series spectral frequency analysis.
Usage
spectralFrequency(data, pollutant, ...)
Arguments
data |
|
pollutant |
The name of the time-series, typically pollutant measurements, to be analysed. |
... |
extra arguments. |
Details
spectralFrequency
producing a
time frequency analysis of the requested
pollutant
.
Value
spectralFrequency
uses the show
argument to control which elements of the functions outputs
are shown but also invisibly returns a list
of all outputs which can caught using, e.g.:
sfa.mod <- spectralFrequency(data, pollutant)
Examples
spectralFrequency(aq.data, "no2")