Title: | Functions to Work with 'incidence2' Objects |
---|---|
Description: | Provides functions to work with 'incidence2' objects, including a simplified interface for trend fitting and peak estimation. This package is part of the RECON (<https://www.repidemicsconsortium.org/>) toolkit for outbreak analysis (<https://www.reconverse.org/). |
Authors: | Tim Taylor [aut, cre] , Thibaut Jombart [aut] |
Maintainer: | Tim Taylor <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.2.1.9000 |
Built: | 2024-11-05 04:59:00 UTC |
Source: | https://github.com/reconverse/i2extras |
add_rolling_average()
adds a rolling average to an <incidence2>
object.
If multiple groupings or count variables are present then the average will be
calculated for each.
add_rolling_average( x, n = 3L, complete_dates = TRUE, align = c("right", "center"), colname = "rolling_average", ... )
add_rolling_average( x, n = 3L, complete_dates = TRUE, align = c("right", "center"), colname = "rolling_average", ... )
x |
|
n |
How many date groupings to consider in each window?
|
complete_dates |
Should Defaults to TRUE. |
align |
Character, specifying the "alignment" of the rolling window, defaulting to |
colname |
The name of the column to contain the rolling average. |
... |
Other arguments passed to |
The input object with an additional column for the rolling average.
if (requireNamespace("outbreaks", quietly = TRUE)) { data(ebola_sim_clean, package = "outbreaks") dat <- ebola_sim_clean$linelist dat <- subset(dat, date_of_onset <= as.Date("2014-10-05")) inci <- incidence2::incidence( dat, date_index = "date_of_onset", groups = "gender", interval = "isoweek" ) add_rolling_average(inci, n = 3L) inci2 <- incidence2::regroup(inci) add_rolling_average(inci2, n = 7L) }
if (requireNamespace("outbreaks", quietly = TRUE)) { data(ebola_sim_clean, package = "outbreaks") dat <- ebola_sim_clean$linelist dat <- subset(dat, date_of_onset <= as.Date("2014-10-05")) inci <- incidence2::incidence( dat, date_index = "date_of_onset", groups = "gender", interval = "isoweek" ) add_rolling_average(inci, n = 3L) inci2 <- incidence2::regroup(inci) add_rolling_average(inci2, n = 7L) }
This function can be used to bootstrap [incidence2]
objects. Bootstrapping
is done by sampling with replacement the original input dates.
bootstrap(x, randomise_groups = FALSE)
bootstrap(x, randomise_groups = FALSE)
x |
An |
randomise_groups |
Should groups be randomised as well in the resampling procedure; respective group sizes will be preserved, but this can be used to remove any group-specific temporal dynamics. If |
As original data are not stored in incidence2::incidence objects, the bootstrapping is achieved by multinomial sampling of date bins weighted by their relative incidence.
An [incidence2]
object.
Thibaut Jombart, Tim Taylor
if (requireNamespace("outbreaks", quietly = TRUE)) { data(fluH7N9_china_2013, package = "outbreaks") i <- incidence( fluH7N9_china_2013, date_index = "date_of_onset", groups = "gender" ) bootstrap(i) }
if (requireNamespace("outbreaks", quietly = TRUE)) { data(fluH7N9_china_2013, package = "outbreaks") i <- incidence( fluH7N9_china_2013, date_index = "date_of_onset", groups = "gender" ) bootstrap(i) }
This function can be used to estimate the peak of an epidemic curve using bootstrapped samples of the available data.
estimate_peak(x, n = 100L, alpha = 0.05, first_only = TRUE, progress = TRUE)
estimate_peak(x, n = 100L, alpha = 0.05, first_only = TRUE, progress = TRUE)
x |
An |
n |
The number of bootstrap datasets to be generated; defaults to 100.
|
alpha |
The type 1 error chosen for the confidence interval; defaults to 0.05. |
first_only |
Should only the first peak (by date) be kept. Defaults to |
progress |
Should a progress bar be displayed (default = TRUE) |
Input dates are resampled with replacement to form bootstrapped datasets; the peak is reported for each, resulting in a distribution of peak times. When there are ties for peak incidence, only the first date is reported.
Note that the bootstrapping approach used for estimating the peak time makes the following assumptions:
the total number of event is known (no uncertainty on total incidence)
dates with no events (zero incidence) will never be in bootstrapped datasets
the reporting is assumed to be constant over time, i.e. every case is equally likely to be reported
A data frame with the the following columns:
observed_date
: the date of peak incidence of the original dataset.
observed_count
: the peak incidence of the original dataset.
estimated
: the median peak time of the bootstrap datasets.
lower_ci/upper_ci
: the confidence interval based on bootstrap datasets.
bootstrap_peaks
: a nested tibble containing the the peak times of the
bootstrapped datasets.
Thibaut Jombart and Tim Taylor, with inputs on caveats from Michael Höhle.
bootstrap()
for the bootstrapping underlying this approach and
find_peak()
to find the peak in a single [incidence2]
object.
if (requireNamespace("outbreaks", quietly = TRUE)) { # load data and create incidence data(fluH7N9_china_2013, package = "outbreaks") i <- incidence(fluH7N9_china_2013, date_index = "date_of_onset") # find 95% CI for peak time using bootstrap estimate_peak(i) }
if (requireNamespace("outbreaks", quietly = TRUE)) { # load data and create incidence data(fluH7N9_china_2013, package = "outbreaks") i <- incidence(fluH7N9_china_2013, date_index = "date_of_onset") # find 95% CI for peak time using bootstrap estimate_peak(i) }
This function can be used to find the peak of an epidemic curve stored as an
[incidence2]
object.
find_peak(x, complete_dates = TRUE, ...)
find_peak(x, complete_dates = TRUE, ...)
x |
incidence2 object. |
complete_dates |
Should Defaults to TRUE. |
... |
Other arguments passed to |
An [incidence2]
object the date of the (first) highest incidence in the
data along with the count. If x
is grouped object then the output will have
the peak calculated for each grouping.
estimate_peak()
for bootstrap estimates of the peak time.
if (requireNamespace("outbreaks", quietly = TRUE)) { # load data and create incidence data(fluH7N9_china_2013, package = "outbreaks") i <- incidence(fluH7N9_china_2013, date_index = "date_of_onset") find_peak(i) }
if (requireNamespace("outbreaks", quietly = TRUE)) { # load data and create incidence data(fluH7N9_china_2013, package = "outbreaks") i <- incidence(fluH7N9_china_2013, date_index = "date_of_onset") find_peak(i) }
Fit an epi curve
fit_curve(x, model = c("poisson", "negbin"), alpha = 0.05, ...)
fit_curve(x, model = c("poisson", "negbin"), alpha = 0.05, ...)
x |
An incidence2::incidence object. |
model |
The regression model to fit (can be "poisson" or "negbin"). |
alpha |
Value of alpha used to calculate confidence intervals; defaults to 0.05 which corresponds to a 95% confidence interval. |
... |
Additional arguments to pass to |
An object of class incidence2_fit
.
Low counts may be genuine, but they can also reflect actually missing data or strong under-reporting. This function aims to detect the latter by flagging any count below a certain threshold, expressed as a fraction of the median count. Setting low values to NAs can be useful to help fitting temporal trends to the data, as zeros / low counts can throw off some models (e.g. Negative Binomial GLMs).
flag_low_counts(x, counts = NULL, threshold = 0.001, set_missing = TRUE)
flag_low_counts(x, counts = NULL, threshold = 0.001, set_missing = TRUE)
x |
An incidence2::incidence object. |
counts |
A tidyselect compliant indication of the counts to be used. |
threshold |
A numeric multiplier of the median count to be used as threshold. Defaults to 0.001, in which case any count strictly lower than 0.1% of the mean count is flagged as low count. |
set_missing |
A |
An incidence2::incidence object.
Tim Taylor and Thibaut Jombart
if (requireNamespace("outbreaks", quietly = TRUE) && requireNamespace("incidence2", quietly = TRUE)) { data(covid19_england_nhscalls_2020, package = "outbreaks") dat <- covid19_england_nhscalls_2020 i <- incidence(dat, "date", interval = "isoweek", counts = "count") plot(i) plot(flag_low_counts(i, threshold = 0.1)) plot(flag_low_counts(i, threshold = 1), title = "removing counts below the median") }
if (requireNamespace("outbreaks", quietly = TRUE) && requireNamespace("incidence2", quietly = TRUE)) { data(covid19_england_nhscalls_2020, package = "outbreaks") dat <- covid19_england_nhscalls_2020 i <- incidence(dat, "date", interval = "isoweek", counts = "count") plot(i) plot(flag_low_counts(i, threshold = 0.1)) plot(flag_low_counts(i, threshold = 1), title = "removing counts below the median") }
Calculate growth/decay rate
growth_rate(x, ...) ## Default S3 method: growth_rate(x, ...) ## S3 method for class 'incidence2_fit' growth_rate( x, alpha = 0.05, growth_decay_time = TRUE, include_warnings = FALSE, ... )
growth_rate(x, ...) ## Default S3 method: growth_rate(x, ...) ## S3 method for class 'incidence2_fit' growth_rate( x, alpha = 0.05, growth_decay_time = TRUE, include_warnings = FALSE, ... )
x |
The output of |
... |
Not currently used. |
alpha |
Value of alpha used to calculate confidence intervals; defaults to 0.05 which corresponds to a 95% confidence interval. |
growth_decay_time |
Should a doubling/halving time and corresponding confidence intervals be added to the output. Default TRUE. |
include_warnings |
Include models in output that triggered warnings but
not errors. Defaults to |
Tim Taylor
These functions are used to filter succesful model fits from those that errored or gave warnings.
is_ok(x, ...) ## Default S3 method: is_ok(x, ...) ## S3 method for class 'incidence2_fit' is_ok(x, include_warnings = FALSE, ...) is_error(x, ...) ## Default S3 method: is_error(x, ...) ## S3 method for class 'incidence2_fit' is_error(x, ...) is_warning(x, ...) ## Default S3 method: is_warning(x, ...) ## S3 method for class 'incidence2_fit' is_warning(x, ...)
is_ok(x, ...) ## Default S3 method: is_ok(x, ...) ## S3 method for class 'incidence2_fit' is_ok(x, include_warnings = FALSE, ...) is_error(x, ...) ## Default S3 method: is_error(x, ...) ## S3 method for class 'incidence2_fit' is_error(x, ...) is_warning(x, ...) ## Default S3 method: is_warning(x, ...) ## S3 method for class 'incidence2_fit' is_warning(x, ...)
x |
The output of function |
... |
Not currently used. |
include_warnings |
Include results in output that triggered warnings but
not errors. Defaults to |
is_ok()
: returns rows from an <incidence2_fit>
object that did not
error (and optionally produce a warning).
is_error()
: returns rows from an <incidence2_fit>
object that errored.
is_warning()
: returns rows from an <incidence2_fit>
object that
produced warnings.
Tim Taylor
Plot a fitted epicurve
## S3 method for class 'incidence2_fit' plot(x, include_warnings = TRUE, ci = TRUE, pi = FALSE, ...)
## S3 method for class 'incidence2_fit' plot(x, include_warnings = TRUE, ci = TRUE, pi = FALSE, ...)
x |
An |
include_warnings |
Include results in plot that triggered warnings but not errors. Defaults to |
ci |
Plot confidence intervals. Defaults to TRUE. |
pi |
Plot prediction intervals. Defaults to FALSE. |
... |
Additional arguments to be passed to |
An incidence plot with the addition of a fitted curve.
Tim Taylor