fit_seir.Rd
This function fits a Stan SEIR model to one or more sets of COVID-19 case
data. See project_seir()
for making predictions/projections.
fit_seir( daily_cases, obs_model = c("NB2", "Poisson"), forecast_days = 0, time_increment = 0.25, samp_frac_fixed = NULL, samp_frac_type = c("fixed", "estimated", "rw", "segmented"), samp_frac_seg = NULL, f_seg = c(0, rep(1, nrow(daily_cases) + forecast_days - 1)), days_back = 45, R0_prior = c(log(2.6), 0.2), phi_prior = 1, f_prior = c(0.4, 0.2), e_prior = c(0.8, 0.05), samp_frac_prior = c(0.4, 0.2), start_decline_prior = c(log(15), 0.05), end_decline_prior = c(log(22), 0.05), use_ramp = TRUE, rw_sigma = 0.1, seed = 42, chains = 4, iter = 2000, N_pop = 5100000, pars = c(D = 5, k1 = 1/5, k2 = 1, q = 0.05, ud = 0.1, ur = 0.02, f0 = 1), i0_prior = c(log(8), 1), state_0 = c(E1_frac = 0.4, E2_frac = 0.1, I_frac = 0.5, Q_num = 0, R_num = 0, E1d_frac = 0.4, E2d_frac = 0.1, Id_frac = 0.5, Qd_num = 0, Rd_num = 0), save_state_predictions = FALSE, delay_scale = 9.85, delay_shape = 1.73, ode_control = c(1e-07, 1e-06, 1e+06), fit_type = c("NUTS", "VB", "optimizing"), init = c("prior_random", "optimizing"), init_list = NULL, X = NULL, ... )
daily_cases | Either a vector of daily new cases if fitting to a single data type or a matrix of case data if fitting to multiple data types. Each data type should be in its own column. Can have NA values (will be ignored in likelihood). A vector will be turned into a one column matrix. |
---|---|
obs_model | Type of observation model. |
forecast_days | Number of days into the future to forecast. The model
will run faster with fewer forecasted days. It is recommended to set this
to 0 here and use |
time_increment | Time increment for ODEs and Weibull delay-model integration in units of days. Larger numbers will run faster, possibly at the expense of accuracy. |
samp_frac_fixed | A vector (or matrix) of sampled fractions. Should be
of dimensions: |
samp_frac_type | How to treat the sample fraction. Fixed, estimated, a constrained random walk, or segmented. Only applies to the first data type column. The other data types must always have a fixed sample fraction currently. |
samp_frac_seg | A vector of sample fraction segment indexes of length
|
f_seg | A vector of segment indexes of length |
days_back | Number of days to go back for the Weibull case-delay integration. Should be sufficiently large that the results do not change. |
R0_prior | Lognormal log mean and SD for the R0 prior. |
phi_prior | SD of |
f_prior | Beta mean and SD for the |
e_prior | Beta mean and SD for the |
samp_frac_prior |
|
start_decline_prior | Lognormal log mean and SD for the parameter
representing the day that social distancing starts ramping in
( |
end_decline_prior | Lognormal log mean and SD for the parameter
representing the day that the social distancing ramp finishes being ramped
in ( |
use_ramp | Logical. Use the ramp? |
rw_sigma | The fixed standard deviation on the optional |
seed | MCMC seed for |
chains | Number of MCMC chains for |
iter | MCMC iterations per chain for |
N_pop | Number of people in population. |
pars | A named numeric vector of fixed parameter values. |
i0_prior | Infected people infected at initial point in time. Lognormal log mean and SD for the parameter. Note that the default is set up for BC. You may need to use a lower prior for other regions. |
state_0 | Initial state: a named numeric vector. |
save_state_predictions | Include the state predictions? |
delay_scale | Weibull scale parameter for the delay in reporting. If there are multiple datatypes then this should be a vector of the same length as the data types. |
delay_shape | Weibull shape parameter for the delay in reporting. Same
format as for |
ode_control | Control options for the Stan ODE solver. First is relative difference, then absolute difference, and then maximum iterations. These can likely be left as is. |
fit_type | Stan sampling/fitting algorithm to use. NUTS =
|
init | Initialization type. Draw randomly from the prior or try to use
the MAP estimate? Can be overridden by the |
init_list | An optional named list foreign initialization. Note that the
|
X | An optional model matrix applied additively to log expected cases. |
... | Other arguments to pass to |
A named list object
# Example daily case data: cases <- c( 0, 0, 1, 3, 1, 8, 0, 6, 5, 0, 7, 7, 18, 9, 22, 38, 53, 45, 40, 77, 76, 48, 67, 78, 42, 66, 67, 92, 16, 70, 43, 53, 55, 53, 29, 26, 37, 25, 45, 34, 40, 35 ) # Example assumed sampling fractions of positive cases: s1 <- c(rep(0.1, 13), rep(0.2, length(cases) - 13)) # To use parallel processing with multiple chains: # options(mc.cores = parallel::detectCores() / 2) # Only using 200 iterations and MAP (optimizing) fitting for example speed: m <- fit_seir( cases, iter = 200, fit_type = "optimizing", samp_frac_fixed = s1 )#> Finding the MAP estimate.#> MAP estimate: #> i0 R0 start_decline end_decline f_s[1] #> 4.05 3.38 14.69 21.39 0.18 #> phi[1] e #> 6.66 0.82 #> Mean in constrained space of MVN samples: #> i0 R0 start_decline end_decline f_s[1] #> 4.73 3.40 14.77 21.38 0.19 #> phi[1] e #> 7.31 0.83 #> SD in constrained space of MVN samples: #> i0 R0 start_decline end_decline f_s[1] #> 3.08 0.27 0.71 1.00 0.09 #> phi[1] e #> 2.51 0.04#> [1] "fit" "post" "phi_prior" #> [4] "R0_prior" "f_prior" "obs_model" #> [7] "e_prior_trans" "start_decline_prior" "end_decline_prior" #> [10] "samp_frac_fixed" "state_0" "daily_cases" #> [13] "days" "time" "last_day_obs" #> [16] "pars" "f2_prior_beta_shape1" "f2_prior_beta_shape2" #> [19] "stan_data" "days_back" "opt" #> [22] "fit_type"#> [1] "i0" "R0" "start_decline" "end_decline" #> [5] "phi" "ur" "e" "f_s"# Use tidybayes if you'd like: # post_tidy <- tidybayes::spread_draws(m$fit, c(y_rep, mu)[day, data_type]) # Add hospitalization data and estimate 2 sample-fraction blocks # for the reported cases: samp_frac_seg <- c(rep(1, 13), rep(2, length(cases) - 13)) s2 <- rep(0.07, 42) # Assuming 7\% of positive individuals are hospitalized hosp <- c( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 11, 9, 4, 7, 19, 23, 13, 11, 3, 13, 21, 14, 17, 29, 21, 19, 19, 10, 6, 11, 8, 11, 8, 7, 6 ) # Only using 200 iterations and MAP (optimizing) fitting for example speed: m2 <- fit_seir( daily_cases = cbind(cases, hosp), iter = 200, samp_frac_type = "segmented", # `samp_frac_type` affects the first data type only samp_frac_seg = samp_frac_seg, samp_frac_fixed = cbind(s1, s2), # s1 is ignored and could be anything delay_scale = c(9.8, 11.2), delay_shape = c(1.7, 1.9), fit_type = "optimizing" )#> Finding the MAP estimate.#> Warning: first element used of 'length.out' argument#> Warning: first element used of 'length.out' argument#> MAP estimate: #> i0 R0 start_decline end_decline f_s[1] #> 0.71 3.85 14.91 21.59 0.15 #> phi[1] e #> 6.12 0.83 #> Mean in constrained space of MVN samples: #> i0 R0 start_decline end_decline f_s[1] #> 0.89 3.85 14.88 21.54 0.17 #> phi[1] e #> 6.61 0.82 #> SD in constrained space of MVN samples: #> i0 R0 start_decline end_decline f_s[1] #> 0.65 0.31 0.74 1.16 0.09 #> phi[1] e #> 2.22 0.05# Estimate a second f_s segment: f_seg <- c(rep(0, 14), rep(1, 20), rep(2, length(cases) - 20 - 14)) f_prior <- matrix(c(0.4,0.6, rep(0.2,2)), ncol=2, nrow=2) # nrow corresponds to num f_s segments # Only using 200 iterations and MAP (optimizing) fitting for example speed: m3 <- fit_seir( cases, iter = 200, f_seg = f_seg, f_prior = f_prior, samp_frac_fixed = s1, fit_type = "optimizing" )#> Finding the MAP estimate.#> MAP estimate: #> i0 R0 start_decline end_decline f_s[1] #> 3.96 3.40 14.69 21.39 0.17 #> f_s[2] phi[1] e #> 0.71 6.88 0.82 #> Mean in constrained space of MVN samples: #> i0 R0 start_decline end_decline f_s[1] #> 4.63 3.39 14.71 21.35 0.18 #> f_s[2] phi[1] e #> 0.66 7.27 0.82 #> SD in constrained space of MVN samples: #> i0 R0 start_decline end_decline f_s[1] #> 2.71 0.24 0.68 0.97 0.09 #> f_s[2] phi[1] e #> 0.21 2.64 0.05# Choose initial values as a named list: m <- fit_seir(cases, fit_type = "optimizing", samp_frac_fixed = s1, init_list = list( R0 = 3, f_s = array(0.2), # note that `f_s` is an array of appropriate length i0 = 5, ur = 0.025, start_decline = 15, end_decline = 22) )#> Finding the MAP estimate.