fit_seir.Rd
This function fits a Stan SEIR model to one or more sets of COVID19 case
data. See project_seir()
for making predictions/projections.
fit_seir( daily_cases, obs_model = c("NB2", "Poisson"), forecast_days = 0, time_increment = 0.25, samp_frac_fixed = NULL, samp_frac_type = c("fixed", "estimated", "rw", "segmented"), samp_frac_seg = NULL, f_seg = c(0, rep(1, nrow(daily_cases) + forecast_days  1)), days_back = 45, R0_prior = c(log(2.6), 0.2), phi_prior = 1, f_prior = c(0.4, 0.2), e_prior = c(0.8, 0.05), samp_frac_prior = c(0.4, 0.2), start_decline_prior = c(log(15), 0.05), end_decline_prior = c(log(22), 0.05), use_ramp = TRUE, rw_sigma = 0.1, seed = 42, chains = 4, iter = 2000, N_pop = 5100000, pars = c(D = 5, k1 = 1/5, k2 = 1, q = 0.05, ud = 0.1, ur = 0.02, f0 = 1), i0_prior = c(log(8), 1), state_0 = c(E1_frac = 0.4, E2_frac = 0.1, I_frac = 0.5, Q_num = 0, R_num = 0, E1d_frac = 0.4, E2d_frac = 0.1, Id_frac = 0.5, Qd_num = 0, Rd_num = 0), save_state_predictions = FALSE, delay_scale = 9.85, delay_shape = 1.73, ode_control = c(1e07, 1e06, 1e+06), fit_type = c("NUTS", "VB", "optimizing"), init = c("prior_random", "optimizing"), init_list = NULL, X = NULL, ... )
daily_cases  Either a vector of daily new cases if fitting to a single data type or a matrix of case data if fitting to multiple data types. Each data type should be in its own column. Can have NA values (will be ignored in likelihood). A vector will be turned into a one column matrix. 

obs_model  Type of observation model. 
forecast_days  Number of days into the future to forecast. The model
will run faster with fewer forecasted days. It is recommended to set this
to 0 here and use 
time_increment  Time increment for ODEs and Weibull delaymodel integration in units of days. Larger numbers will run faster, possibly at the expense of accuracy. 
samp_frac_fixed  A vector (or matrix) of sampled fractions. Should be
of dimensions: 
samp_frac_type  How to treat the sample fraction. Fixed, estimated, a constrained random walk, or segmented. Only applies to the first data type column. The other data types must always have a fixed sample fraction currently. 
samp_frac_seg  A vector of sample fraction segment indexes of length

f_seg  A vector of segment indexes of length 
days_back  Number of days to go back for the Weibull casedelay integration. Should be sufficiently large that the results do not change. 
R0_prior  Lognormal log mean and SD for the R0 prior. 
phi_prior  SD of 
f_prior  Beta mean and SD for the 
e_prior  Beta mean and SD for the 
samp_frac_prior 

start_decline_prior  Lognormal log mean and SD for the parameter
representing the day that social distancing starts ramping in
( 
end_decline_prior  Lognormal log mean and SD for the parameter
representing the day that the social distancing ramp finishes being ramped
in ( 
use_ramp  Logical. Use the ramp? 
rw_sigma  The fixed standard deviation on the optional 
seed  MCMC seed for 
chains  Number of MCMC chains for 
iter  MCMC iterations per chain for 
N_pop  Number of people in population. 
pars  A named numeric vector of fixed parameter values. 
i0_prior  Infected people infected at initial point in time. Lognormal log mean and SD for the parameter. Note that the default is set up for BC. You may need to use a lower prior for other regions. 
state_0  Initial state: a named numeric vector. 
save_state_predictions  Include the state predictions? 
delay_scale  Weibull scale parameter for the delay in reporting. If there are multiple datatypes then this should be a vector of the same length as the data types. 
delay_shape  Weibull shape parameter for the delay in reporting. Same
format as for 
ode_control  Control options for the Stan ODE solver. First is relative difference, then absolute difference, and then maximum iterations. These can likely be left as is. 
fit_type  Stan sampling/fitting algorithm to use. NUTS =

init  Initialization type. Draw randomly from the prior or try to use
the MAP estimate? Can be overridden by the 
init_list  An optional named list foreign initialization. Note that the

X  An optional model matrix applied additively to log expected cases. 
...  Other arguments to pass to 
A named list object
# Example daily case data: cases < c( 0, 0, 1, 3, 1, 8, 0, 6, 5, 0, 7, 7, 18, 9, 22, 38, 53, 45, 40, 77, 76, 48, 67, 78, 42, 66, 67, 92, 16, 70, 43, 53, 55, 53, 29, 26, 37, 25, 45, 34, 40, 35 ) # Example assumed sampling fractions of positive cases: s1 < c(rep(0.1, 13), rep(0.2, length(cases)  13)) # To use parallel processing with multiple chains: # options(mc.cores = parallel::detectCores() / 2) # Only using 200 iterations and MAP (optimizing) fitting for example speed: m < fit_seir( cases, iter = 200, fit_type = "optimizing", samp_frac_fixed = s1 )#> Finding the MAP estimate.#> MAP estimate: #> i0 R0 start_decline end_decline f_s[1] #> 4.05 3.38 14.69 21.39 0.18 #> phi[1] e #> 6.66 0.82 #> Mean in constrained space of MVN samples: #> i0 R0 start_decline end_decline f_s[1] #> 4.73 3.40 14.77 21.38 0.19 #> phi[1] e #> 7.31 0.83 #> SD in constrained space of MVN samples: #> i0 R0 start_decline end_decline f_s[1] #> 3.08 0.27 0.71 1.00 0.09 #> phi[1] e #> 2.51 0.04#> [1] "fit" "post" "phi_prior" #> [4] "R0_prior" "f_prior" "obs_model" #> [7] "e_prior_trans" "start_decline_prior" "end_decline_prior" #> [10] "samp_frac_fixed" "state_0" "daily_cases" #> [13] "days" "time" "last_day_obs" #> [16] "pars" "f2_prior_beta_shape1" "f2_prior_beta_shape2" #> [19] "stan_data" "days_back" "opt" #> [22] "fit_type"#> [1] "i0" "R0" "start_decline" "end_decline" #> [5] "phi" "ur" "e" "f_s"# Use tidybayes if you'd like: # post_tidy < tidybayes::spread_draws(m$fit, c(y_rep, mu)[day, data_type]) # Add hospitalization data and estimate 2 samplefraction blocks # for the reported cases: samp_frac_seg < c(rep(1, 13), rep(2, length(cases)  13)) s2 < rep(0.07, 42) # Assuming 7\% of positive individuals are hospitalized hosp < c( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 11, 9, 4, 7, 19, 23, 13, 11, 3, 13, 21, 14, 17, 29, 21, 19, 19, 10, 6, 11, 8, 11, 8, 7, 6 ) # Only using 200 iterations and MAP (optimizing) fitting for example speed: m2 < fit_seir( daily_cases = cbind(cases, hosp), iter = 200, samp_frac_type = "segmented", # `samp_frac_type` affects the first data type only samp_frac_seg = samp_frac_seg, samp_frac_fixed = cbind(s1, s2), # s1 is ignored and could be anything delay_scale = c(9.8, 11.2), delay_shape = c(1.7, 1.9), fit_type = "optimizing" )#> Finding the MAP estimate.#> Warning: first element used of 'length.out' argument#> Warning: first element used of 'length.out' argument#> MAP estimate: #> i0 R0 start_decline end_decline f_s[1] #> 0.71 3.85 14.91 21.59 0.15 #> phi[1] e #> 6.12 0.83 #> Mean in constrained space of MVN samples: #> i0 R0 start_decline end_decline f_s[1] #> 0.89 3.85 14.88 21.54 0.17 #> phi[1] e #> 6.61 0.82 #> SD in constrained space of MVN samples: #> i0 R0 start_decline end_decline f_s[1] #> 0.65 0.31 0.74 1.16 0.09 #> phi[1] e #> 2.22 0.05# Estimate a second f_s segment: f_seg < c(rep(0, 14), rep(1, 20), rep(2, length(cases)  20  14)) f_prior < matrix(c(0.4,0.6, rep(0.2,2)), ncol=2, nrow=2) # nrow corresponds to num f_s segments # Only using 200 iterations and MAP (optimizing) fitting for example speed: m3 < fit_seir( cases, iter = 200, f_seg = f_seg, f_prior = f_prior, samp_frac_fixed = s1, fit_type = "optimizing" )#> Finding the MAP estimate.#> MAP estimate: #> i0 R0 start_decline end_decline f_s[1] #> 3.96 3.40 14.69 21.39 0.17 #> f_s[2] phi[1] e #> 0.71 6.88 0.82 #> Mean in constrained space of MVN samples: #> i0 R0 start_decline end_decline f_s[1] #> 4.63 3.39 14.71 21.35 0.18 #> f_s[2] phi[1] e #> 0.66 7.27 0.82 #> SD in constrained space of MVN samples: #> i0 R0 start_decline end_decline f_s[1] #> 2.71 0.24 0.68 0.97 0.09 #> f_s[2] phi[1] e #> 0.21 2.64 0.05# Choose initial values as a named list: m < fit_seir(cases, fit_type = "optimizing", samp_frac_fixed = s1, init_list = list( R0 = 3, f_s = array(0.2), # note that `f_s` is an array of appropriate length i0 = 5, ur = 0.025, start_decline = 15, end_decline = 22) )#> Finding the MAP estimate.