Methods for the design and analysis of stepped wedge trials under model misspecification

Voldal, Emily

Methods for the design and analysis of stepped wedge trials under model misspecification

Files

Voldal_washington_0250E_24895.pdf (1.02 MB)

Date

2023-01-21

relationships.isAuthorOf

Voldal, Emily

Abstract

This dissertation consists of three projects that attempt to address issues related to the design and analysis of stepped wedge trials (SWTs). Stepped wedge trials (SWTs) are a type of cluster-randomized trial that are commonly used to evaluate health care interventions. SWTs are sometimes preferred over parallel cluster-randomized designs due to practical concerns, however some elements of the SWT design (e.g. confounding between time and treatment) make analysis more challenging. Although the popularity of the SWT has been increasing, resources and information for designing and analyzing SWTs are limited compared to more traditional cluster-randomized trial designs. The three projects presented here provide (1) resources for understanding and correctly applying existing methods; (2) an analysis of how model misspecification affects existing methods; and (3) novel methods for analyzing SWTs that are robust to common misspecification concerns. Project 1: Most SWT-related software packages have restrictive assumptions about the study design and correlation structure of the data. The objective of this project is to present a package and corresponding web-based graphical user interface (GUI) that provide researchers with another, more flexible option for SWT design and analysis. We developed an R package swCRTdesign ('stepped wedge Cluster Randomized Trial design'), which uses a random effects model to account for correlation in the data induced by a SWT design. Possible sources of correlation include clusters, time within clusters, and treatment within clusters. swCRTdesign allows a user to calculate power, simulate SWT data to streamline simulation studies (e.g. to estimate power), and create descriptive summaries and plots. Additionally, a GUI, developed using shiny, is available to calculate power and create power curves and design plots. The swCRTdesign package accommodates a wide variety of SWT designs, and makes it easy to account for some sources of correlation which are not found in other packages. The user-friendly web-based GUI makes some swCRTdesign features accessible to researchers not familiar with R. These two resources will make appropriate and flexible SWT calculations more accessible to scientists from a wide variety of backgrounds. Project 2: Mixed models are commonly used to analyze SWTs to account for clustering and repeated measures on clusters. One critical issue researchers face is whether to include a random time effect or a random treatment effect. When the wrong model is chosen, inference on the treatment effect may be invalid. We explore asymptotic and finite-sample convergence of variance component estimates when the model is misspecified and how misspecification affects the estimated variance of the treatment effect. For asymptotic results, we rely on analytical solutions rather than simulation studies, which allows us to succinctly describe the convergence of misspecified estimates, even though there are multiple roots for each misspecified model. We found that both direction and magnitude of the bias associated with model-based standard errors depends on the study design and magnitude of the true variance components. We identify some scenarios in which choosing the wrong random effect has a large impact on model-based inference. However, many trends depend on trial design and assumptions about the true correlation structure, so we provide tools for researchers to investigate specific scenarios of interest. We use data from a SWT on disinvesting from weekend services in hospital wards to demonstrate how these results can be applied as a sensitivity analysis, which quantifies the impact of misspecification under a variety of settings and directly compares the potential consequences of different modeling choices. Our results will provide guidance for pre-specified model choices and supplement sensitivity analyses to inform confidence in the validity of results. Project 3: Although mixed models are commonly used to analyze SWTs, they are susceptible to misspecification. This is in part because they use 'horizontal' or within-cluster information in addition to 'vertical' or between-cluster information. To use horizontal information in a mixed model, both the mean model and correlation structure must be correctly specified or accounted for, since time is confounded with treatment and time periods are likely correlated within clusters. Alternative non-parametric methods have been proposed that use only vertical information; these are more robust because between-cluster comparisons in a SWT preserve randomization, but these non-parametric methods are not very efficient. We propose a semi-parametric composite likelihood method that focuses on vertical information, but has the flexibility to recover some efficiency by using some horizontal information. We compare the properties and performance of various methods, using simulations based on COVID-19 data. We found that a vertical composite likelihood model that leverages baseline data is more robust than traditional methods, but more efficient than methods that use only vertical information. We hope that these results demonstrate the potential value of semi-parametric vertical methods, and that these new tools are useful to researchers analyzing SWTs who are concerned about misspecification of traditional models. Finally, we discuss the implications of this work and plans for future work.