Latent Continuous Time Markov Chains for Partially-Observed Multistate Disease Processes
Abstract
A disease process refers to a patient's traversal over time through a disease with multiple discrete states. Multistate models are powerful tools used to describe the dynamics of disease processes. Clinical study settings present modeling challenges, as patients' disease trajectories are only partially observed, and patients' disease statuses are only assessed at discrete clinic visit times. Furthermore, imperfect diagnostic tests may yield misclassification error in disease observations. Observational data, such as that available in electronic medical records (EMR), present additional challenges, since patients initiate visits based on symptoms, and these times are informative about patients' disease histories. Many of the flexible modeling methods suited for fully observed trajectories are no longer tractable with partially observed data. A typical approach is to assume a standard continuous time Markov chain for the disease process, due to its computational tractability. This assumption means that disease state sojourn times have constant hazard functions, which is frequently unrealistic. Our approach is to model the disease process via a latent continuous time Markov chain, enabling greater flexibility yet retaining tractability. We devise a novel expectation-maximization algorithm (EM) for fitting these models in a panel data setting in which observation times are non-informative. We then extend the model and the EM algorithm to accommodate observation times that are patient-initiated and informative about the disease process. We apply our model to a study of secondary breast cancer events using an EMR dataset of mammography and biopsy records.
Collections
- Biostatistics [215]