Latent Class and Latent Profile Analysis in Medical Diagnosis and Prognosis
MetadataShow full item record
Evaluating test accuracy is an important topic in medical diagnosis and prognosis. Accuracy information is necessary for care-givers to make well-informed decisions; it also helps researchers to select better diagnostic tools, either from new techniques or based on combinations of current information. This field has recently been re-energized, due to advances in diagnostic techniques and the discovery of novel biomarkers. However, assessment becomes difficult when the underlying medical condition, or the gold standard, is unknown due to time or cost constraints, lack of biotechnology, or concerns over the invasive nature of a diagnostic procedure. This issue is becoming more common and pressing with the growing interest in, and emphasis on, preclinical diagnosis and prevention. Moreover, with improvements in clinical practice, there is now a need to go beyond a traditional binary disease status approach and incorporate an ordinal gold standard. Additionally, the ability to take subjects' individual characteristics, which may affect disease prevalence and test performance, into consideration, will allow care-givers to provide their patients with more accurate and personalized diagnoses. This dissertation views the unobserved gold standard as a latent variable, and proposes models in the latent class and latent profile framework to solve the above mentioned problems. For categorical tests, a latent class approach is adopted to nonparametrically model the conditional distributions of the tests within different disease groups. Additionally, a random effect method is introduced to relax the classic conditional independence assumption in latent class models, so that the model can then be applied to more general situations. A likelihood ratio test on the conditional independence assumption is also discussed. For continuous tests, a latent profile model is proposed, which allows for the inclusion of a set of covariates that may be associated with disease prevalence, and a set of covariates that may influence test performance. Therefore, the model also relaxes the conditional independence assumption by explicitly explaining correlations among the tests within each disease category. Moveover, it can provide information about risk factors' impacts and about a test's properties within subpopulations. Additionally, the model proposed here allows for a transformation on the test results to take into account possible skewness in the data. This dissertation also proposes that a summary measure and graphical presentation of the results in terms of the commonly used receiver operating characteristic (ROC) curve cannot be directly apply to data with an ordinal gold standard. This dissertation extends the concept of the ROC curve into a high dimensional volume and provides corresponding interpretations. Extensive simulations have been performed to assess the consistency and robustness of the proposed methods. Moreover, this dissertation carefully discusses the local and global identifiabilities of latent class and latent profile models, and is the first to provide sufficient conditions for establishing local and global identifiability for latent class and latent profile models in the general form. These results provide theoretical justification of the proposed methods and guidance for practical applications of these models. The proposed methods are illustrated using data from a traditional Chinese medicine practice to evaluate doctors' diagnostic accuracy for symptom diagnosis, and in a data set from a study on Alzheimer's disease to select and combine biomarkers that can help with early detection.
- Biostatistics