Comparing Machine Learning Classification Methods for Biological Tracking Data
Loading...
Date
Authors
Shackelford, David
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Multiple particle tracking (MPT) has been increasingly used to characterize and probe biological environments. The additional use of machine learning (ML) methods has previously been proven successful in classifying both single particle tracking and MPT data for many biological systems. In order to further utilize collected MPT data for studies focused on nanoparticle diffusion within biological environments, a new predictive package, referred to as diff_predictor was developed in this study. This package uses feature and trajectory datasets, along with prediction methods such as XGBoost, recurrent neural networks, and random forest decision trees, to make predictions and classifications about the biological environment and nanoparticle behavior. To apply diff_predictor, I took a threefold approach: First, an extreme gradient boosting decision tree (XGBoost) was applied to a study of diffusion in a developing region within the brain to predict age based on spatial features. The features used in classification were analyzed for importance using Shapley Additive Explanations (SHAP) values. Second, a long short-term memory recurrent neural network algorithm was used on the temporal data from the same dataset and the results were compared with XGBoost and random forest models. Finally, demonstration of the versatility of the diff_predictor package showed we can classify regional differences in diffusion in the brain using a new different experimental dataset. The models produced by diff_predictor had accuracies up to 54.56% for prediction in age-based data, and accuracies up to 91.35% for prediction in regional-based data. Furthermore, SHAP values produced by the decision tree methods in diff_predictor provided useful information toward nanoparticle environment and interactions.
Description
Thesis (Master's)--University of Washington, 2020
