Prediction of CYP3A4 metabolic activity from whole genome RNA-seq data with feature selection machine learning methods
MetadataShow full item record
CYP3A4, one of the isozyme of the cytochromes P450 (CYPs), contributes significantly to drug clearance and drug-drug interactions. The goals of this project are to identify hepatically-expressed genes that are associated with CYP3A4 metabolic activity in human liver tissue and to predict CYP3A4 activity using gene expression data from whole genome RNA sequences. Due to the high-dimensionality of the dataset, we applied lasso and elastic net, two feature selection machine learning methods, for prediction and graphical lasso was used for constructing gene network graphs. A simulation study was performed to assess the performance of the prediction algorithms and to evaluate the efficiency of gene selection using the machine learning methods. We assessed prediction performance based on correlations, and the correlation between measured CYP3A4 activity and predicted activity was approximately 0.4 and 0.5 when reductase was excluded and included, respectively, for both lasso and elastic net. In addition to the CYP3A4 gene, we also identified the GZMA gene as a strong candidate for prediction of CYP3A4 activity that should be investigated in future studies.
- Biostatistics