Estimation and Comparison of HIV-Specific Substitution Matrices
Abstract
Amino acid substitution matrices are commonly used for sequence alignment, phylogenetic inference and sequence comparison. Empirical organism-specific substitution matrices constructed using only sequence data from a particular organism are thought to lead to more accurate analyses. In HIV research, the standard substitution matrices are the between- and within-host matrices estimated using HIV sequences introduced by Nickle et al. (2007). This thesis focuses on constructing more granular HIV-specific matrices (clade-specific matrices and gene-specific matrices) and comparing the matrices in a way that accounts for error in matrix estimation. Using standard errors of parameter estimates predicted from a two-part linear model, the analyses indicate statistically significant difference between HIV clade B and HIV clade C matrices as well as between HIV Env gene and HIV Gag gene matrices upon performing Bonferroni-corrected comparisons of 189 estimates of amino acid exchangeability parameters.
Collections
- Biostatistics [215]