Application and Comparison of Clustering Methods to Educational Process Data

dc.contributor.advisorLi, Min
dc.contributor.authorLuo, Yu
dc.date.accessioned2022-07-14T22:09:05Z
dc.date.available2022-07-14T22:09:05Z
dc.date.issued2022-07-14
dc.date.submitted2022
dc.descriptionThesis (Master's)--University of Washington, 2022
dc.description.abstractCluster analysis has great potential for analyzing the vast amounts of process data that record the online learning behaviors of students. It can be used to develop profiles of student groups that help instructors understand students’ online learning patterns. However, one of the major challenges in employing cluster analysis is to select a suitable one among many clustering algorithms. This methodological paper introduces and compares three clustering algorithms, including two popular non-hierarchical clustering methods, k-means and k-medoids, and one hierarchical method called agglomerative hierarchical clustering analysis (HCA). The dataset used for demonstration is a publicly available dataset, Open University Learning Analytics (OULA), which contains information on online modules, student demographics, and students' clicks on the virtual learning environment (VLE). To examine the utility of process features and performance of the selected clustering algorithms in predicting students' module outcome (i.e., pass or fail), one module was selected (N = 1299), and 18 process features were developed. After obtaining the clustering results from each algorithm, logistic regression was used to compare and validate the cluster memberships with students' module outcomes (i.e., pass or fail). Multiple logistic regression was employed to explore the demographics and process feature compositions of the most predictive clustering results. The results of the present study showed that k-means and k-medoids generated comparable results, while agglomerative HCA produced the most dissimilar yet most predictive results compared to k-means and k-medoids. Multiple logistic regression results showed that students who engaged in certain VLE activities such as taking quizzes or joining discussion forums had a higher chance of being in the high-performance group (i.e., the group with a higher probability of passing the module). Limitations and future research directions were discussed.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherLuo_washington_0250O_24185.pdf
dc.identifier.urihttp://hdl.handle.net/1773/48921
dc.language.isoen_US
dc.rightsCC BY
dc.subject
dc.subjectEducational tests & measurements
dc.subject.otherEducation - Seattle
dc.titleApplication and Comparison of Clustering Methods to Educational Process Data
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Luo_washington_0250O_24185.pdf
Size:
922.27 KB
Format:
Adobe Portable Document Format