Application and Comparison of Clustering Methods to  Educational Process Data

Luo, Yu

Application and Comparison of Clustering Methods to Educational Process Data

dc.contributor.advisor	Li, Min
dc.contributor.author	Luo, Yu
dc.date.accessioned	2022-07-14T22:09:05Z
dc.date.available	2022-07-14T22:09:05Z
dc.date.issued	2022-07-14
dc.date.submitted	2022
dc.description	Thesis (Master's)--University of Washington, 2022
dc.description.abstract	Cluster analysis has great potential for analyzing the vast amounts of process data that record the online learning behaviors of students. It can be used to develop profiles of student groups that help instructors understand students’ online learning patterns. However, one of the major challenges in employing cluster analysis is to select a suitable one among many clustering algorithms. This methodological paper introduces and compares three clustering algorithms, including two popular non-hierarchical clustering methods, k-means and k-medoids, and one hierarchical method called agglomerative hierarchical clustering analysis (HCA). The dataset used for demonstration is a publicly available dataset, Open University Learning Analytics (OULA), which contains information on online modules, student demographics, and students' clicks on the virtual learning environment (VLE). To examine the utility of process features and performance of the selected clustering algorithms in predicting students' module outcome (i.e., pass or fail), one module was selected (N = 1299), and 18 process features were developed. After obtaining the clustering results from each algorithm, logistic regression was used to compare and validate the cluster memberships with students' module outcomes (i.e., pass or fail). Multiple logistic regression was employed to explore the demographics and process feature compositions of the most predictive clustering results. The results of the present study showed that k-means and k-medoids generated comparable results, while agglomerative HCA produced the most dissimilar yet most predictive results compared to k-means and k-medoids. Multiple logistic regression results showed that students who engaged in certain VLE activities such as taking quizzes or joining discussion forums had a higher chance of being in the high-performance group (i.e., the group with a higher probability of passing the module). Limitations and future research directions were discussed.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Luo_washington_0250O_24185.pdf
dc.identifier.uri	http://hdl.handle.net/1773/48921
dc.language.iso	en_US
dc.rights	CC BY
dc.subject
dc.subject	Educational tests & measurements
dc.subject.other	Education - Seattle
dc.title	Application and Comparison of Clustering Methods to Educational Process Data
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Luo_washington_0250O_24185.pdf
Size:: 922.27 KB
Format:: Adobe Portable Document Format

Download

Collections

Education - Seattle