Automatic Detection Of Language Levels in L2 English Learners

Podgornik, Stella M.

Automatic Detection Of Language Levels in L2 English Learners

dc.contributor.advisor	Levow, Gina-Anne	en_US
dc.contributor.author	Podgornik, Stella M.	en_US
dc.date.accessioned	2012-08-10T20:37:34Z
dc.date.available	2012-08-10T20:37:34Z
dc.date.issued	2012-08-10
dc.date.submitted	2012	en_US
dc.description	Thesis (Master's)--University of Washington, 2012	en_US
dc.description.abstract	This study analyzes different features which would enable classifiers to detect language levels in adult second language (L2) English Learners. Approximately 46 different speech samples from users speaking 15 different native or L1 languages were selected from the Learning Prosody in a Foreign Language (LeaP) corpus (Gut 2004) collected in Germany. Using a variety of selected features from the spoken L2 (second language English) languages, the Support Vector Machine (SVM), was trained and the speakers were classified into three different categories: c1, c2, and s1. These categories correspond to beginner, intermediate, and advanced levels of the target secondary or L2 language, English. The chosen features are grouped into four different categories: sentence, syllable, duration, and pitch. Count features such as sentence word count, sentence article count, etc. had the most influence on the system, while the sentence features had the second most influence. The duration features pushed the accuracy numbers into the 60s. Surprisingly, most of the pitch features used had no effect on the accuracy. A small common stop word list was also used, which proved to be very helpful. The edit distance measures of the sentences with common words removed showed a measurable effect, and the spoken duration of those same words in the sentence helped push the accuracy numbers for the test configuration above 60%. The test configuration was selected because it had an accuracy rating close to the mean of a set of 50 randomly generated configurations. Due to the small size of the training and testing sets, it was found the L1 language of the speaker had a significant effect on the accuracy of the classification predictions. The classification predictions have a variance as much as 40%.	en_US
dc.embargo.terms	No embargo	en_US
dc.format.mimetype	application/pdf	en_US
dc.identifier.other	Podgornik_washington_0250O_10158.pdf	en_US
dc.identifier.uri	http://hdl.handle.net/1773/20283
dc.language.iso	en_US	en_US
dc.rights	Copyright is held by the individual authors.	en_US
dc.subject	computational linguistics; second language learning	en_US
dc.subject.other	Linguistics	en_US
dc.subject.other	Language	en_US
dc.subject.other	Linguistics	en_US
dc.title	Automatic Detection Of Language Levels in L2 English Learners	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Podgornik_washington_0250O_10158.pdf
Size:: 754.37 KB
Format:: Adobe Portable Document Format

Download

Collections

Linguistics