Emotionally Intelligent Voice Language Models For Mental Health Therapy

dc.contributor.advisorSi, Dong
dc.contributor.authorTanikella, Raviteja
dc.date.accessioned2025-10-02T16:07:31Z
dc.date.issued2025-10-02
dc.date.submitted2025
dc.descriptionThesis (Master's)--University of Washington, 2025
dc.description.abstractConversational assistants for mental health therapy primarily rely on text-based models or cascaded architectures that first transcribe speech to text, a process that discards crucial paralinguistic information. This information bottleneck limits the AI's ability to perceive critical emotional cues and broader psychological states, hindering its capacity to reason over human voice and provide effective therapy. This thesis details the development of an emotionally intelligent voice language model designed to overcome these limitations. The process began with a systematic evaluation of heuristic-based approaches and end-to-end model architectures, where automated benchmarking and human study confirmed that large audio language models provided the most effective foundation for voice understanding and reasoning. Building on these findings, I propose and implement policy optimization methods to fine-tune Qwen2.5-Omni on therapy data. The resulting aligned model demonstrated improved performance in benchmark evaluations, exhibiting emotional intelligence and generating therapeutically relevant responses. By presenting a complete development frame- work, from architectural validation to targeted alignment, this research establishes a clear and proven roadmap for creating the next generation of efficient and adaptive voice language models for mental health therapy and multimodal conversational AI.
dc.embargo.lift2026-10-02T16:07:31Z
dc.embargo.termsRestrict to UW for 1 year -- then make Open Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherTanikella_washington_0250O_28882.pdf
dc.identifier.urihttps://hdl.handle.net/1773/53980
dc.language.isoen_US
dc.rightsCC BY
dc.subjectConversational AI
dc.subjectDirect Preference Optimization
dc.subjectEmotional Intelligence
dc.subjectMental Health Therapy
dc.subjectVoice Language Models
dc.subjectArtificial intelligence
dc.subjectComputer science
dc.subject.otherComputer science and engineering
dc.titleEmotionally Intelligent Voice Language Models For Mental Health Therapy
dc.typeThesis

Files