Towards Robust and Effective Human Pose Estimation and Generation

Jiang, Zhongyu

Towards Robust and Effective Human Pose Estimation and Generation

dc.contributor.advisor	Hwang, Jenq-Neng
dc.contributor.author	Jiang, Zhongyu
dc.date.accessioned	2025-05-12T22:47:47Z
dc.date.available	2025-05-12T22:47:47Z
dc.date.issued	2025-05-12
dc.date.submitted	2025
dc.description	Thesis (Ph.D.)--University of Washington, 2025
dc.description.abstract	Human pose estimation (HPE) in both 2D and 3D remains a fundamental yet challenging problem in computer vision, with broad applications in action recognition, human-computer interaction, motion analysis, and object tracking. Despite recent advances, achieving robustness and efficiency in real-world and edge-device scenarios remains difficult. This dissertation presents a series of contributions toward making HPE more effective and robust. Specifically, we propose (1) a temporal-based 2D HPE method for golf swing analysis, (2) an optimization-driven pipeline for 3D HPE, and (3) a unified contrastive learning-based framework for 2D-3D pose representation. Furthermore, building upon HPE, we explore its potential in human motion generation. In particular, we introduce PackDiT, a novel diffusion-based framework for joint motion and text generation via mutual prompting. PackDiT effectively integrates text and motion generation by leveraging a unique training strategy with two DiT models (Text-DiT and Motion-DiT) with shared latent spaces, enabling text-to-motion, motion-to-text, and joint motion-text synthesis. Evaluated on the HumanML3D dataset, PackDiT outperforms state-of-the-art generative models across multiple tasks, demonstrating its capability as a unified framework for motion understanding and generation. The dissertation discusses challenges, limitations, and potential directions for advancing HPE and human motion generation in future research.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Jiang_washington_0250E_27962.pdf
dc.identifier.uri	https://hdl.handle.net/1773/52980
dc.language.iso	en_US
dc.rights	CC BY-NC
dc.subject	Artificial Intelligence
dc.subject	Human Pose Estimation
dc.subject	Machine Learning
dc.subject	Motion Generation
dc.subject	Computer science
dc.subject.other	Electrical and computer engineering
dc.title	Towards Robust and Effective Human Pose Estimation and Generation
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Jiang_washington_0250E_27962.pdf
Size:: 28.49 MB
Format:: Adobe Portable Document Format

Download

Collections

Electrical and computer engineering