Towards Robust and Effective Human Pose Estimation and Generation

dc.contributor.advisorHwang, Jenq-Neng
dc.contributor.authorJiang, Zhongyu
dc.date.accessioned2025-05-12T22:47:47Z
dc.date.available2025-05-12T22:47:47Z
dc.date.issued2025-05-12
dc.date.submitted2025
dc.descriptionThesis (Ph.D.)--University of Washington, 2025
dc.description.abstractHuman pose estimation (HPE) in both 2D and 3D remains a fundamental yet challenging problem in computer vision, with broad applications in action recognition, human-computer interaction, motion analysis, and object tracking. Despite recent advances, achieving robustness and efficiency in real-world and edge-device scenarios remains difficult. This dissertation presents a series of contributions toward making HPE more effective and robust. Specifically, we propose (1) a temporal-based 2D HPE method for golf swing analysis, (2) an optimization-driven pipeline for 3D HPE, and (3) a unified contrastive learning-based framework for 2D-3D pose representation. Furthermore, building upon HPE, we explore its potential in human motion generation. In particular, we introduce PackDiT, a novel diffusion-based framework for joint motion and text generation via mutual prompting. PackDiT effectively integrates text and motion generation by leveraging a unique training strategy with two DiT models (Text-DiT and Motion-DiT) with shared latent spaces, enabling text-to-motion, motion-to-text, and joint motion-text synthesis. Evaluated on the HumanML3D dataset, PackDiT outperforms state-of-the-art generative models across multiple tasks, demonstrating its capability as a unified framework for motion understanding and generation. The dissertation discusses challenges, limitations, and potential directions for advancing HPE and human motion generation in future research.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherJiang_washington_0250E_27962.pdf
dc.identifier.urihttps://hdl.handle.net/1773/52980
dc.language.isoen_US
dc.rightsCC BY-NC
dc.subjectArtificial Intelligence
dc.subjectHuman Pose Estimation
dc.subjectMachine Learning
dc.subjectMotion Generation
dc.subjectComputer science
dc.subject.otherElectrical and computer engineering
dc.titleTowards Robust and Effective Human Pose Estimation and Generation
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Jiang_washington_0250E_27962.pdf
Size:
28.49 MB
Format:
Adobe Portable Document Format