Separability and Systematic Design of Gestural Human-Robot Interaction

Hendrix, Rose

Separability and Systematic Design of Gestural Human-Robot Interaction

Files

Hendrix_washington_0250E_21192.pdf (15.28 MB)

Date

2020-04-30

Authors

Hendrix, Rose

Abstract

This work aims to facilitate robust, systematic design of human-robot interactions that are mediated by gesture. In many domains, communication modalities such as voice or physical input are not feasible because of domain restrictions. In these cases, gesture interaction is necessary for explicit communication between a human and a robotic assistant or observer. Because the only input to the robotic platform's gesture recognition system is a continuously tracked hand, there is potential for confusion between the human motions of work and human motions intended to communicate control. Interaction design up to this point has largely been human-centered or ad-hoc in nature, rather than having systematic focus on performance. The design choices of interest to this work are the choice of control gestures (what motions to perform to communicate preset meanings) and feature representations (what collections of measurements or combinations thereof should represent a gesture). In the absence of constraints on time and resources, the designer could try out all possible combinations of control gestures and feature representations in the final application context. However, that type of exhaustive testing takes a very long time for the researcher, may take the time of skilled workers, and need to be repeated when processes or environmental factors are changed. The selection of control gestures is based around separability, a motion that a set should be measured against its worst cases that is explored in more detail in Chapter 3. The main contribution of this research is the formalization and validation of tools for systematic design of functional human-robot interaction without exhaustive testing. This is based on a selection method for maximally-separable control gestures and feature representations for any given specific work context. There are several challenges involved in this goal. Firstly, it is not necessarily obvious what it means to compute the separability of a gesture set even with very simple class representations. How does separability as calculated correlate with overall performance? Secondly, the choice of how to measure ``distance'' between gesture classes such that correlation with performance measures is preserved is not trivial, particularly in the presence of outliers or small population sizes. Thirdly, there are many different ways to represent ``gestures'' numerically. How do these different representations interact with separability? How would a more separable representation be chosen? Finally, does this systematic design process generalize to other architectures for sensing human motion? These challenges are addressed in order. First, a separability metric for systematically selecting a set of control gestures that can be easily distinguished from a given set of work gestures is proposed and examined in Chapter 3. A correlation between the proposed separability metric and test accuracy of gesture sets is found (R=0.59), as well as a negative correlation (R=-0.35) between separability and sensitivity of performance to individuals' performances. Results are also presented for a comparative online classification implementation on a hand-following robot of the most- and least-separable gesture sets, where the most-separable gesture set performs an average of 4 times better across all metrics. Second, a review of the considered measures of distance between gesture classes is presented, along with numeric and analytic justifications for the final selection of a modified version of the Support Vector Machines (SVM) margin of the model between the gesture classes. Third, the problem of reduced-order gesture representation is considered and a modification of a method from literature and results are presented. A >99% reduction in dimensionality is reliably achieved without loss in test accuracy, and on average 17.5% fewer features are required for the same accuracy retention compared to the baseline method of PCA-based selection. Finally, the tools developed for systematic design are applied to a substantially different sensing architecture for human motion to validate the generalization ability of the process. The main contribution of this research is a method to select gestures that best enable communication between a human and a robotic platform, and a reduced-order representation method to speed up computations without significant loss of accuracy.