Temporal fine structure and applications to cochlear implants
MetadataShow full item record
Complex broadband sounds are decomposed by the auditory filters into a series of relatively narrowband signals, each of which conveys information about the sound by time-varying features. The slow changes in the overall amplitude constitute envelope, while the more rapid events, such as zero crossings, constitute temporal fine structure (TFS). Although envelope cues from a small number of channels can support robust speech recognition in quiet, TFS seems to plays a significant role for speech perception in noise, especially in fluctuating background. Fundamental questions about the relative importance of envelope and TFS have been addressed by many studies. The definition of TFS poses a critical issue. Due to the coupling between envelope and phase, it is problematic to isolate the TFS from the envelope for any signal which is not extremely narrowband. Conventionally, a Hilbert transform is used to represent each band as the product of the Hilbert envelope and a frequency-modulated (FM) sinusoidal carrier. The FM component is then taken as the TFS of the band. We show in this dissertation that the Hilbert FM is a distorted representation. To address this concern, we proposed a new distortion-free additive view of signal decomposition, the slow envelope and the fast envelope, using half wave rectification followed by filters reflecting engineering interpretation of neural physiology. The slow envelope is a tool for representing temporal cues that can be coded in the average firing rate of auditory nerve fibers, while the fast envelope instead captures the temporal cues conveyed in neural phase locking patterns. Using this new decomposition and the conventional Hilbert decomposition, we investigated the relative contribution of neural envelope and TFS coding to speech intelligibility in different noise conditions. The neural representation was generated by a simplified peripheral auditory model (Shamma and Lorenzi, 2013). We observed that the distortions in the Hilbert FM likely confounded the importance of TFS and made it seem insignificant. In contrast, the trends observed with fast envelope were in line with previous perception studies, suggesting that TFS plays a significant role in masking release. Due to the inherently coarse spectral and temporal resolution in electric hearing, conventional cochlear implant (CI) coding strategies only transmit envelope cues in a small number of channels. The lack of TFS potentially contributes to CI users' difficulties in understanding speech in noise and perceiving music. To encode fine structure information for CI users, we proposed a harmonic-single-sideband-encoder (HSSE) strategy that explicitly tracks the harmonics in complex sounds and transforms them into modulators conveying both envelope and TFS cues. A key distinction about HSSE is that it keeps the envelope and TFS cues together during the transformation to avoid distortions. The effectiveness of HSSE to speech and music perception were tested using three approaches, including acoustic simulation in normal hearing listeners, neural response simulation using a population auditory nerve model (Imennov and Rubinstein, 2009), and acute test in CI patients. Significant effects of HSSE on speech perception in noise and music perception were observed, which illustrated the potentially large benefit of providing fine structure information in a cochlear implant.
- Electrical engineering