Image of angry caller

Vocal Accords

Carla DeMarco

A large dataset that provides a unique resource for research on emotional vocal cues has proven to be a main attraction on TSpace, the U of T’s institutional academic repository.

It was a random discovery by UTM Library’s Digital Scholarship Technician Mary Beth Atkinson that the Toronto Emotional Speech Set (TESS) collection is the most sought-after dataset in their repository by a significant margin.

“When I did a search, I found that 87% of U of T Mississauga’s TSpace traffic is going to TESS” says Atkinson. “It is by far the most popular dataset in our collection.”

“We receive requests to access TESS from all over the world – sometimes three to five times a week, which far exceeds requests for other datasets – including requests for the data coming in from notable industry leaders, such as Samsung.”

The lead researchers responsible for generating the data are former UTM graduate student Kate Dupuis and UTM Professor Kathy Pichora-Fuller, both from the Department of Psychology. The resulting dataset they assembled for TESS consists of a total of 2800 recorded sound files. The stimuli include 200 target words spoken in the same carrier phrase (“Say the word ---") by two Toronto actors, a younger woman and an older woman, to convey seven different emotions: anger, disgust, fear, happiness, pleasant surprise, sadness, and neutral. One of the aims of the research was to determine how well younger and older adults can identify vocal emotions, and another was to determine how vocal emotion affects the ability to understand the phrase that was uttered in a noisy environment.

“Vision-based research on emotion and cognition is well known, and research using sound is nothing new, but research employing emotional speech stimuli is rare,” says Pichora-Fuller. “Our studies also take into account how age-related differences factor into the way in which people produce and perceive vocal emotion.”

Notably, their findings indicate that older adults and people with hearing loss have a harder time distinguishing the different vocal emotions, but for both younger and older adults it was easiest to understand speech spoken to portray fear and hardest to understand speech spoken to portray sadness.

Though the sound files have been archived on TSpace since 2010, Pichora-Fuller says that the recent spike in interest can likely be attributed to various industries becoming more focused on vocal emotions as this topic relates to areas like Artificial Intelligence (AI) and Human-Computer Interaction (HCI): machines will need to recognize the emotions of the humans talking to them, and also to respond with speech that conveys intended emotions to humans. There are many applications for this kind of information; for example, a company might want a happy-sounding, computer-generated voice for a call system, or to use different messages depending on if callers talk in happy or angry voices. Not all humans are young adults with normal hearing, so it is important to learn how people of different ages and with different abilities listen to emotional speech says Pichora-Fuller. She further states that the hearing aid industry is interested in this area of research because the focus until recently has been on developing devices to improve speech understanding but little attention was paid to improving vocal-emotion perception.

Atkinson was surprised by the recent bump in traffic to TESS, but she says that’s the nature of the institutional repository: just like research, sometimes you cannot gauge where the data will lead or what the uptake will be.

“When you put things up on TSpace, you don’t know how useful it is, or how it’s going to be used, or who will ultimately be interested in the material,” says Atkinson.

Both Atkinson and Pichora-Fuller agree that recently retired Digital Research and Scholarly Communications Librarian Pam King helped pave the way and was proactive in her archiving practices to ensure they preserve as much UTM research output as possible on TSpace. Atkinson also says that the two new units at the Library, the Digital Scholarship and the Research & Data Services, aim to help UTM researchers be most effective in managing their data and publications.

Pichora-Fuller further affirms the library’s collaborative involvement in the research environment on campus. She also points to a couple of other interesting outcomes that have emerged related to the renewed interest in their work.

“There is the knowledge mobilization piece because we are creating knowledge in our lab and other researchers and people in industry are using it,” says Pichora-Fuller. “There’s also an important translational factor with companies like Sonova, a leading international hearing aid company, deciding to open a new Canadian Innovation Centre near UTM because they are interested in and have ties to our work.”

Pichora-Fuller also points to a significant graduate-student connection that has been integral to shaping an outstanding track record for auditory research at UTM.

“Three of our former graduate students all received Mitacs funding to support their postdoctoral research, including Kate Dupuis, who is now a neuropsychologist working as an Innovation Leader at the Sheridan Elder Research Centre, as well as Gurjit Singh, who is currently a Senior Research Audiologist and Program Manager at Phonak, a Sonova company, and Huiwen Goy, who recently completed a postdoctoral fellowship that involved research using TESS to investigate how older listeners with hearing loss perceived emotional speech using different types of hearing aid signal processing,” says Pichora-Fuller. “Dr. Goy’s postdoctoral research was done at Ryerson in the SMART lab of Professor Frank Russo, who did his postdoctoral fellowship at UTM over 10 years ago. We have created a strong network of researchers who are leaders in the area of emotional speech processing.”

“I think this really speaks to the work we have done in the past and what we continue to do here in our Psychology labs on this campus, but also that our researchers, including our graduate student cohort, are contributing to this area of auditory and vocal emotion research in a profound way at the international level.”

For researchers interested in collaborating with the library or archiving their material in TSpace, please get in touch with Chris Young, Coordinator, Digital Scholarship.