Converting One Accent To Another

Speech Technology

I read this article about a company, Sanas, who has developed a product that will detect a speaker’s accent and then change it to a different accent (either from your choice or by detecting it). The article says, “Accents supported will include American, Spanish, British, Indian, Filipino and Australian by the end of the year.” They call it “real-time accent matching.” What do you think about that idea?

You can hear a sample in the video in the article, also linked here.

The first thing I noticed is how non-human the created accented voice is. If an accent is a noticeable signal that there is a difference between me and another person, a computer takes that to a whole other level. Talking through a machine accent interpreter makes me feel even more distanced from the human producing the speech.

If I were only listening and not interacting with the machine accent interpreter, it might not feel as weird. I hear computer generated speech a lot in YouTube and TikTok videos and it’s only growing in popularity. On November 11, 2021, Instagram started using the same text-to-speech feature on Reels that TikTok uses. On November 12, 2021, Disney announced its partnership with TikTok to use Disney character voices such as C-3PO, Chewbacca, and Rocket Raccoon. Maybe interacting with computer generated voices will become more common and I’ll get used to it.

There’s a quote in the article from the company’s CEO Maxim Serebryakov, “Sanas is striving to make communication easy and free from friction, so people can speak confidently and understand each other, wherever they are and whoever they are trying to communicate with.”

Based on my own tiny experience with the demonstration, I can see how it could reduce “friction” in communication based on understanding accented speech, but I doubt it would remove it completely. There’s more to communication than accents.

“Ladder Truck”

I remember scoring for the TOEIC (Test of English for International Communication) and having a supervisor call me to review an answer that I had given a high score. She wanted me to lower the score based on language use because the speaker had said, “The ladder truck moved the boxes.” She said that a “ladder truck” was a truck used by the fire department, not for moving into a new apartment. I could tell that the speaker on the test had a Korean accent and I had just been living in Korea otherwise I would have immediately thought the same thing as my supervisor. However, I knew this speaker was referring to a totally different kind of truck that I had seen used for moving boxes into apartments from the outside instead of people carrying each box up and down stairwells. They were called “ladder trucks” or “ladder-lift trucks” and look like this:

My supervisor was mentally picturing an American fire department truck:

It didn’t matter how perfectly that test taker pronounced the word “ladder truck” or how it was used in a grammatically correct way in his sentence, the completely different meaning that the supervisor had for that term caused her to misunderstand his intended meaning. You can change the sound of the words and overall intonation, but when people have accents while speaking another language it’s because they are from different places geographically, economically, socially, any separation that creates differences not only in sound but in meaning and perspectives.

Upsides

I won’t even go into detail about how much I disagree that speaking into an app that converts your accent will increase your speaking confidence. Ultimately, if this product will help people do their job better, or even get a job that once had speech requirements they didn’t meet, that’s a good thing. If it helps reduce the initial resistance that someone has when they are connected with a customer service rep that has an accent different from their own, that’s a good thing. That initial resistance combined with a frustrating situation that created the need for a call to customer service is a bad combination. If this product helps reduce the amount of effort that goes into understanding a speaker with a different accent and helps a conversation go more smoothly, that’s a positive result.