An App That Lets You Converse with the Deaf, No Sign Language Necessary

brayton · on Oct 14, 2014

Amazing to see why they are going after this problem. One founder grew up in a fully deaf family and another founder is deaf himself.

tsm · on Oct 14, 2014

I watched the video...is this anything more than a mashup of groupchat and speech-to-text? Couldn't something similar be achieved with Google's speech to text API and IRC? I would've been impressed if the transcription was amazing, but there are errors in the video ("foreclosure" instead of "surfer culture", for one).

siglesias · on Oct 15, 2014

A very dismissive and short-sighted comment. The ambition here seems to go beyond the current implementation, to have a "magic" chat view that tracks voices and transcribes them in differing colors automatically and with minimal setup. I laud the effort and encourage the team to play this out for the sake of the hearing impaired. I hope that in the next decade deaf people--and their interlocutors--won't have to hobble together a slew of disparate technologies just to enable a group conversation.

kbenson · on Oct 15, 2014

It's not dismissive, it's inquisitive, and it lays out their reasoning why they don't understand what's special about this. In asking what's special, it gives the opportunity for proponents to address those questions specifically as to why they think it's special and different, so other readers that may have shared the original opinion get more information.

Your explanation about why you think it's special us useful a good example of a positive outcome of the original comment. The way you initially denigrate the question is not.

siglesias · on Oct 15, 2014

It's undeniably dismissive. Yes, the first sentence is a question, but "is it anything more than X?" is a rhetorical flourish meant to imply that the product is trivially replicated. The next sentence goes on to say that he's not impressed.

As pg put it:

Maybe you think you're making some sort of important point here. Or maybe you realize your comment is inane and you think it's witty. But (perhaps without realizing it) you and the people upvoting you represent one of the worst forces at work in the world. The people who ridicule new things when they first appear in incomplete form are one of the worst drags on innovation. [1]

1) https://news.ycombinator.com/item?id=4356562

kbenson · on Oct 15, 2014

I don't think it is, and I think the fact the person begins with a question asking if their assessment of the technology is correct is integral to that point.

As a real life example, I was leaning towards the original poster's interpretation of the product. The reply it spurred helped me see the product in a different light, and I think it has more merit than I originally did, even if I'm not sure the technologies in use, or even how they are combined, is especially new and noteworthy. As is all to often the case, it's the implementation that matters.

In think my initial opinion was dismissive, the original comment was inquisitive (if a bit critical, but I see nothing wrong with some light criticism), the reply it spurred was illuminating, and my resulting opinion was hopeful. I view that as part of HN's success, not something that needs to be overly policed.

tsm · on Oct 15, 2014

GP here; you've used much better words to mirror my opinion. This being on Hacker News, my initial thought was that it would be an amazing technical display. Frankly, it isn't, but the discussion here has helped me realize that the reason we care about it is because it's an incredibly useful application of existing tech. That's still great—there's a ton of value outside of technical wizardry—but it just wasn't immediately clear to me after reading the article.

tduchemin1 · on Oct 20, 2014

hi tsm, founder of Transcense here. beyond the impact we want to do, we always at some point built on top of others/existing technologies. Innovation definition is tricky. Is it in the technical implementation (an Instagram is not that complicated after all) or in the productization/distribution to market? We're humbled to have been posted on HN, not from us. But stay updated, what's coming next will be even more interesting.

mkal_tsr · on Oct 15, 2014

> It's undeniably dismissive.

Looks to me like at least one personal already countered that claim, thus it is no longer undeniably dismissive. I would say your response to his question is even more dismissive of any "dismissal" that may had been interpreted from the OP.

sdrothrock · on Oct 15, 2014

> The ambition here seems to go beyond the current implementation, to have a "magic" chat view that tracks voices and transcribes them in differing colors automatically and with minimal setup.

To put it another way: real-time subtitles. Imagine having something like this in Google Glass. As a person with profound hearing loss, that blows me away.

This technology could easily be repurposed for subtitling videos.

Freeboots · on Oct 15, 2014

Google glass would be a great implementaion

qq66 · on Oct 15, 2014

I just drove the new Tesla, it's really nothing more than a battery, a motor, and wheels. Really nothing that couldn't have been done 100 years ago :)

nicolasehrhardt · on Oct 14, 2014

I don't think a small team of people can be better than Google or Apple in building a speech to text technology. However, leveraging the tools available to help the deaf is the main idea - at least at the beginning - I think. Going forward, they will probably focus on their "Leap Motion" part of the project: from signs to text/voice and let big companies improve their text to speech algorithms that they would just use. Because that's where big improvement can be made.

cerrelio · on Oct 15, 2014

This technology (speaker identification) is 10 years old, and [HMM/neural net] speech recognition is slightly older. So a small team could likely pull it off today just by implementing or using code published by researchers. As long as Transcense have control over the microphone(s), then it might work. Single mic/multispeaker speech recognition is still practically impossible unless the speakers take turns (not always the case).

abecedarius · on Oct 15, 2014

The Leap Motion thing is unrelated, from some other group, at least according to the story.

The problem of supporting realtime conversation among multiple people is different enough from voice search that there's scope to differentiate.

mcherm · on Oct 15, 2014

You are completely correct -- this is, indeed, nothing more than a simple mash up of existing commodotized capabilities. Yet there appears to be a market opportunity to sell such a service for as much as $100-300/year. You should build a competing product (a simple mash up) and sell it.

tmyrden · on Oct 14, 2014

Seems to leave the hearing impaired party as an outsider that is observing the conversation. Great step forward to giving the hearing impaired a foot in the door, so to speak. I'm curious to see where tech like this continues to develop to create an equal playing field for the hearing impaired within the conversation.

TezzellEnt · on Oct 15, 2014

My Grandma is slowly losing her hearing - one ear is 100% deaf while her other remaining one is at about 50%. She uses a CapTel speech to text phone with a huge display to understand what is said when people call her. It generally works well, but she only has one in her family room. She struggles to hear especially when there are multiple people speaking and there is background noise. I've learned techniques to improve her comprehension, but it can only go so far (If interested, here's a few: make sure you're looking at them when talking, speak in a 'deeper tone', don't rush your words, continually repeat what was said until they understand, etc).

Almost everyone in the United States has a phone. If I could download an app that runs this program along with my cousins, and have my Grandma use her 'iPad' (Nook tablet) to understand, with the assistance of something like Transcense, that would be amazing. By linking several microphones, they may be able to cancel out background noise and only highlight the specific speaker, and that would be a fantastic advance.

I'm wondering what their current state of Transcense's speech recognition is, however. From the video, it did see like there were some errors. I'm sure a deaf user can understand what was meant to be said using context of the conversation, but in a business meeting a misunderstood word can change the whole meaning of the sentence or message. I've used Siri, Dragon Naturally speaking, et al and while they're good, they're not perfect. Dragon in particular supposedly can be taught and learn the user's unique style of speech, so I'm also curious if Transcence will be going the route of machine learning and NLP.

Sven7 · on Oct 15, 2014

That's a good set of rules. I'll add one more - keep the sentences short. Some people say a whole lot without having anything much to say.

machineghost · on Oct 16, 2014

Initially I was impressed, but now I worry about the logistic issues in regards to the fine details.

The main problem with general purpose real-time voice recognition is that current hardware is simply way too underpowered to accomplish the task. For instance, running the Dragon 11 SDK on an Intel Atom Z3770 has it about up to a minute behind transcribing the conversation! So I fear Transcene's approach is using the inferior Google Speech API which plainly put, sucks donkey balls, and is no way comparable to the latest Dragon engine. Apple uses the Dragon engine to implement Siri.

There's also the social burden of needing speakers to install an app on their smartphones and also actually have a smartphone in the first place. Will this be a free "remote mic" app such as Dragon 13 provides or does Transcene expect speakers to pony up the monthly cost as well?

I too think businesses and institutions will not allow this because they need overpriced ADA compliant solutions due to regulations. An example would be Interact-AS which is $800 and is essentially a fancy overlay for the Dragon engine (or Microsoft Speech in the low-end $150 version). Dragon itself only costs a one-time $99 to $199!

I'm also skeptical there's a viable business model in this. The vast majority of the deaf are on fixed incomes and not employed, so what is a relatively expensive $30 a month for app access buying them exactly? It better be a superior remote client to server transcriptioning experience! What's to stop Dragon from enabling multiple "remote mic" apps to work all at once with the mothership PC in their next version, etc.? And if not a client server model, what are the minimum hardware specifications to get "one second" transcriptions? A $599 smartphone is a ridiculous and overpriced luxury for the deaf.

As for Google Glass, it is a non-starter. No one wants to look like an idiot constantly staring off into their peripheral vision to read text instead of looking at whoever is speaking -- which is why Google Glass has been such a massive failure. What is truly needed is spatial aware, augmented reality where the transcriptions are placed over who is speaking via beaming text onto normal glasses or directly onto the retina. This technology already exists in various forms; it is just a matter of a real world implemention into a "killer app". Transcene, are you paying attention?

Nonetheless, this is a very important step forward that no one else is really doing, so I'm in for $250... and holding my breath.

appreneur · on Oct 29, 2014

Wow nice insight, I am going to try a protype, where a deaf look at the person on the retina or glass, he can see what he is talking...that's beautiful and amazing. Yes I love transcense , if they can do good and i believe a retina display is must for deaf and dump, suddenly imagine their world coming alive , where they can experience sound , on retina or glass, it's beautiful real work.

Keep it up guys at transcense

post_break · on Oct 14, 2014

So does it go up to a server to transcribe the audio? Is that why it costs $360 a year?

jschwartzi · on Oct 15, 2014

I hope that's an option. I could see this being used to transcribe conversations between groups of hearing people, as in meetings and the like.

Freeboots · on Oct 15, 2014

Its probably running on google speech api, im not very familiar with it, but the calls may be a limited / paid sevice

BHSPitMonkey · on Oct 14, 2014

That's also why most businesses won't be able to use it.

batbomb · on Oct 14, 2014

Businesses can already communicate with a deaf person the same way using humans instead of servers, it's called VPS.

BHSPitMonkey · on Oct 15, 2014

What does this have to do with the confidentiality issues that come with sending your full meeting audio and transcripts into the hands of a third party?

megablast · on Oct 15, 2014

I don't get this point, and the point after?

Why can businesses not use this, and how does a VPS replace this?

ObviousScience · on Oct 15, 2014

Businesses (particularly larger ones) are incredibly leery of having internal business communications handled by external parties.

podjackel · on Oct 14, 2014

You mean a text message?

chadgeidel · on Oct 14, 2014

I thought the same thing - until I read this: "It works by catching conversations from the voices of different individuals and assigning them a color bubble so the deaf person knows who said what. It works with a distributed microphone system on all the devices using the app so that it can distinguish each person from another."

That's pretty damn amazing IMHO.

chaf · on Oct 16, 2014

What makes this better then use assign every microphone to a individual. Is it possible to be 4 individuals and 2 microphones? Then I would be impressed but if it have to be as many microphones as individuals then it just seems over engineered.

rcthompson · on Oct 14, 2014

You wouldn't have made this comment if you actually looked at the article and saw what the app does.

YokoZar · on Oct 15, 2014

Still, snarkiness aside, it's worth asking every once in a while if the problem you're solving is something that actually merits an app at all (or, for that matter, a computer).

That said, there does seem to be promise here, as there's a reason we video conference (or talk in person) rather than text-message for everything.

spiritplumber · on Oct 15, 2014

I swear I read "converse with the dead"...