I made a policy that I visit any news sites only once a day (usually in the morning). When I catch up and when I start seeing stories I already saw yesterday, I stop and don't come back until tomorrow.
YouTube captioning in English works surprisingly well, the improvement over the last few years is huge. It still chokes on proper nouns but in general it mostly works.
I think it's a bit like self-driving cars in the sense that it's good enough to be impressive but not good enough to be actually usable everywhere. Of course self-driving is worse because people seldom die of bad captions.
Google's captioning works well when people speak clearly and in English. Google translate works well when you translate well written straightforward text into English. It's impressive but it's got a long way to go to reach human grade transcription and translation.
I think when evaluating these things people underestimate how long the tail of these problems is. It's always those pesky diminishing returns. I think it's true for many AI problems today, for instance it looks like current self-driving car tech manages to handle, say, 95% of situations just fine. Thing is, in order to be actually usable you want something that critical to reach something like 99.999% success rate and bridging these last few percent might prove very difficult, maybe even impossible with current tech.
What's important to remember, I think, is that we should not compare YouTube auto captions to human made captions, because auto captions were not created as a substitute for human made captions - if it wasn't for auto captioning, all these videos wouldn't get any captions at all. They may never be perfect, but they're not designed to be, they're creating new value on their own. And IMO they crossed the threshold of being usable, at least for English.
Mh no it does not. It is just a source of hilarity apart from a few very specific cases (political speeches mostly, because of their slow pace, good english and prononciation I guess).
Every time I activate it I am in for a good laugh more than anything actually useful.
It works for general purpose videos. Transcripts of any kind appear to stop working whenever there's domain knowledge involved. That doesn't matter for most youtube videos but is crucial if you want to have a multi purpose translator/encoder.
A. Cooper had a nice example of this kind: a dancing bear. Sure, the fact that bear dances is very amusing, but let it not distract us from the fact that it dances very very badly.
In Polish there are different words for free as in free speech and for free as in free beer. But, interestingly, the words for "free" as in free speech and for "slow" are the same, so "wolne oprogramowanie" can mean both "free software" and "slow software" :-) If I remember correctly, the etymology for both words is "wola" which means "will" - if you were free, you could act on your own will, but you also could do things on your own pace. Slowly, that is. But I'm not completely certain if I remember the etymology right, so take it with a grain of salt:-)