I think it’s a cultural difference. I’m also from a non-dubbing country (Netherlands) and I can’t stand dubbed content either. On the other hand people tell me they can’t stand subtitles because it “reveals” what they’re going to say before they say it.
"people tell me they can’t stand subtitles because it “reveals” what they’re going to say before they say it."
I love watching movies in the original language, but this is something I hate as well, but something that can be avoided.
Some movies get it right, though. The timing, just the words that are spoken and even different colors for different persons speaking (very rare, cannot even remember where I have seen it). That should be standard, but with most movies you can be lucky if the subs even match the plot and do not reveal too much.
Some of the best subtitles I've ever seen were on Tom Scott's YouTube channel. They use different colours, indicators for jokes and sarcasm, while also staying relatively close to what's actually been said. They're better than many big-budget movies and TV shows I've seen.
He talked about subtitling at some point, and I was surprised how cheap subtitling services are. I think he went beyond the price he mentioned, but it really made me question why big, profitable YouTube channels aren't spending the small change to do at least native language subtitles that Google can translate, instead of relying on YouTube's terrible algorithm
That said, Whisper seems to generate quite good subtitles that take short pauses for timing into account, but they're obviously neve going to be as good as a human that actually understands the context of what's being said.
Yes. But Whisper's word-level timings are actually quite inaccurate out of the box. There are some Python libraries that mitigate that. I tested several of them. whisper-timestamped seems to be the best one. [0]
That's a great use case for LLMs, actually. Translate the sentence only up to what has been said so far. Basically, a balance between translating word-for-word (perfect timing, but terrible grammar) and translating the whole sentence and/or thought (perfect grammar and meaning, but potentially terrible timing).
With the SRT file format for subtitles, I think, there's no reason why one couldn't make groups of words appear as they are spoken.
Actually, I have to do the same thing when generating the dubbed voices. Otherwise it feels as though the AI voice is saying something different than the person in the video, especially when the AI finishes speaking and you still hear some of the last words from the original speaker.
Unfortunately not all languages follow the same sentence structure, so translating "up to what has been said so far" is not possible.
Assume 2 dramatic stops in an English sentence, and observe Turkish version. You can
"I will.. go to.... the cinema"
"Ben... sinemaya... gidecegim" (I .. to the cinema.. go)
I'm Norwegian, and Norway used to be near-universally non-dubbing other than for TV for the very youngest children, and even then almost exclusively cartoons or stop motion etc. where it wasn't so jarring. But the target age of material being dubbed has crept up as it has become relatively-speaking cheaper to do compared to revenues generated in what is a tiny market.
The thing that annoys me the most about it is that it often alters the feel of the material. E.g. I watched Valiant (2005) with my son in Norwegian first, because he got it on DVD from his grandparents. He doesn't understand much Norwegian, but when he first got the DVD he was so little that it didn't matter. A few years later we watched the English language version.
It comes across as much darker in the English version. The voice acting is much more somber than the relatively cheerful way the Norwegian dub was done, and it while it's still a comedy, in comparison it feels like the Norwegian version obscures a lot of the tension, and it makes it feel almost like a different movie.
I guess that could go both ways, but it does often feel like the people dubbing something are likely to have less time and opportunity to get direction on how to play the part, and you can often hear the consequences.
I prefer subs over dubbing for foreign languages, but I cannot stand closed captions (for people who can’t hear at all) because having your eye drawn to the bottom of the screen for a description of something I don’t need to know about is horrible!
Sometimes it's hilarious when they're trying to describe the dramatic tension from sounds or music, and "reveal" all the cliches, though. "Music swells to a tear-jerking crescendo"