steinvakt's comments

steinvakt · on Oct 13, 2024

How does the accuracy compare to Whisper?

Etheryte · on Oct 13, 2024

This uses SenseVoice under the hood, which claims to have better accuracy than Whisper. Not sure how accurate that statement is though, since I haven't seen a third party comparison, in this space it's very easy to toot your own horn.

[0] https://github.com/FunAudioLLM/SenseVoice

jmward01 · on Oct 13, 2024

This uses SenseVoice small under the hood. They claim their large model is better than Whisper large v3, not the small version. This small version is definitely worse than Whisper large v3 but still usable and the extra annotation it does is interesting.

khimaros · on Oct 13, 2024

this claims to have speaker diarization which is a potentially killer feature missing from most whisper implementations.

pferdone · on Oct 13, 2024

I mean they make a bold statement up top just to paddle back a little bit further down with: "[…] In terms of Chinese and Cantonese recognition, the SenseVoice-Small model has advantages."

It feels dishonest to me.

[0] https://github.com/FunAudioLLM/SenseVoice?tab=readme-ov-file...

ks2048 · on Oct 13, 2024

I've been doing some things with Whisper and find the accuracy very good, BUT I've found the timestamps to be pretty bad. For example, using the timestamps directly to clip words or phrases often clips off the end of word (even simple cases where is followed by silence). Since this emphases word timestamps, I may give it a try.

steinvakt · on Sept 25, 2024

So "What did Ilya see" might just be "Ilya actually saw Sam"

steinvakt · on Sept 25, 2024

People have been saying that we reached the limits of AI/LLMs since GPT4. Using o1-preview (which is barely a few weeks old) for coding, which is definitely an improvement, suggests there's still solid improvements going on, don't you think?

samatman · on Sept 25, 2024

Continued improvement is returns, making it inherently compatible with a diminishing returns scenario. Which I also suspect we're in now: there's no comparing the jump between GPT3.5 and GPT4 with GPT4 and any of the subsequent releases.

Whether or not we're leveling out, only time will tell. That's definitely what it looks like, but it might just be a plateau.