Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Very cool! Curious about the technical details of the LLM side. Can't imagine this is cheap to run.

Tiny suggestion: the timestamp in the generated text block could link to the corresponding timestamp in the video.



I have a similar (less polished) side project I threw together in a weekend [0][1]. I use ChatGPT's API and it costs ~0.5c / article generated. So IMO very reasonable.

[0] https://gitea.va.reichard.io/evan/VReader/src/branch/master/...

[1] https://vreader.va.reichard.io/


It depends. If it is using YouTube's transcript API, it shouldn't be that bad. The problem with that is that it tends to mess up when there are multiple people in the video.


Wouldn't the expensive part of this be the LLM summary, not the video transcription? I can do transcription pretty easily with my low end graphics card and a fast version of whisper.

I was wondering though, if this uses the YouTube transcript if available, and falls back to do the transcription itself if not.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: