Hacker Newsnew | past | comments | ask | show | jobs | submit | AnthonyR's commentslogin

hard agree, there's already "voice ai" companies that use the normal models and have this "interaction" engine on top of them to produce better results than I've seen in these demos. idk why people are impressed

Have you tried this task using an actual OCR model like Google Cloud Vision AI? I am not sure if this is what Gemini uses under the hood but multi-modal LLMs are not designed to extract text like this so it should be no surprise it's not good at it?

Google Cloud Vision AI is a specialized model built on CNNs frameworks which is part of the Interfaze architecture which is an hybrid so you get best of both worlds. Google cloud vision was pretty far behind other specalized models like PaddleOCR etc anyways so if you're looking for a pure CNN, check them out.

You can find the explanation and the comparison in the article, which we benchmarked pure CNN models, pure LLM models and a hybrid architecture like ours.


I don't think I've tried Google Cloud Vision on that particular image, no. In my experience, based on some tests from a year ago or so, Azure Document Intelligence impressed me the most in terms of OCR - out of the big three players: GCP, AWS and Azure.

I should retry the experiment because there has been a lot of progress since then and I could imagine that GCP improved there vision models since then.


This is great!

I'd argue a good use of vibeslop -- a non security critical, fun, UI-centric data presentation website, don't be so cynical :)

I mean it's not as offensive as a lot of other vibe-coded "products", but it's just kind of a waste of everyone's time; there's no ingenuity in the presentation (it suffers from the same emoji-feature-box-small-text combo as every single one-shot vibecoded site in existence) and judging by said presentation I highly doubt the author put the effort into researching this list or writing the comments by themselves either.

So while it's not gonna be the next Moltbook in terms of security breaches, it's basically just the 2026 version of your middle manager copy-pasting a paragraph from ChatGPT web into Slack. It's content from nobody for nobody. It's definitely not what I want to see on the HN front page, but I guess if people get a kick out of it then you do you.


Super impressive for a solo project! How does this compare to capacities.io ?


Something being harder and attributing value to that makes no sense. Sure a big moat is important for value but "difficult to do" is just a unidimensional angle.


Showing naked butt on the internet seems easy.

Earning millions that way is much more complicated.


Yeah I also don't quite understand the example on the homepage... they agreed to 50/50 and then she wanted 70/30 so now they settle on 60/40? Like this doesn't seem like a "fair" mediation it's kind of weird (obviously oversimplifying the situation a bit but nonetheless I'm not sure real world conflicts are this simple in practice)


You raise a good point. The issue is presentation - leading with the 60/40 reads like midpoint arbitration, whereas the interesting part is Daniel's path back to 50/50, the management salary, the mutual waiver on the first 18 months (which is what settles his rent contribution), and the shotgun buy-sell.

I've made some changes that should help with this.


They wanted 50/50, but from the vignette Daniel didn’t continue to do 50% of the work.


Sure, he just continued to take sole responsibility for the production of the product, quality and quantity, while also holding down an additional job, which paid the rent.

These characters have both been putting the work in.

I’d be looking for a serpent at his partner’s ear, planting poisonous suggestions that she deserves more of the company they started equally. If this were real.


> While also holding down an additional job

That's the problem, the story is saying he stopped focusing full-time on the business in order to make his own ends meet. It looks like the main innovation of the mediator generated deal is that it attempts to reconcile by drafting a way back in to 50/50 if he recommits. The starting 60/40 split is not that important.


Her ends, too. They share an apartment, in the story.

This is certainly an example of what I would expect from a product designed to optimize a prenup. You know, they say money ruins people, but sometimes you just have to acknowledge there was nothing really ever there decent to begin with.


Yeah after re-reading the scenario it is pretty weird. The AI doesn't have enough data. There should be concrete numbers for the rent. Why wouldn't Daniel tell the LLM exactly how much it was?


Well, I don't know, I'm sure. Totally unrelated, I hear a common piece of advice for the aspiring con artist is to avoid overcomplicating the legend.


He paid her rent


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: