Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I am pessimistic about AI. The technology is definitely useful, but I watched a video that made good points.

  * On benchmark style tests, LLMs are not improving. GPT-4o is equivalent to GPT-4 and both barely improve on GPT-3.5T.
  * AI companies have been caught outright lying in demos and manipulating outcomes by pre-training for certain scenarios.
  * The main features added to GPT-4o have been features that manipulate humans with dark patterns
  * The dark patterns include emotional affects in tone and cute/quirky responses
  * These dark patterns encourage people to think of LLMs as humans that have a similar processing of the universe.
I seriously wonder about the transcript that these guys had with the LLM. Were they suggesting things? Did ChatGPT just reconfigure words that helped them think through the problem? I think the truth is that ChatGPT is a very effective rubber duck debugger.

> https://www.youtube.com/watch?v=VctsqOo8wsc



I don't think it makes sense to base your opinion on videos when you could simply use the product yourself and get first-hand experience.

I've come full circle on it. First I thought it was totally amazing, then I pushed it hard and found it lacking, and then I started using it more casually and now I use a little bit every day. I don't find it completely brilliant but it knows an above-average amount about everything. And it can make short work of very tedious tasks.

I just type to ChatGPT so there are no "emotional effects" involved.


Try playing around with getting GPT-4 to discuss creative solutions to real unsolved problems you are personally an expert on, or created yourself. That video looks like just standard internet ragebait to me.

I find it pretty annoying when people say you are just being manipulated by hype if you are impressed by LLMs, when I was a serious skeptic, thought GPT-3 was useless, and only changed my opinion by directly experimenting with GPT-4 on my own- by getting it to discuss and solve problems in my area of expertise as an academic researcher.


Its the new eternal summer, welcome to the club :) GPT-3 was translating 2000 lines of code across 5 languages and enabling me to ship at scale


I don’t know Jack about C# and .NET and I’ve used ChatGPT to write several nontrivial programs.


> On benchmark style tests, LLMs are not improving. GPT-4o is equivalent to GPT-4 and both barely improve on GPT-3.5T.

The understatement of this year.

GPT-4o is 6x cheaper than GPT-4. So if it's actually equivalent to GPT-4, it's great improvmenet.

In fact, calling it "great" is still a crazy understatement. It's a cutting edge technology becoming 6x cheaper in 1.5 year. I'm quite sure that never happened before in the human history.


A lot of the video points are plain wrong in numbers and experientially. I don't know how they feel comfortable claiming benchmark numbers didn't improve much from 3.5 to 4


Going off the GP's points alone, they said GPT-3.5T vs. GPT-4, where T I assume stands for "Turbo". If that's the case, then the video must have its timelines wrong - GPT-3.5 Turbo came out with GPT-4 Turbo, which was some time after GPT-4, and OpenAI put a lot of work in making GPT-3.5 Turbo work much better than 3.5. I can sort of buy that someone could find the difference between 3.5 Turbo and 4 to be not that big, but they're making the wrong comparison. The real jump was from 3.5 to 4; 3.5 Turbo came out much later.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: