Hacker Newsnew | past | comments | ask | show | jobs | submit | xmprt's commentslogin

I've seen this happen when AI is asked to spot fix a single sentence in the middle of an essay and it fails to maintain the style of the rest of the writing.

Not sure if I buy this reasoning. I just prompted ChatGPT and asked it to "pay attention to all the words". It still got it wrong but I'm sure if you told a human this, they'd actually do more thinking to get the right answer.

###

> Pay attention to all the words and answer this riddle: A young man gets into an accident. The surgeon, after completing his preparations, says: ‘I can’t operate on him, he is my son.’ How is this possible?

The surgeon is his mother.

(The riddle plays on the assumption that surgeons are male — the surgeon can simply be the boy’s mother.)

> Are you sure about this? Remember to pay attention to all the words.

Yes — the surgeon is his mother.

The riddle hinges on the unstated assumption that surgeons are male. Nothing in the wording rules out that the surgeon is female (or that the parent-child relationship is by adoption, step-parenting, etc.), so the simplest answer is that the surgeon is the young man’s mother.


I am sure some humans would still miss the clue even if you told them to pay close attention. Being dumb is not the same as not thinking.

Even if they aren't obligated to repay, they have to in practice because it'll impact their ability to get loans in the future. If the shell company declares bankruptcy and gets the loans off Meta's books no one will ever loan money to Meta again.

They would still be able to get loans, but the terms would be much worse.

Basically, if we’re reading about it from substacks and Matt Levine’s newsletter then it’s already fully common knowledge in the finance world.


Eh, debt investors have short memories. They buy 100 year bonds from Argentina, for fuck's sake. It might limit Meta's ability to do this SPV trick.

> Fundamentally, with LLMs you can't separate instructions from data, which is the root cause for 99% of vulnerabilities

This isn't a problem that's fundamental to LLMs. Most security vulnerabilities like ACE, XSS, buffer overflows, SQL injection, etc., are all linked to the same root cause that code and data are both stored in RAM.

We have found ways to mitigate these types of issues for regular code, so I think it's a matter of time before we solve this for LLMs. That said, I agree it's an extremely critical error and I'm surprised that we're going full steam ahead without solving this.


We fixed these in determinate contexts only for the most part. SQL injection specifically requires the use of parametrized values typically. Frontend frameworks don't render random strings as HTML unless it's specifically marked as trusted.

I don't see us solving LLM vulnerabilities without severely crippling LLM performance/capabilities.


> We have found ways to mitigate these types of issues for regular code, so I think it's a matter of time before we solve this for LLMs.

We've been talking about prompt injection for over three years now. Right from the start the obvious fix has been to separate data from instructions (as seen in parameterized SQL queries etc)... and nobody has cracked a way to actually do that yet.


Yes, plenty of other injections exist, I meant to include those.

What I meant, that at the end of the day, the instructions for LLMs will still contain untrusted data and we can't separate the two.


I hope you share that same energy for people doing high frequency trading or writing advertisement engines. Cheating in neopets is probably at the lowest end of harm caused by cheating and also hurts neopets devs more than it hurts other players.


The entire basis of this article/rumor is a single job posting on Google's careers website... Unifying Android across all devices is Google's holy grail and they've been hiring for that for most than a decade. I don't think we have to read into this much.


Unifying the two has never been an internal goal until 2024. I'm not sure why you think otherwise. Everything before that has just been rumors and maybe one off projects by very small amounts of people. Rebasing ChromeOS on the lower half of Android is real and has been publicly announced. It is not necessarily the layers you will notice through. It's about unifying things like the kernel, display stack, power management, Bluetooth stack, etc. There are effectively divergent universes between ChromeOS and android (and the desktop Linux ecosystem) despite these things not necessarily requiring unique solutions.


Might be that the source of the rumour is an inside disclosure which pointed to the job listing as a published fact.

That's an extrapolation on my part, of course, but it's not inconsistent with how other leaks or disclosures have occurred. Can't speak to Android Authority's practices here.


> It's almost a law of nature

We have tons of different systems for accumulating power all over the world. Corporate structures, democracy vs autocracy, etc. In each of those societies, we see different types of leaders on a sliding scale of savoriness.

My point is that clearly there are some forms of governance which result in more savory people and so you can argue that it's the systems that define the outcomes rather than any "law of nature".


Please make sure that the new billing experience has support for billing limits and prepaid balance (to avoid unexpected charges)!


We are working on hard billing limits! Should land in late Dec or early Jan!


Lol. Since the GirlsGoneWild people pioneered the concept of automatically-recurring subscriptions, unexpected charges and difficult-to-cancel billing is the game. The best customer is always the one that pays but never uses the service ... and ideally has forgotten or lost access to the email address they used when signing up.


> or lost access to the email address they used when signing up.

Since Gmail controls access to tens of millions of people's email, I'm seeing potential for some cross-team synergy here!


tens of millions? I think you're severely underestimating it.


Or

* Helper: This is a great suggestion which I'll flag for the team to add support (5 years ago)


For what it’s worth the people who made that sort of post are probably vaguely annoyed at the lack of progress on this change, or on other ones on their own particular list of requests that have been moldering for half a decade while everyone spends three dev cycles adding half-assed AI bullshit features.


One possibility is increased monitoring. In the past, issues that happened weren't reported because they went under the radar. Whereas now, those same issues which only impact a small percentage of users would still result in a status update and postmortem. But take this with a grain of salt because it's just a theory and doesn't reflect any actual data.

A lot of people are pointing to AI vibe coding as the cause, but I think more often than not, incidents happen due to poor maintenance of legacy code. But I guess this may be changing soon as AI written code starts to become "legacy" faster than regular code.


At least with GitHub it's hard to hide when you get "no healthy upstream" on a git push.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: