Hacker Newsnew | past | comments | ask | show | jobs | submit | Imnimo's commentslogin

I'm not sure I understand why this company is talking about "frontier artificial intelligence".

This is exactly what Dario asked for in his last blog post. So even though this is clearly stupid, I just can bring myself to feel sorry for Anthropic.

He asked for an independent body.

No, he asked for the government to make the decision in light of 3rd party analysis. Which is what happened here - an independent company demonstrated a jailbreak, and the government issued a restriction on deployment based on that finding.

"The government should have the power to block or deter deployment of the model if it is determined, in light of third-party assessment, to present unacceptable risks. This power must be scoped to the above four specific risks and there must be protective measures against political favoritism or arbitrary decisions."

You are wrong.

Read in full here https://darioamodei.com/post/policy-on-the-ai-exponential.

I know you won't though. haha.


I am having trouble understanding which ingredient you feel is missing here.

Can you be more specific? It seems to me that the there was a third party assessment, they identified risks associated with the specific risk groups, and the government therefore chose to block the model's deployment.


You have to be precise: the gov blocked “export” of the model (search for ITAR for a lengthy history on this), and Anthropic picked up its ball and went home.

I’m willing to bet internally they thought this was a good plan from the beginning - from engagement, requests for reg oversight, Mythos PR, silently nerfing AI engineering quality, and now this “pulling the model” stunt. It’s frustrating, I generally like using the Claude models, but I don’t think I’ve ever been a customer of such a user-hostile company before.


So if I can demonstrate a jailbreak in ChatGPT, the government will immediately slap a "no foreign nationals" ban on GPT-5.5?

that's cute

This is explicitly not what Dario asked for in his blog. Care to quote his post for me where you feel that he asked for this?

Please tell me how this is what he “asked for.”

"The government should have the power to block or deter deployment of the model if it is determined, in light of third-party assessment, to present unacceptable risks."

>The government should have the power to block or deter deployment of the model if it is determined, in light of third-party assessment, to present unacceptable risks. This power must be scoped to the above four specific risks and there must be protective measures against political favoritism or arbitrary decisions.

I feel significantly less sympathy for Anthropic's Supply Chain Risk designation if they believe the government should have this power over them. You get what you sign up for.


>Craftsmanship will always be in our hands, it's one thing we can never outsource to a machine.

Current AI coding is certainly very lacking in the craftsmanship department. But it is not obvious to me that that will always be the case. I don't think there's some fundamental reason AI could never produce code that matches or exceeds the craftsmanship of human experts.


The fact that LLMs fundamentally do not, and cannot, have understanding or reasoning. I don't think you can produce craftsmanship through sheer volume of throwing stuff at the wall and seeing what sticks.

This reminds me also of this paper: https://www.pnas.org/doi/pdf/10.1073/pnas.1115585109

"The allocation of all metabolic resources to maintenance purposes limits the size of the smallest prokaryotes and largest unicellular eukaryotes, whereas an inability to meet the ever-increasing biosynthesis rates limits the largest prokaryotes and smallest unicellular eukaryotes. Metabolic constraints for larger eukaryotes are relieved by alternative reproductive strategies and multicellularity."


That framing makes the article feel even more interesting, because it's not just "cells are small because diffusion gets slow". There's also an energy budget behind it

I direct a lot of questions to LLMs, but I want to ask a high-quality model, not the crappy one that Google uses to answer queries. If I'm typing something into Google, it's because I want a search result, not an LLM answer.


I've actually changed that. When I type something into Google it's because I want an LLM answer - their search results have been useless for a while now. But that's only because I rarely use Google these days. I'm mostly using DDG to search (I might try Kagi at some point). Google is relegated to my phone when I want a quick answer where accuracy isn't critical without needing to scroll through a bunch of search results/open and read websites on a small screen.


Kagi, unfortunately, is getting worse too. I think mainly because they don’t get access. But I’m not sure. I had to fallback more and more to Google, because Kagi couldn’t find exact matches, while Google could. Like texts which I copied from a webpage (for example from Android’s source), and it can’t find it.

Its search results ordering is quite good, but the accessible information for them seems to be shrinking. And quickly.

I’m at the point where I don’t search for complex things anymore. I use Kagi for things which can be found with any search engines. Not because I chose it, but because I was forced. This was not the case a few years back, when I started to use it.

Btw, there was one thing with which Google was superior all along: define <word>. And they fucking killed it in the past months, for a far, far worse solution. Nothing comes even close.


I do have to say, and this is from recent observations, not outdated ones, but their AI summaries get things wrong, alot, and these are things that gemini (proper), Claude, or ChatGPT subscription AI's don't get wrong.


I don't know why their AI summary model is so bad, yet when you just click through to AI mode it's miles better...


I've noticed that too. I am sure its because the AI summary runs on almost every query.


It also has to be very fast = small


If I'm typing something into a google it's usually so I can be hit with a Captcha on my home internet connection and then get search results that aren't even any better than DDG. And DDG has a LLM as well.


You've got the captcha issue as well? Seems like it's happening constantly now. I suspect Apple Private Relay has something to do with it, but I'm not sure.


Nope, not exclusively an Apple thing, since I don't use any Apple products at home, and have had an uptick in captcha requests.


If Claude starts sending queries to Google then Google makes you use captcha from that IP, likely you have been using such a bot and it sent queries without you knowing.


only time i ever have to deal with this anywhere (and i move physical sites a lot) is when using a commercial VPN provider. odd.


There is an interesting old article by Magic's creator about what the game environment was like during the early playtesting days - when card packs were handed out to a community of playtesters at UPenn, and they traded in a closed ecosystem, occasionally getting an influx of additional cards. It seems like this was a pretty successful recreation of that feeling:

https://magic.wizards.com/en/news/making-magic/creation-magi...


I could imagine cases where prediction markets could offer some actual insight, but in practice they seem few and far between. Most markets I've seen devolve into one or more of: betting on unimportant events (e.g. sports games), insider trading, or poorly written ambiguous resolution criteria. It's just hard for me to imagine that, on net, these markets will offer more societal good than the harm we've seen from sports betting.


> I could imagine cases where prediction markets could offer some actual insight

I could imagine that there's very little insight to be gained when some 23 year old has so little hope for his financial future that he's spending some of what he earned last night delivering food for uber eats on making a completely uninformed bet on a geopolitical situation involving a country he couldn't even find on a map just in case it pays out well enough that he can afford some of the needed medical care he's been putting off. That's the type of user I imagine most people participating in prediction markets are. It's telling that the vast majority of the people making bets lose their money. Are their predictions really valuable data?


ThAt's one of the interesting(?) aspects of prediction markets. Some meaningful percentage of people in the market need to be bringing some useful insight. If everyone is just making random guesses, the market isn't very useful (other than, perhaps, financially to the lucky).


What insights are being gained? Every time this topic comes up, I see someone vaguely mentioning insights from prediction markets. But no one ever has concrete examples - real or even hypothetical.

Can you give an example of insights derived from prediction markets? Who benefited from the insights, and how did they act on the insights?


Hypothetically, someone who doesn’t live in the United States and is considering accepting a job there might appreciate the insight from prediction markets on the upcoming U.S. presidential election. If the likely candidate promises to enact policies that would make that person’s life worse, that would give them reason to reject the job offer.

Other hypothetical prediction markets whose insights would be useful include those used in futarchy, a proposed government system in which decisions are made based on betting markets. The proposal: https://mason.gmu.edu/~rhanson/futarchy.html; some analysis: https://www.lesswrong.com/w/futarchy. In futarchy, prediction markets would be set up for, for example, “average happiness of citizens (as measured by regular survey) will increase in 1 year if Bill ABC passes” and “average happiness of citizens will increase in 1 year if Bill ABC does not pass”. The government would pass or reject proposed bills according to whichever market predicts higher happiness, and the market describing the event that did not happen would be closed and its money refunded.


I would agree that the insights provided are little to none. I think this is some sort of fallacy that being being PR'd by tech bros with a lot of skin in the game attempting to come up with some kind of positive reason to sell to people. It's fairly easy to see right through it though.

WSJ had an insightful article that claimed .1% of the accounts take 67% of the profit, along with some other facts: https://www.wsj.com/finance/investing/polymarket-kalshi-bett...

Full disclosure, I've written performant market making algorithms for Polymarket. I'm actually a fan of these markets and enjoy the statistics and engineering challenges they present, but see it as a net negative on society. I'd gladly give up my PnL if it was a net positive on the American psych.


It looks like the markets aren't very useful except for the small percentage of people with inside information, the exceptionally lucky, and the platforms themselves that get to take a cut of the action and/or collect personal data to sell.


Getting inside traders to publicize their knowledge for a price is like half of the point of prediction markets


I asked this elsewhere, but do you have an example of this knowledge? Have prediction markets ever revealed genuine knowledge? If so, has anyone ever acted on the newly revealed knowledge in a meaningful way?


Does polymarket then have a full profile of its users to be able to distinguish insider trades vs noise?

Genuine question as I haven’t used the site and everything I know about them are comments and some articles.


Insider trading does offer actual insight


>But this is where the line slightly blurs in my head. Did we possibly just build the first human biocomputer and immediately put it in a simulated hell, playing the same game on loop, forever? Using the same reward mechanisms we use for LLMs?

This description does not seem to really match what was done in the Doom demo, and makes me skeptical that the author has actually looked into the details.


Author clearly doesn’t know the field well at all. First few paraphrases reveal this. Opening sentence: I’ve been in the AI space since ChatGPT first dropped.

Everyone is allowed to have an opinion, but that doesn’t mean they’re all worth listening to. Unfortunately, right now, all of those opinions are about ai.


> since ChatGPT first dropped.

That'd be in November of 2022.

https://openai.com/index/chatgpt/


> skeptical that the author has actually looked into the details.

Nevermind the experiment.. same deal for a lot of people who are only interested enough to offer opinions about consciousness and theory-of-mind without doing any of the boring background reading.

The bottom line in TFA is maybe just about unapologetic carbon-chauvinism. But although OP has "been in the AI space since ChatGPT first dropped" and "bothered by this for months", they don't seem aware of terms or the usual problems with this position. Your average non-technical scifi reader has a more nuanced take than AI bros puffing up blogs for linked-in traffic


The bad boy of science!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: