More

golfer · 2026-02-05T22:41:13 1770331273

There's lots of websites that list the spells. It's well documented. Could Claude simply be regurgitating knowledge from the web? Example:

https://harrypotter.fandom.com/wiki/List_of_spells

qwertytyyuu · 2026-02-06T03:03:50 1770347030

Hmm… maybe he could switch out all the spells names slightly different ones and see how that goes

ck_one · 2026-02-05T22:44:33 1770331473

It didn't use web search. But for sure it has some internal knowledge already. It's not a perfect needle in the hay stack problem but gemini flash was much worse when I tested it last time.

viraptor · 2026-02-05T22:57:55 1770332275

If you want to really test this, search/replace the names with your own random ones and see if it lists those.

Otherwise, LLMs have most of the books memorised anyway: https://arstechnica.com/features/2025/06/study-metas-llama-3...

jazzyjackson · 2026-02-06T03:45:50 1770349550

Being that it has the books memorized (huh, just learned another US/UK spelling quirk), I would suppose feeding it the books with altered spells would get you a confused mishmash of data in the context and data in the weights.

ribosometronome · 2026-02-06T00:01:45 1770336105

Couldn't you just ask the LLM which 50 (or 49) spells appear in the first four Harry Potter books without the data for comparison?

viraptor · 2026-02-06T00:10:33 1770336633

It's not going to be as consistent. It may get bored of listing them (you know how you can ask for many examples and get 10 in response?), or omit some minor ones for other reasons.

By replacing the names with something unique, you'll get much more certainty.

Grimblewald · 2026-02-06T00:29:27 1770337767

might not work well, but by navigating to a very harry potter dominant part of latent space by preconditioning on the books you make it more likely to get good results. An example would be taking a base model and prompting "what follows is the book 'X'" it may or may not regurgitate the book correctly. Give it a chunk of the first chapter and let it regurgitate from there and you tend to get fairly faithful recovery, especially for things on gutenberg.

So it might be there, by predcondiditioning latent space to the area of harry potter world, you make it so much more probable that the full spell list is regurgitated from online resources that were also read, while asking naive might get it sometimes, and sometimes not.

the books act like a hypnotic trigger, and may not represent a generalized skill. Hence why replacing with random words would help clarify. if you still get the origional spells, regurgitation confirmed, if it finds the spells, it could be doing what we think. An even better test would be to replace all spell references AND jumble chapters around. This way it cant even "know" where to "look" for the spell names from training.

angst · 2026-02-06T02:29:37 1770344977

btw it recalls 42 when i asked. (without web search)

full transcript: pastebin.com/sMcVkuwd

f33d5173 · 2026-02-06T02:44:56 1770345896

Not sure how they're being counted, but that adds up to 46 with the pair spells counted separately. But then nox is counted twice, so maybe 45.

heavyset_go · 2026-02-06T02:13:16 1770343996

No, because you don't know the magic spell (forgive me) of context that can be used to "unlock" that information if it's stored in the NN.

I mean, you can try, but it won't be a definitive answer as to whether that knowledge truly exists or doesn't exist as it is encoded into the NN. It could take a lot of context from the books themselves to get to it.

joshmlewis · 2026-02-05T22:57:41 1770332261

I think the OP was implying that it's probably already baked into its training data. No need to search the web for that.

obirunda · 2026-02-06T01:17:03 1770340623

This underestimates how much of the Internet is actually compressed into and is an integral part of the model's weights. Gemini 2.5 can recite the first Harry Potter book verbatim for over 75% of the book.

NiloCK · 2026-02-06T02:42:10 1770345730

I'm getting astrology when I search for this. Any links on this?

f33d5173 · 2026-02-06T02:48:28 1770346108

Iirc it's not quite true. 75% of the book is more likely to appear than you would expect by chance if prompted with the prior tokens. This suggests that it has the book encoded in its weights, but you can't actually recover it by saying "recite harry potter for me".

jdminhbg · 2026-02-06T02:54:11 1770346451

Do you happen to know, is that because it can’t recite Harry Potter, or because it’s been instructed not to recite Harry Potter?

jazzyjackson · 2026-02-06T03:47:36 1770349656

It's a matter of token likelihood... as a continuation, the rest of chapter one is highly likely to follow the first paragraph.

The full text of Chapter One is not the only/likeliest possible response to "recite chapter one of harry potter for me"

jamesfinlayson · 2026-02-06T04:53:45 1770353625

Instructed not to was my understanding.

obirunda · 2026-02-06T03:22:43 1770348163

https://arxiv.org/abs/2601.02671?hl=en-US

IAmGraydon · 2026-02-06T03:39:07 1770349147

I'm not sure what your knowledge level of the inner workings of LLMs is, but a model doesn't need search or even an internet connection to "know" the information if it's in its training dataset. In your example, it's almost guaranteed that the LLM isn't searching books - it's just referencing one of the hundreds of lists of those spells in it's training data.

This is the LLM's magic trick that has everyone fooled into thinking they're intelligent - it can very convincingly cosplay an intelligent being by parroting an intelligent being's output. This is equivalent to making a recording of Elvis, playing it back, and believing that Elvis is actually alive inside of the playback device. And let's face it, if a time traveler brought a modern music playback device back hundreds of years and showed it to everyone, they WOULD think that. Why? Because they have not become accustomed to the technology and have no concept of how it could work. The same is true of LLMs - the technology was thrust on society so quickly that there was no time for people to adjust and understand its inner workings, so most people think it's actually doing something akin to intelligence. The truth is it's just as far from intelligence your music playback device is from having Elvis inside of it.

kgeist · 2026-02-06T15:57:41 1770393461

>The truth is it's just as far from intelligence your music playback device is from having Elvis inside of it.

A music playback device's purpose is to allow you hear Elvis' voice. A good device does it well: you hear Elvis' voice (maybe with some imperfections). Whether a real Elvis is inside of it or not, doesn't matter - its purpose is fulfilled regardless. By your analogy, an LLM simply reproduces what an intelligent person would say on the matter. If it does its job more-less, it doesn't matter either, whether it's "truly intelligent" or not, its output is already useful. I think it's completely irrelevant in both cases to the question "how well does it do X?" If you think about it, 95% we know we learned from school/environment/parents, we didn't discover it ourselves via some kind of scientific method, we just parrot what other intelligent people said before us, mostly. Maybe human "intelligence" itself is 95% parroting/basic pattern matching from training data? (18 years of training during childhood!)

Trasmatta · 2026-02-06T00:54:01 1770339241

Do the same experiment in the Claude web UI. And explicitly turn web searches off. It got almost all of them for me over a couple of prompts. That stuff is already in its training data.

soulofmischief · 2026-02-05T23:23:30 1770333810

The only worthwhile version of this test involves previously unseen data that could not have been in the training set. Otherwise the results could be inaccurate to the point of harmful.

altmanaltman · 2026-02-06T05:58:49 1770357529

> But for sure it has some internal knowledge already.

Pretty sure the books had to be included in its training material in full text. It's one of the most popular book series ever created, of course they would train on it. So "some" is an understatement in this case.

eek2121 · 2026-02-05T22:58:03 1770332283

Honestly? My advice would be to cook something custom up! You don't need to do all the text yourself. Maybe have AI spew out a bunch of text, or take obscure existing text and insert hidden phrases here or there.

Shoot, I'd even go so far as to write a script that takes in a bunch of text, reorganizes sentences, and outputs them in a random order with the secrets. Kind of like a "Where's Waldo?", but for text

Just a few casual thoughts.

I'm actually thinking about coming up with some interesting coding exercises that I can run across all models. I know we already have benchmarks, however some of the recent work I've done has really shown huge weak points in every model I've run them on.

clhodapp · 2026-02-05T23:43:57 1770335037

Having AI spew it might suffer from the fact that the spew itself is influenced by AI's weights. I think your best bet would be to use a new human-authored work that was released after the model's context cutoff.

golfer · 2026-02-05T00:36:12 1770251772

As a site owner, how does one opt out of this, since it obviously ignores robots.txt?

chadwebscraper · 2026-02-05T00:51:01 1770252661

Shoot me your site and I can blacklist it

xnx · 2026-02-05T18:55:18 1770317718

That's what robots.txt is for.

golfer · 2026-02-05T00:30:34 1770251434

Microsoft has access to all the OpenAI/ChatGPT tech. How is their chatbot so awful? Seems like they are trying their hardest to screw this up.

golfer · 2026-02-03T15:59:30 1770134370

Thanks for sharing.

I never got to the actual product/website itself. There were 4 popup dialogs that I had to wade through first, and I bailed. My advice is, don't antagonize or put roadblocks on your potential users before they can even begin to absorb your product.

baljeet_ · 2026-02-03T19:07:53 1770145673

This is actually my first real feedback since launching, and I can't tell you how much I appreciate you taking the time to tell me this instead of just closing the tab.

You're 100% right — 4 popups before seeing the product is ridiculous. I got so caught up building features that I forgot what the first visit actually feels like.

Good news: I just pushed a fix. Disabled the welcome tour completely and the install prompt now only shows after you create an account. Should just be a small cookie notice at the bottom now.

If you get a chance to revisit, I'd genuinely love to hear what you think of the actual product. And honestly, any other feedback — good or bad — is welcome. Solo founder here trying to build something useful, and this kind of input is exactly what I need.

Thanks again

golfer · 2026-02-03T15:26:11 1770132371

Stock is down 22% today after their earnings report.

Title is in reference to this popular blog post: https://dx.tips/gartner

Which was also a topic on HN: https://news.ycombinator.com/item?id=44890012

golfer · 2026-02-03T04:57:56 1770094676

This is definitely interesting and HN-worthy. If nothing else, archive.today links are posted on tons of HN submissions, so it's topical.

dang · 2026-02-03T05:02:21 1770094941

I agree - it's clear that archive.is / archive.ph / archive.today / who-knows-what-else has been a lubricant in many HN threads, letting people read things they otherwise couldn't, and that increases the interest of the topic.

I suppose I should add that we prefer archive.org links when they're available, but often they aren't.

Edit: I suppose I should also re-add that we have no knowledge of or opinion about what's going on in the dispute at hand.

chrisjj · 2026-02-03T11:16:29 1770117389

> we prefer archive.org links when they're available

Interesting. May we know why?

dannyw · 2026-02-03T12:52:54 1770123174

Archive.org is run by a registered nonprofit instead of what’s likely a sole maintainer, who while I personally appreciate, does seem to go a little unhinged sometimes (like the dispute with Cloudflare DNS).

chrisjj · 2026-02-03T14:01:34 1770127294

I assume that answer is not official, since there's nothing more unhinged than archive.org facilitating the page's originator to make alterations after the snapshot.

direwolf20 · 2026-02-03T13:42:22 1770126142

This also makes it susceptible to government pressure. It's easy to get a page taken down from archive.org and it won't archive anything paywalled.

chrisjj · 2026-02-03T14:04:03 1770127443

Government pressure is the least of the problem. Anyone gaining control of a domain can delete all archives of it.

direwolf20 · 2026-02-03T17:52:00 1770141120

Because the government pressures them to obey this

russelg · 2026-02-03T12:10:41 1770120641

Perhaps because the admins of archive.org don't go around DDoSing random blogs I'd reckon.

chrisjj · 2026-02-03T12:39:27 1770122367

Instead they execute source page JS and allow it to doctor the archive copy.

golfer · 2026-01-12T16:47:24 1768236444

The original iPhone came pre-loaded with Google search, Maps, and Youtube. Jobs competed with Google but he also knew Google had best-in-class products too.

golfer · 2026-01-12T16:31:27 1768235487

Is the era of Apple exceptionalism over? Has it been over for a while now?

golfer · 2025-12-30T01:26:58 1767058018

GoT got so bad that I don't really have any desire to watch any of the seasons ever again. Killed rewatchability.

prawn · 2025-12-30T04:37:59 1767069479

I rewatched it in recent weeks and enjoyed all the bits that I enjoyed years ago during the first watch. The stories I found a bit tedious first time (High Sparrow plotline, Arya and faceless men) weren't as miserable; I think I was expecting them to drag on even more. My biggest grievance on the rewatch was just how poorly it's all tied up. I again enjoyed The Long Night through the lens of 'spectacle over military documentary'. The last season just felt like they wrote themselves into a corner and didn't have time and patience to see it through. By that point, actors were ready to move on, etc.

kankerlijer · 2025-12-30T13:22:22 1767100942

I don't really view this as the show runners fault. GRRM was unable to complete his own work. The show worked best when it drew from the authors own material (GRRM was a screenwriter himself and knew how to write great dialog/scenes).

lesuorac · 2025-12-30T14:28:57 1767104937

It's absolutely the producer's fault. They actively choose to release the product they did instead of making more episodes, taking long, bringing other people in to help, etc.

Martin has claimed he flew to HBO to convince them to do 10 seasons of 10 episodes instead of the 8 seasons with just 8 episodes in the final one [1]. It was straight up just D.B. Weiss and David Benioff call how the series ended.

[1]: https://variety.com/2022/tv/news/george-rr-martin-shut-out-g...

machomaster · 2025-12-30T11:36:19 1767094579

Not the actors, the showrunners.

golfer · 2025-12-04T05:12:41 1764825161

Honest question: What could a hardware device do that your phone can't do already?