Hacker Newsnew | past | comments | ask | show | jobs | submit | jellyotsiro's commentslogin

sick


rag is not a core! we use both semantic search but combining with fts, grep, direct read, etc.


Open source models with minimal safety fine tuning or Grok


Grok is arguably not uncensored, it’s re-aligned to a specific narrative lane.

“Uncensored” is simply a branding trick that a lot of seemingly intelligent people seem to fall for.


Wait, is abliteration actually just a branding trick? That doesn't sound correct.


My post was sociological - yours is technical.

Are you on purpose not having the same conversation or are you confused as to whether or not I think abliteration itself is marketing?


Saying grok is uncensored is like saying that deepseek is uncensored. If anything deepseek is probably less censored than grok. The doplin family has given me the best results, though mostly in niche cases.


Thanks for testing this. The Bannon email from June 30, 2019 is in there (HOUSE_OVERSIGHT_029622). Good stress test idea.

Couple things happening:

Semantic search limitation: Less-famous names don't have strong embeddings, so it defaults to general connections rather than specific mentions Keyword search gap: You're right — raw grep can catch exact names I'm missing


I saw a similar problem. Roger Schank had some conversations with Epstein and the emails can be seen in Epsteinvisualizer.com but your site claimed there was no emails or connection. To be fair to Roger, who was an AI legend of his time and someone I knew personally before his untimely death, he really was not a pedo, and most likely never got involved with the girls, I think him and Epstein just talked about AI and education mostly.


yes! once for files come out, I will add them right away


Shareable conversations would definitely make the tool more useful yeah. I really like the query parameter approach over UUIDs so it would make links human-readable


On the limited dataset: Completely agree - the public files are a fraction of what exists and I should have mentioned that it is not all files but all publicly available ones. But that's exactly why making even this subset searchable matters. The bar right now is people manually ctrl+F-ing through PDFs or relying on secondhand claims. This at least lets anyone verify what is public.

On LLMs vs traditional NLP: I hear you, and I've seen similar issues with LLM hallucination on structured data. That's why the architecture here is hybrid:

- Traditional exact regex/grep search for names, dates, identifiers - Vector search for semantic queries - LLM orchestration layer that must cite sources and can't generate answers without grounding


> can't generate answers without grounding

"can't" seems like quite a strong claim. Would you care to elaborate?

I can see how one might use a JSON schema that enforces source references in the output, but there is no technique I'm aware of to constrain a model to only come up with data based on the grounding docs, vs. making up a response based on pretrained data (or hallucinating one) and still listing the provided RAG results as attached reference.

It feels like your "can't" would be tantamount to having single-handedly solved the problem of hallucinations, which if you did, would be a billion-dollar-plus unlock for you, so I'm unsure you should show that level of certainty.


sorry all publicly available files *


it uses semantic search so yes


Trump famously told New York Magazine in 2002: "I've known Jeff for 15 years. Terrific guy. He's a lot of fun to be with. It is even said that he likes beautiful women as much as I do, and many of them are on the younger side."

Trump and Epstein were social acquaintances in Palm Beach and New York circles during the 1990s-early 2000s. They socialized together at Mar-a-Lago and other venues


Interesting. It is my impression that almost everyone globally already knew this. What else did you learn?


ill take like 1 hour in the evening to dive deeper, i was never familiar with epstein stuff until i built the agent to simplify things for me.


Its peak HN to whip out a LLM, instead of just reading a news paper article or two.


Better not choose the wrong two.


If you want to understand these people, watch the Daily Beast podcast "Inside Trump's Head" with Michael Wolff. It's a little slow but will paint the picture of their motivations, friendship, falling out, etc etc.


This is one of the most widey quoted phrases by trump on the topic of epstein


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: