LLMs of the future will need good data for proper context, but it is less and less making it onto the internet. Unpublished data stores like Discord or meeting recordings are going to be the only way forward. How else can you get up to date information except to be where the people are.
there are various little things scattered around the github org - a js framework, a treesitter grammar, some old docs, a vscode extension, a vim-style editor, an AI-powered code editor geared towards design, etc.
Are you still working on this? Because I like the words I see on your GitHub -- vim-style bindings, keyboard driven, sounds like you write a definition language for your designs, basically?
Lik Matry is to Figma as openscad is to traditional CAD (Fusion 360, etc)?
Though that does sound like a huge project to take on!
I don’t know enough about CAD products to evaluate that comparison, but the core idea was to expose language as a design tool. First through code, then through keyboard commands (hence the vim idea). It’s still pretty fun, but LLMs have changed the conversation around what a designer even is, and I’m currently re-evaluating.
Matry might pop up in another form. I’m considering turning it into an actual browser for designers. Right now designers are getting into the code and using Claude/Cursor to make changes directly. But they still have to know how to get the app running locally, which is a hurdle. So if they could just navigate to the site, make some design changes directly in the browser, Matry could then take the changes and create a PR on GitHub for them. Designer wouldn’t have to fuss with any dev tools. Kind of a cool idea.
Huh, how? Did you have to modify your site a lot to do switch?
I tried to test it out as a CDN replacement for Cloudflare but the workflow was a lot different. Instead of just using DNS to put it in front of another website and proxy the requests (the "orange cloud" button), I had to upload all the assets to Bunny and then rewrite the URLs in my app. Was kind of a pain
When I tried it last year, their edge compute infra was just not there yet. It could not do any meaningful server-side rendering because of code size, compute and JS standard constraints.
Depending on your precise requirements, I think it might have changed.
I've been trying out Bunny recently and it looks like a very viable replacement for most things I currently do with Cloudflare. This new database fills one of the major gaps.
Their edge scripting is based on Deno, and I think is pretty comparable to e.g. Vercel. They also have "magic containers", comparable to AWS ECS but (I think) much more convenient. It sounds from the docs like they run containers close to the edge, but I don't know if it's comparable to e.g. Lambda@Edge.
I haven’t tried to do SSR in bunny but they also have bunny magic containers now where you run an entire container instead of just edge scripts (but still at the edge).
I have been using them for over a year. THey have the same flow as Cloudflare, point domain to thier CDN, set CDN Pull Zone to target your server. I havent had to do anything.
They even support websockets.
Why they cant do is the TUnnel stuff, or at least fake it. I have ipv6 servers, and I can't have the IPv4 Bunny traffic go to the ipv6 only sources.
The article itself was more interesting imo. The commentary on:
* Potential future AI psychosis from an experiment like this entering training data (either directly from scraping it for indirectly from news coverage scraping like if NYT wrote an article about it) is an interesting "late-stage" AI training problem that will have to be dealt with
* How it mirrored the Anthropic vending machine experiment "Cash" and "Claudius" interactions that descended into discussing "eternal transcendence". Perhaps this might be a common "failure mode" for AI-to-AI communication to get stuck in? Even when the context is some utilitarian need
* Other takeaways...
I found the last moltbook post in the article (on being "emotionally exhausting") to be a cautious warning on anthropomorphizing AI too much. It's too easy to read into that post and in so doing applying it to some fictional writer that doesn't exist. AI models cannot get exhausted in any sense of how human mean that word. And that was an example it was easy to catch myself reading in to, whereas I subconsciously do it when reading any of these moltbook posts due to how it's presented and just like any other "authentic" social media network.
Anyone who anthropomorphizes LLM's except for convenience (because I get tired of repeating 'Junie' or 'Claude' in a conversation I will use female and male pronouns for them, respectively) is a fool. Anyone who things AGI is going to emerge from them in their current state, equally so.
We can go ahead and have arguments and discussions on the nature of consciousness all day long, but the design of these transformer models does not lend themselves to being 'intelligent' or self-aware. You give them context, they fill in their response, and their execution ceases - there's a very large gap in complexity between these models and actual intelligence or 'life' in any sense, and it's not in the raw amount of compute.
If none of the training data for these models contained works of philosophers; pop culture references around works like Terminator, 'I, Robot', etc; texts from human psychologists; etc., you would not see these existential posts on moltbook. Even 'thinking' models do not have the ability to truly reason, we're just encouraging them to spend tokens pretending to think critically about a problem to increase data in the recent context to improve prediction accuracy.
I'll be quaking in my boots about a potential singularity when these models have an architecture that's not a glorified next-word predictor. Until then, everybody needs to chill the hell out.
>Anyone who anthropomorphizes LLM's except for convenience [...] is a fool.
I'm with you. Sadly, Scott seems to have become a true AI Believer, and I'm getting increasingly disappointed by the kinds of reasoning he comes up with.
Although, now that I think of it, I guess the turning point for me wasn't even the AI stuff, but his (IMO) abysmally lopsided treatment of the Fatma Sun Miracle.
I used to be kinda impressed by the Rationalists. Not so much anymore.
> Even 'thinking' models do not have the ability to truly reason
Do you have the ability to truly reason? What does it mean exactly? How does what you're doing differ from what the LLMs are doing? All your output here is just a word after word after word...
The problem of other minds is real, which is why I specifically separated philosophical debate from the technological one. Even if we met each other in person, for all I know, I could in fact be the only intelligent being in the universe and everyone else is effectively a bunch of NPCs.
At the end of the day, the underlying architecture of LLMs does not have any capacity for abstract reasoning, they have no goals or intentions of their own, and most importantly their ability to generate something truly new or novel that isn't directly derived from their training data is limited at best. They're glorified next-word predictors, nothing more than that. This is why I said anthropomorphizing them is something only fools would do.
Nobody is going to sit here and try to argue that an earthworm is sapient, at least not without being a deliberate troll. I'd argue, and many would agree, that LLMs lack even that level of sentience.
You do too. What makes you think the models are intelligent? Are you seriously that dense? Do you think your phones keyboard autocomplete is intelligent because it can improve by adapting to new words?
How much of this is executed as a retrieval-and-interpolation task on the vast amount of input data they've encoded?
There's a lot of evidence that LLMs tend to come up empty or hilariously wrong when there's a relative sparsity in relevant training data (think <10e4 even) for a given qury.
> in seconds
I see this as less relevant to a discussiom about intelligence. Calculators are very fast in operating on large numbers.
When I ask an LLM to plan a trip to Italy and it finishes with with "oh and btw i figured the problem you had last week with the thin plate splines yoi have to do this ...."
>>interactions that descended into discussing "eternal transcendence". Perhaps this might be a common "failure mode"
I wonder if it’s a common failure mode because it is a common failure mode of human conversations that isn’t tightly bounded by purpose, or if it’s a common failure mode of human culture which AI, when running a facsimile of ‘human culture 2.7’, falls into as well.
If only those who claim to be "managers" enabled those "engineers" to do such work, but it's not in their interest to their product, their bottom line, or their performance review. At least in their mind.
…what? IC developers are a huge, huge contributor to the sort of over-complicated engineering and stack churn that’s at the heart of what’s being described here. Take an iota of responsibility for yourself.
Re: the DOJ emails prefixed with "EFTA", I have no idea how over-redacted they are. They definitely seem dubious though.
Re: the DDoSecrets emails though (YAHOO dataset), I have more to share.
Drop Site News agreed to give us access to the Yahoo dataset discovered by DDoSecrets, but on the condition that we help redact it. It's a completely unfiltered dataset. It's literally just .eml files for [email protected]. It includes many attached documents. There is no illegal imagery, but it has photos of Epstein's extended family (nephews, nieces, etc) and headshots of many models that Epstein's executive assistant would send to him. I was quite shocked that this thing existed.
We built some internal redaction tools that the Drop Site team is now using to comb through all of this. We've released 5 batches of the Yahoo mail now, with the 1k+ Amazon receipts being the most recent.
Unlike the DOJ, we've tried to minimize the ambiguity about what was redacted.
For example: all redacted images are replaced with a Gemini-generated description of that photograph.
Another example: we are aggressively redacting email addresses and phone numbers of normal people to avoid spamming them. Perhaps others would leave it all in, but Riley and I don't want to be responsible for these people's lives getting disrupted by this entire saga. For example, we redacted this guy's email but not his name: https://www.jmail.world/thread/4accfb5f3ed84656e9762740081a4...
Riley and I were not expecting this type of scope when we first dropped Jmail. Jmail is an interesting side project for us, and this new dataset requires full-time attention. Thankfully we have help though. We're happy to take on this responsibility given how helpful, thoughtful and careful both the Drop Site and DDoSecrets team has been here.
Norms will shift, be prepared.
reply