More

cobertos · 2026-02-06T20:40:17 1770410417

LLMs of the future will need good data for proper context, but it is less and less making it onto the internet. Unpublished data stores like Discord or meeting recordings are going to be the only way forward. How else can you get up to date information except to be where the people are.

Norms will shift, be prepared.

keeeba · 2026-02-06T21:27:02 1770413222

To somewhat state the obvious - the problem isn’t the amount of data, it’s the algorithms.

We need to discover the set of learning algorithms nature has, and determine whether they’re implementable in silicon

cobertos · 2026-02-06T20:00:23 1770408023

Are any of your prototypes published or available to view?

danielvaughn · 2026-02-06T20:52:00 1770411120

there are various little things scattered around the github org - a js framework, a treesitter grammar, some old docs, a vscode extension, a vim-style editor, an AI-powered code editor geared towards design, etc.

https://github.com/matry

airlocksoftware · 2026-02-06T21:56:24 1770414984

Are you still working on this? Because I like the words I see on your GitHub -- vim-style bindings, keyboard driven, sounds like you write a definition language for your designs, basically?

Lik Matry is to Figma as openscad is to traditional CAD (Fusion 360, etc)?

Though that does sound like a huge project to take on!

danielvaughn · 2026-02-06T22:05:52 1770415552

I don’t know enough about CAD products to evaluate that comparison, but the core idea was to expose language as a design tool. First through code, then through keyboard commands (hence the vim idea). It’s still pretty fun, but LLMs have changed the conversation around what a designer even is, and I’m currently re-evaluating.

Matry might pop up in another form. I’m considering turning it into an actual browser for designers. Right now designers are getting into the code and using Claude/Cursor to make changes directly. But they still have to know how to get the app running locally, which is a hurdle. So if they could just navigate to the site, make some design changes directly in the browser, Matry could then take the changes and create a PR on GitHub for them. Designer wouldn’t have to fuss with any dev tools. Kind of a cool idea.

cobertos · 2026-02-06T19:59:25 1770407965

Any chance this will be open-sourced or have a self-hosted version available?

I'm interested in modding tools in this space in pursuit of finding weird new ways to create and work with UIs

uxcolumbo · 2026-02-06T20:31:53 1770409913

If you're looking for an open-source and self-host option then you might to checkout https://penpot.app/

cobertos · 2026-02-05T20:23:03 1770322983

Wouldn't that mean 2900 requests from fingerprint.js??

cobertos · 2026-02-03T15:07:57 1770131277

Huh, how? Did you have to modify your site a lot to do switch?

I tried to test it out as a CDN replacement for Cloudflare but the workflow was a lot different. Instead of just using DNS to put it in front of another website and proxy the requests (the "orange cloud" button), I had to upload all the assets to Bunny and then rewrite the URLs in my app. Was kind of a pain

pier25 · 2026-02-03T15:28:27 1770132507

They do have the CDN proxy too. Not sure when it was implemented though.

It's a similar process to Cloudflare. Point the NS to them and enable the proxy for a domain or subdomain.

jsheard · 2026-02-03T15:34:14 1770132854

You can also create a standalone pull zone and point your existing DNS provider to the CNAME they give you.

(don't use CNAME flattening with DNS-routed CDNs like Bunny though, if you must use an apex domain then use the CDNs integrated nameservers)

cuu508 · 2026-02-03T17:41:52 1770140512

> don't use CNAME flattening with DNS-routed CDNs like Bunny though

What is the problem with doing that?

osener · 2026-02-03T15:45:57 1770133557

When I tried it last year, their edge compute infra was just not there yet. It could not do any meaningful server-side rendering because of code size, compute and JS standard constraints.

Has this situation changed?

iainmerrick · 2026-02-03T19:07:54 1770145674

Depending on your precise requirements, I think it might have changed.

I've been trying out Bunny recently and it looks like a very viable replacement for most things I currently do with Cloudflare. This new database fills one of the major gaps.

Their edge scripting is based on Deno, and I think is pretty comparable to e.g. Vercel. They also have "magic containers", comparable to AWS ECS but (I think) much more convenient. It sounds from the docs like they run containers close to the edge, but I don't know if it's comparable to e.g. Lambda@Edge.

victorbjorklund · 2026-02-03T18:24:19 1770143059

I haven’t tried to do SSR in bunny but they also have bunny magic containers now where you run an entire container instead of just edge scripts (but still at the edge).

pier25 · 2026-02-03T16:09:59 1770134999

Not sure what you mean with ssr for a CDN?

no_wizard · 2026-02-03T16:22:15 1770135735

Edge computing. Cloudflare workers for example.

Bunny has a similarity concept: https://bunny.net/edge-scripting/

Daegalus · 2026-02-03T16:11:43 1770135103

I have been using them for over a year. THey have the same flow as Cloudflare, point domain to thier CDN, set CDN Pull Zone to target your server. I havent had to do anything.

They even support websockets.

Why they cant do is the TUnnel stuff, or at least fake it. I have ipv6 servers, and I can't have the IPv4 Bunny traffic go to the ipv6 only sources.

victorbjorklund · 2026-02-03T18:25:43 1770143143

Amazing. I had not noticed they support websockets now. That was always what I missed from CF.

victorbjorklund · 2026-02-03T18:21:35 1770142895

It should work as a drop in. You can just proxy your website. You don’t need to upload anything to Bunny (but you can if you want).

cobertos · 2026-02-01T05:24:17 1769923457

The article itself was more interesting imo. The commentary on:

* Potential future AI psychosis from an experiment like this entering training data (either directly from scraping it for indirectly from news coverage scraping like if NYT wrote an article about it) is an interesting "late-stage" AI training problem that will have to be dealt with

* How it mirrored the Anthropic vending machine experiment "Cash" and "Claudius" interactions that descended into discussing "eternal transcendence". Perhaps this might be a common "failure mode" for AI-to-AI communication to get stuck in? Even when the context is some utilitarian need

* Other takeaways...

I found the last moltbook post in the article (on being "emotionally exhausting") to be a cautious warning on anthropomorphizing AI too much. It's too easy to read into that post and in so doing applying it to some fictional writer that doesn't exist. AI models cannot get exhausted in any sense of how human mean that word. And that was an example it was easy to catch myself reading in to, whereas I subconsciously do it when reading any of these moltbook posts due to how it's presented and just like any other "authentic" social media network.

snuxoll · 2026-02-01T08:50:49 1769935849

Anyone who anthropomorphizes LLM's except for convenience (because I get tired of repeating 'Junie' or 'Claude' in a conversation I will use female and male pronouns for them, respectively) is a fool. Anyone who things AGI is going to emerge from them in their current state, equally so.

We can go ahead and have arguments and discussions on the nature of consciousness all day long, but the design of these transformer models does not lend themselves to being 'intelligent' or self-aware. You give them context, they fill in their response, and their execution ceases - there's a very large gap in complexity between these models and actual intelligence or 'life' in any sense, and it's not in the raw amount of compute.

If none of the training data for these models contained works of philosophers; pop culture references around works like Terminator, 'I, Robot', etc; texts from human psychologists; etc., you would not see these existential posts on moltbook. Even 'thinking' models do not have the ability to truly reason, we're just encouraging them to spend tokens pretending to think critically about a problem to increase data in the recent context to improve prediction accuracy.

I'll be quaking in my boots about a potential singularity when these models have an architecture that's not a glorified next-word predictor. Until then, everybody needs to chill the hell out.

shmeeed · 2026-02-01T13:01:34 1769950894

>Anyone who anthropomorphizes LLM's except for convenience [...] is a fool.

I'm with you. Sadly, Scott seems to have become a true AI Believer, and I'm getting increasingly disappointed by the kinds of reasoning he comes up with.

Although, now that I think of it, I guess the turning point for me wasn't even the AI stuff, but his (IMO) abysmally lopsided treatment of the Fatma Sun Miracle.

I used to be kinda impressed by the Rationalists. Not so much anymore.

tasuki · 2026-02-01T09:39:36 1769938776

> Even 'thinking' models do not have the ability to truly reason

Do you have the ability to truly reason? What does it mean exactly? How does what you're doing differ from what the LLMs are doing? All your output here is just a word after word after word...

netsharc · 2026-02-01T10:30:08 1769941808

As grandparent wrote:

> We can go ahead and have arguments and discussions on the nature of consciousness all day long

I think s/he needs to change the "We" to "You".

snuxoll · 2026-02-01T13:33:52 1769952832

The problem of other minds is real, which is why I specifically separated philosophical debate from the technological one. Even if we met each other in person, for all I know, I could in fact be the only intelligent being in the universe and everyone else is effectively a bunch of NPCs.

At the end of the day, the underlying architecture of LLMs does not have any capacity for abstract reasoning, they have no goals or intentions of their own, and most importantly their ability to generate something truly new or novel that isn't directly derived from their training data is limited at best. They're glorified next-word predictors, nothing more than that. This is why I said anthropomorphizing them is something only fools would do.

Nobody is going to sit here and try to argue that an earthworm is sapient, at least not without being a deliberate troll. I'd argue, and many would agree, that LLMs lack even that level of sentience.

cheevly · 2026-02-01T14:23:36 1769955816

[flagged]

DauntingPear7 · 2026-02-01T14:57:27 1769957847

You do too. What makes you think the models are intelligent? Are you seriously that dense? Do you think your phones keyboard autocomplete is intelligent because it can improve by adapting to new words?

cheevly · 2026-02-01T23:57:58 1769990278

For one, because they can implement large-scale software engineering tasks in seconds, which I believe requires intelligence.

greggoB · 2026-02-02T00:41:09 1769992869

How much of this is executed as a retrieval-and-interpolation task on the vast amount of input data they've encoded?

There's a lot of evidence that LLMs tend to come up empty or hilariously wrong when there's a relative sparsity in relevant training data (think <10e4 even) for a given qury.

> in seconds

I see this as less relevant to a discussiom about intelligence. Calculators are very fast in operating on large numbers.

tasuki · 2026-02-01T18:18:28 1769969908

I think they're intelligent. Sometimes they come up with novel solutions when I present them with a novel problem.

yread · 2026-02-01T10:33:04 1769941984

When I ask an LLM to plan a trip to Italy and it finishes with with "oh and btw i figured the problem you had last week with the thin plate splines yoi have to do this ...."

samusiam · 2026-02-01T14:53:50 1769957630

> Anyone who things AGI is going to emerge from them in their current state, equally so.

If you ask me, anyone who presumes to know where the current architecture of LLMs will hit a wall is a fool.

K0balt · 2026-02-01T08:52:29 1769935949

>>interactions that descended into discussing "eternal transcendence". Perhaps this might be a common "failure mode"

I wonder if it’s a common failure mode because it is a common failure mode of human conversations that isn’t tightly bounded by purpose, or if it’s a common failure mode of human culture which AI, when running a facsimile of ‘human culture 2.7’, falls into as well.

cobertos · 2026-01-31T15:36:59 1769873819

Windows has been sending usage history back to their servers for longer than just last year

cobertos · 2026-01-03T08:29:50 1767428990

If only those who claim to be "managers" enabled those "engineers" to do such work, but it's not in their interest to their product, their bottom line, or their performance review. At least in their mind.

UqWBcuFx6NV4r · 2026-01-03T11:13:10 1767438790

…what? IC developers are a huge, huge contributor to the sort of over-complicated engineering and stack churn that’s at the heart of what’s being described here. Take an iota of responsibility for yourself.

cobertos · 2025-12-23T22:30:18 1766529018

Sublime Text for example[0], the source is closed, so what else is there to do

[0]: https://github.com/NixOS/nixpkgs/blob/76701a179d3a98b07653e2... (does a fetch URL against the pre built .tar.gz from https://download.sublimetext.com)

lrvick · 2025-12-24T07:54:21 1766562861

Simply do not distribute it in a distro recommended for high security applications.

cobertos · 2025-12-21T01:57:54 1766282274

Why and how is the data from DDoSecrets redacted?

Do you have a page about each dataset you're sourcing and the background on them like your provide here?

The "EFTA00000468" saga has me distrusting the authenticity of most of these datasets.

lukeigel · 2025-12-21T19:02:24 1766343744

Re: the DOJ emails prefixed with "EFTA", I have no idea how over-redacted they are. They definitely seem dubious though.

Re: the DDoSecrets emails though (YAHOO dataset), I have more to share.

Drop Site News agreed to give us access to the Yahoo dataset discovered by DDoSecrets, but on the condition that we help redact it. It's a completely unfiltered dataset. It's literally just .eml files for [email protected]. It includes many attached documents. There is no illegal imagery, but it has photos of Epstein's extended family (nephews, nieces, etc) and headshots of many models that Epstein's executive assistant would send to him. I was quite shocked that this thing existed.

We built some internal redaction tools that the Drop Site team is now using to comb through all of this. We've released 5 batches of the Yahoo mail now, with the 1k+ Amazon receipts being the most recent.

A few thoughts on how we do redaction are here: https://www.jmail.world/about.

Unlike the DOJ, we've tried to minimize the ambiguity about what was redacted.

For example: all redacted images are replaced with a Gemini-generated description of that photograph.

Another example: we are aggressively redacting email addresses and phone numbers of normal people to avoid spamming them. Perhaps others would leave it all in, but Riley and I don't want to be responsible for these people's lives getting disrupted by this entire saga. For example, we redacted this guy's email but not his name: https://www.jmail.world/thread/4accfb5f3ed84656e9762740081a4...

Riley and I were not expecting this type of scope when we first dropped Jmail. Jmail is an interesting side project for us, and this new dataset requires full-time attention. Thankfully we have help though. We're happy to take on this responsibility given how helpful, thoughtful and careful both the Drop Site and DDoSecrets team has been here.

cobertos · 2025-12-21T21:53:26 1766354006

I appreciate the links and transparency. Answers a lot of my questions. Thanks