Hacker Newsnew | past | comments | ask | show | jobs | submit | darknoon's commentslogin

I haven't seen one that worked properly—can you list a couple examples? Some of the ones that say they're "AI" are just VTracer / Potrace and don't give nice control points.

I liked the results of vectorizer.ai and recraft.ai

Input image is important too. When working with the generalist LLM on the raster art, give it context that you are making a logo, direct it to use strokes and fills and minimal color palette, readable at small sizes, etc.


vectorizer.ai is amazing. It's worked great for like over 10 years (back when it had a name like vector magic or something). I'm super curious how it's implemented

I think you'd find that it's far from "any human" who can do this without looking anything up. I have 15y of dev exp and couldn't do this from memory on the cli. Maybe in c, but less helpful to getting stuff done!

  # curl -s https://upload.wikimedia.org/wikipedia/commons/6/61/Sun.png | file -
  /dev/stdin: PNG image data, 256 x 256, 8-bit/color RGBA, non-interlaced
That's it, two utilities almost everybody has installed.

ChatGPT has 800 million monthly users. The fraction of those who are comfortable opening a terminal and running those commands is pretty tiny.

If 800m people think delegating thinking to a slop generator is fine, that's not my loss. It's bad for humanity, but who even cares anymore in 2026, right?

"Delegating thinking" and "figuring out how to determine an image format from the first few bytes of a file" are not the same thing.

I disagree, in my opinion it's the exact same process, just on a much smaller scale. It's a problem, and we humans are good at solving problems. That is, until LLMs arrived, now we are supposed to become good at prompting, or something.

I used ffmpeg and yt-dlp to make an animated GIF of a kākāpō in her nest from a livestream on YouTube the other day. https://simonwillison.net/2026/Jan/25/kakapo-cam/

Much as I love kākāpō there is no way I was going to invest more than a few minutes in figuring out how to do that.

I love this new world where I can "delegate my thinking" to a computer and get a GIF of a dumpy New Zealand flightless parrot where I would otherwise be unable to do so because I didn't have the time to figure it out.

(I published it as a looping MP4 because that was smaller than the GIF, another thing I didn't have to figure out how to do myself.)


I agree that your project is cool, I just don't think the numerous downsides are worth the occasional cool thing like this.

Yes but now do the same for every bit of programming tooling, sysadmin configuration / debugging problem and concept out there. With just a few seconds to answer each reply.

It's called learning, and it used to be the hacker mindset to continuously improve. But I guess that died with slop generators.

really weird graph where they're comparing to 3x H100 PCI-E which is a config I don't think anyone is using.

they're trying to compare at iso-power? I just want to see their box vs a box of 8 h100s b/c that's what people would buy instead, and they can divide tokens and watts if that's the pitch.


> they're trying to compare at iso-power?

Yeah they are defining a "rack" as 15kW, though 3x H100 PCIe is only a bit over 1kW. So they are assuming GPUs are <10% of rack power usage which sounds suspiciously low.


It would also depend on the purchase cost and cooling infrastructure cost. If this costs what a 3x H100 box costs then it’s a fair comparison even if not a direct comparison to what customers currently buy.


Whats a more realistic config?


8xGPUs per box. this has been the data center standard for the last 8ish years.

furthermore usually NVLink connected within the box (SXM instead of PCIe cards, although the physical data link is still PCIe.)

this is important because the daughter board provides PCIe switches which usually connect NVMe drives, NICs and GPUs together such that within that subcomplex there isn't any PCIe oversubscription.

since last year for a lot of providers the standard is the GB200 I'd argue.


Fascinating! So each GPU is partnered with disk and NICs such that theres no oversubscription for bandwidth within its 'slice'? (idk what the word is) And each of these 8 slices wire up to NVLink back to the host?

Feels like theres some amount of (software) orchestration for making data sit on the right drives or traverse the right NICs, guess I never really thought about the complexity of this kind of scale.

I googled GB200, its cool that Nvidia sells you a unit rather than expecting you to DIY PC yourself.


usually it's 2-2-2 (2 GPUs, 2 NICs and 2 NVMe drivers on a PCIe complex). no NVLink here, this is just PCIe - under this PCIe switch chip there is full bandwidth, above it's usually limited BW. so for example going GPU-to-GPU over PCIe will walk

GPU -> PCIe switch -> PCIe switch (most likely the CPU, with limited bw) -> PCIe switch -> GPU

NVLink comes into the picture as a separate, 2nd link between the GPUs: if you need to do GPU-to-GPU, you can use NVLink.

you never needed to DIY your stuff, at least not for the last 10 years: most hardware vendors (Supermicro, Dell, ...) will sell you a complete system with 8 GPUs.

what's nice on GH200/GBx00/VR systems, is that you can use chip-to-chip NVLink between the CPU and GPU, so the CPU can access GPU memory coherently and vica versa.


Here's the problem, you're still going to get scraped and the LLM will understand it anyway. Maybe at best you'll get filtered out of the dataset b/c it's high perplexity text?


Do training scrapers really feed all their input through an LLM to decode it? That sounds expensive and most content probably doesn't need that. If they don't, then this method probably works to keep your stuff out of the training datasets.


They don't need to decode it first, it can be passed in directly.


Is the problem with scraping the bandwidth usage or the stealing of the content? The point here doesn’t seem to be obfuscation from direct LLM inference (I mean, I use Shottr on my MacBook to immediately OCR my screenshots) but rather stopping you from ending up in the dataset.

Is there a reason you believe getting filtered out is only a “maybe?” Not getting filtered out would seem to me to imply that LLM training can naturally extract meaning from obfuscated tokens. If that’s the case, LLMs are more impressive than I thought.


The developers also gave a talk about Helion on GPU Mode: https://www.youtube.com/watch?v=1zKvCLuvUYc


Here's the thing, they've completely given up and started making their (inferior to AMD) CPUs on TSMC. For example, Arrow Lake is on TSMC N3B. So it's not getting amortized over anything at all and their valuation is going to 0.


It's ok, somewhere between a qwen 2.5 VL and the frontier models (o3 / opus 4) on visual reasoning


In ML, often it does work to a degree even if it's not 100% correct. So getting it working at all is all about hacking b/c most ideas are bad and don't work. Then you'll find wins by incrementally correcting issues with the math / data / floating point precision / etc.


If you were doing a lot of scraping, you could just solve this on a GPU in 1/10 or less of the time it takes a human's phone to do it. Generally you need a decent computer to render a webpage while scraping it these days, so I don't see what this is solving.


scrapers usually don't render a webpage, else their scraping wouldn't be efficient at all.


Is that still true? There are so many SPAs out there now that if I were to create a web spider today, I would plan to just render a lot of the pages in a browser rather than fight the status quo. Efficiency wouldn't be my top concern.


That’s bullshit


anyone know why they mix in the 3 previous tokens? could have just as easily done 5 or 2 right?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: