Cool idea. Just some back-of-the-envelope math here (not trusting what's on thei...

mavamaarten · 2026-04-16T05:24:21 1776317061

Well. Running your machine to do inference will utilize more than 50W sustained load, I'd say more than double that. Plus electricity is more expensive here (but granted, I do have solar panels). Plus don't forget to factor in that your hardware will age faster.

I'd say it's not worth it. But the idea is cool.

jorvi · 2026-04-16T08:19:08 1776327548

Your hardware will age slower if you have consistent load.

Thermal stress from bursty workloads is much more of a wearing problem than electromigration. If you can consistently keep the SoC at a specific temperature, it'll last much longer.

This is also why it was very ironic that crypto miner GPUs would get sold at massive discounts. Everyone assumed that they had been ran ragged, but a proper miner would have undervolted the card and ran it at consistent utilization, meaning the card would be in better condition than a secondhand gamer GPU that would have constantly been shifting between 1% to 80% utilization, or rather, 30°C to 75°C

kennywinker · 2026-04-16T05:42:40 1776318160

Their estimate is based on significantly lower consumption when under load. E.g. 25W for an M4 Pro mac mini. I have no idea if that’s realistic - but the m4s are supposedly pretty efficient (https://www.jeffgeerling.com/blog/2024/m4-mac-minis-efficien...)

kennywinker · 2026-04-16T05:35:36 1776317736

Their example big earner models are FLUX.2 Klein 4B and FLUX.2 Klein 9B, which i imagine could generate a lot more tokens/s than a 26B model on your machine.

For Gemma 4 26B their math is:

single_tok/s = (307 GB/s / 4 GB) * 0.60 = 46.0 tok/s

batched_tok/s = 46.0 * 10 * 0.9 = 414.4 tok/s

tok/hr = 414.4 * 3600 = 1,492,020

revenue/hr = (1,492,020 / 1M) * $0.200000 = $0.2984

I have no idea if that is a good estimate of how much an M5 Pro can generate - but that’s what it says on their site.

They do a bit of a sneaky thing with power calculation: they subtract 12Ws of idle power, because they are assuming your machine is idling 24/7, so the only cost is the extra 18W they estimate you’ll use doing inference. Idk about you, but i do turn my machine off when i am not using it.

pants2 · 2026-04-16T10:42:33 1776336153

Interesting token numbers they're using, because I've benchmarked it at 69 tok/s single steam and 130 multi stream.

nnx · 2026-04-16T06:10:48 1776319848

> My M5 Pro can generate 130 tok/s (4 streams) on Gemma 4 26B.

This seems high. At which quantization? Using LM Studio or something else?

Note: Darkbloom seems to run everything on Q8 MLX.

pants2 · 2026-04-16T10:43:22 1776336202

Ah good point, this is using Q4, benchmarked total throughout serving with Llama.cpp.

todotask2 · 2026-04-16T05:05:40 1776315940

OpenAI has only about 5% paying customers, how does it generate revenue?

I don’t think this is a sustainable business model. For example, Cubbit tried to build decentralised storage, but I backed out because better alternatives now exist, and hardware continues to improve and become cheaper over time.

Your electricity and ownership are going to get lower return and does not actually requce CO2.

chaoz_ · 2026-04-16T05:05:54 1776315954

Genuinely curious, is there any way to estimate amortization of Mac?

I’d imagine 1 year of heavy usage would somehow affect its quality.

pants2 · 2026-04-16T05:22:25 1776316945

Yeah, only way to get there is assuming they're not giving prompt caching discounts while my laptop is getting prompt caching benefits, with very many large prompts. So yes I am skeptical of their numbers.

torginus · 2026-04-16T09:49:47 1776332987

Also this assumes hardware never fails. I learned about this the hard way back when I started mining crypto on my 5700XT way back when.

I figured since I already used it a lot, and I've never had a GPU fail on me, it would be fine.

The fans on it died in a month of constant use, replacing them was more money than what I made on mining.

xendo · 2026-04-16T05:06:43 1776316003

Any idea what makes for such a diff between your and theirs numbers? Batching? Or could they do a crazy prefix caching across all nodes to reduce the actual processing.

znnajdla · 2026-04-16T05:37:24 1776317844

Maybe lunch money for you, but there are people in some parts of the world who live on $200/month. Like Ukraine.

sethherr · 2026-04-16T05:45:20 1776318320

But they probably don’t have M5 MacBook pros idling

tonyedgecombe · 2026-04-16T06:38:06 1776321486

Or reliable energy or internet.

znnajdla · 2026-04-16T09:38:58 1776332338

They can acquire one if it offers real opportunities like this.

MrDrMcCoy · 2026-04-16T05:04:38 1776315878

Don't forget to factor in cooling costs.

pants2 · 2026-04-16T05:05:47 1776315947

Or saved heating costs in the winter!