It does. You can use it directly on iOS 26 beta - without writing a line of code...

bigyabai · 2025-07-17T20:21:32 1752783692

It would be interesting to see the tok/s comparison between the ANE and GPU for inference. I bet these small models are a lot friendlier than the 7B/12B models that technically fit on a phone but won't accelerate well without a GPU.

gleenn · 2025-07-17T20:29:17 1752784157

I thought the big difference between the GPU and ANE was that you couldn't use the ANE to train. Does the GPU actually perform faster during inference as well? Is that because the ANE are designed more for efficiency or is there another bigger reason?

wmf · 2025-07-17T20:39:20 1752784760

GPUs are usually faster for inference simply because they have more ALUs/FPUs but they are also less efficient.

mrheosuper · 2025-07-18T02:34:29 1752806069

fitting 7B model on phone with 8gb ram for the whole system is impressive.

JKCalhoun · 2025-07-17T20:33:51 1752784431

Wild to see what improvements might come if there is additional hardware support in future Apple Silicon chips.

ivape · 2025-07-17T21:38:22 1752788302

What’s the cost of pointing it to Private Cloud Compute? It can’t be free, can it?

floam · 2025-07-18T00:21:20 1752798080

It’s “free”, as in it doesn’t charge you anything or require a subscription: it’s a part of Apple Intelligence which is basically something bought with the device. It’s in the cloud so theoretically one shouldn’t need a quite new iPhone or Mac but - one does.