It will be a beautiful day when I can finally lose all my Adobe accounts and software. Kdenlive is definitely on the right track BUT having a real risk to lose my project after days and weeks of work is not something I am able to afford. I am following this with great interest and waiting for the right time to jump on board.
Where did you hear about loosing your work? Did you experience it? Did you report it? Kdenlive has a very robust project recovery system, even if it crashes you are able to recover your lost work. Also in any software you must continuously save.
These are thoughts of someone who's very good at putting words together, but sadly has little experience with the subject matter.
> I’ve thought about this a lot over the last few years, and I think the best response is to stop.
This is exactly where it shows.
LLMs, agents and whatever comes next are not only the future of tech, but they are going to be national resilience drivers for the countries that will be able to support them with power, water and science.
Who is supposed to stop? The US? China? Russia? Everyone? Of course this won't happen. This is an arms race.
But even if it weren't, stopping is the wrong answer. You don't have to outsource your thinking, writing or reading. How you use LLMs is entirely up to you.
There is a way to use LLMs which is beneficial. I treat them as a private tutor available to me for questions. This solved a lot of friction I had with my relationship with LLMs.
More telling is that the author mainly thinks about their relationship with LLMs while in reality the space has moved on to automation with agents. You don't interact with LLMs as much as before, and if you still do, then soon you won't.
Ahents are not really ML. It's harnesses and parsing and memory and metrics. It's software. Should we stop this as well?
Ollama is the worst engine you could use for this. Since you are already running on an Nvidia stack for the dense model, you should serve this with vLLM. With 128GB you could try for the original safetensors even though you might need to be careful with caches and context length.
Strangely, I haven't had a lot of luck with vLLM; I finally ended up ditching Ollama and going straight to the tap with llama-serve in llamacpp. No regrets.
> End of the PC era, there's nothing to tinker with anymore. And certainly no gradient for entrepreneurship for once-skilled labor capital.
This one seems too far fetched. Training models is widespread. There will always be open weight models in some form, and if we assume there will be some advancements in architecture, I bet you could also run them on much leaner devices. Even today you can run models on Raspberry Pis. I don't see a reason this will stop being a thing, there will be plenty of ways to tinker.
However, keep in mind the masses don't care about tinkering and never have. People want a ChatGPT experience, not a pytorch experience. In essence this is true for all tech products, not just AI.
It works fine on mac (that's what we developed it on) and it's not nearly as much overhead as I was initially expecting. There's probably some added latency from virtual box but it hasn't been noticeable in our usage.
The top Mac Studio has six thunderbolt 5 ports, each of which is a PCIe 4.0 x4 link. Each is a 8GB/sec link in each direction, which is a lot. Going from x16 down to x4 has less than a 10% hit on games: https://www.reddit.com/r/buildapc/comments/sbegpb/gpu_in_pci...
“In the more common situations of reducing PCI-e bandwidth to PCI-e 4.0 x8 from 4.0 x16, there was little change in content creation performance: There was only an average decrease in scores of 3% for Video Editing and motion graphics. In more extreme situations (such as running at 4.0 x4 / 3.0 x8), this changed to an average performance reduction of 10%.”
Oculink is generally faster than TB5 despite them both using PCIe 4.0, because Oculink provides direct PCIe access whereas Thunderbolt has to route all PCIe traffic through its controller. The benchmarks show that the overhead introduced by the TB5 controller slows down GPU performance.
It's not just the controllers; the Thunderbolt protocol itself imposes different speed limits. The bit rates used by Thunderbolt aren't the same as PCIe, and PCIe traffic gets encapsulated in Thunderbolt packets.
Maybe; I'm unable to find any benchmarks that specifically compare PCs with TB to Macs to test this. But there is certainly still overhead with TB no matter what, and therefore it'll never be as fast as Oculink.
Sure, but how big of a difference is there? Even inside a desktop PC, you typically have PCIe ports directly off the CPU and ones off the chipset, and the latency for the latter is double. But the difference is immaterial in practice.
I think latency is the wrong focal point (more important for gaming, plus Macs don't support eGPUs anymore). There aren't a lot of general workloads that require high sustained throughput, but the ones that do can benefit from TB5 scaling.
For instance, if you cluster Mac Studios over TB5 with RDMA, the performance can be pretty stellar. It may not be more cost effective than renting compute for the same tasks, but if you've got (up to) four M3 Ultras with a ton of RAM, you'll be hard pressed to find something similar.
That's still not more ideal than having native alternatives like OCuLink or something that can be networked like QSFP, but it's a fair way to highlight the current design's strengths.
That's just blatantly wrong, the performance loss of GPUs is very well documented and gets worse as you go towards higher end models. We're talking 30%+ loss of performance here.
Sure. And lots of people need all that I/O. But my point is that it’s not like the Mac Studio has no I/O. The outgoing Mac Pro only has 24 total lanes of PCIe 4.0 going to the switch chip that’s connected to all the PCI slots. The advent of externally route PCIe is a development in the last few years that may have factored into the change in form factor.
When people talk about 100gigabit networks for Macs, im really curious what kind of network you run at home and how much money you spent on it. Even at work I’m generally seeing 10gigabit network ports with 100gigabit+ only in data centers where macs don’t have a presence
Local AI is probably the most common application these days.
Apple recently added support for InfiniBand over Thunderbolt. And now almost all decent Mac Studio configurations have sold out. Those two may be connected.
100 Gb/s Ethernet is likely to be expensive, but dual-port 25 Gb/s Ethernet NICs are not much more expensive than dual-port 10 Gb/s NICs, so whenever you are not using the Ethernet ports already included by a motherboard it may be worthwhile to go to a higher speed than 10 Gb/s.
If you use dual-port NICs, you do not need a high-speed switch, which may be expensive, but you can connect directly the computers into a network, and configure them as either Ethernet bridges or IP routers.
I work in media production and I have the same thought constantly. Hell I curse in church as far as my industry is concerned because I find 2.5 to be fine for most of us. 10 absolutely.
100gbps is going to be for mesh networks supporting clusters (4 Mac Studios let's just say) - not for LAN type networks (unless it's in an actual datacenter).
I suppose the throughput is not the key, latency is. When you split ann operation that normally ran within one machine between two machines, anything that crosses the boundary becomes orders of magnitude slower. Even with careful structuring, there are limits of how little and how rarely you can send data between nodes.
I suppose that splitting an LLM workload is pretty sensitive to that.
Things that aren’t graphics cards, such very high bandwidth video capture cards and any other equipment that needs a lot of lanes of PCI data at low latency.
Multiple GPUs was tried, by the whole industry including Apple (most notably with the trash can Mac Pro). Despite significant investment, it was ultimately a failure for consumer workloads like gaming, and was relegated to the datacenter and some very high-end workstations depending on the workload.
Multi-GPU has recently experienced a resurgence due to the discovery of new workloads with broader appeal (LLMs), but that's too new to have significantly influenced hardware architectures, and LLM inference isn't the most natural thing to scale across many GPUs. Everybody's still competing with more or less the architectures they had on hand when LLMs arrived, with new low-precision matrix math units squeezed in wherever room can be made. It's not at all clear yet what the long-term outcome will be in terms of the balance between local vs cloud compute for inference, whether there will be any local training/fine-tuning at all, and which use cases are ultimately profitable in the long run. All of that influences whether it would be worthwhile for Apple to abandon their current client-first architecture that standardizes on a single integrated GPU and omits/rejects the complexity of multi-GPU setups.
I see AI as a new, unreliable resource that I can try and tame with good software practices. It's an incredibly fun challenge and there's a lot to learn.
As long as there's internal documentation, which virtually every serious shop has, it can be processed and combined with AI. There are startups selling this product already. I've seen first hand some very narrowly focused domain knowledge becoming more accessible when you can ask the chatbot and the thing is right. It works.
Come to think of it, domain knowledge should be an LLMs strong suit as long as you can provide the right documentation, which is working pretty well already.
Right now the main issue I see with AI is that it doesn't do well with scaling. It's great for building demos and examples but you have to fix its code for real production work. But for how long?
In ERP software there are MLOCs without any technical documentation. And nobody would spend a dime to create one. So, the deep expert knowledge on how business processes are supposed to work (in full detail) and how they are implemented is mostly in the heads of a couple of people.
AI is most excellent at reading and understanding large codebases and, with some guidance docs, can easily reproduce accurate technical documentation. Divide and conquer.
Reading a large codebase...perhaps if it is not too large. Understanding... why a tool exists, what is the motivation for its design, what the external human systems requirements for successful utilization of the internal facing tools... especially when that knowledge exists only in the memories of a few developers and PMs... not so much.
Deep domain expertise is a long way from AI capability for effective replacement.
Again, nobody would spend a dime to create the technical documentation, even if it could be done somewhat faster with AI support. Also, in my experience AI is not so great explaining the consequences to business processes when documenting code.
Accuracy/faithfulness to the code as written isn't necessarily what you care about though, it's an understanding of the underlying problem. Just translating code doesn't actually help you do that.
No, current LLMs are already good enough to read the subtexts from documents, email, call transcripts where available. They're extremely good at identifying unwritten business practices, relationships, data flows, etc.
> human suffering, unfortunately, does not motivate Americans like gas prices do.
Absolutely right. It also makes sense most people will care about something tangible like gas prices than the lives of other people half way across the world.
But this doesn't mean that half way across the world there isn't something truly urgent that needs dealing with.
I honestly don't know what will come of this war but I do know with a fair bit or certainty that a nuclear Iran would have caused the US far more damage than a few weeks of higher gas prices, and they wouldn't even need to use it.
But to truly and fully understand this people need to put a real effort and research the region.
> But to truly and fully understand this[,] people need to put a real effort and research the region.
The US defense apparatus has been doing just this for quite some time. And netunyahoo has been saying Iran is 'weeks away' from having nukes for 30 years, now.
Opinion? israel has some real juicy stuff on trump, and he's doing his best to not get the information released by doing netun's bidding.
I am thoroughly appalled at trump's General Officers allowing him to get into such a mess.
I think it’s less blackmail and more the Israeli government has learned it can simply act and expect the US to back them up to save face. Rubio admitted as much with his initial statements after we attacked Iran. Netanyahu can currently reliably expect the US military to back up its foreign policy with or without buy in ahead of time.
This might be true but the current conflict was a long planned joint operation. Centcom was not just prompted to join in, but was the lead plan maker and execution of the attack was entirely in sync with Israel.
I do not think it was as coordinated with US leadership as that statement implies even if they had discussed it prior. Rubio’s initial statement was very telling. Now Trump says it was all Hegseth’s idea, so the story isn’t even consistent now.
Iran literally hit a preschool in Israel today, with an MRV which is solely designed to terrorize the population (and is a war crime btw). Plus a 12 year old is in critical condition alongside 40 civilians from a single Iranian missile hitting a residential building later today. And in June Iran hit a hospital in Israel with a ballistic missile.
> Its a mystery...
Not a mystery, though, is it? Israel has excellent air defense which is why the damage isn't x10 worse. But Iran is definitely making a huge effort to hit the civilian population for maximum damage.
Unlike Iran which is literally aiming statistical weapons at population centers, the US has high accuracy weapons - the school was hit because intelligence wasn't up to date (it used be an IRGC building).
Your comment is absolutely misinformed, or worse, spreading disinformation on purpose.
No, everything I said was true. The entire world knows who deliberately targets and murders children, by the tens of thousands. "Disinformation" is one of the Zionist colony's biggest exports, but its effect (like all drugs) has waned over time.
People who have unyoked from Zionist mental-control have dozens, if not 100s of independent journalistic outlets, mostly online, where they can (and ARE) following to get some sense of what's really happening. Hence your frustration.
Its not for nothing that "every accusation is a confession" is now a phrase which has spread across the globe, in relation to the Zionist entity and its hasbara. So, your "spreading disinformation on purpose" accusation is really your confession.
What's tinfoil hat about it? The antisemitism card has been overused, it's a common tactic by the Israeli government and its agents. People who have been able to pull themselves out of being affected by these false claims can think more clearly on the matter.
reply