I'm building a Java HFT engine and the amount of things AI gets wrong is eye opening. If I didn't benchmark everything I'd end up with much less optimized solution.
Examples: AI really wants to use Project Panama (FFM) and while that can be significantly faster than traditional OO approaches it is almost never the best. And I'm not taking about using deprecated Unsafe calls, I'm talking about using primative arrays being better for Vector/SIMD operations on large sets of data. NIO being better than FFM + mmap for file reading.
You can use AI to build something that is sometimes better than what someone without domain specific knowledge would develop but the gap between that and the industry expected solution is much more than 100 hours.
AI is extremely good at the things that it has many examples for. If what you are doing is novel then it is much less of a help, and it is far more likely to start hallucinating because 'I don't know' is not in the vocabulary of any AI.
I haven't had that at all, not even a single time. What I have had is endless round trips with me saying 'no, that can't work' and the bot then turning around and explaining to me why it is obvious that it can't work... that's quite annoying.
> Please carefully review (whatever it is) and list out the parts that have the most risk and uncertainty. Also, for each major claim or assumption can you list a few questions that come to mind? Rank those questions and ambiguities as: minor, moderate, or critical.
> Afterwards, review the (plan / design / document / implementation) again thoroughly under this new light and present your analysis as well as your confidence about each aspect.
There's a million variations on patterns like this. It can work surprisingly well.
You can also inject 1-2 key insights to guide the process. E.g. "I don't think X is completely correct because of A and B. We need to look into that and also see how it affects the rest of (whatever you are working on)."
Of course! I get pretty lazy so my follow-up is often usually something like:
"Ok let's look at these issues 1 at a time. Can you walk me through each one and help me think through how to address it"
And then it will usually give a few options for what to do for each one as well as a recommendation. The recommendation is often fairly decent, in which case I can just say "sounds good". Or maybe provide a small bit of color like: "sounds good but make sure to consider X".
Often we will have a side discussion about that particular issue until I'm satisfied. This happen more when I'm doing design / architectural / planning sessions with the AI. It can be as short or as long as it needs. And then we move on to the next one.
My main goal with these strategies is to help the AI get the relevant knowledge and expertise from my brain with as little effort as possible on my part. :D
A few other tactics:
- You can address multiple at once: "Item 3, 4, and 7 sound good, but lets work through the others together."
- Defer a discussion or issue until later: "Let's come back to item 2 or possibly save for that for a later session".
- Save the review notes / analysis / design sketch to a markdown doc to use in a future session. Or just as a reference to remember why something was done a certain way when I'm coming back to it. Can be useful to give to the AI for future related work as well.
- Send the content to a sub-agent for a detailed review and then discuss with the main agent.
I think the main issue is treating LLM as a unrestrained black box, there's a reason nobody outside tech trust so blindly on LLMs.
The only way to make LLMs useful for now is to restrain their hallucinations as much as possible with evals, and these evals need to be very clear about what are the goal you're optimizing for.
See karpathy's work on the autoresearch agent and how it carry experiments, it might be useful for what you're doing.
We were working on translations for Arabic and in the spec it said to use "Arabic numerals" for numbers. Our PM said that "according to ChatGPT that means we need to use Arabic script numbers, not Arabic numerals".
It took a lot of back-and-forths with her to convince her that the numbers she uses every day are "Arabic numerals". Even the author of the spec could barely convince her -- it took a meeting with the Arabic translators (several different ones) to finally do it. Think about that for a minute. People won't believe subject matter experts over an LLM.
Honestly I think we're just becoming more aware of this way of thinking. It's certainly exacerbated it now that everyone has "an expert" in their pocket.
It's no different than conspiracy theorists. We saw a lot more with the rise in access to the internet. Not because they didn't put in work to find answers to their questions, but because they don't know how to properly evaluate things and because they think that if they're wrong then it's a (very) bad thing.
But the same thing happens with tons of topics, and it's way more socially acceptable. Look how everyone has strong opinions on topics like climate, rockets, nuclear, immigration, and all that. The problem isn't having opinions or thoughts, but the strength of them compared to the level of expertise. How many people think they're experts after a few YouTube videos or just reading the intro to the wiki page?
Your PM is no different. The only difference is the things they believed in, not the way they formed beliefs. But they still had strong feelings about something they didn't know much about. It became "their expert" vs "your expert" rather than "oh, thanks for letting me know". And that's the underlying problem. It's terrifying to see how common it is. But I think it also leads to a (partial) solution. At least a first step. But then again, domain experts typically have strong self doubt. It's a feature, not a bug, but I'm not sure how many people are willing to be comfortable with being uncomfortable
In my experience, people outside of tech have nearly limitless faith in AI, to the point that when it clashes with traditional sources of truth, people start to question them rather than the LLM.
It would help if you briefly specified the AI you are using here. There are wildly different results between using, say, an 8B open-weights LLM and Claude Opus 4.6.
I've been using several. LM Studio and any of the open weight models that can fit my GPU's RAM (24GB) are not great in this area. The Claude models are slightly better but not worth they extra cost most of the time since I typically have to spend almost the same amount of time reworking and re-prompting, plus it's very easy to exhaust credits/tokens. I mostly bounce back and forth between the codex and Gemini models right now and this includes using pro models with high reasoning.
Not necessarily. Java can be insanely performant, far more than I ever gave it credit for in the first decade of its existence. There has been a ton of optimization and you can now saturate your links even if you do fairly heavy processing. I'm still not a fan of the language but performance issues seem to be 'mostly solved'.
You can achieve optimized C/C++ speeds, you just can't program the same way you always have. Step 1, switch your data layout from Array of Structures to Structure of Arrays. Step 2, after initial startup switch to (near) zero object creation. It's a very different way to program Java.
You have to optimize your memory usage patterns to fit in CPU cache as much as possible which is something typical Java develops don't consider. I have a background in assembly and C.
I'd say it's slightly harder since there is a little bit of abstraction but most of the time the JIT will produce code as good as C compilers. It's also an niche that often considers any application running on a general purpose CPU to be slow. If you want industry leading speed you start building custom FPGAs.
Java has significant overhead, that most/every object is allocated on heap, synchronized and has extra overhead of memory and performance to be GC controlled. Its very hard/not possible to tune this part.
You program differently for this niche in any language. The hot path (number crunching) thread doesn't share objects with gateway (IO) threads. Passing data between them is off heap, you avoid object creation after warm up. There is no synchronization, even volatile is something you avoid.
how exactly you are passing data? You can pass some primitives without allocating them on heap. You can use some tiny subset of Java+standard library to write high performance code, but why would you do this instead of using Rust or C++?
Strangely this is one of the areas where I want to use project panama so I might re-implement some of the ring buffers constructs.
You allocate off heap memory and dump data into it. With modern Java classes like Arena, MemoryLayout, and VarHandle it's honestly a lot like C structs.
> You allocate off heap memory and dump data into it. With modern Java classes like Arena, MemoryLayout, and VarHandle it's honestly a lot like C structs.
my opinion is that no, it is not, declaring and using C struct is 20x times more transparent, cost efficient and predictable. And that's we talking about C raw stucts, which has lots of additional ergonomics/safety/expression improvements in both c++ and rust on top of it.
Depends. Many reasons, but one is that Java has a much richer set of 3rd party libraries to do things versus rolling your own. And often (not always) third party libraries that have been extensively optimized, real world proven, etc.
Then things like the jit, by default, doing run time profiling and adaptation.
There are actually cases when Java (the HotSpot JVM) runs faster than the same logic written in C/C++ because the JVM is doing dynamic analysis and selective JIT compilation to machine code.
I personally know of an HFT firm that used Java approximately a decade ago. My guess would be they're still using it today given Java performance has only improved since then.
Optimal in what sense? In the java shops I've worked at it's usually viewed as a pretty optimal situation to have everything in one language. This makes code reuse, packaging, deployment, etc much simpler.
In terms of speed, memory usage, runtime characteristics... sure there are better options. But if java is good enough, or can be made good enough by writing the code correctly, why add another toolchain?
> But if java is good enough, or can be made good enough by writing the code correctly,
"writing code correctly" here means stripping 95% of lang capabilities, and writing in some other language which looks like C without structs (because they will be heap allocated with cross thread synchronization and GC overhead) and standard lib.
Its good enough for some tiny algo, but not good enough for anything serious.
It's good enough for the folks who choose to do it that way. Many of them do things that are quite "serious"... Databases, kafka, the lmax disruptor, and reams of performance critical proprietary code have been and continue to be written in java. It's not low effort, you have to be careful, get intimate with the garbage collector, and spend a lot of time profiling. It's a totally reasonable choice to make if your team has that expertise, you're already a java shop, etc. I no longer make the choice to use java for new code. I prefer rust. But neither choice is correct or incorrect.
> Databases, kafka, the lmax disruptor, and reams of performance critical proprietary code have been and continue to be written in java.
those have low bar of performance, also they mostly became popular because of investments from Java hype, and rust didn't exist or had weak ecosystem at that time.
I would say that if AI has to make decisions about picking between framework or constructs irrelevant to the domain at hand, it feels to me like you are not using the AI correctly.
I am curious about what causes some to choose Java for HFT. From what I remember the amount of virgin sacrifices and dances with the wolves one must do to approach native speed in this particular area is just way too much of development time overhead.
Probably the same thing that makes most developers choice a language for a project, it's the language they know best.
It wasn't a matter of choosing Java for HFT, it was a matter of selecting a project that was a good fit for Java and my personal knowledge. I was a Java instructor for Sun for over a decade, I authored a chunk of their Java curriculum. I wrote many of the concurrency questions in the certification exams. It's in my wheelhouse :)
My C and assembly is rusty at this point so I believe I can hit my performance goals with Java sooner than if I developed in more bare metal languages.
The one person who understands HFT yeah. "True" HFT is FPGA now and also those trades are basically dead because nobody has such stupid order execution anymore, either via getting better themselves or by using former HFTs (Virtu) new order execution services.
So yeah there's really no HFT anymore, it's just order execution, and some algo trades want more or less latency which merits varying levels of technical squeezing latency out of systems.
Software HFT? I see people call Python code HFT sometimes so I understand what you mean. It's more in-line with low latency trading than today's true HFT.
I don't work for a firm so don't get to play with FPGAs. I'm also not co-located in an exchange and using microwave towers for networking. I might never even have access to kernel networking bypass hardware (still hopeful about this one). Hardware optimization in my case will likely top out at CPU isolation for the hot path thread and a hosting provider in close proximity to the exchanges.
The real goal is a combination of eliminating as much slippage as possible, making some lower timeframe strategies possible and also having best class back testing performance for parameter grid searching and strategy discovery. I expect to sit between industry leading firms and typical retail systematic traders.
Then you list all of the things you want it not to do and construct a prompt to audit the codebase for the presence of those things. LLMs are much better at reviewing code than writing it so getting what you want requires focusing more on feedback than creation instructions.
Is the article trying to discuss a thermal issue? It spends the entire time discussing reduced watt consumption over time which would sound like a good thing to most people and then at the very end it has one sentence about needing improved cooling.
I think it's an entire article about thermal throttling that never once mentions it.
it doesn't seem to actually identify thermal throttling as the issue with any evidence -- what if it's 'just' power throttling?
that would be even more dissapointing of course, but all that's given in the article is the wattage chart
actually, in the article linked at the top they say this: "The situation is much worse on the smaller MacBook Pro 14 with the M5 Max, where the maximum power input is capped at 97 Watts (even when you use Apple's 140W PSU or an even more powerful 180W USB-C PSU), which results in a battery loss of 15 % during our one-hour stress test."
Lower wattage can mean higher efficiency, but the evidence in the fine article suggests it is thermal throttling and the laptop is not doing more with less.
If I remember correctly a previous MacBook air could have improved thermal dissipation by adding some thermal tape and turning the case into a heatsink.
It's not just a hobby you need, it's purpose. For some that is a hobby. If you go the hobby route, try to look for one that has in person meetups. Others going through this use self-improvement as their purpose (gym, suit up, etc). Church works for some. Consider some continuing education courses. Would charity work suit you? There are places like habitat for humanity that you can volunteer at. Maker spaces can be fun. You might also want to try out working from a co-working space.
I still host one of those 20+ year old forums. The Fediverse is different. With forums (and HN/Reddit) you immediately had good sense of them being for you or not. With the Fediverse to have to commit to servers and even then you don't know if they are right for you unless you try and spend time customizing your feeds/follows. It's a lot of work and you don't know if it will pay off. I tried again today and so much of it has no focus at all. It reminds me of this exchange from the Good Place TV show:
Chidi Anagonye: So, making decisions isn't necessarily my strong suit.
Michael: I know that, buddy. You-you once had a panic attack at a make-your-own sundae bar.
Chidi Anagonye: There were too many toppings, and very early in the process, you had to commit to a chocolate palette or a fruit palette. And if you couldn't decide, you wound up with kiwi-Junior Mint-raisin, and it just ruins everyone's night.
How did you get that 'good sense' with forums back in the day? I did that by reading into the community. Just look around on the server, see the posts on there, and see if there's a connection.
You can do that with Mastodon servers too.
And it's not that high stakes. You are not stuck on that server, if you find out that it's not quite the right fit, you can move to another server.
In fact, if I were to look at my following list, I think I see more people from outside the mastodon server I joined than people from the same server.
For forums you can look at them for a minute or less and figure it out. It's same with HN and sub-reddits. They are information dense and don't require signing up or customization to figure out if you can curate them into something you might like.
Compare that to mastodon, step 1 is pick a server https://joinmastodon.org/servers
Very few are actually topic focused. Even the ones with themes have a lot of deviation. Even picking the popular safe choice - https://mastodon.social/ , I can't tell if I would ever like it. I don't like what I normally see but considering I can see 2 posts at a time and a significate portion are animals or other topics I wouldn't visit a forum about, it doesn't feel like I would.
And the animal thing is common across most of the servers. I understand it, I have dogs. But it's a side effect of the medium not having a coherent focus. It feels like I'd have to spend so much time to turn it into something that I'd like that I'd be better off staying with forums.
Mastodon seem great if you want to follow people and be social which is kind of the point of social media. I want to follow areas of interest, not people.
The real question will be; Do we need to pay the juniors to write code to become seniors?
If coding is an art then all the juniors will end up in the same places as other struggling artists and only the breakout artists will land paying coding gigs.
I'm sitting here on a weekend coding a passion project for no pay so I have to wonder.
I think most big tech companies are like this and it's just going to get worse as AI adoption increases internally.
2 days ago I tried to create new gmail account and Google insisted that my phone number was used too many times. Fine, I'll pay for a new workspace account... Submit my billing information, that same that I use on other accounts but now there is an extra validation step that requires my ID and copies of my bank statement. Wasn't happy about that but tried it anyway. They expected my entire checking account number to appear on my bank statement. My bank account number is for my entire account and not just the checking portion but even if they were the same my bank redacts parts of the number so that if anyone gets my statement they can't just start drafting money.
The additional information section where I explain things is obviously ignored because the auto generated responses are sent pretty fast.
I can't decide what the saddest part is, the fact that their "give us a moment" emails that are send immediately after submitting still say they need extra time to process the request due to limited staffing because of covid or the fact that Gemini was brutal in criticizing them when I asked if it was normal to expect complete account numbers on the statement.
Similar to OP, the embedded help chat got in a loop of telling me I needed to speak to a rep to fix the issue and when attempting to connect me it would deny the request because I wasn't a paying customer yet.
The sad part is that you gave your information and kept trying to give them money despite all of this hassle? What is exactly you are looking for? If it's email, there is tons of providers out there who will be happy to take your money.
the only thing I can think that would make it imperative that you get a gmail instead of other email provider is that google provides the ecosystem of apps and google apps script to automate them, if you have built up a lot of tooling for automating things it can be a pretty sweet platform to do things in.
> I think most big tech companies are like this and it's just going to get worse as AI adoption increases internally.
Welcome to UB, at scale, in every language.
Everyone loves to complain about C (and C++) UB; well, now, you have that in every language.
We're at the point now that my manually written (non-trivial) projects C hits fewer undefined behaviour than even trivial projects constructed with an LLM and human "review".
The arch nemesis of software engineering. The exceptionally exceptional exception. It doesn’t throw, it glides. It festers. It waits until production day. It rears its head from the dead. The demon with 1000 names…
That's bad for you don't have the account yet but at Google, once you're a paying customer --and hence you're not the product anymore--, there are actual people helping you on the phone when you call them. Of course the catch-22 here is that you don't have the Workspace account yet, so you may sorry out of luck.
Big tech might ahead of the rest of the economy in this experiment. Microsoft grew headcount by ~3% from June 2022 to June 2025 while revenue grew by >40%. This is admittedly weak anecdata but my subjective experience is their products seem to be crumbling (GitHub problems around the Azure migration for instance), and worse than they even were before. We'll see how they handle hiring over the next few years and if that reveals anything.
Ease of recycling is not prioritized during design or manufacturing because there is no monetary incentive (for the manufacturer) to do so it most cases. It would eat into profits. Simple as that.
Unless a component is expensive to manufactory and recycling/reuse could save the manufacturer money it won't happen. The only real solution are laws requiring it.
At some point, sockets add enough failure modes that making components switchable increases the amount of waste. And it's not a far, theoretical point; it's one we often meet in practice.
Any regulation about that has to be detail-focused and conservative.
What do you see as the alternative here? Conductive epoxy is way less repairable than solder. Sockets are… components; and tend to be more expensive and higher failure rate than what’s socketed in them, except for extreme cases of very large ICs. Press fit requires special tooling, so repairability is much worse… what’s left?
The full cost of recycling things should be part of the cost of the product at the time of sale.
What you would find quickly, is that there is little to no profit on the manufacturing and sale of new devices and the value of repairs and reuse would skyrocket.
Right now companies are allowed to steal money from the future by ignoring the problem of what happens to these devices once they leave the factory. The truth is that they become hazardous waste, and lock away valuable resources inside of trash.
The reality is that there is no real economic benefit to the current model of ever increasing sales of new goods. But the capitalists, as ever, have been extracting money out of it by making the unpleasant, expensive parts someone else's liability. Namely ours.
Riches built from value extraction and arbitrage against the future. And most of us cannot conceive of it being any other way.
I'd argue the question was wrong, it's not that big companies can copy you easier now. They could have always invaded your space and destroyed your business. As other pointed out it was always picking up the pennies that they didn't want until those pennies became dollars.
The concern now is that other small team or solo developers can rebuild what you have very quickly. This happened in the mobile space with all the decompiled and repacked (with min changes) apps that showed up in the stores.
The moat for SaaS startups was that the code is hidden. Now that matters less because people use AI to try and reverse engineer your backend from the API or even UI screenshots.
You have to pick up the pace to keep ahead of them and make sure you don't cause problems for customers while doing it.
Examples: AI really wants to use Project Panama (FFM) and while that can be significantly faster than traditional OO approaches it is almost never the best. And I'm not taking about using deprecated Unsafe calls, I'm talking about using primative arrays being better for Vector/SIMD operations on large sets of data. NIO being better than FFM + mmap for file reading.
You can use AI to build something that is sometimes better than what someone without domain specific knowledge would develop but the gap between that and the industry expected solution is much more than 100 hours.
reply