I’d argue that this was a limitation of the GA fitness function, not of the concept.
Now that we have vastly faster compute, open FPGA bitstream access, on-chip monitoring, plus cheap and dense temperature/voltage sensing, reinforcement learning + evolution hybrids, it becomes possible to select explicitly for robustness and generality, not just for functional correctness.
The fact that human engineers could not understand how this worked in 1996 made researchers incredibly uncomfortable, and the same remains true today, but now we have vastly better tooling than back then.
I don't think that's true, for me it is the concept that's wrong. The second-order effects you mention:
- Parasitic capacitive coupling
- Propagation delay differences
- Analogue behaviours of the silicon substrate
...are not just influenced by the chip design, they're influenced by substrate purity and doping uniformity -- exactly the parts of the production process that we don't control. Or rather: we shrink the technology node to right at the edge where these uncontrolled factors become too big to ignore. You can't design a circuit based on the uncontrolled properties of your production process and still expect to produce large volumes of working circuits.
Yes, we have better tooling today. If you use today's 14A machinery to produce a 1µ chip like the 80386, you will get amazingly high yields, and it will probably be accurate enough that even these analog circuits are reproducible. But the analog effects become more unpredictable as the node size decreases, and so will the variance in your analog circuits.
Also, contrary to what you said: the GA fitness process does not design for robustness and generality. It designs for the specific chip you're measuring, and you're measuring post-production. The fact that it works for reprogrammable FPGAs does not mean it translates well to mass production of integrated circuits. The reason we use digital circuitry instead of analog is not because we don't understand analog: it's because digital designs are much less sensitive to production variance.
Possibly, but maybe the real difference is the subtlety between a planned deterministic (logical) result versus deterministic (black box) outcome?
We’re seeing this shift already in software testing around GenAI. Trying to write a test around non-deterministic outcomes comes with its own set of challenges, so we need to plan can deterministic variances, which seems like an oxymoron but is not in this context.
Now that we have vastly faster compute, open FPGA bitstream access, on-chip monitoring, plus cheap and dense temperature/voltage sensing, reinforcement learning + evolution hybrids, it becomes possible to select explicitly for robustness and generality, not just for functional correctness.
The fact that human engineers could not understand how this worked in 1996 made researchers incredibly uncomfortable, and the same remains true today, but now we have vastly better tooling than back then.