Indeed no worries about pointing that out! - there are lots of things that could optimize the RAM requirements, and especially reduce the runtime. Since the overall runtime is pretty quick to getting a single answer (3 hours) vs spending more time optimizing.. but absolutely there is a lot left on the table.
I think with some good optimization you could reduce the runtime significantly, especially on modern hardware.
As for RAM limitations - You are correct the only limitation is just that with the RAM I have I would need to do more iterations. It would be possible to do this in much less RAM with more iterations, or the reverse of course.
I was fond of solving the problem in RAM just as a way to limit the scope of the problem... but SSDs are indeed pretty fast at streaming data like this.
How odd. You have to read and copy the data from the web to your hard disk, to the SSD. Why not just run the program on the web copy, streaming. Tiny amount of ram for all the hash tables, and you do not need to move the data around. Multi threading would improve the runtime, but reading the table is the bottle neck.
He makes a great optimization, by using hash tables, but I was wondering about optimizing about the evaluation of the Chi-Square Test for Equal Proportions.
If the Chi-Square test diverges from the Chi-Square Test for e, then there is little likely hood for pi+e to ever converge on rationality.
My hypothesis is that because Pi is cyclical, and e is exponentially transcendental, that their sum and product are not rational, and nether are their respective powers ( Pi^e and e^Pi ), but those will take a much larger homelab than I have access to.
I think with some good optimization you could reduce the runtime significantly, especially on modern hardware.
As for RAM limitations - You are correct the only limitation is just that with the RAM I have I would need to do more iterations. It would be possible to do this in much less RAM with more iterations, or the reverse of course.
I was fond of solving the problem in RAM just as a way to limit the scope of the problem... but SSDs are indeed pretty fast at streaming data like this.