> For me, it was simply a gut feeling. I’ve been talking to founders and doing deep dives into technology companies for decades. It’s been my entire professional life as a writer. And because of that experience, there must be a pattern-matching algorithm churning away somewhere in my subconscious. I don’t know how I know, I just do. SBF is a winner.
This is a fantastic insight into journalists. They need to fit their story into a narrative. If you match their stereotype, then they'll spin whatever tale they can imagine to convince themselves that it's really true. The level of self-deception that's needed is difficult to comprehend for me.
I've just tried making a loop in a jit-compiled function and it just worked:
>>> import jax
>>> def a(y):
... x = 0
... for i in range(5):
... x += y
... return x
...
>>> a(5)
25
>>> a_jit = jax.jit(a)
>>> a_jit(5)
DeviceArray(25, dtype=int32, weak_type=True)
It definitely works, JAX only sees the unrolled loop:
x = 0
x += y
x += y
x += y
x += y
x += y
return x
The reason you might need `jax.lax.fori_loop` or some such is if you have a long loop with a complex body. Replicating a complex body many times means you end up with a huge computation graph and slow compilation.
Fused into one operation since the Tensor isn't resolved until I call .numpy()
kafka@tubby:/tmp$ cat fuse.py
from tinygrad.tensor import Tensor
x = Tensor.zeros(1)
for i in range(5):
x += i
print(x.numpy())
kafka@tubby:/tmp$ OPT=2 GPU=1 DEBUG=2 python3 fuse.py
using [<pyopencl.Device 'Apple M1 Max' on 'Apple' at 0x1027f00>]
**CL** 0 elementwise_0 args 1 kernels [1, 1, 1] None OPs 0.0M/ 0.00G mem 0.00 GB tm 0.15us/ 0.00ms ( 0.03 GFLOPS)
**CL** copy OUT (1,)
[10.]
What happens, when your model exhibits a discriminating bias? How do you find out, what is going wrong? Knowing, what the model pays attention to can be pretty helpful.
Didn't all recommendations engines move to two-towers like models? I remember that it "solved" the freshness problem (ie when adding a new item to your catalog how do you recommend it to users if there are no ratings/interactions). Of course as long as you have a good model that creates items embeddings.
Regarding time series, don't everyone moved to attention based models?
Not challenging your answer, just curious. I work mostly with Graph NNs and quite a bit out of touch with the rest of the field.
The black box nature of a neural net is a problem. For model based design, a bit more accuracy out of a black box doesn't really help when you need, for example, state space matrices in a control design.