Interesting. LLAMA is trained using 16K GPUs so it would have taken around a quarter for them. An hour of GPU use costs $2-$3 so training a custom solution using LLAMA should be atleast $15K to $1M. I am trying to get started with this thing. A few guys suggested 2 GPUs were a good start but I think that would only be good for 10K training samples.