As opposed to inference (like generating text and images), training requires some more math (fp16 or bf16) and a single CPU generally won't cut it.
The prepare/train/generate instructions in the github linked are pretty much it for the 'how' of training a model. You give it a task and it does it for 1 billion trillion epochs and saves the changes incrementally (or not).
Training a LoRA for an image model may be more approachable, there's more blog entries etc on this, and the process is largely similar, except you're doing it for a single slice instead of the whole network.
[edit] I'm also learning so correct me if I'm off, hn!
The prepare/train/generate instructions in the github linked are pretty much it for the 'how' of training a model. You give it a task and it does it for 1 billion trillion epochs and saves the changes incrementally (or not).
Training a LoRA for an image model may be more approachable, there's more blog entries etc on this, and the process is largely similar, except you're doing it for a single slice instead of the whole network.
[edit] I'm also learning so correct me if I'm off, hn!