Just for training and processing the existing context (pre fill phase). But when...

		sailingparrot 10 days ago \| parent \| context \| favorite \| on: Big GPUs don't need big PCs Just for training and processing the existing context (pre fill phase). But when doing inference a token t has to be sampled before t+1 can so it’s still sequential