With only a modicum of trolling-level here, I wonder what percentage of that training expense was used to identify and avoid "true things that must be muted because they offend someone"
Ignoring the subtext of "true things that must be muted because they offend someone", there's a whole section in the paper on how they didn't filter and the problems that causes. TL;DR:
> We observe that toxicity increases with the size of the model, especially for Respectful prompts.
It does outperform GPT3 slightly in terms of observed bias against protected groups (as in it is slightly less biased) but not substantially so.