The EU has a copyright exemption for noncommercial model training, and at least the UK is changing that to even allow commercial model training, without even an opt-out required. So it appears they legally have the right to use anything.
Why should you want a model designed to know all human knowledge to know only public-domain knowledge?
Why should you want a model designed to know all human knowledge to know only public-domain knowledge?