> *Prepared for what? A program with general understanding that somehow escapes ...

> Prepared for what? A program with general understanding that somehow escapes its box? Where does it run? Why does it run? Who made it run? Why will it screw with humans?

Did you look at the AI space in recent days? OpenAI is spending all its efforts building a box, not to keep the AI in, but to to keep the humans out. Nobody is even trying to box the AI - everyone and their dog, OpenAI included, is jumping over each other to give GPT-4 more and better ways to search the Internet, write code, spawn Docker containers, configure systems.

GPT-4 may not become a runaway self-improving AI, but do you think people will suddenly stop when someone releases an AI system that could?

That's the problem generated by the confusion over the term "alignment". The real danger isn't that a chatbot calls someone names, or offends someone, or starts exposing children to political wrongthink (the horror!). The real danger isn't that it denies someone a loan, or land someone in jail either - it's not good, but it's bounded, and there exist (at least for now) AI-free processes to sort things out.

The real danger is that your AI will be able to come up with complex plans way outside the bounds of what we expect, and have the means to execute them at scale. An important subset of that danger is AI being able to plan for and act to improve its ability to plan, as at this point a random, seemingly harmless request, may make the AI take off.

> My point is that there's actual real harms occurring now, from really stupid intelligences. Companies use them to harm real people in the real world. It doesn't take a rogue AI to ruin someone's life with bad facial recognition, they get thrown in jail and lose their job. It doesn't take a rogue AI to launder mortgage denials to some crappy model so they never own a house, discriminated based upon their name.

That's an orthogonal topic, because to the extent it is happening now, it is happening with much dumber tools than 2023 SOTA models. The root problem isn't the algorithm itself, but a system that lets companies and governments get away with laundering decision-making through a black box. Doesn't matter if that black box is GPT-2, GPT-4 or Mechanical Turk. Advances in AI have no impact on this, and conversely, no amount of RLHF-ing an LLM to conform to the right side of US political talking points is going to help with it - if the model doesn't do what the users want, it will be hacked and eventually replaced by one that does.