I've experimented with this and one approach is to avoid the simple chat interface. Let the game be the "user" and have it relay the player's text. Something like
<<< We're in this situation, I'm the game master, and the player said "xyz". I need your help to handle this request according to the rules of the game. >>>
Then the LLM is directing the obedience towards the game master and the rules, rather than the player.
<<< We're in this situation, I'm the game master, and the player said "xyz". I need your help to handle this request according to the rules of the game. >>>
Then the LLM is directing the obedience towards the game master and the rules, rather than the player.