Yeah, it almost feels like you are talking to somebody with OCD. The frustrating...

Yeah, it almost feels like you are talking to somebody with OCD. The frustrating part is, output tokens are usually a lot more expensive than input tokens, so they are wasting energy and money :-). Also, the more they generate, the greater the chance it will create attention issues as the conversation progresses.

This is why I built my chat app to let me manipulate LLM responses. If I feel it is not worth knowing, I'll just erase parts of it to ensure the conversation doesn't get side tracked. Or I will go back to the original user message and modify it to say

### IMPORTANT

- Do not generate more code than required.

The nice thing about LLM conversations are, every time you chat, the LLM treats it as a first time conversation, so this trick will work if the model is smart enough.