Why do the three models have different system prompts? and why is Sonnet's longe...

orbital-decay · on Aug 27, 2024

They're currently on the previous generation for Opus (3), it's kind of forgetful and has worse accuracy curve, so it can handle fewer instructions than Sonnet 3.5. Although I feel they may have cheated with Sonnet 3.5 a bit by adding a hidden temperature multiplier set to < 1, which made the model punch above its weight in accuracy, improved the lost-in-the-middle issue, and made instruction adherence much better, but also made the generation variety and multi-turn repetition way worse. (or maybe I'm entirely wrong about the cause)

coalteddy · on Aug 27, 2024

Wow this is the first time i hear about such a method. Anywhere i can read up on how the temperature multiplier works and what the implications/effects are? Is it just changing the temperature based on how many tokens have already been processed (i.e. the temperature is variable over the course of a completion spanning many tokens)?

orbital-decay · on Aug 27, 2024

Just a fixed multiplier (say, 0.5) that makes you use half of the range. As I said I'm just speculating. But Sonnet 3.5's temperature definitely feels like it doesn't affect much. The model is overfit and that could be the cause.

potatoman22 · on Aug 27, 2024

Prompts tend not to be transferable across different language models