I built a few multi agent systems and went down a rabbit hole where I reached an important conclusion - From the perspective of the LLM, the prompt/context is the only thing that ever matters. Everything about how your agent will behave ultimately boils down to this.
I had a bunch of fancy stuff like agents collaborating by passing messages and interpreting them with their own prompts and function calls. Then I realized I could collapse all of my "agents" into one dynamic prompt that tracks state in a stupid simple text region. Passing messages around was playing in very expensive traffic at the end of the day.
This is ultimately about information and spinning up an entire matrix of "agents" to process stream of info from A to B seems quite wasteful when many clear alternatives exist.
If we are seeking emergence, then perhaps this mental model still fits better. But, for practical targeted solutions, I think it's a huge distraction.
The best analogy I can think of is that if you want your agent to accomplish something without persisting a long chat history, but instead use the agent to reorganize and change the prompt, you would choose to use the method Leonard uses in the film "Memento". Due to his condition, Leonard cannot form new memories and struggles to recall events that occur after his injury. Leonard knows his condition and uses tattoos to record facts "between sessions". Each chat completion of the LLM is similar to "each session" Leonard experiences. The prompt, with the help of the agent, can persist across LLM chat completions, which is similar to the tattoos, and various notes and photos he has.
That's true if you're passing messages between identical models. There's a question to ask as to whether different models trained for different tasks would be better than single, multipurpose models though. My gut feel is that eventually multipurpose models will win because you don't have the embedded cost of relearning what syntactic structure is, but for a given training time and number of weights it's not clear whether that's true today.
Yeah, same principle. If you're passing messages between things that will react exactly the same to the same prompt, there's not a lot of point (unless the parallelism is important). If you've got fine-tunes, the whole point is that they will be better at some questions than the baseline.
Mind you, there's another idea there about mixture of experts as implemented by deciding which fine-tune to load, depending on the prompt itself... I'm sure that's been looked at.
Maybe the LLM needs something like dantes Divina Commedia as previous instances describe in condensed forms why a previous prompts conclussions failed, to succesfully navigate around the trained in local minimas. I diary of failure to keep track on how to reach a success.
I had a bunch of fancy stuff like agents collaborating by passing messages and interpreting them with their own prompts and function calls. Then I realized I could collapse all of my "agents" into one dynamic prompt that tracks state in a stupid simple text region. Passing messages around was playing in very expensive traffic at the end of the day.
This is ultimately about information and spinning up an entire matrix of "agents" to process stream of info from A to B seems quite wasteful when many clear alternatives exist.
If we are seeking emergence, then perhaps this mental model still fits better. But, for practical targeted solutions, I think it's a huge distraction.