> Well, yes, but Not all models need to be "super large." Smaller models, specialized in specific tasks, working together - and then reporting to a slightly larger model is the way to go.
I want to believe, but I'm still yet to see this kind of set up being anywhere near GPT-4 level.
The weather example seems quite contrived. Why not just display the alerts for your area? Why is a complex system of smaller models reporting up to a slightly larger model necessary?
Because "Flood warning on roads" is very different than "hey there is a possible tornado/aftershocks/*" and I'd like the 2nd one to take up the entire home-assistant dashboard.
I can either code it myself, or let the model figure it out after passing it the YAML file for the dashboard.
>Why is a complex system of smaller models reporting up to a slightly larger model necessary?
because as the other poster said, cost and speed. thousands of queries every day (potentially) at 30 cents per million tokens is very different than 15 dollars per million tokens.
Because cost and speed. Smaller models can run on your phone for free, or on the cloud for pennies. An API call for a large LLM with a lot of context can cost orders of magnitude more and incur network latency
I want to believe, but I'm still yet to see this kind of set up being anywhere near GPT-4 level.
The weather example seems quite contrived. Why not just display the alerts for your area? Why is a complex system of smaller models reporting up to a slightly larger model necessary?