A lot of companies are already using projects like chatbot-ui with Azure's OpenAI for similar local deployments. Given this is as close to local ChatGPT as any other project can get, this is a huge deal for all those enterprises looking to maintain control over their data.
Shameless plug: Given the sensitivity of the data involved, we believe most companies prefer locally installed solutions to cloud based ones at least in the initial days. To this end, we just open sourced LLMStack (https://github.com/TryPromptly/LLMStack) that we have been working on for a few months now. LLMStack is a platform to build LLM Apps and chatbots by chaining multiple LLMs and connect to user's data. A quick demo at https://www.youtube.com/watch?v=-JeSavSy7GI. Still early days for the project and there are still a few kinks to iron out but we are very excited for it.
Quality and depth of particular types of training data is one difference. Another difference is inference tracking mechanisms within and between single-turn interactions (e.g., what does the human user "mean" with their prompt, what is the "correct" response, and how best can I return the "correct" response for this context; how much information do I cache from the previous turns, and how much if any of it is relevant to this current turn interaction).
With Louie.ai, there is a lot of work on specialization for the job, and I expect the same for others. We help with data analysis, so connecting enterprise & common data sources & DBs, hooking up data tools (GPU visuals, integrated code interpreter, ...), security controls, and the like, which is different from say a ChatGPT for lawyers or a straight up ChatGPT UI clone.
Technically, as soon as the goal is to move beyond just text2gpt2screen, like multistep data wrangling & viz in the middle of a conversation, most tools technically struggle. Query quality also comes up, whether quality of the RAG, the fine tune, prompts, etc: each solves different problems.
I see this as more of a 'Migration problem'. Why is this offered as a SaaS as opposed to a consulting service?
The code to organize and vectorize the documentation, endpoints and run it through a variety of models and injection prompting like two shots, etc. are going to be highly customized. The 'Base-code' there, is not exactly trivial, but anyone reading all the llama index docs can do it.
Then it's just run of the mil, analyst level integration that you provide to the client on a T&M, or fixed price costs.
I agree there's room for consulting, but as a new field, there's a lot of software currently missing for each vertical. Today, that's manual labor by consultants, but as the field matures... consultants should be doing things specialized to the specific customer, not what can be amortized across adjacent verticals. Top software engineers investing into software over time deliver substantially more in substantially less time, and consultants should be integrating that, not competing head-on.
> we believe most companies prefer locally installed solutions to cloud based ones
We've also seen a strong desire from businesses to manage models and compute on their own machines or in their own cloud accounts. This is often part of a hybrid strategy of using API products like OpenAI for rapid prototyping.
The majority of (though not all) businesses we've seen tend to be quite comfortable using hosted API products for rapid prototyping and for proving out an initial version of their AI functionality. But in many cases, they want to complement that with the ability to manage models and compute themselves. The motivation here is often to reduce costs by using smaller / faster / cheaper fine-tuned open models.
When we started Anyscale, customer demand led us to run training & inference workloads in our customers' cloud accounts. That way your data and code stays inside of your own cloud account.
Now with all the progress in open models and the desire to rapidly prototype, we're complementing that with a fully-managed inference API where you can do inference with the Llama-2 models [1] (like the OpenAI API but for open models).
There is a generic HTTP API processor that can be used to call APIs as part of the app flow which should help invoke tools. Currently working on improving documentation so it is easy to get started with the project. We also have some features planned around function calling that should make it easy to natively integrate tools into the app flows.
Interesting project - was trying it out, found an issue in building the image - have opened an issue on github - please take a look. Also do you have plan to support llama over openai models.
Shameless plug: Given the sensitivity of the data involved, we believe most companies prefer locally installed solutions to cloud based ones at least in the initial days. To this end, we just open sourced LLMStack (https://github.com/TryPromptly/LLMStack) that we have been working on for a few months now. LLMStack is a platform to build LLM Apps and chatbots by chaining multiple LLMs and connect to user's data. A quick demo at https://www.youtube.com/watch?v=-JeSavSy7GI. Still early days for the project and there are still a few kinks to iron out but we are very excited for it.