What is Anthropic doing and who is their customer? AFAIK they over-censor their models, have subpar quality even compared to Llama 2 and Falcon 180B, have been generally even more closed than OpenAI, their pricing isn't competitive, no function calling, their long context window isn't that accurate.
There's enough room in the market, and the tech is early enough that little paying fandoms can pop up anywhere. Similarly to crypto, all the hopefuls believe their chosen AI will dominate the others and become the AI Godhead (loosely)
I haven’t used their service, so I don’t know what you mean by “over-censor,” but there is a very large commercial opportunity for a model that can handle business tasks. Such models would benefit from being prudish and carefully asking questions before inferring anything. If you explicitly market those are replacements for human assistant work, pricing isn’t that much of a problem.
The lack of function is a concern that I’m assuming they are addressing—why go on Google Sheets otherwise?
The context window is a concern, as business cases typically rely on long prefixes to any prompt containing the business context.
I asked Claude to help me research the fictional world of an established IP, and it refused, saying it was a fact-based tool and wasn't made to work with fictional content.
¯ \ _ ( ツ ) _ / ¯
An established IP's world is fact from my point of view. I got very little traction there compared to ChatGPT which had a no friction response that I could use to build upon.
Do we need some way to grade these services based on vertical or use-case?
I actually tried the same tech questions to multiple services when I first started playing around with these commercial LLMs. I would copy and paste the same question to GPT4, MS Bing (I soon stopped using that since I already have a sub to gpt4), claude, bard, and recently You (https://you.com) and while Claude.ai was rarely as good as GPT4, it wasn't too far off for tech questions.
I'm not very creative, so maybe the use of it helping with writing fiction or roleplay might help me, I haven't tried it yet.
Did you try Claude with non-fictional tasks, and if so, how does that compare to GPT4?
I did not try Claude for a research based task based on non-fictional content.
I think it's good that LLMs becomes specialized tools that can go deep into their expertise, I just think 'a fact engine' -- if that's what Claude is aiming to be -- needs to have correctly rigid controls on what defines fact. From that POV, I think I agree with the 'over-censored' label for Claude earlier in the thread... The intention may not be censorship, but if the LLM is so gunshy about what is fact vs. not, it's going to have a really narrow (and therefore potentially unreliable) lens.
I've had good results with Claude. Sometimes better than chatGPT. Never been censored, but that's probably just cause I ask it to teach me about stuff.
Claude is great, you can upload PDFs of research papers, slides and very long documents and it will spit out decent attempts at implementation code and summaries.
It's curious that Anthropic is entering the LLMOps tooling space ---this definitely comes as a surprise to me, as both OpenAI and HuggingFace seem to avoid building prompt engineering tooling themselves. Is this a business strategy of Anthropic's? An experiment? Regardless, it's cool to see a company like them throw their hat into the LLMOps space beyond being a model provider. Interested to see what comes next.
I read somewhere that people would buy an electric motor with various attachments, instead of buying home appliances that contain electric motors like we do now.
If the motor is the expensive part, that makes sense.
In a way, that’s still what some cheap crates of DIY equipment do: they have a drill, a sander, a vibrating polisher, etc. that are all extensions around a handle/motor core, and a couple of batteries.
Models are infrastructure on which applications are built, but if you don't have both the clearly best models and a moat that prevents competitors from catching up, the people making money aren't going to be you selling commodity models, but the people selling applications which can even be model agnostic.
And Claude, while it may be the nearest competitor among paid models to OpenAI GPT-4, isn't both clearly best and moat-protected.
Spreadsheet people are the world's most common developers. A spreadsheet is just a programming environment where every memory address has a visual representation on-screen.
I'm doing spreadsheets right now. If it needs to be dynamic and you don't want to use scripts but rather plain Excel formulas w/o the fancy functions introduced with excel 2010, it's pure programming with a limited syntax:)
Why are they even in business yet?