I haven't evaluated the judge benchmark. You have everything needed in the repo to do so though, so be my guest. It took me a bit of time to put all this together and won't have much more time to dedicate to it before a couple of weeks.
BTW, if you explore the repo, sorry for all the French files...
I have a very nice grinder: a solis caffissima digital coffee grinder. It is available under a different brand name in the US I think.
I make filter coffee with a very basic earthenware filter holder with melitta high quality yet very normal filters and sometimes I mix it up with an aeropress which offers a different type of taste because of the low acidity way of making coffee. I just drip the coffee into a nice thermos so I can make 4 cups in one go and just pour from the thermos.
My coffee is much nicer than I get in most places, both professional and at homes and it doesn’t cost me a lot in effort, money and, very importantly, workspace footprint.
Espresso machines require a lot of space and maintenance and trouble to make.
Having said all this, I am quite intrigued about all the stories about the negative effects of coffee. I just thought it was about influencing sleep, but I had never thought about the memory and mood effects. I will study this some more in the coming months.
Does this have a unified API? In playing around with some of these, including unified libraries to work with various providers, I've found you are, at some point, still forced to do provider-specific works for things such as setting temperatures, setting reasoning effort, setting tool choice modes, etc.
What I'd like is for a proxy or library to provide a truly unified API where it will really let me integrate once and then never have to bother with provider quirks myself.
Also, are you also planning on doing an open-source rug pull like so many projects out there, including litellm?
You should be concerned about a government issuing these ridiculous and dangerous controls on what you can do in society. Not whether, within that dystopia it is fair to submit in one way or another.
Also, kids understand perfectly well that different parents have different rules.
I don’t think the government or Apple should be responsible for protecting you from mopey teenagers by blocking free internet access for everyone just so that it “is fair”. Are you even hearing yourself?
Yeah Claude/Cursor already have tools to access the browser. What I’m missing is a tool to inspect iOS simulator the same way. Is there a tool for that yet? The Xcode MCP wasn’t really helpful.
I find that incredibly buggy. Literally 50%+ of the time CC complains it can’t connect to the MCP. When it does work it can be magical, but my success rate is tiny. I’m not going to restart all my chrome windows every time I turn around because CC can’t talk to it for some unknown reason, especially since I’ve restarted Chrome before and CC still couldn’t connect.
There should be a a better way to restart just the MCP.
Are these comments from 2018? 'Pro' models of iPhones have been $999 or more, not adjusted for inflation, at their lowest tier since 'Pro' has been a thing. I would expect the same of a Samsung 'Ultra' flagship?
No one sane buys it for the list price. During launch there are always various discounts. I got S25U for 800 (without sending my old phone, just some coupons) with a 5eur/month contract last year at launch. If it really lasts 7 years it's not even that expensive.
It's market segmentation. Instead of charging what it's worth, maybe 700 EUR, they charge 1700 so they can get away with charging you 800 eur on "discount", dragging up the prices of the entire lineup below it.
I really like this benchmarking. Have you evaluated the judge benchmark somehow? I'd love to setup my own similar benchmark.
reply