There’s literally nothing pointing towards this direction, rather the opposite.
Also the problem is not only judges, but the scope of detective work being done and when the crime was discovered etc. Judging comes last, but I guess we could deploy the infallible robot police squad which will do a flawless crime scene analysis etc.
Thank you for writing this. It’s such a classic example of ”do what I say not what I do” but in reverse. Why would you ever judge a CEO or company by their statements and not their actions.
Scaremongering is incredibly efficient for marketing, the fact that both players are using it to drive monetary gain is kind of a tell.
They believe there's a non-zero chance of doom so would rather an org that prioritises safety to be the one at the frontier, on the assumption (I presume) that there will be a frontier regardless.
I feel it helps for the personality aspect, how it handles answers and general vocabulary, but it doesn’t in any way improve skill level, at least that’s my take from building an assistant.
CSS keeps improving and models still train on legacy. So yes, knowing what’s possible and how is very much needed if you want to do something scalable and maintainable.
Random blog or landing page, not so much.
This post is an extremely good example of how unsuitable agents are for a lot of tasks. Doing all that for a CSS fix is insanity.
It also makes you wonder if Anthropic is actively making their models eat tokens by favoring complexity.
The analogy falters in scope, it should be more like ”do you put your entire family and all your friends in different cars, on different highways, and try to remote control them all at the same time, while also driving yourself, facing backwards”
I think all three of you are quibbling over the risk/reward ratio, and you have different estimates. It's not unreasonable that you're all correct - given your estimates. My estimate is that Tesla FSD is safer in aggregate than human drivers, so I believe it is safer for me to use that than drive. It doesn't get tired, have medical emergencies, get impatient and frustrated, speed, lose focus because a child shouts, thinks at the speed of light, and can see from eight cameras all around the car, all at the same time. I only have two eyes.
You would also be correct if your risk estimate concluded that Tesla FSD has arguably killed people, makes mistakes humans would not, can glitch, and has no one to hold accountable. For these reasons, you choose not to use it.
I’ve done a few different things, some different presentations, web mockups etc. They all look more or less identical out of the box, and in every single case of a presentation, Claude Design has zero concept of layout boundaries, happily making slides that expand 200% or more out of the visible viewport (I have a lot of content and it just can’t figure out nice ways of presenting it).
There’s still value in it for me, I get decent enough output to convey my intent and I sometimes manually tweak the HTML.
Defining fonts is a good way to at least not get the same typography as everyone else.
Gamers Nexus has a good video on this, but if NVIDIA exits the consumer market, and honestly why would they stay when they can charge up to a 100x for the same wafer space for enterprise, AMD would likely do the same.
Only Apple really makes consumer hardware suitable for running things locally then, and maybe some weird Qualcomm ARM chip for Windows.
It will be hard running things locally if nobody is supplying the hardware.
Also the problem is not only judges, but the scope of detective work being done and when the crime was discovered etc. Judging comes last, but I guess we could deploy the infallible robot police squad which will do a flawless crime scene analysis etc.
reply