Hacker Newsnew | past | comments | ask | show | jobs | submit | jmalicki's commentslogin

I have sometimes found "LARPing job roles" to be useful for expectations for the codebase.

Claude is kind of decent at doing "when in Rome" sort of stuff with your codebase, but it's nice to reinforce, and remind it how to deploy, what testing should be done before a PR, etc.


If you build up and save some of those scripts, skills help Claude remember how and when to use them.

Skills are crazy useful to tell Claude how to debug your particular project, especially when you have a library of useful scripts for doing so.


Mercenary also excludes people do it for funsies and not getting paid.

Does it also exclude researchers?

Only if they keep refusing to pay bug bounties!

Regardless, published papers aren't an authoritative source of truth. Just a note to your friends "hey I did some cool stuff I want to tell you about!"

Sure it's slightly more reviewed than a GitHub repo, but it's not an end all be all.


I can forgive vibe code... It needs to execute if it works it's fine.

Unedited vibe documentation is unforgivable.


Still, jq is run a whole lot more than it used to be due to coding agents, so every bit helps.

The vast majority of Linux kernel performance improvement patches probably have way less of a real world impact than this.


> The vast majority of Linux kernel performance improvement patches probably have way less of a real world impact than this.

unlikely given that the number they are multiplying by every improvement is far higher than "times jq is run in some pipeline". Even 0.1% improvement in kernel is probably far far higher impact than this


Jq is run a ton by AIs, and that is only increasing.

I can't take seriously any talk about performance if the tools are going to shell out. It's just not a bottleneck.

It's not just that - anything working with analog signals benefits hugely from not living inside the complete EM interference nightmare of the computer case.

I see AI pass the turning test all the time, since humans are constantly falsely being accused of being an AI.

It doesn't mean that AI got good, just that humans are thinking other humans are AI, which is a form of passing the test.

The adversarial version with humans involved is actually easier to pass because of this - because real actual humans wouldn't pass your non adversarial version.


I've seen a fair number of cases where someone swears up and down not to be using AI to generate responses, but there's no good reason to believe it (except perhaps specifically for the messages where that claim is made).

This includes times that someone basically disappeared from e.g. Stack Overflow at some point before the release of ChatGPT, having written a bunch of posts that barely demonstrate functional literacy or comprehension of English; and then came back afterward posting long messages with impeccable grammar and spelling in textbook "LLM house style".


There are a ton of people like that, but the LLM house style also exists because a ton of people write that way too.

The people falsely accused because they've used em-dashes for 20 years aren't the ones that were functionally illiterate before.


It's not just patterns like "not just X, but Y", but also deeper patterns and a kind of narrative cadence. Sure it's also mimicking something real, but usually it's a mismatch between the insightfulness of the content and the quality of the delivery. It feels like chewing on empty calories, it's missing the intentionality and the edge of being human. I guess you need to read a lot of LLM output to get a feel for this beyond the surface level pattern matching.

I wonder whether AI house style is the result of the people training it having no sense of writing style or some kind of technical limitation.

With AI, there is no sense of the level of emphasis matching the meaning of the text, or a long-range dramatic arc - everything is a revelation, like somebody who can only speak in TED talks. Everything is extremely earnest, very important, and presented using the same five flashy language hacks.


> “It’s not just patterns… but also cadence…”

Nice try, ChatGPT.


Thank you!

Your exact post here claiming you can identify AI has all of the hallmarks of the AI detection algorithm you are proposing, in spades.

Hence my claim "actual people actually write like that"


It was a joke. But also my use of not x but y is not rhetorical but declarative. The whole point is that what many of us are talking about is not simply these surface patterns but how they are used and how the narrative rhythm of the sentences and paragraphs go.

> There are a ton of people like that, but the LLM house style also exists because a ton of people write that way too.

Everyone keeps saying this, and I keep asking for a link to this type of writing that is dated pre-2022, and I keep getting nothing.

If it was that common, I'd have gotten at least a few examples by now.


I believe that, much the same way as a fighter jet designed for the average pilot doesn't fit any of them, the 'average' of written text ends up reading like an LLM without being able to find a 100% matching sample.

I certainly used em dashes before 2022, and so did anyone who cares about proper typography.

> I certainly used em dashes before 2022, and so did anyone who cares about proper typography.

Who said anything about em-dashes? There's an entire Wikipedia page documenting the tells, and only one of the 15 or so items is "extended usage of em-dashes".

It's not fluff, it's actual tells.

No nonsense, no BS, just tells.

The key insight is...


I have seen people accuse someone of using AI solely because the comment contained em dashes.

> I have seen people accuse someone of using AI solely because the comment contained em dashes.

Yeah, but you haven't seen me do that :-)

(You also haven't seen anyone on HN recently (since this year) do that either).


I think em-dashes were uncommon mainly because they're not always convenient to type.

I don't think there's any definitive way to check, but for me one of the biggest tells that a long piece of writing was LLM generated is that it will hardly say anything given how many words are in it.

(well that and the "it's not just x, it's y!" pattern they seem to love)


It's not just x it's y is also something people do!

That is possibly one of my personal writing weaknesses that lead my own writing to get flagged as AI.

I can admit "it's not just x, it's y" is mediocre writing - but it's also something mediocre writers do - it's how AI learned to do it!


But it's also often a shoehorned artificial contrast that doesn't really make sense. The Y is often not such a different thing from the X that would make it worthy an actual "not just X but Y" claim. Or the Y is a vague subjective term, or some kind of fancy-word-dropping. It's strong styling but little content, similar to politician CYA talk. I don't think it's necessarily a tech limitation, more of an effect of deliberate post-training to be middle-of-the-road nonoffensive and nonopinionated.

> I can admit "it's not just x, it's y" is mediocre writing.

Eh, there are times when it's entirely justifiable and even good.

The problem has more to do with making it a cliche. (Also, sometimes the X-Y pair is just uncanny in one way or another.)


In one study, GPT-4.5 was judged to be human 73% of the time, which means that the actual human was judged to be human only 27% of the time. More human than human, as Tyrell would say.

Edit: folks, the standard Turing test involves a computer and a human, and then a judge communicating with both and giving a verdict about which one is the human. The percentages for the two entities being judged will add up to exactly 100%. That's how this test was conducted. Please don't assume I'm a moron.


The implication would be that GPT-4.5 was not judged to be human 27% of the time. You can't determine how often humans were judged correctly as humans from that data point.

The structure of the test was that there was one human and one AI conversation partner, and the rater had to choose which one was which.

Given that structure, you can judge from that data point.


That was also before the crazy AI hysteria we have today with the em-dash police everywhere.

For the test to be free of bias, we’ll have to ensure all the humans are from Nigeria.

Those stats dont necessarily line up that way. Do you have a link?

Given the way the test was structured it does line up.

https://arxiv.org/abs/2503.23674


Surprisingly good. I wonder how they would have done without the 5 minute limit on conversations (average of 8 messages per convo per the study)

People have been killing each other with weapons for as long as they've been around, nuclear weapons shouldn't be anything new.

No one should have nuclear weapons, we aught to have robust policy, institutions, and vigilance to prevent their proliferation and use.

Computerized vehicles aught to be strictly regulated in terms of how computers may affect the physical operation of the car, such that a reasonable standard of safety can be ensured outside the usual risk one takes when hopping in a motor vehicle. The fact that a hacker can possibly kill people by rooting an infotainment system is a symptom of the general disregard for security in design, and we continue to ignore it for engineering expediency.


This is a great real world example of where leetcode is useful.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: