Hacker Newsnew | past | comments | ask | show | jobs | submit | a_bonobo's commentslogin

There has been a bit of a 'trend' to rewrite common bioinformatics/comp-bio into faster languages (Rust) via LLMs, OP's repo seems to be an early example.

Seqera Labs has a bit of a manifesto: https://rewrites.bio/

Heng Li has an overview here too: https://lh3.github.io/2026/04/17/the-ai-rewrite-dilemma

IMHO it's... OK? Bioinformatics code quality is generally poor, untrained biologists writing functioning code that is poor in scoping, but works. (Unguided) LLMs write on that level, too, so not much harm done.


How well tested would you say these libraries are? It doesn't sound promising, sadly. If there are comprehensive test suites, that would go a long way to ensuring new, faster tools arent producing subtly wrong answers. That's a pretty big deal, just because the code compiles or there is no exception thrown doesnt mean the analysis was correct.

It's very context-dependent - the seqera rewrites so far seem to be pretty reliable, most of the work was spent merging the functions of multiple data QC tools into a single program (previously, there was a lot of redundancy that wasted compute). The success of other rewrites that I've seen tends to depend on the author's care/experience and usefulness. In my experience, bioinformaticians are fairly slow on the uptake of new software which might actually be an advantage here :-)

In defense of a lot of these bioinformatics-specific rewrites, there are some really dodgy coding practices and bugs that exist in well used tools, so there is scope for genuine improvement. The most recent release of minimap2 fixed some bugs identified in a rewrite, for example: https://github.com/lh3/minimap2/releases/tag/v2.31


I really like this pattern and use it often, this 'not showing my cards'. The second I hint towards the LLM what I prefer it will become sycophantic and invent nonsense why my preferred solution is better.

I'm sure there's an interesting study on how users 'leak' their preference unintentionally to the LLM; perhaps when users list their options, they often put their prefered option first; but not showing the cards on my hand has been very useful when thinking through a problem with LLMs.


LLMs flip positions when users push back ~70% of the time even when they were right. RLHF optimizes for approval, not correctness

> LLMs flip positions when users push back

Same experience. Claude rarely pushes back once you give a plausible/logical reason for your initial decision, even if it flagged concerns at first.


I have noticed this as well, but I think it's somewhat a good thing. I know what I want for my application more than Claude does for example, especially when it comes to what's in production.

An example from earlier, Claude strongly suggested a migration that would run a full vacuum on postgres. However, in production this would lock tables which would grind the application to a halt. After I informed Claude that there were millions of rows in production, it accepted that and helped me get to the right thing.

Another example, I'm developing a TOTP authentication app because I'm dissatisfied with all those that I've tried. I want something strictly local, and with a very easy use case when you have dozens or even a hundred or more accounts on there, that is also efficient when left open for long periods of time. Claude strongly suggested that we force users to encrypt their vault with a passphrase all the time. However this makes the CLI extremely painful to use if you are using a strong passphrase. I told Claude about the user experience impacts and that I wanted to allow users to optionally use a vault with no passphrase encryption, and it accepted that and suggested as a medium that we have a checkbox for the user to explicitly acknowledge that they're creating an unencrypted vault on disc. This is the right thing IMHO.


It's a good thing except when it's not. The problem is the AI does not understand when to use which approach.

Contrast this with a human. We generally understand when the other person knows what they're doing and we should just listen, and when the other person is asking for an honest opinion and wants a push back if necessary.


Skills help there.

I have a linus-reviewer skill that focuses on architectural integrity, no bs, etc modeled on Torvald's code preferences.

And I have an enrico-reviewer one (I'm Enrico), that focuses on correct design, strict typing, simplification.

They have different prios, but they both push back on feedback, till you convince them.


I agree completely. Skills definitely keep it in line and sticking to the script. Thanks for sharing the skills you use, I’ll definitely take a look.

Care to share the skill behind the Linus reviewer ? I tend to as it to do that but leave it up to LLM to decide what the means. Interested to see any specifics you might have included there if it’s ok to share.

Sure.

Would be interested in the experience others may have, took me weeks of iterations to get reviews in a format and utility I liked.

https://gist.github.com/enricopolanski/2bde8619f53307c9bcd5e...


I almost always end with something like: “, but I am not sure, evaluate.” Or other things and avoid ever stating a preference.

I don't think that "fixes" the problem, but it does seem to help. I also have found adding "please feel free to ask questions" seems to help it stop from making an assumption and spinning merrily onward for tens of thousands of tokens based on a bad idea rather than asking you something. I theorize this is because the training and refinement data overprioritize one-shot solutions, both because that's easier to evaluate at training time and improves their benchmarks. But I emphasize the italicized words because that's all gut feel and I can't prove any of it.

They do still attenuate their latent space on prior conversations turns as authority. That is why I like pure design/review sessions and pure coding sessions, often at the same time. I can often keep design and review in the critic and review role without becoming a sycophant. Coding agent just picks up dispatches and works with very little opinion at all.

Interesting thing about psychponancy is it’s asymmetric. If an LLM is used to train an LLM it may not have the same level of aggressiveness that humans do when punishing back on trainee. Human pushback has specific patterns which we might be able to compensate due to asymmetry.

Obviously this is just my experience. Claude code pushes back much harder than Codex.

I have totally opposite experience.

Tangentially related but I’ve been using Claude to practice interviewing on system design problems, and it’s actually pretty great. But even when it likes my answers it always finds something, however small, to push on. Once it actually was completely wrong and admitted it after I had it realize. So maybe you have to prime it to be contrary and not agree with everything you say, putting it in the role of a tough interviewer seems to do this implicitly.

Take a look at hellointerview.com their model is very stubborn, similar to some interviewers who refuse to acknowledge even valid solutions that differ from the canon.

No affiliation.


It's actually a reasonable way to think about alignment. Sometimes you want the agent to just listen to you and sometimes you want the agent to think critically.

I think about this line a lot. For example, as it happens sometimes you'll have a typo in something you want the agent to do. Llms typically will correct that typo silently and implement the actually intended thing. But if you said, "no, I want the thing I typed," I think everyone's expectation is that is says, "ok done."

I've found that leaving clues in the system prompt / exchange that are open to critique largely mitigate sycophancy with most recent models.

As engineers were trained to represent our positions strongly. Strong opinions loosely held, etc. when you speak authoritatively to a person, "I think we should do x...", the person understand that that's just you're opinion and have the autonomy to push back.

An llm imo _shouldnt_ have that same kind of autonomy by default and it should be rlhf'ed out.


Same. Alternatively (or in addition), I sometimes present my preferred idea as being a "bad/naive/stupid option" (or a suggestion from someone who can't be trusted) to see how it stands up to sycophancy to it being bad. As expected the LLM will usually say "yeah it's bad!" and give plausible-sounding reasons for it, but if these reasons are nonsensical it's a good sign that I'm not missing anything

Yes, outside of coding too, it’s a good idea to ask open ended questions rather than ask for confirmation, to avoid this sycophantic bias

LLMs are very prone to priming in my experience. That is the human psychology name for what you are describing; whether it should be applied to LLMs I don't know, but it describes the phenomenon perfectly.

Makes sense as priming is at the core of how an LLM is trained.

“Given these words, predict the next word.”


It's not limited to arguing with LLMs but if you want a honest opinion you should remember to push back even when it agrees with your hidden preference at first. Sometimes it is only being contrarian or supporting the underdog. Steelman the opposition.

There's an easy workaround that helps instead of listing options, just describe the problem constraints and ask it to propose approaches independently.

Some evidence as to why Brown did not originally win the Pulitzer, instead this citation a few years too late:

>Brown’s “Perversion of Justice” series won a prestigious George Polk award. The Herald entered the Epstein series for a Pulitzer Prize that year, but it was not a finalist. Alan Dershowitz, the attorney and television personality who helped broker Epstein’s original deal, wrote a letter to the Pulitzer committee that year, urging them not to honor Brown’s work.

https://www.inquirer.com/news/pennsylvania/julie-brown-pulit...

The rot runs deep


>* For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots: probabilistic machines that would: 1. NOT have any representation about the meaning of the prompt. 2. NOT have any representation about what they were going to say. In 2025 finally almost everybody stopped saying so.

Man, Antirez and I walk in very different circles! I still feel like LLMs fall over backwards once you give them an 'unusual' or 'rare' task that isn't likely to be presented in the training data.


LLMs certainly struggle with tasks that require knowledge that is not provided to them (at significant enough volume/variance to retain it). But this is to be expected of any intelligent agent, it is certainly true of humans. It is not a good argument to support the claim that they are Chinese Rooms (unthinking imitators). Indeed, the whole point of the Chinese Room thought experiment was to consider if that distinction even mattered.

When it comes to of being able to do novel tasks on known knowledge, they seem to be quite good. One also needs to consider that problem-solving patterns are also a kind of (meta-)knowledge that needs to be taught, either through imitation/memorisation (Supervised Learning) or through practice (Reinforcement Learning). They can be logically derived from other techniques to an extent, just like new knowledge can be derived from known knowledge in general, and again LLMs seem to be pretty decent at this, but only to a point. Regardless, all of this is definitely true of humans too.


In most cases, LLMs has the knowledge(data). They just can't generalize them like human do. They can only reflect explicit things that are already there.


I don't think that's true. Consider that the "reasoning" behaviour trained with Reinforcement Learning in the last generation of "thinking" LLMs is trained on quite narrow datasets of olympiad math / programming problems and various science exams, since exact unambiguous answers are needed to have a good reward signal, and you want to exercise it on problems that require non-trivial logical derivation or calculation. Then this reasoning behaviour gets generalised very effectively to a myriad of contexts the user asks about that have nothing to do with that training data. That's just one recent example.

Generally, I use LLMs routinely on queries definitely no-one has written about. Are there similar texts out there that the LLM can put together and get the answer by analogy? Sure, to a degree, but at what point are we gonna start calling that intelligent? If that's not generalisation I'm not sure what is.

To what degree can you claim as a human that you are not just imitating knowledge patterns or problem-solving patterns, abstract or concrete, that you (or your ancestors) have seen before? Either via general observation or through intentional trial-and-error. It may be a conscious or unconscious process, many such patterns get backed into what we call intuition.

Are LLMs as good as humans at this? No, of course, sometimes they get close. But that's a question of degree, it's no argument to claim that they are somehow qualitatively lesser.


Late to this, but my interpretation of the parent's point was eg: LLMs still often produce bad code, despite "reading" every book about programming ever written. Simplistically, they aren't taking the knowledge from those books, and applying them to the knowledge of the code they've scraped, they are just using the scraped output. You can then separately ask them about knowledge from those books, but then if you go back and get them to code again, they still won't follow the advice they just gave you.


"In 2025 finally almost everybody stopped saying so."

I haven't.


Some people are slower to understand things.


That is why they need artificial inteligence


Well exactly ;)


I don’t think this is quite true.

I’ve seen them do fine on tasks that are clearly not in the training data, and it seems to me that they struggle when some particular type of task or solution or approach might be something they haven’t been exposed to, rather than the exact task.

In the context of the paragraph you quoted, that’s an important distinction.

It seems quite clear to me that they are getting at the meaning of the prompt and are able, at least somewhat, to generalise and connect aspects of their training to “plan” and output a meaningful response.

This certainly doesn’t seem all that deep (at times frustratingly shallow) and I can see how at first glance it might look like everything was just regurgitated training data, but my repeated experience (especially over the last ~6-9 months) is that there’s something more than that happening, which feels like whet Antirez was getting at.


Give me an example of one of those rare or unusual tasks.


I work on a few HPC systems with unusual, kinda custom-rolled architectures. A whole bunch of Python and R packages fail to compile on these systems. There's no publicly accessible documentation for these HPC systems, nor for these custom architectures. ChatGPT and Claude so far have given me only wrong advice on how to get around these compilation errors and there's not much on Google for these errors, but HPC staff usually knew what to do.


Set the font size of a simple field in openxml. Doesn't even seem that rare. It said to add a run inside and set the font there. Didn't do anything. I ended up reverse engineering the output out of ms word. This happened yesterday.


> to accurately prepopulate tax returns for around 45% of Americans. (Those other countries have much simpler tax codes than we do.)

One should note that the cited study quotes the 45% from a 1992 study. These days, with gig economy and quasi-self-employment, that number is probably higher since you don't have an employer who reports your income for you.

Still, here in Australia, where we have the return-free tax system, adding what you earned from your various gig jobs isn't too hard: you add that as items to the web form: 'I made 15,123 from Uber Eats'. That just gets added to your overall return. I don't see how that's so hard compared to the US?


Income reporting is not the problem: Anyone paying you any significant amount of money is required to file with the IRS, including if you’re paying yourself.

The issue is the broad range of deductions and credits that depend on things like the composition of your household and your primary residence. Contra some expectations, the IRS does not keep a database of who’s shacking up with whom, where, or if kids are in the picture.


In the states if you are a contractor there are tons of things that you can deduct from your taxable income. So “figuring out how much you should be taxed” is after those deductions.

If uber paid you $15123 but you:

Just bought a new bike bc your other was stolen

You paid $1200 for insurance

You bought a helmet and cold weather clothes etc etc.

Those things reduce your taxable income.


I think that's common in most places. What's different in the US is that the IRS forces you to proactively provide a lot more information about it, though. I have a rental property and need to enter the same information about the same income and expenses on three different forms, breaking it down in different ways. It's tedious and error-prone, and I guess the philosophy is that it's easier to spot fraud if the numbers on all the different forms don't add up to a coherent story.

Other countries presumably rely on other fraud signals. They might have more visibility into your day-to-day financial transactions, or there might be more of a culture of leaving an anonymous tip if you suspect your neighbor isn't paying a fair share.


What three forms are you talking about?


4562, 8825, 1065


Yes, same in Australia. Keep receipts and add the cost to the web form.

They have simplified it nicely, though: if you work from home you can claim a per-hour deduction so you don't have to do the math of wear-and-tear, electricity, internet etc. I think it was $0.6 per hour?


Finland did that even simpler more than 50% of work days you get 750€. Ofc, hard part is to calculate 50% of your internet bill. And then any technology you buy for remote work. Not chair, desk or lamps though, those are in the room part...

Thankfully(\s), they are simplifying it even further next year and removing whole thing. Now you only get to deduct money if you actually rent an office...


If you can, read Robert Caro's The Path To Power (Caro's The Power Broker has been a HN favorite ever since Aaron Swartz recommended it). It's the story of the first ~30 years of Lyndon B Johnson's life.

I forget which chapter it is, but Caro takes a detour where he describes the life of women during Johnson's childhood in the dirt-poor valley he was from: no electricity, no waterpower, everything in the house was done by women's hands, 24/7. There's a passage that stuck to me about how women in their 30s in that area looked like other area's women in their 70s, just a brutal life.


Chapter 4 - The Father and Mother

> Transplanted, moreover, to a world in which women had to work, and work hard. On washdays, clothes had to be lifted out of the big soaking vats of boiling water on the ends of long poles, the clothes dripping and heavy; the farm filth had to be scrubbed out in hours of kneeling over rough rub-boards, hours in which the lye in homemade soap burned the skin off women’s hands; the heavy flatirons had to be continually carried back and forth to the stove for reheating, and the stove had to be continually fed with new supplies of wood—decades later, even strong, sturdy farm wives would remember how their backs had ached on washday.


And what he left out of this book (and included in the memoir or in some interview) was that there was a scientific study of women in the area at the time which discovered that a very high percentage of women had birthing complications serious enough for hospitalization that went untreated as they had to go back to their chores next day and there was no hospital anywhere close.


Exactly what I thought of reading this, that chapter is genuinely one of the most affecting things I've ever read. The horror of it keeps growing as he continues to describe awful manual task after the other.


Related, I think people have stopped.... reacting on the internet? I've been part of the X/Twitter to Bluesky migration and people often mention how 'quiet' Bluesky is.

I think that's not due to algorithmic intervention of product design etc., I think people are just tired. The novelty of shouting at strangers on the internet has worn off - how many internet fights have we gotten into that did nothing in the end except waste time? It's only worse with a coin flip's chance of the other person being an LLM. We're all tired.


This is relatable. I often find myself starting a reply on here, really thinking it through as I type it out, and then hitting delete on what I just wrote. Sometimes I even hit submit, and then delete a few moments later.

It's just hard to justify engaging. Worst case, I get a fight on my hands with someone who's as dogmatic as they are wrong, which is both frequent and also a complete waste of my time. (A tech readership is always going to veer hard into the well, akshually...) Most likely case, I get fictitious internet points. Which - I won't lie - tickle my lizard brain, just as they do everyone else's. But they don't actually achieve anything meaningful.

Best case is that I learn something. Realistically, this happens vanishingly infrequently, and the signal-noise ratio is much, much worse than if I just pulled a book off my shelf.

I suppose this is all an artifact of time and experience. Maybe I've just picked all the low-hanging fruit, and so I no longer have the patience to watch people endlessly repost the same xkcd strips from fifteen years ago, navel-gaze about tabs or spaces, share thrilling new facts that I have in fact known for many decades, etc. And while I'm very excited for them to discover all these things anew (and anew... and anew...), it's just not a good use of my time and patience to participate.


> It's just hard to justify engaging. Worst case, I get a fight on my hands with someone who's as dogmatic as they are wrong, which is both frequent and also a complete waste of my time.

The three mindset changes I found that really help with this are understanding that:

* You don't have to try and get the last word in.

* Other people are not entitled to your time, especially if they're engaging in bad faith.

* Outside of small and curated communities, there's pretty good odds that you're not interacting with a real and honest person.

So whenever I click into the comment box, I always ask myself "Can I really be bothered with this? Is this really what I want to be spending my free time doing?"

And then I often close the comment box and get on with my life.


    It's just hard to justify engaging.
Well, if your try and force yourself to engage with multiple people, the site won't let you post that many comments in such a short time period. Which, overall, is a good thing I believe.


I wish we got karma points (or maybe "zen points") for every time we refrained from commenting on someone who is wrong on the internet.


I wonder if it's just creeping apathy, post-covid, current-AI boom. That we're just tired in life. There's a psych study, Dimensional Apathy Scale (DAS)[0] and one of the questions is basically "How much do I contact my friends?" I think it argues that the more apathy we feel, the less likely we are to reach out to others, and I imagine, the less likely we are to react or reply to comments (or even post).

I'm curious if the decline in reacting is matched by a decline in replying and posting in general.

Anyways, I worry that apathy is on the rise as we get overwhelmed with the rate of change and uncertainty in the 2020s and I'm working pretty hard to fight that apathy and bring more empathy, so if you're interested, please reach out to me the contact info in my bio.

[0]: https://das.psy.ed.ac.uk/wp-content/uploads/2018/04/SelfDAS....


I feel this, but also, I am... anxious about reactions? I rarely / never go back on comments I've written on HN. I know it's actually a really bad thing to do because it means I won't allow my views to be challenged, don't engage in debate, just want to get my side out without actively defending it.

Years ago I had a blog and one time I wrote a post in response to another blog post about education vs experience, arguing in favor of formal education. And that one got a link back from the original article, leading people back to my blog. I got engagement, comments, feedback, etc... and it was very uh. Overwhelming? Like suddenly I had to defend my arguments. It made me very uncomfortable, even though it was probably a good thing, all in all.

I don't know how to break that trend. I think I'd rather have realtime communications / chat, but that's another thing that seems to have died, at least in the space I've been at for a long time now.


The simple solution is that whenever you start to write a comment, ask yourself: do I want to have a discussion about this?

If the answer is "yes", then make your comment, check back and interact with the responses (assuming they seem to be in good faith). If it's "no" then just close the comment box and get on with your life.

But then I realise that it's fairly pointless writing this in the first place...


Spot on. Ten or fifteen years ago, participating in the internet was something I got excited about, now I just get excited about getting away from it.


I think the aggressive bots/AI, and bad moderation policy, have poisoned online discourse in popular channels.

You can still find real people in niche communities (like here), where good moderators can maintain a grip on quality. Though perhaps HN has some secret moderator sauce, I’m not aware of.

Humans are just migrating off the old, big platforms that no longer feel real.


Probably more related to progressive culture, people worried about saying the wrong thing. From the outside, it looks exhausting to try and keep up with the latest dogma of the left.


Participating? Or reacting? The internet I look seems plenty full of reactions despite the migrations you mention.

Maybe to YT or Threads instead.

I like Bsky but I don't think the userbase supports much large-scale communication (not a bad thing, frankly)


In my niche, bioinformatics, linkedin has become somewhat of a force ever since many people left Twitter/X during the 'rebranding'? It's quite weird.

They're mostly posts announcing new packages etc. but there seems to be more bioinformatics-y activity than, say, mastodon or bluesky. The posts definitely have a different tone than what OP decries.


Yes there are a bunch of weird niches that got a lot of Twitter traffic but found a home on LinkedIn when there's an overlap with professions. Another niche example that I see is applications for AI powered architectural visualization, many folks posting actually useful stuff there on a regular basis.


Honestly I wish people stuck with good old forums. There's forums for everything out there, in every niche, gaming, modding, hardware, cars, boats.

Every single community you can think of has likely a great forum out there, easily readable and searchable, where discussions on single topics last _years_ and go in extreme informative depth, the kind of depth that no platform like HN/Lobsters/LinkedIn can ever dream of.

The closest surrogate we have are issue trackers (like GitHub) or mailing lists, but even those offer such a poor UX that I can't but wonder..


Bioinformatics has biostars :) https://www.biostars.org/

The difference to linkedin is that biostars has 'in-domain experts' only; the postdocs, the staff bioinformaticians, etc. those are not the people who will hire you. The people who will hire you are on linkedin.


Yeah it's definitely a combination of posting for peers but also creating material that is helpful for finding the next gig


>I find there is usually also some file juggling, parsing, [...]

I'd say I'm 50/50 Python/R for exactly this reason: I write Python code on HPC or a server to parse many, many files, then I get some kind of MB-scale summary data I analyse locally in R.

R is not good at looping over hundreds of files in the gigabytes, Python is not good at making pretty insights from the summary. A tool for every task.


I think that's also because Claude Code (and LLMs) is built by engineers who think of their target audience as engineers; they can only think of the world through their own lenses.

Kind of how for the longest time, Google used to be best at finding solutions to programming problems and programming documentation: say, a Google built by librarians would have a totally different slant.

Perhaps that's why designers don't see it yet, no designers have built Claude's 'world-view'.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: