Hacker Newsnew | past | comments | ask | show | jobs | submit | drgiggles's commentslogin

Easily 99% of comments generated by LLMs are useless.


That's how I detect who is using LLMs at work.

  # loop over the images
  for filename in images_filenames:

    # download the image
    image = download_image(filename)
  
    # resize the image
    resize_image(image)

    # upload the image
    upload_image(image)


They're often repetitive if you're reading the code, but they're useful context that feeds back into the LLM. Often once the code is clear enough I'll delete them before pushing to production.


do you have proof of this being useful for llm? wouldn't you rather it re-read the actual code it generated instead of assuming that the potentially wishful thinking or stale comment is going to lead it astray?


it reads both, so with the comments it more or less parrots the desired outcome I explained... and it sometimes catches the mismatch between code and comment itself before I even mention it

I read and understand 100% of the code it outputs, so I'm not so worried about falling too far astray...

being too prescriptive about it (like prompting "don't write comments") makes the output worse in my experience


I've noticed this too. They are often restatements of the line in verbal form, or intended for me, the LLM-reader about the prompt, vice a code maintainer.


99% of comments are not needed as they just re-express what the code below does.

I prefer to push for self documenting code anyway, never saw the need for docs other than for an API when I'm calling something like a black box.


I think its because LLMs are often trained with data from code tutorial sites and forums like stackoverflow, and not always production code


Not what I have found with gemini.

What is particularly useful is the comments about reasoning about new code added at my request.


Very often comments generated by humans are also useless. The reason for this are mandated comment policies, e.g., 'every public method should have a comment'. An utterly disgusting practice. One should only have a comment if one has something interesting to say. In a not-overly-complex code base there should maybe be a comment perhaps every 100 lines or so. In many cases it makes more sense to comment the unit tests than the code.


I think the rules for comments on public method is to use something like doxygen to extract the reference. And most IDE can display them upon hovering. And comments can remind the caller of pre- and post-conditions.


I am pretty far to one end of the spectrum on need for comments. Very rarely is a comment useful to help you/another developer decipher the intent and function of a piece of code.


Then tell it to write better comments...


Ah, so it's good enough to write code on its own without time-consuming, excessive hand-holding. But it's not good enough to write comments on its own.


If you put in the work to write rules and give good prompts you get good results just like every other tool created by mankind.

How often do you use coding LLMs?


I can't speak to comments rules specifically but I am a heavy user of "agentic" coding and use rules files and while they help they are simply not that reliable. For something like comments that's probably not that big of a deal because some extra bad comments isn't the end of the world.

But I have rules that are quite important for successfully completing a task by my standards and it's very frustrating when the LLM randomly ignores them. In a previous comment I explained my experiences in more detail but depending on the circumstances instruction compliance is 9/10 times at best, with some instructions/tasks as poor as 6/10 in the most "demanding" scenarios particularly as the context window fills up during a longer agentic run.


They comment on the how, not the why.


There are mountains of data that show it actually has long term benefits beyond weight loss (beyond even the obvious health markers that improve due to losing weight). I wouldn’t be surprised at all if the majority of the population ends up taking next gen drugs in this space, most of them purely for longevity.


Reminds me of the alleged neurological benefits from use of hallucinogenics - but they're still banned.


Anyone that’s worth anything at all in this field is “self taught”, some of them just went to school first.


Agreed - my CS degree exposed me to a bunch of low level 'algorithms & data structures' type stuff + a bit of assembly, prolog etc + some math stuff all of which I likely wouldn't have gone out of my way to learn on my own unless it came up in a very obvious way in a problem I was trying to solve. But like 95% of the actual large scale software engineering type work (read: actually building useful software) I learned on my own by building unnecessarily over engineered side projects, or by building stuff while working.

(as an aside: "don't overengineer things" is great advice when your goal is to actually finish creating something useful, but imo if you're coding to learn then horrendously overengineering everything is super valuable to the learning process - you should totally set up a full custom CI pipeline, design your own networking protocol, write a parser for a DSL etc etc in service of your dumb little tic tac toe game or whatever you're making - you will learn things)


I have a CS degree myself but most of the code I wrote/read during college wasn’t a part of any class.

I treated GLSL how others treat Civilization VI. “Just one more shader” - “Oh no it’s 4am”.


I haven't dared even looking at any other Civilization game after the first one. I kept deleting it and it somehow found a way to sneak back when I wasn't looking through some sort of sentient magic undelete function.


Even a degree is self taught. The assessment is typically pretty decoupled from the content. You could say "people that have taught themselves a degree course have a better understanding than those that learnt just enough to pass the assessment".


Agreed. The effective difference between a degree and learning on your own comes down to structure, really. A college course gives you a decent enough structure to know that you do addition, then subtraction before you go trying to learn multiplication. I often find trying to learn things on my own I start from differential calculus in this example and try to work backwards.


> I often find trying to learn things on my own I start from differential calculus in this example and try to work backwards.

Doesn't everyone who has learned how to learn? Learning is way more efficient when you already have forward context for why you are learning something. Nebulously studying addition and subtraction without being able to see that it leads to multiplication (to stay with your example) is an absolutely horrid situation to find yourself in.

In fact, I suspect that's exactly what the article is really trying to get at: Those who "learn backwards", which happens to be a trait commonly associated with self-teaching, outperform.


I work in quantitative finance and have wanted to to start using OCaml at work for years. I just find that unless you are at a shop like Jane Street with a well developed proprietary code base, internally developed tooling, etc, there just isn't the ecosystem available for me to be nearly as productive as I can be in other well accepted languages in the quant dev space...which is a bummer. It's been a little while since the last time I investigated this though.


If you use Jane street's base, core, and async libraries, you already have most of the tooling you need.


It's brilliant if you think about it as an employer that wants to limit employee turnover. Great, you're an ace algo dev in OCaml, where you gonna jump ship to?


Also in quant finance. Have you given F# a shot? We use it and are very happy.


Any tips for getting a first quant job ?

Is learning C++ a must?


When we hire a junior person we are interested in math background, ability to communicate real world value of various models to our investment process and familiarity with computer science and software engineering concepts more than we care about experience with specific languages or technologies. That being said, C++ does still dominate this space so having exposure to it certainly would not hurt.


Who is "we" here? What is your approach for more seasoned folks?


I am a strategist at a smallish boutique quant investment firm. This is how we think about hiring a junior person. It's not all that different for a more senior person, but actual development experience would likely be more important, we would expect more contribution sooner from a more experienced person. More senior jobs also might have more specific responsibilities and therefore require more specific knowledge of technologies, etc. Many junior analyst roles support the team as a whole and there is less concern around experience with specific technologies, typically.


Does Fintech experience matter at all. I'm still in the finance space, but I'm very much just an average software engineer.

I've been meaning to learn C++, but I always find it intimidating.


Doesn't matter. It's same as any tech. Need to understand mathematics and algos and have great fluency in code. Can't speak for the other guy but I suspect he's same: no amount of skill "aligning stakeholders" and "managing cross-functional teams" is of much use in the business if you're at a prop shop. Have to be able to write code. No "blocked on other team", "Kernel Timestamping API is undocumented" etc.


hang out in bars around the office you want to work at.


THX

actually, quite a good idea, regardless which sector!


More generally it doesn't have to be bars. Go to where the people are that you want to meet.


No, C++ is useful if you want to work in HFT (where you're paid $$$ if you're a programmer).

For quants (progression towards a trader / portfolio manager), Python / Matlab / R is enough.


Get a Bloomberg Terminal - just for the social-networking features.


$24k per year to get a job?


My reply was somewhat facetious; though I have heard some pretty crazy personal anecdotes about what happens in the Bloomberg chat service.

But even taken at face-value, Bloomberg's annual fee is comparable to the yearly cost of an undergraduate degree - so it might very well be a fair deal.


Good intuition in probabilities is a must.


If you were to buy a property today in any major market, the average unlevered cap rate on leasing residential property is nowhere near 20%. In many cases it’s not even profitable without a significant housing price return assumption.


This is only part of what drives inflation. Yes, increase in money supply creates upward pressure on inflation. Higher interest rates reduce borrowing, which decreases demand for goods and services, this has downward pressure on inflation. This explains Fed activity over the last couple of years…the goal is to reduce inflation, not create higher interest payments on national debt…


Raising interest rates decreases inflation in the short term but that debt continues to increase every year.

In 2024 the interest expense surpassed defense spending for the first time in history.

the fed government already can't meet its obligations without borrowing.

so what that means is the Federal reserve has to print more and more money every year to compensate for these debt obligations of the federal government.

which is ultimately inflation

you're right there's other factors of play but this is the gigantic looming elephant in the room

https://fred.stlouisfed.org/series/CURRCIR


Unfortunately it seemed pretty clear from the start that this is what data science would turn into. Data science effectively rebranded statistics but removed the requirement of deep statistical knowledge to allow people to get by with a cursory understanding of how to get some python library to spit out a result. For research and analysis data scientists must have a strong understanding of underlying statistical theory and at least a decent ability write passable code. With regard to engineering ability, certainly people exists with both skill sets, but its an awfully high bar. It is similar in my field (quant finance), the number of people that understand financial theory, valuation, etc and have the ability to design and implement robust production systems are few and you need to pay them. I don't see data science openings paying anywhere near what you would need to pay a "unicorn", you can't really expect the folks that fill those roles to perform at that level.


I worked adjacent to the data science field when it was in its infancy. As in I remember people who are now household names in the field debating what it should be called.

At the time I considered going down that path, but decided I did not have anywhere near the statistics & math knowledge to get very far. So I stuck with the path I had been on. Over time I saw a lot of acquaintances jumping into the data science game. I couldn't figure out how they were learning this stuff so fast. At some point I realized that most of them knew less than I did when I decided I didn't know enough to even begin that journey.

Of course, I was comparing myself against the giants of the field and not the long tail of foot soldiers. But it made for a great example to me of how with just about everything there's a small handful of people who are the primary movers, and then everybody else.


Data science effectively rebranded statistics but removed the requirement of deep statistical knowledge to allow people to get by with a cursory understanding of how to get some python library to spit out a result.

I dont know anything about Data Science but as a bystander with a mathematical background thats what I assumed was going on so its kindof interesting to see it spelt out like that. Like you've put words to a preconception that I didnt even know I had.


That's because businesses don't require a deep level of math knowledge.


>Data science effectively rebranded statistics but removed the requirement of deep statistical knowledge

An important thing people miss is that shallow statistical knowledge can cause subtle failures, but shallow software engineering knowledge can cause subtle failures too.

A junior frontend developer will write buggy code, notice that the UI is glitched, and fix the bug. A junior data analyst will write buggy code, fix any bugs which cause the results to be obviously way off, but bugs which cause subtler problems will go unfixed.

Writing correct code without the benefit of knowing when there is a bug is challenging enough for senior developers. I don't trust newbie devs to do it at all.

Context here is I used to work in email marketing and at one point I was reading some SQL that one of the data scientists wrote and observed that it was triple-counting our conversions from marketing email. Triple-counting conversions means the numbers were way off, but not so far off as to be utterly absurd. If I hadn't happened to do a careful read of that code, we would've just kept believing that our email marketing was 3x as effective as it actually was.

So, it's impossible to know how much of a problem this is. But there is every reason to believe it is a significant problem, and lots of code written by data scientists is plagued by bugs which undermine the analysis. (When's the last time you wrote a program which ran correctly on the first try?) Any serious data science effort would enforce stern practices around code review, assertions, TDD, etc. to make the analysis as correct as possible -- but my impression is it is much more common for data analysis to be low-quality throwaway code.


This is an important point. I used to work in adtech. It's amazing how terrible the modeling is in that space. You can generate a model that identifies a given target audience and simply assert that it works without any real validation.


Surely adtech companies like Google and FB do OK though?


On the flip side you used to have statisticians writing code that is frankly unusable in a Production environment. You would weep at the R code I've seen and had to turn into something to actually produce business value.


There is a bit of a joke that a data scientist is someone who can do better stats then the average SWE and can write better code than the average statistician. Both of those are relatively low bars to clear though


The way I heard the joke was "a data scientist is someone who's not good enough at math to be a statistician, and not good enough at programming to be a software engineer."

Maybe a little harsh...


That's much better. Consider that stolen.


Harsh, but funnier than how I phrased it.


This is exactly my point. Let subject matter experts in their respective disciplines handle what they know and communicate through the lingua franca of R. Most data scientists/statisticians probably shouldn't be writing production code, I think that's ok. It's a failing of management to think that coding is coding and not understand the value of true engineering ability.


My first job basically consisted of taking code in FORTRAN and translating it into C++ with robust testing and engineering, and then frontending that code into a ton of spreadsheet packages. So you had quanta doing quant work, software engineers doing software engineering, and analysts and traders being analysts and traders, instead of having quants fail at all three, which is more or less what data science is.


Yeah but in the end it’s just code. And even better, just R.

The business value comes from the stats guy.


When the R/stats guy quits and you have to figure out which of his 7 notebooks to run in which order and which local files need to be in which local directories to run correctly and which versions of each package are now broken and which code you need to rewrite to fix it you start to realize the value he produced was clicking a lot of buttons in the right order and that overall this doesn't scale at all.


Yeah, but I meant that because the business value is in the stats, and there is such low quality of stats in the field to begin with, it’s borked no matter what.

There’s no point in fixing it. You can just pretend like you did. But if the stat work is quality, then it’s worth the effort to optimize.


That sounds more like a jupyter notebook/python problem than an R problem.

but otherwise, yes, I see the problem.


The hours I have spent debugging package problems in R would disagree.


I know that pain. That’s why I’m saying avoid it if you can do so.


> Data science effectively rebranded statistics but removed the requirement of deep statistical knowledge to allow people to get by with a cursory understanding of how to get some python library to spit out a result.

That's a good way of putting it. I remember in my first calculus-based probability+statistics class in college, I felt incredibly challenged by the theory. I wondered why there are so many probability distributions out there, why the standard stats formulas look like they do, what "kernel density estimation" even is, etc.

On the other hand, my data science course did include some theory, but a big part of it was also learning how to type the right commands in R to perform the "featured analysis of the week" on a sample data set. Something about these lab exercises felt off because it felt more like training rather than education. The professor expressed something along the lines that if we wanted to go far with this in the future, he would expect us to design the algorithms behind the function calls. I think the analogy he used was "baking a cake from scratch rather than buying a ready made one at the store."


That answer somehow reminds me of an article in logicmag: An Interview with an Anonymous Data Scientist [1].

[1]: https://logicmag.io/intelligence/interview-with-an-anonymous...


I don't know many software engineers who have the ability to design and implement robust production systems.


The point of a grill is to block dust and debris, which presumably I want. Otherwise I would just not use one and my fans would be the quietest and cool most effectively. What is the trade-off between the two metrics sited and effectiveness of each type of grill at its intended purpose?


I think the purpose of the grill on a computer is not so much about dust and smaller debris as it is about (a) protecting user fingers against fan bites, and (b) protecting the fan against bigger pokey things that could damage the fan blades or jam the fan.


In super small form factor PCs, sometimes you use a fan grille on the inside, to make sure that wires, etc on the interior don't get caught in the fan.


hmm, I always assumed there was some other benefit. I'm taking all mine off. Seems pretty easy to not ram my finger in it.


No roaming pets or children I take it :p.

Debris are usually covered by a foam filter in front of the fans (if at all). I’ll usually take that off though and just clean it every once in a while.


No, I comment on posts about pc fan grills at 6am...I'm single :)


That's exactly the kind of thing I do in the morning over coffee in those quiet few minutes before the children are up. Rest assured that you can still continue to comment on obscure tech minutiae at odd hours of the morning.


I have a large fan at home that I remove the front protector from because it gets dirty often and is quieter.

My young nephew shoves his hand it in all the time for fun with zero harm.


On the flip side, a few of my hobbies include devices where the fans are usually exposed - and I've cut myself a handful of times because of it.

It's very size/shape/rpm dependent.


"It doesn't affect me, so it cannot be an issue"


Big fans are at a relatively low RPM, and household fans might be made with kids sticking their fingers in them in mind. I've got a little noisy desk fan with metal blades in it that I would prefer not to take my chances with. I've also cut myself on my RC Plane's props a few times.


zero harm to him or to the blade?


I really love Scala for all of these reasons, it also has the benefit of being used frequently in industry (big data, etc).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: