Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Grok: Thousands LOC a day in C is a big deal even if the "coder" uses LLM?
6 points by adinhitlore 5 months ago | hide | past | favorite | 18 comments
Since I got addicted to vibe-coding (to the unilluminated "vibecoding = using LLM to generate code), I asked grok couple of days ago if getting thousands or more LOC/day in complex language like say C is a lot or not, especially since the project involves AI (so it's 4-digit loc number/day for a complex task, we're not talking a Notepad clone, PoS, dental appointment, crypto wallet or anything that junior dev should do).

Here is the thing though: while one may be a total newbie and can barely type code apart from say downloading Python 3.9 if you have to deal with tremendous amount of code you'd have to compile, address potential errors (during compilation), if the LLM gives you code that erronously works against your goal (example: it automatically put a safety 'alignment' on my project basically forbidding 'rm -rf' to be run on my computer...but i'm on windows so i saw this "safety" feature and just manually deleted it from the code).

The question is: Is there any difference between a junior dev or rather someone just starting and someone who's been coding for years or even decades? In a way it's kind of like asking "do mathematicians use calculators the same way non-mathematicians use them?" I guess the difference is minimal?



In every language you can take 100 LOC and compress them into less than half if you abstract it and optimize it. What I mean is that LOC doesn't tell you much about your software, but there is a a point where there is probably a lot of repetitive logic and more likely to have issues because it's not abstracted properly.

The hard part about programming is not writing down some syntax that does some function. The hard part is designing the system to scale properly as you add more features, it's dividing the problem into parts that make sense so that they can be easily understood and reasoned about. Typing the solution into an editor is one of the easy steps once you understand the solution.


I don't understand the premise of the title. Thousands of LOC a day is more problematic if you didn't write it by hand. Because you haven't spent time directly understanding it and experiencing its meaning as it flowed out of your head, which trains you to build the necessary big-picture model.

> (example: it automatically put a safety 'alignment' on my project basically forbidding 'rm -rf' to be run on my computer...but i'm on windows so i saw this "safety" feature and just manually deleted it from the code).

I hope you replaced it with corresponding rules about `deltree` and `rmdir`.

> The question is: Is there any difference between a junior dev or rather someone just starting and someone who's been coding for years or even decades

There are many differences in how they approach the task of coding. I couldn't tell you anything specific about LLM use. Certainly I have seen explanations of how more experienced devs can get value from using LLMs, in ways that are basically inaccessible to juniors.

But if the overall differences could be simply and usefully explained, it wouldn't take years to become senior. I strongly advise you to stop looking for shortcuts and start building your understanding.


LOC are not all the same.

My heuristics for assessing the difficulty of changes revolve around "load bearing-ness".

It's basically free to change code no one uses. It's not bearing any load. If your metric is LOC you can be very "productive"! But from another perspective you cannot tell if you are productive yet, the value is not realized until people pay for it and rely on it.

Changes in software that people use affect "load bearing" LOC. There is "pucker factor".

You can kinda multiply together: what is the inherent complexity of the domain, is the change reversible, is your change stateful, how many customers use it, how deeply do they interface with it, how many stakeholders are involved in the change, what are the consequences of failure / mistakes, how many cases / paths do you need to test - to get a feel for the true difficulty of a change.

The difficulty of a change is separate from the value it creates, but often they're related. Why? Because in widely used software if a thing lots of people valued was easy, it would already be done. Of course pathologies do exist where people do difficult things for no good reason or foolishly leave easy money on the table.

Greenfields / startups are a special case / time, where difficulty seems temporarily uncoupled from value. You can make huge and potentially valuable changes quickly and easily if no one is using your thing. But there is no free lunch because the downside is that you are only creating potential value, you still have risk that all the code you wrote is useless.

This is why some some people might say 30 LOC / day is a lot while others might name a much higher number. The people saying the lower number are the ones with existing customers, they are talking about "load bearing" LOC.


I'd say a very useful and simple formula one could device can look something like:

f(n) = x/y. value = goal/loc.

"goal" here is just arbitrary number of usefullnes of code that aligns with some common uility like say 10 = ASI/eternal happiness and immortaility or whatever while 1 being 'terminator' and 7, 8 being useful business code so on. loc is self-expplanatory.

>goal / >loc -> < 10/1 a; <goal / >loc -> < 10/1 b; <goal / <loc -> < 10/1 c; >goal / <loc -> = 10/1 d

So, the very best value possible will be say "d" 10, f(n) = 10/1 - any other of the 4 possible permutations aka options will lead to worse result:

But while such simple formula could illustrate the value of software project overall, a key detail ommited could be the lowest possible number a great goal may require, most people would agree writing ASI in just <100 words aka 'one line' is just not possible, or anything useful.

It's like complaining you could write 100 000 a day when the project is Operating system or 3dsmax clone, it surely will be better to finish it early and there is a lowest bar possible that cannot be decreased even with F# or whatever language that saves words.


I was not able to make much sense of that, but if you can create 10k+ lines of good C per day regardless of domain complexity, requirements etc, you could create ffmpeg solo including all codecs in under 6 months.

That would make you, approximately, the best programmer in the world.


well maybe but btw i swear this almost looks like P=NP but rather reversed P=NP lol, let me explain...in the classic p=np problem given to explain it it's the subset problem where it's easy to verify a solution, but not find it {25,39,22,12,29,18} - do any 2 numbers add to 47 and yeah the last 2 numbers: 18 and 29 add to 47. Yet here if you use LLM to make great product, you have a "solution" pretty quickly but verifying it will take some time...be it to test functionality or read every single line and understand it. It's such a weird logical puzzle! The llm throws you the solution basically saying 'i did it' and your verification takes longer, another way to look at is that your verification is the real solution while the code generation is the "verification"...confusing a lot.

edit: in my previous post i meant that the higher the challenge of the program / the lowest possible lines of code to implement it will be the best possible software one can make. So a software rated 10/10 in every aspect possible implemented in just a single line of code is the best possible program. If you write it in more lines of code, it's worse in terms of value, if you decrease the quality but keep the loc low...it's still not optimal since you sacrifice quality, if you decrease the quality but increase the loc that's the worst possible case.


It sucks to do a 2-line change that requires 2 weeks of validation :/

Agree that codegen can flip the effort from creating to validating. Some people believe the "true" intellectual capital created by the software development process is the understanding created inside the team's mind as they build it (not the code itself). Like, understanding of the domain, customers, market, how the software works, how they would build it again if they had the chance, their networks and relationships etc.

Probably it does hurt if codegen impacts that process. It's always been possible to externalise some of that understanding, by writing books, articles etc. I suppose what we are starting to see is that kind of intellectual capital will gradually reside in AI also. For well understood problems (TODO app, space invaders, etc) the model will have the actual understanding and everyone generating those apps for the first time would need to build it up.

I agree that 1 LOC that provides infinite value sounds like the "best possible program", but beyond that I can't see any hard and fast rules about LOC vs value, I think they're almost entirely uncorrelated. Mostly it's about what the lines do, but even then it's hard. What's more valuable: the bible, the script to star wars or a set of 747 technical manuals? Should they have used more or fewer words to achieve their goals?

Even with precise specifications the "best" way to implement large scale software is never clear and depends on many factors. Sometimes more duplication is better, sometimes you need more abstraction to reduce it. A lot depends on what your team (or contributors) are comfortable with and their abstraction ceiling.


It's a big deal if LOC is proportional to how much money you make. Otherwise it's an unequivocal burden, you can't pay me any amount to adopt a 10,000 SLOC C99 codebase.


I spent Friday and today using Copilot on a project and I (or it) wrote about 13k LOC: about 5k LOC of python code, 6k LOC jupyter notebooks and 2k LOC markdown files with agent orchestration. My time was spent waiting for Copilot to do the coding, then running tests and making sure they pass and they are meaningful, and then debugging. The debugging is a very interesting experience. You devise various tests to tease out what's going wrong and narrow down to a very small context so the issue becomes apparent to both you and the LLM. I didn't actually write a single line of python code myself. But a junior developer would have gotten stuck at the first problem that required debugging. These two days felt a lot like hard work. The only difference is that I have much more to show for after 2 days of work than I used to have in the past.


5k in python is tremendous since the lack of {()} will make reading a long function quite the journey, this is of course my naive assumption that some of the functions are very long, in theory you could just put 1000 functions each of which 5 lines but i doubt that's the case in practice, the LLMs tend to give long and detailed statements from time to time.


The 5k lines of code are spread over 44 files. Most of this code is testing code. Just about 2k LOC are functional code. The functions are generally small. Only 5 functions have more than 100 LOC, and all have less than 200 LOC. All the functions have generous docstrings. I set up a system with a manager agent that would create action plans and executor agents that would execute, and then the manager had to review and accept the execution, and if not (it did happen once), the executor had to resume the execution until the manager confirmed the task was done. I stumbled upon this workflow by myself, and so far it looks like it works.


Most of my Python functions are less than 10 lines long and many are two or three lines long. At 10 I am already questioning if I should refactor it and at 20 my internal voice is screaming about it.

This does not change as the project gets larger.

LLMs don't write like this without very specific instructions, and it might be difficult even then. After all, most programmers don't write like that.

But I think they should, and I'm absolutely prepared to defend that.

Losing brackets doesn't make the code harder to read. After all, Python code must be properly indented for correctness. It does mean that you can't use the % key in vim to find the end of a block easily. But Python-specific editors can easily identify the end of a block and provide a "fold" feature if you like using those. I just prefer not to write code that would benefit from that.


Thousands of lines of code per day is an enormous red flag. That is well beyond your ability to write or review on your own in a single workday.

Pumping out that much code means you're not putting thought, care, or professionalism into the work. You're pumping out garbage and making it everyone else's problem to clean up.

If I saw anyone other than a greybeard developer pushing out code like this, I'd fire them. Very, very few people can actually work at this level and produce good code. Everyone else is just shitting out slop, ruining the product, and dragging down the entire team.

Additionally, only extremely inexperienced and naïve developers even think about LOC counts. If you're focusing on this, it tells me you have no idea what you're doing and that the code you're putting out has about as much value and utility as the shit my dog took this morning.


a good way to look at it is that problems solved is an asset, but lines of code are a liability. you pay the cost of maintaining that code because you want the benefit of the problems it solves.

note that LLM-generated code is even more of a liability because when you go to maintain it you will not have the context of the decisions that went into writing it - it's legacy code from day one.


It its just a tool, use it as a tool.

Rmember when we had to create programs reading the assembler manual and put byte by byte? It was kinda hard. Then someone invented a language. A compiler started to do all the work you did not wanted to do. And the time you took to create a program came from months to minutes.

My only question is, why use something like that Grok since you know who runs that.


It's in the post above lol "couple of days ago" - i had quite the fallout with Grok since like august ~17th when it started ignoring my questions about my mental health (literally just skipping them) on 2 occassions when it was online and 100% operational. I prefer chatgpt, claude is great for code too. Before that I used it because i guess i was too lazy to signup on openai and just use my twitter profile -> Grok...already logged in etc.


SLOC as a metric of productivity was always useless. AI only proves this more.


15 (just 15, not k) is par for a corporate dev on a good team. 100-300 is what you might expect from a "10x engineer" or independent dev consistently in the flow state.

1k lines is too much to even review in a day and I would assume it's useless.

Typical disclaimers "lines != Statements != Productivity" apply. Also I'm not counting things like unit tests and documentation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: