> NEVER try to explain HOW your code works in a comment
Never is a strong word, sometimes the algorithm is just inherently complex and it's worth it to explain what you are doing.
Why is much more important, but it's also partially solved automatically by git blame and adding the task number to each commit message (and writing good commit messages which IMHO is more important than writing good comments). Each line of your code has commit message associated with it whether you care about it or not - make it useful.
Yeah algorithms in particular are where I'm hesitant to say "never explain how your code works." They can be dense, deeply-nested, and hard to break down into semantically-meaningful variables and function names. Sometimes it's good to explain what's going on there.
That's what people don't understand. We don't think in code, we have a translation layer from human language/thought that we think through when parsing a programming language. As long as the comments are kept accurate, they can greatly improve your ability to parse through a block of complex code, which is crucial when you're trying to hotfix an issue. It also gives newer programmers to the codebase a chance to help understand what that block of code is doing beyond it just being some blackbox with an input and output.
Yeah the how is important when auditing the correctness of code. I would say "Never listen to an advice about never doing something, there is always a case were breaking the rules is good"
This is generally how I treat comments, docblocks on everything for api documentation, but inside functions, yeah, something clever or non-obvious gets a comment. I also tend to document complex interfaces / abstract classes more heavily to give people a sort of road map for implementing it.
I have been trying to explain this for some time and I’ve just had a new thought so I will add it here.
I think that we fundamentally start to lose the mental distinction between the language semantics and the idiomatic code. It’s all part of the scenery. So for a seasoned programmer, the what and the how blur together. There is no how to getting a value from an object, a struct, a tuple, or an array. You just do. Unless your language doesn’t have one of those things.
Non-idiomatic code is full of how. So what we seek is code that just tells us “what” and comments are a bargain we make with each other to guarantee a lower bound on that. Meanwhile the self documenting code people want a different arrangement, where you have to stoop to documentation less often by putting that energy into sustaining a constant output of idiomatic code.
Most nontrivial algorithms are just a bunch of nested for loops with array indexing and ifs mixed in. There were no advances in computer science that made this kind of stuff less confusing as far as I am aware. We made a lot of advances to make simple boring business code do 30 nested method calls, but at the end of the day the basic algorithm is still a bunch of loops and ifs.
It's actually funny how all the focus in CS seems to be on modeling the business part in more natural ways, while the "solving problems" part is still done like it's 1970s.
You can make the algorithms clearer with good variable names and extracting some parts to well named methods (and you should, usually) but it's often not enough.
Example taken straight from Wikipedia:
function BellmanFord(list vertices, list edges, vertex source) is
// This implementation takes in a graph, represented as
// lists of vertices (represented as integers [0..n-1]) and edges,
// and fills two arrays (distance and predecessor) holding
// the shortest path from the source to each vertex
distance := list of size n
predecessor := list of size n
// Step 1: initialize graph
for each vertex v in vertices do
distance[v] := inf // Initialize the distance to all vertices to infinity
predecessor[v] := null // And having a null predecessor
distance[source] := 0 // The distance from the source to itself is, of course, zero
// Step 2: relax edges repeatedly
repeat |V|−1 times:
for each edge (u, v) with weight w in edges do
if distance[u] + w < distance[v] then
distance[v] := distance[u] + w
predecessor[v] := u
// Step 3: check for negative-weight cycles
for each edge (u, v) with weight w in edges do
if distance[u] + w < distance[v] then
error "Graph contains a negative-weight cycle"
return distance, predecessor
This is pseudocode with some stuff abstracted away, yet they still added "what" comments. You can remove some of them and split it into several methods but for Step 2 I still kinda feel it's not enough information to make it obvious what is happening. What does it mean to "relax an edge"? Why do we need to do it V-1 times? Does order matter?
I don't think fancy programming language features help here. You can change iteration into tail recursion or point-free functional code or whatever is fashionable but the underlying complexity remains.
And there are algorithms out there that are much trickier.
Code doesn't exist in a vacuum, it exists in the context of our general knowledge.
I would argue that the only comment really useful here would be one pointing to the paper/wikipedia article/whatever that explains the algorithm.
If no such article/reference exists and/or your implementation has nuances you want to explain, that's where technical documentation should come in. For instance, check the linux kernel's explanation of circular buffers and how to use them [1].
> What does it mean to "relax an edge"? Why do we need to do it V-1 times? Does order matter?
These are questions related to understanding the algorithm, not the implementation. There are much better ways to convey that information (including examples, diagrams, lengthy explanations, etc.). Trying to convey that information through comments in an implementation is an exercise in __not__ using the best tool for the job ;)
To an extent List Comprehensions are, if not a fix, at least an analgesic. But FP is definitely a case of The Future is Here, It's Just Not Evenly Distributed. Calling it an 'advance' might be a stretch, at this point. Lost wisdom?
I would chalk the complexity of the example you provided up to the inability of half of developers to handle four nested conditionals, and for most of the rest to handle five.
I sort of handwaved over the Death By A Thousand Cuts scenario, which I definitely believe is important. It doesn't matter how idiomatic your code is if you have enough of it. I don't maintain that idiomatic code is free by any means. It just doesn't have the factorial surface area of spaghetti code. It scales better, but scale absolutely matters.
If you count conditional branches, you'll see that parts of this code are pushing up against those limits. You're definitely into 'how' territory, but you've also shunted it off to a function that just 'does stuff'. So my code that relies on this for step 1 of a bigger answer doesn't give a shit (as long as the answer is correct, including boundary conditions).
It makes more sense if you emphasise COMMENT as well.
> NEVER try to explain HOW your code works in a COMMENT
Comments are for explaining why, not how. If some code is algorithmically complex enough to need explanation, that should go in documentation, not an inline comment.
One thing a friend of mine (hi Adrian!) did at my old workplace, that I found incredibly annoying at the time, but later realised was exactly the right thing to do, was set up a wiki for us to use and then refuse to answer any question or acknowledge any information except via that wiki. After a while I just gave in and before telling him about anything, stuck it on the wiki. Years later when I left, every time anyone asked me a question I could honestly answer "it's on the wiki."
People love to hate on Bezos (less here than elsewhere, though, I guess) but this is exactly the tack he used to convert Amazon's IT setup to services, and I think both were shrewd moves.
Never is a strong word, sometimes the algorithm is just inherently complex and it's worth it to explain what you are doing.
Why is much more important, but it's also partially solved automatically by git blame and adding the task number to each commit message (and writing good commit messages which IMHO is more important than writing good comments). Each line of your code has commit message associated with it whether you care about it or not - make it useful.