Hacker Newsnew | past | comments | ask | show | jobs | submit | el_pollo_diablo's commentslogin

Capturing invariants in the type system is a two-edged sword.

At one end of the spectrum, the weakest type systems limit the ability of an IDE to do basic maintenance tasks (e.g. refactoring).

At the other end of the spectrum, dependent type and especially sigma types capture arbitrary properties that can be expressed in the logic. But then constructing values in such types requires providing proofs of these properties, and the code and proofs are inextricably mixed in an unmaintainable mess. This does not scale well: you cannot easily add a new proof on top of existing self-sufficient code without temporarily breaking it.

Like other engineering domains, proof engineering has tradeoffs that require expertise to navigate.


Years ago the research team behind OCaml released Chamelle, a version of the language localized in French, as an April fool's joke:

https://gallium.inria.fr/blog/ocaml-5/


I would put the emphasis on a different word:

> This article is made and published by Anna Hartz, which may have used AI in the preparation

Which, not who. They're not even sure the author is human!


Might be a language issue, as English is not the primary language of the newspaper's staff.


Not to mention that they insist on calling every entry of the list a "list head", which makes no sense (hysterical raisins, maybe?). The structure is made of a uniform loop of entries, one of which is used as the actual head & tail, or entry point into the structure.


In general, there is no “actual” head and tail - you could have multiple references to different parts of the list, and each of them would have a different head. If you’re recursing through a list, at some point every node will be used as the head. This is a common pattern in recursive data structures, particularly in functional languages.

Disclaimer: I haven’t looked at this author’s code, just pointing out that list nodes that consist of (head, tail) are a common pattern with a clear rationale.


Head and tail make sense for persistent lists in functional languages with value semantics, yes.

The intrusive, mutable, doubly-linked loops with reference semantics under discussion are quite different. Although all entries behave identically in the structure itself, one of them is _used_ differently, as a standalone anchor point, while the others are embedded in the list's "elements".


Yes, it's terrible, and the fact that their list_add takes parameters backwards from what one might expect, with no types to catch mistakes!

See https://github.com/rustyrussell/ccan/blob/master/ccan/list/_...


Absolutely. Wrapping the distinguished entry point in a new structure type equipped with a thin type-safe wrapper API that covers the most common use case is the way to go.


Sure, but it is also very common for C programs to contain data structures that have one use in the program, and could still be instances of a generic type. You mentioned red black trees, which are a perfect example of that.


If, by "situation", you mean the development of a small program with so many constraints that using existing libraries is out if the question, then yes.

Otherwise, that seems unwise to me. Not every user of a generic type has to be generic. A major selling point of generic types is that you write a library once, then everyone can instantiate it. Even if that is the only instance they need in their use case, you have saved them the trouble of reinventing the wheel.

No colleague of mine may need 10 different instances of any of my generic libraries, but I bet that all of them combined do, and that our bosses are happy that we don't have to debug and maintain 10+ different implementations.


Ok you’re telling me the upside which we already know. Now what’s the downside?


You mean the downside that we also already know, i.e. that there are some situations where a custom data structure would be superior for various reasons (e.g. smaller footprint)?

Experienced programmers know when to reuse a generic library and when to roll out their own. We all know that.

Yet you dismiss generic red black trees because there is no realistic single program that uses 10 key/value combinations. Not only is this false, but as I illustrated in my reply, a single program is not the relevant scope to decide whether to use a generic implementation. And as someone who has written a red black tree library, I am definitely in favor of reusing an implementation unless you have an excellent reason, which "I do not need 10 different instances in my program" or "my favorite language and its standard library only have arrays built in" most definitely are not.


> Yet you dismiss generic red black trees

I use generic data structures all the time. That’s the default in programming. We know their advantages.

I am trying to say there is a world out there that doesn’t look like ours and it works just fine and has other advantages.

Are you saying the only real way to program is with generic data structures? I’m saying no because nobody did prior to the late 80s.

> I am definitely in favor of reusing an implementation unless you have an excellent reason

Let me try one more time. If you examine a program you won’t find 10 different red black trees. Data tends to follow a particular path and there is probably 1 red black tree and it’s a core feature of the application.

If that’s true then it’s not a big deal to write that core feature of your application.

It avoids premature optimization and code bloat because you tend to use complex data structures when they have a need.

Array is a good default. It’s how computer memory works, not just what happens to be lying around.


> Are you saying the only real way to program is with generic data structures?

Certainly not. As I said, the experienced programmer knows when (not) to use them. Some programs are better off without them, such as... most of the low-level software I write (security-critical operating systems).

> Data tends to follow a particular path and there is probably 1 red black tree and it’s a core feature of the application. If that’s true then it’s not a big deal to write that core feature of your application.

"It's not a big deal" are famous last words. Maybe it is not a big deal, but unless you have hard constraints or measurements that confirm that you would not be served well enough by an existing solution, it may very well be the reason why your competitors make progress while you are busy reinventing, debugging, maintaining, and obtaining certification for a wheel whose impact you overestimated.

> It avoids premature optimization and code bloat because you tend to use complex data structures when they have a need.

Avoiding code bloat? Agreed, a custom solution cannot be worse than a generic one. Avoiding premature optimization? Quite the contrary: going straight for the custom solution without first confirming, with actual measurements, that the generic solution is the unacceptable bottleneck, is textbook premature optimization.

I am sorry but I do not understand what you are getting at.


Sure, but your alternative code incorrectly assigns to (list)->payload. You have many other options. Without typeof, you can if(0) the assignment, or check type compatibility with a ternary operator like 1 ? (item) : (list)->payload and pass that to _list_prepend, etc. With typeof, you can store item in a temporary variable with the same type as (list)->payload, or build a compound literal (typeof(*(list))){.payload=(item)}, etc.


The assignment is intentional. The union changed to a struct


This is about a function definition, not a random function declarator. C23 does not change anything in that case.


C23 does not change anything in this situation, because we are talking about the definition of main(), not a forward declaration. More details here:

https://news.ycombinator.com/item?id=38729278#38732366


In what situation fn() doesn't mean fn(void) under C23?


None, but that is not my point. Before C23, fn() already meant the same thing as fn(void) in function definitions, which the situation under discussion here.

C23 changed what fn() means outside a function definition.


Oh, yeah, the codegen for the fn() itself would likely be the same, but the prototype of that definition is still a K&R function. https://godbolt.org/z/Psvae55Pr


> In fact, even state-of-art compilers will break language specifications (Clang assumes that all loops without side effects will terminate).

I don't doubt that compilers occasionally break language specs, but in that case Clang is correct, at least for C11 and later. From C11:

> An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.


C++ says (until the future C++ 26 is published) all loops, but as you noted C itself does not do this, only those "whose controlling expression is not a constant expression".

Thus in C the trivial infinite loop for (;;); is supposed to actually compile to an infinite loop, as it should with Rust's less opaque loop {} -- however LLVM is built by people who don't always remember they're not writing a C++ compiler, so Rust ran into places where they're like "infinite loop please" and LLVM says "Aha, C++ says those never happen, optimising accordingly" but er... that's the wrong language.


> Rust ran into places where they're like "infinite loop please" and LLVM says "Aha, C++ says those never happen, optimising accordingly" but er... that's the wrong language

Worth mentioning that LLVM 12 added first-class support for infinite loops without guaranteed forward progress, allowing this to be fixed: https://github.com/rust-lang/rust/issues/28728


For some context, 12 was released in April 2021. LLVM is now on 20 -- the versions have really accelerated in recent years.


At least it's not just clownish version acceleration. They decided they wanted versions to increase faster somewhere around 2017-2018 (4.xx), and the version increase is more or less linear before and after that time, just at different slopes.


And it's now a yearly major release, is it not?

Same as with GCC


I can't say I'm happy with "yearly broken backward compatibility," but at least it's predictable.


Looks like two major versions per year for LLVM.


Sure, that sort of language-specific idiosyncrasy must be dealt with in the compiler's front-end. In TFA's C example, consider that their loop

  while (i <= x) {
      // ...
  }
just needs a slight transformation to

  while (1) {
      if (i > x)
          break;
      // ...
  }
and C11's special permission does not apply any more since the controlling expression has become constant.

Analyzes and optimizations in compiler backends often normalize those two loops to a common representation (e.g. control-flow graph) at some point, so whatever treatment that sees them differently must happen early on.


In theory, in practice it depends on the compiler.

It is no accident that there is ongoing discussion that clang should get its own IR, just like it happens with the other frontends, instead of spewing LLVM IR directly into the next phase.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: