More

javierhonduco · on July 5, 2024

Personally I’m not a fan of Go’s default zero-initialisation. I’ve seen many bugs caused by adding a new field, forgetting to update constructors to intialise these fields to “non-zero” values which caused bugs. I prefer Rust’s approach where one has to be explicit.

That being said it’s way less complex than C++’s rules and that’s welcomef.

maccard · on July 5, 2024

I spent a year and a half writing go code, and I found that it promised simplicity but there an endless number of these kinds of issues where it boils down to "well don't make that mistake".

gizmo686 · on July 5, 2024

It turns out that a lot of the complexity of modern programming languages come from the language designers trying to make misaked harder.

If you want to simplyfing by synthesising decades of accumulated knowledge into a coherent language, or to remove depreciated ideas (instead of the evolved spaghetti you get by decades of updating a language) then fine. If your approach to simplicity is to just not include the complexity, you will soon disciplinary that the complexity was there for a reason.

kccqzy · on July 5, 2024

The problem you are describing in Go is rarely a problem in C++. In my experience, a mature code base rarely has things with default constructors, so adding a new field will cause the compiler to complain there's no default constructor for what you added, therefore avoiding this bug. Primitive types like `int` usually have a wrapper around them to clarify what kind of integers, and same with standard library containers like vector.

However I can't help but think that maybe I'm just so fortunate to be able to work in a nice code base optimized for developer productivity like this. C++ is really a nice language for experts.

dieortin · on July 5, 2024

Why would you have a wrapper around every primitive/standard library type?

kccqzy · on July 6, 2024

Type safety.

Compare `int albumId, songId;` versus `AlbumId albumId; SongId songId;`. The former two variables can be assigned to each other causing potential bug and confusion. The latter two will not. Once you have a basic wrapper for integers, further wrappers are just a one-liner so why not. And also in practice making the type more meaningful leads you to shorter variable names because the information is already expressed in types.

catlifeonmars · on July 5, 2024

FWIW there is a linter that enforces explicit struct field initialization.

ErikBjare · on July 5, 2024

Haven't written Go in a long time, but I do remember being bit by this.

zarathustreal · on July 5, 2024

Yea this can be problematic if you don’t have sum types, it’s hard to enforce correct typing while also having correct default / uninitialized values.

dgfitz · on July 5, 2024

Wouldn’t it just be considered bad practice to add a field and not initialize it? That feels strongly like something a code review is intended to catch.

javierhonduco · on July 5, 2024

It’s easy to miss this in large codebases. Having to check every single struct initalisation whenever a field is added is not practical. Some folks have mentioned that linters exist to catch implicit initialisation but I would argue this shouldn’t require a 3rd party project which is completely opt-in to install and run.

dieortin · on July 5, 2024

All bugs are considered bad practice, yet they keep happening :P

crowdyriver · on July 5, 2024

You can always use exhaustruct https://github.com/GaijinEntertainment/go-exhaustruct

to enforce all fields initialized.

If you care, the linter is there, so this is more of a skill issue.

javierhonduco · on March 24, 2024

For anybody interested in this topic, “Release it!” is a pretty good read. (Not affiliated in any way, just enjoyed reading it)

https://www.oreilly.com/library/view/release-it/978168050026...

jb3689 · on March 24, 2024

I just finished this. To each their own of course, but I found the writing too padded and tonally off-putting at times. Some of the stories felt dated both from a technological stance and a cultural stance. I prefer Azure's Cloud Pattern docs myself (though "Release It!" was really good if you prefer a storytelling approach):

https://learn.microsoft.com/en-us/azure/architecture/pattern...

javierhonduco · on March 24, 2024

This looks incredibly comprehensive, thanks for sharing!

Should have added that I read this book in 2016, and the first edition is even older, so there’s naturally been lots of new (and exciting) developments in this area!

javierhonduco · on March 17, 2024

Overall, I am for frame pointers, but after some years working in this space, I thought I would share some thoughts:

* Many frame pointer unwinders don't account for a problem they have that DWARF unwind info doesn't have: the fact that the frame set-up is not atomic, it's done in two instructions, `push $rbp` and `mov $rsp $rbp`, and if when a snapshot is taken we are in the `push`, we'll miss the parent frame. I think this might be able to be fired by inspecting the code, but I think this might only be as good as a heuristic as there could be other `push %rbp` unrelated to the stack frame. I would love to hear if there's a better approach!

* I developed the solution Brendan mentions which allows faster, in-kernel unwinding without frame pointers using BPF [0]. This doesn't use DWARF CFI (the unwind info) as-is but converts it into a random-access format that we can use in BPF. He mentions not supporting JVM languages, and while it's true that right now it only supports JIT sections that have frame pointers, I planned to implement a full JVM interpreter unwinder. I have left Polar Signals since and shifted priorities but it's feasible to get a JVM unwinder to work in lockstep with the native unwinder.

* In an ideal world, enabling frame pointers should be done on a case-by-case. Benchmarking is key, and the tradeoffs that you make might change a lot depending on the industry you are in, and what your software is doing. In the past I have seen large projects enabling/disabling frame pointers not doing an in-depth assessment of losses/gains of performance, observability, and how they connect to business metrics. The Fedora folks have done a superb and rigorous job here.

* Related to the previous point, having a build system that enables you to change this system-wide, including libraries your software depends on can be awesome to not only test these changes but also put them in production.

* Lastly, I am quite excited about SFrame that Indu is working on. It's going to solve a lot of the problems we are facing right now while letting users decide whether they use frame pointers. I can't wait for it, but I am afraid it might take several years until all the infrastructure is in place and everybody upgrades to it.

- [0]: https://web.archive.org/web/20231222054207/https://www.polar...

rwmj · on March 17, 2024

On the third point, you have to do frame pointers across the whole Linux distro in order to be able to get good flamegraphs. You have to do whole system analysis to really understand what's going on. The way that current binary Linux distros (like Fedora and Debian) works makes any alternative impossible.

spc476 · on March 18, 2024

It could be one instruction: ENTER N,0 (where N is the amount of stack space to reserve for locals)---this is the same as:

    PUSH EBP
    MOV  ESP,ESP
    SUB  SP,N

(I don't recall if ENTER is x86-64 or not). But even with this, the frame setup isn't atomic with respect to CALL, and if the snapshot is taken after the CALL but before the ENTER, we still don't get the fame setup.

As for the reason why ENTER isn't used, it was deemed too slow. LEAVE (MOV SP,BP; POP BP) is used as it's just as fast as, if not faster, than the sequence it replaces. If ENTER were just the PUSH/MOV/SUB sequence, it probably would be used, but it's that other operand (which is 0 above in my example) that kills it performance wise (it's for nested functions to gain access to upper stack frames and is every expensive to use).

felixge · on March 17, 2024

Great comments, thanks for sharing. The non-atomic frame setup is indeed problematic for CPU profilers, but it's not an issue for allocation profiling, Off-CPU profiling or other types off non-interrupt driven profiling. But as you mentioned, there might be ways to solve that problem.

brancz · on March 17, 2024

Great comment! Just want to add we are making good progress on the JVM unwinder!

javierhonduco · on March 17, 2024

There's always room for improvement, for example, Samply [0] is a wonderful profiler that uses the same APIs that `perf` uses, but unwinds the stacks as they come rather than dumping them all to disk and then having to process them in bulk.

Samply unwinds significantly faster than `perf` because it caches unwind information.

That being said, this approach still has some limitations, such as that very deep stacks won't be unwound, as the size of the process stack the kernel sends is quite limited.

- [0]: https://github.com/mstange/samply

javierhonduco · on March 17, 2024

Inlined functions can be symbolized using DWARF line information[0] while unwinding requires DWARF unwind information (CFI), which the x86_64 ABI mandates in every single ELF in the `.eh_frame` section

- [0] This line information might or might not be present in an executable but luckily there's debuginfod (https://sourceware.org/elfutils/Debuginfod.html)

javierhonduco · on March 17, 2024

Curious to hear more about this. Full disclosure: I designed and implemented .eh_frame unwinding when I worked at Polar Signals.

int_19h · on March 17, 2024

I think I might have confused two unrelated posts. The one that references Polar Signals is this one:

https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/issues/...

So not a perf issue there, but they don't think the workflow is suitable for whole-system profiling. Perf issues were in the context of `perf` using DWARF:

https://gitlab.com/freedesktop-sdk/freedesktop-sdk/-/issues/...

javierhonduco · on Jan 26, 2024

Once it’s loaded in memory, if Kernel Samepage Merging is enabled it might not be as bad, but would love to hear of somebody has any thoughts https://docs.kernel.org/admin-guide/mm/ksm.html

LegionMammal978 · on Jan 26, 2024

From the link:

> KSM only merges anonymous (private) pages, never pagecache (file) pages.

So it wouldn't be able to help with static libraries loaded from different executables. (At any rate, they'd have to be at the same alignment within the page, which is unlikely without some special linker configuration.)

javierhonduco · on Jan 26, 2024

Had completely missed that line — great point!

javierhonduco · on Jan 9, 2024

Haven’t checked the source code yet, wondering if profiling of code without frame pointes is supported. Curious on their approach.

reactordev · on Jan 9, 2024

It uses eBPF to provide instrumentation of the kernel calls up as well as hooking into networking for http2 pgsql etc. Since it’s running the process in eBPF it’s essentially sandboxed and all memory, kernel function calls, and even profiling, is an option. They have an agent that collects this information and sends to the server over RPC (protobuf/grpc). You should check it out (however, some of the docs are in Chinese).

sharangxy · on Jan 9, 2024

We are in the process of translating more documents into English, and we also greatly welcome the help of the community!

javierhonduco · on Oct 16, 2023

There’s Kvrocks. It uses the Redis protocol and it’s built on RocksDB https://github.com/apache/kvrocks

eatonphil · on Oct 16, 2023

Does RocksDB speak NVMe directly?

> High-performance storage engines. There are a number of storage engines and key-value stores optimized for flash. RocksDB [36] is based on an LSM-Tree that is optimized for low write amplification (at the cost of higher read amplification). RocksDB was designed for flash storage, but at the time of SATA SSDs, and therefore cannot saturate large NVMe arrays.

From this slightly tangent mention, I am guessing not.

https://web.archive.org/web/20230624195551/https://www.vldb....

javierhonduco · on May 21, 2023

Curious what the overhead of dealing with files in Go would be if finalisers were cheaper / not in use.

Sorry there’s no link to the source, AFK right now, but files opened with os.Open will be automatically closed once all its references have been collected.

Found out about this behaviour some months ago while debugging some code at work.

majewsky · on May 21, 2023

> AFK right now, but files opened with os.Open will be automatically closed once all its references have been collected.

No, they're not. The idiomatic usage is to defer a Close() call after checking that Open succeeded, like so:

  file, err := os.Open("/etc/hostname")
  if err != nil {
    doSomethingWith(err)
  }
  defer file.Close()
  doSomethingWith(file)

But you'd be forgiven for not knowing that: To my own surprise, I could not find the need for this idiom explained in the package documentation for os.Open, though you can see it in action throughout the std implementation. For example: https://cs.opensource.google/go/go/+/refs/tags/go1.20.4:src/...

javierhonduco · on May 22, 2023

FWIW, I never advocated for not explicitly calling Close(). Brought up the finalizer in File, because it seems to have performance implications and it’s called for every file handle.

From the docs (https://pkg.go.dev/os#File.Fd):

> Fd returns the integer Unix file descriptor referencing the open file. If f is closed, the file descriptor becomes invalid. If f is garbage collected, a finalizer may close the file descriptor, making it invalid; see runtime.SetFinalizer for more information on when a finalizer might be run.

To give more context on how I found this out, we were missing a Close for a couple of files in our codebase. As soon as I realised I added them, but checked in production for file descriptor leaks and there were none. Checking Go’s source code led me to the finalizer and this doc confirmed what I saw in the code.

ithkuil · on May 21, 2023

That's not the behaviour you actually want though. You need to know exactly when a file closes.

cratermoon · on May 21, 2023

Strong disagree. If a long-running program opens a lot of files, has no references to them and they are not closed, those files will stay open, using system resources. There's no way for the Go code to close them. Yes, it's a bug to open a file and lose all references before closing it, but the runtime is doing the right thing by closing them for the program.

chowells · on May 21, 2023

You could go the other way: the program is incorrect if it doesn't explicitly close files. Allowing incorrect programs to somehow work correctly only encourages programmers to create more advanced incorrect programs, until they exceed the ability of the language/runtime to fix.

I tend more towards this line of thought. But I'd still put a finalizer on file handles. I just would have it yell very loudly that there's a bug. Maybe even close the file, but there's no way I wouldn't at the very least be generating a lot of diagnostics saying things were broken.

jrockway · on May 21, 2023

I agree with your take on correctness. The problem that you run into is that GC and thus finalization is triggered by memory pressure. But closing a file doesn't relieve memory pressure, it relieves FD pressure. So what you'd want is some sort of native thing not related to GC that closes unused files.

This isn't the only resource like this. If you write to a lot of temporary files, you'd want to start cleaning those up before the OS returns "no space left on device". If you listen on a lot of TCP ports, you want to clean up stale listeners before you've listened 65535 times. If you span a subprocess, you want to start reaping the children when PID pressure prevents creating future subprocesses.

This all ends up being rather esoteric so many programming languages just say "tell me as soon as you're done" and you pay the cost right then. This isn't optimal (especially "x := Open(thing); defer x.Close(); do something with x; do a bunch of slow stuff without x"). A big downside is that it's possible to have an object in memory that refers to something in an invalid state; you can write "x := Open(thing); x.Close(); x.Read()" just fine in many languages; the code compiles and it runs, but it returns incorrect results. This could be an area of focus for future languages, but I doubt much of this would be general. A lot of nitty gritty specifics depending on what resources you want to track.

It would also be weird to not close files just because someone turned GC off, right? If you tie finalizers to memory management, if the memory management isn't under memory pressure, then you start leaking FDs. Weird! All in all, the memory management subsystem is a weird place to manage non-memory resources. So to get this right, you really need something GC-like for everything that's not memory.

I think people are looking for something like, or are already very comfortable with, C++'s RAII. "I am certain this variable is allocated on the stack, so the compiler should fail to compile if that's untrue, and a Close method should be called when the function returns." But sometimes you will want the resource to be allocated beyond the current stack frame, and now you need special syntax for that. (Newer languages like to avoid exposing stack vs. heap to the programmer and let The Algorithm pick the best place. C++ never considered that an option so you do get some extra flexibility when you need it.)

Anyway this is a long post to say "all options are bad".

ithkuil · on May 21, 2023

Sure. But if you think that a file will be closed by the runtime "when there are no more references to it" then you're prone to assume that that the file must be closed when you later reopen the file or pass that filename to some other process . And most of the times the file will be indeed closed, until the GC just happens to run the finalized a little bit later than usual.

It's very hard to debug such bugs.

Imo it's much easier to make the leak more prominent so that people are incentivised to properly close the files

cratermoon · on May 21, 2023

It's possible to use Go's pprof to find leaks. Also, if it's important enough then runtime.SetFinalizer(fd, any) will allow the programmer to do something appropriate.

yawaramin · on May 23, 2023

So will 'defer', and that is much more idiomatic and well-understood in the Go community.

cratermoon · on May 23, 2023

Granted, but that only works for a subset of all cases. Remember that FDs can also refer to sockets and other things that aren't "files". A careful programmer will ensure that all FDs get cleaned up, but even careful programmers can fall victim to corner cases.

yawaramin · on May 24, 2023

I don't see why 'defer' wouldn't work for FDs or any other resource. There's nothing about 'defer' that says it will work only with 'files'. In fact I've been using it for general-purpose resource management for a while now.