Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is one reason why I explored model checking machine code output, since at this point, the behavior is very much defined, even if it differs from implied source behavior. But, this gets complicated for other reasons.


This is actually one of the reasons I've been tinkering with CBMC https://www.philipzucker.com/pcode2c/ The idea is to use Ghidra to lift a specialized C interpreter of the machine code and then compare to source or ask other questions via CBMC


This kind of analysis is excellent.

I do recommend standardizing on hardware and toolchain for projects like these, as it can ensure that the machine code is matched.

The third phase of my work will round the corner with machine code analysis. Currently, I'm working on constructive proofs and equivalence proofs of imported C code to handle the pieces that CBMC doesn't do so well, as part of my second phase. So far, I can extract efficient C from Lean 4, but I want to import C directly into Lean 4 to prove it equivalent to extracted code. Hand-written C is easier to read than the line noise I can extract.

In particular, model checking doesn't fare well with data structures, algorithms, and cryptography. These can be torn down quite a bit, and at least we can verify that the algorithms don't leak memory or make bad integer assumptions. But, if we want to verify other properties, this requires constructive proofs.

Between 80 and 95 percent of the time, depending on the type of code being written, CBMC is enough. I'm now solving for that 5 - 20 percent of code that CBMC can't easily tackle.


> model checking doesn't fare well with data structures, algorithms, and cryptography

To the contrary, that's what I'm using it for in most of my projects. It found interesting algorithmic bugs in my ctl find function (3-way comparison callback missing cases), is used to crack poor hashes. Also my tiny-regex matcher profited from it.

Also a lot of baremetal firmware.


ooooooh. link?

What do you mean by standardizing on hardware and toolchain? I am currently choosing ghidra and cbmc, and it seems like the approach is applicable to any compiler or arch that ghidra supports without too much change. I have only been doing x86-64 and gcc so far though


Sadly, I don't have a link for this yet. My goal is to write this up once I have a compelling example.

In the meantime, check out my github. https://github.com/nanolith

Currently, I'm feeding commits into libcparse, which is the start of this effort. That stream is currently about 180 days out of phase with what's in my local repositories, but it is slowly trickling in. The first step of this second phase will be to use libcparse to verify itself via constructive proofs in Lean. Currently, and by that I mean what hits in March or April, libcparse has a mostly C18 compliant lexical scanner for the C preprocessor. The goal is to have a SAX-like C18 parser that can detect all C declarations and definitions. Tooling will include a utility that imports C declarations and definitions into both Lean 4 and Coq. This is a moonlighting effort for an upcoming book I intend to write.


As for standardizing on hardware and toolchain, what I mean is that I try to get projects to commit to using specific hardware and a specific toolchain. The goal is to limit variables for automated testing and analysis.

OpenBSD, in particular, is adamant about testing their OS on real hardware for platforms that they support. Their definition of "working" is on real hardware. I think that's a reasonable goal.

It helps with analysis like this, as you don't have to worry about different compilers or different hardware. GCC, in particular, can produce different output depending on which hardware it was compiled on, unless a specific configuration is locked down. If there is a flaw in how it does vector operations on a particular Xeon processor, this won't be caught if testing locally on a Core i5 processor that produces different code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: