There's a hybrid approach of C -> WASM -> C compilation, which ends up controlling every OS interaction and sandboxing memory access like WASM, while technically remaining C code:
WASM sandboxes don't do much to guarantee the soundness of your program. It can hose your memory all it wants, it can just only do so within the confines of the sandbox.
Using a sandbox also limits what you can do with a system. With stuff like SECCOMP you have to methodically define policies for all its interactions. Like you're dealing with two systems. It's very bureaucratic and the reason we do it, is because we don't trust our programs to behave.
With Fil-C you get a different approach. The language and runtime offer a stronger level of assurance your program can only behave, so you can trust it more to have unfettered access to the actual system. You also have the choice to use Fil-C with a sandbox like SECCOMP as described in the blog post, since your Fil-C binaries are just normal executables that can access powerful Linux APIs like prctl. It took Linux twenty years to invent that interface, so you'll probably have to wait ten years to get something comparable from WASI.
> It can hose your memory all it wants, it can just only do so within the confines of the sandbox.
True, although as I understand it the WASI component model at least allows multiple fine-grained sandboxes, so it's somewhere in-between per-object capabilities and one big sandbox for your entire program. I haven't actually used it yet so I might be wrong about that.
> so you'll probably have to wait ten years to get something comparable from WASI
I think for many WASI use cases the capability control would be done by the host program itself, so you don't need OS-level support for it. E.g. with Wasmtime I do
WASI is basically CORBA, and DCOM, PDO for newer generations.
Or if you prefer the bytecode based evolution of them, RMI and .NET Remoting.
I don't see it going that far.
The WebAssembly development experience on the browser mostly still sucks, especially the debugging part, and on the server it is another yet another bytecode.
Finally, there is hardly any benefit over OS processes, talking over JSON-RPC (aka how REST gets mostly used), GraphQL, gRPC, or plain traditional OS IPC.
You've named half of the weasel security technologies of the last three decades. The nice thing about SECCOMP BPF is it's so difficult to use that you get the comfort of knowing that only a very enlightened person could have written your security policy. Hell is being subjected to restrictions defined by people with less imagination than you.
WASM sacrifices guest security & performance in order to provide mediocre host sandboxing, though. It might be a useful tradeoff sometimes, but proper process-based sandboxing is so much stronger and lets the guest also have full security & performance.
How is process-based sandboxing stronger? Also the performance penalty is not only due to sandboxing (I doubt it's even mostly due to it). Likely more significant is the portability.
Because the guarantees themselves are stronger, process isolation is something we have decades of experience with, it goes wrong every now and then but those are rare instances whereas what amounts to application level isolation is much weaker in terms the guarantees it that it provides and the maturity level of the code. That suggests that if you base your isolation scheme on processes rather than 'just' sandboxing that you will come out ahead and even with all other things the same you'd have one more layer in your stack of swiss cheese slices. A VM would offer another layer of protection on top of that, one with yet stronger guarantees.
No because the host can present the same interface on every platform. I do think that WASI is waaay to much "let's just copy POSIX and screw other platforms", but it does work pretty well even on Windows.
Running ffmpeg compiled for wasm and watching as most codec selections lead to runtime crashes due to invalid memory accesses is fun. But, yeah, it’s runtime safety, so going to wasm as a middle step doesn’t do much.
> Running ffmpeg compiled for wasm and watching as most codec selections lead to runtime crashes due to invalid memory accesses is fun.
For all you know that’s a bug in the wasm port of the codec.
> it’s runtime safety
So is Fil-C
The problem with wasm is that an OOBA in one C allocation in the wasm guest can still give the attacker the power to clobber any memory in the guest. All that’s protected is the host. That’s enough to achieve weird execution.
Hence why I say that wasm is a sandbox. It’s not memory safety.
Finally reality is catching up with the WASM sales pitch against other bytecode formats introduced since 1958, regarding security and how great it is over anything else.
Warm was great because it was lightweight and easy to target from any language and create any custom interaction API with the host. That's becoming less true as they bolt on features no one needed (GC) and popularize standardized interfaces that contain the kitchen sink (WASI) but these things can still be treated as optional so it can still be used for much more flexible use cases than java or .net
There’s no guarantee the toolchains will support WASM “preview” forever and make the bloat optional, and even if they do you could still end up in an ecosystem where it would be unviable. At some point you’re probably better off just compiling to RISCV and using an emulator library instead.
Wasm now supports multiple modules and multiple linear memories per module, so it ought to be quite possible to compile C to Wasm in a way that enforces C's object access rules, much like CHERI if perhaps not Fil-C itself.
You wouldn't be able to get quite as fine-grained. One memory per object is probably horrifically slow. And I don't know about Fil-C, but CHERI at least allows capabilities (pointers with bounds) to overlap and subset each other. I.e. you could allocate an arena and get a capability for that, and then allocate an object inside that arena and get a smaller capability for that, and then get a pointer to a field in that object and get capability just for that field.
Fil-C has like one "linear memory" per object and each capability gives read/write access to the whole object.
But Fil-C has its compiler which does analysis passes for eliding bounds-checks where they are not needed, and I think it could theoretically do a better job at that than a WASM compiler with multi-memories, because C source code could contain more information.
Unlike WASM, but like CHERI, every pointer in memory is also tagged, and would lose its pointer status if overwritten by an integer, so it is still more memory-safe in that way.
One would probably just need to define WASM extensions that allow for such subsetting. Performance will probably be competitive with software implementations of CHERI (perhaps with varying levels of hardware acceleration down the road) which isn't that bad.
There's a need for some portable and composable way to do sandboxing.
Library authors you can't configure seccomp themselves, because the allowlist must be coordinated with everything else in the whole process, and there's no established convention for negotiating that.
Seccomp has its own pain points, like being sensitive to libc implementation details and kernel versions/architectures (it's hard to know what syscalls you really need). It can't filter by inputs behind pointers, most notably can't look at any file paths, which is very limiting and needs even more out-of-process setup.
This makes seccomp sandboxing something you add yourself to your application, for your specific deployment environment, not something that's a language built-in or an ecosystem-wide feature.
If a codebase is being maintained and extended, it's not all code with 20 years of testing.
Every change you make could be violating a some assumption made elsewhere, maybe even 20 years ago, and subtly break code at distance. C's type system doesn't carry much information, and is hostile to static analysis, which makes changes in large codebases difficult, laborious, and risky.
Rust is a linter and static analyzer cranked up to maximum. The whole language has been designed around having necessary information for static analysis easily available and reliable. Rust is built around disallowing or containing coding patterns that create dead-ends for static analysis. C never had this focus, so even trivial checks devolve into whole-program analysis and quickly hit undecidability (e.g. in Rust whenever you have &mut reference, you know for sure that it's valid, non-null, initialized, and that no other code anywhere can mutate it at the same time, and no other thread will even look at it. In C when you have a pointer to an object, eh, good luck!)
Many of these mistakes weren't even made by any committee, but were stuff shipped in a rush by Netscape or Microsoft to win the browser wars.
There was some (academic) reaserch behind early CSS concept, but the original vision for it didn't pan out ("cascading" was meant to blend style preferences of users, browsers and page authors, but all we got is selector specificity footguns).
Netscape was planning to release their own imperative styling language, and ended up shipping a buggy CSS hackjob instead.
Once IE was dominant, Microsoft didn't think they have to listen to anybody, so for a while W3C was writing CSS specs that nobody implemented. It's hard to do user research when nothing works and 90% of CSS devs' work is fighting browser bugs.
there's table-layout:fixed that makes rendering of large tables much faster.
I'd argue that if you have so many rows that DOM can't handle, humans won't either. Then you need search, filtering, data exports, not JS attaching a faked scrollbar to millions of rows.
Yeah, Rust closures that capture data are fat pointers { fn*, data* }, so you need an awkward dance to make them thin pointers for C.
let mut state = 1;
let mut fat_closure = || state += 1;
let (fnptr, userdata) = make_trampoline(&mut &mut fat_closure);
unsafe {
fnptr(userdata);
}
assert_eq!(state, 2);
use std::ffi::c_void;
fn make_trampoline<C: FnMut()>(closure: &mut &mut C) -> (unsafe fn(*mut c_void), *mut c_void) {
let fnptr = |userdata: *mut c_void| {
let closure: *mut &mut C = userdata.cast();
(unsafe { &mut *closure })()
};
(fnptr, closure as *mut _ as *mut c_void)
}
It requires a userdata arg for the C function, since there's no allocation or executable-stack magic to give a unique function pointer to each data instance. OTOH it's zero-cost. The generic make_trampoline inlines code of the closure, so there's no extra indirection.
> Rust closures that capture data are fat pointers { fn, data }
This isn’t fully accurate. In your example, `&mut C` actually has the same layout as usize. It’s not a fat pointer. `C` is a concrete type and essentially just an anonymous struct with FnMut implemented for it.
You’re probably thinking of `&mut dyn FnMut` which is a fat pointer that pairs a pointer to the data with a pointer to a VTable.
So in your specific example, the double indirection is unnecessary.
This is a problem for all capturing closures though, not just Rust's. A pure fn-ptr arg can't have state, and if there's no user data arg then there's no way to make a trampoline. If C++ was calling a C API with the same constraint it would have the same problem.
Well, capturing closures that are implemented like C++ lambdas or Rust closures anyway. The executable stack crimes do make a thin fn-ptr with state.
If Rust has a stable ABI on where the data* is in the function arguments (presumably first?), you don't need to do anything if it matches the C code's expected function signature including the user context arg.
Unfortunately a lot of existing C APIs won't have the user arg in the place you need it, it's a mix of first, last, and sometimes even middle.
Yes but my problem wasn’t with the user data pointer but the fact that I needed a STATIC generic lambda. Static because the C library then forks and continues to call the lambda in the new process but I also type based conversions in it.
https://rlbox.dev/
reply