Hacker Newsnew | past | comments | ask | show | jobs | submit | pornel's commentslogin

There's a hybrid approach of C -> WASM -> C compilation, which ends up controlling every OS interaction and sandboxing memory access like WASM, while technically remaining C code:

https://rlbox.dev/


WASM sandboxes don't do much to guarantee the soundness of your program. It can hose your memory all it wants, it can just only do so within the confines of the sandbox.

Using a sandbox also limits what you can do with a system. With stuff like SECCOMP you have to methodically define policies for all its interactions. Like you're dealing with two systems. It's very bureaucratic and the reason we do it, is because we don't trust our programs to behave.

With Fil-C you get a different approach. The language and runtime offer a stronger level of assurance your program can only behave, so you can trust it more to have unfettered access to the actual system. You also have the choice to use Fil-C with a sandbox like SECCOMP as described in the blog post, since your Fil-C binaries are just normal executables that can access powerful Linux APIs like prctl. It took Linux twenty years to invent that interface, so you'll probably have to wait ten years to get something comparable from WASI.


> It can hose your memory all it wants, it can just only do so within the confines of the sandbox.

True, although as I understand it the WASI component model at least allows multiple fine-grained sandboxes, so it's somewhere in-between per-object capabilities and one big sandbox for your entire program. I haven't actually used it yet so I might be wrong about that.

> so you'll probably have to wait ten years to get something comparable from WASI

I think for many WASI use cases the capability control would be done by the host program itself, so you don't need OS-level support for it. E.g. with Wasmtime I do

  WasiCtxBuilder::new()
        .allow_tcp(false)
        .allow_udp(false)
        .allow_ip_name_lookup(false)
But yeah a standard WASI program can't itself decide to give up capabilities.

WASI is basically CORBA, and DCOM, PDO for newer generations.

Or if you prefer the bytecode based evolution of them, RMI and .NET Remoting.

I don't see it going that far.

The WebAssembly development experience on the browser mostly still sucks, especially the debugging part, and on the server it is another yet another bytecode.

Finally, there is hardly any benefit over OS processes, talking over JSON-RPC (aka how REST gets mostly used), GraphQL, gRPC, or plain traditional OS IPC.


You've named half of the weasel security technologies of the last three decades. The nice thing about SECCOMP BPF is it's so difficult to use that you get the comfort of knowing that only a very enlightened person could have written your security policy. Hell is being subjected to restrictions defined by people with less imagination than you.

> hardly any benefit over OS processes, talking over JSON-RPC

Hardly any benefit except portability and sandboxing, the main reasons WASM exists?


WASM sacrifices guest security & performance in order to provide mediocre host sandboxing, though. It might be a useful tradeoff sometimes, but proper process-based sandboxing is so much stronger and lets the guest also have full security & performance.

How is process-based sandboxing stronger? Also the performance penalty is not only due to sandboxing (I doubt it's even mostly due to it). Likely more significant is the portability.

> How is process-based sandboxing stronger?

Because the guarantees themselves are stronger, process isolation is something we have decades of experience with, it goes wrong every now and then but those are rare instances whereas what amounts to application level isolation is much weaker in terms the guarantees it that it provides and the maturity level of the code. That suggests that if you base your isolation scheme on processes rather than 'just' sandboxing that you will come out ahead and even with all other things the same you'd have one more layer in your stack of swiss cheese slices. A VM would offer another layer of protection on top of that, one with yet stronger guarantees.


WASM is only portable if the only thing it does is heating up CPU, given that everything else depends on the host.

No because the host can present the same interface on every platform. I do think that WASI is waaay to much "let's just copy POSIX and screw other platforms", but it does work pretty well even on Windows.

That's a sandboxing technology but not a memory safety technology.

You can totally achieve weird execution inside the rlbox.


Running ffmpeg compiled for wasm and watching as most codec selections lead to runtime crashes due to invalid memory accesses is fun. But, yeah, it’s runtime safety, so going to wasm as a middle step doesn’t do much.

> Running ffmpeg compiled for wasm and watching as most codec selections lead to runtime crashes due to invalid memory accesses is fun.

For all you know that’s a bug in the wasm port of the codec.

> it’s runtime safety

So is Fil-C

The problem with wasm is that an OOBA in one C allocation in the wasm guest can still give the attacker the power to clobber any memory in the guest. All that’s protected is the host. That’s enough to achieve weird execution.

Hence why I say that wasm is a sandbox. It’s not memory safety.


I’m not disagreeing with anything you said about wasm?

Finally reality is catching up with the WASM sales pitch against other bytecode formats introduced since 1958, regarding security and how great it is over anything else.

Warm was great because it was lightweight and easy to target from any language and create any custom interaction API with the host. That's becoming less true as they bolt on features no one needed (GC) and popularize standardized interfaces that contain the kitchen sink (WASI) but these things can still be treated as optional so it can still be used for much more flexible use cases than java or .net

> features no one needed (GC)

WasmGC is absolutely necessary for languages like Dart, Kotlin, and Java, which are all using it right now successfully.

But I get that if you're compiling C or Rust then it might seem like it isn't useful.


There’s no guarantee the toolchains will support WASM “preview” forever and make the bloat optional, and even if they do you could still end up in an ecosystem where it would be unviable. At some point you’re probably better off just compiling to RISCV and using an emulator library instead.

Fortunately core wasm is simple enough for a single person to implement an interpreter or even compiler for.

Even if the major engines continue to pile on complexity we have a pretty good escape hatch I think.


Since 1958 (UNCOL) there have been more options than only Java or CLR MSIL.

Wasm now supports multiple modules and multiple linear memories per module, so it ought to be quite possible to compile C to Wasm in a way that enforces C's object access rules, much like CHERI if perhaps not Fil-C itself.

The multiple linear memory is supported in wasi preview 3? I thought it was not supported as of preview 2.

Some WebAssembly runtimes now do support those parts of the specification.

You wouldn't be able to get quite as fine-grained. One memory per object is probably horrifically slow. And I don't know about Fil-C, but CHERI at least allows capabilities (pointers with bounds) to overlap and subset each other. I.e. you could allocate an arena and get a capability for that, and then allocate an object inside that arena and get a smaller capability for that, and then get a pointer to a field in that object and get capability just for that field.

Fil-C has like one "linear memory" per object and each capability gives read/write access to the whole object.

But Fil-C has its compiler which does analysis passes for eliding bounds-checks where they are not needed, and I think it could theoretically do a better job at that than a WASM compiler with multi-memories, because C source code could contain more information. Unlike WASM, but like CHERI, every pointer in memory is also tagged, and would lose its pointer status if overwritten by an integer, so it is still more memory-safe in that way.


It has a separate address space for each object? That seems unlikely. Is it not pretty much a software implementation of CHERI?

One would probably just need to define WASM extensions that allow for such subsetting. Performance will probably be competitive with software implementations of CHERI (perhaps with varying levels of hardware acceleration down the road) which isn't that bad.

There's a need for some portable and composable way to do sandboxing.

Library authors you can't configure seccomp themselves, because the allowlist must be coordinated with everything else in the whole process, and there's no established convention for negotiating that.

Seccomp has its own pain points, like being sensitive to libc implementation details and kernel versions/architectures (it's hard to know what syscalls you really need). It can't filter by inputs behind pointers, most notably can't look at any file paths, which is very limiting and needs even more out-of-process setup.

This makes seccomp sandboxing something you add yourself to your application, for your specific deployment environment, not something that's a language built-in or an ecosystem-wide feature.


To be properly useful as a sandbox, it would be nice to have a tool that would run another process/executable in a sandboxed environment.

Basically a tool that would allow to run flatpaks/AppImages/ etc...

Maybe firejail already does all that can be done without using a VM.


If a codebase is being maintained and extended, it's not all code with 20 years of testing.

Every change you make could be violating a some assumption made elsewhere, maybe even 20 years ago, and subtly break code at distance. C's type system doesn't carry much information, and is hostile to static analysis, which makes changes in large codebases difficult, laborious, and risky.

Rust is a linter and static analyzer cranked up to maximum. The whole language has been designed around having necessary information for static analysis easily available and reliable. Rust is built around disallowing or containing coding patterns that create dead-ends for static analysis. C never had this focus, so even trivial checks devolve into whole-program analysis and quickly hit undecidability (e.g. in Rust whenever you have &mut reference, you know for sure that it's valid, non-null, initialized, and that no other code anywhere can mutate it at the same time, and no other thread will even look at it. In C when you have a pointer to an object, eh, good luck!)


Many of these mistakes weren't even made by any committee, but were stuff shipped in a rush by Netscape or Microsoft to win the browser wars.

There was some (academic) reaserch behind early CSS concept, but the original vision for it didn't pan out ("cascading" was meant to blend style preferences of users, browsers and page authors, but all we got is selector specificity footguns).

Netscape was planning to release their own imperative styling language, and ended up shipping a buggy CSS hackjob instead.

Once IE was dominant, Microsoft didn't think they have to listen to anybody, so for a while W3C was writing CSS specs that nobody implemented. It's hard to do user research when nothing works and 90% of CSS devs' work is fighting browser bugs.


They are wrong, and didn't get the point of separating semantics and presentation.

there's table-layout:fixed that makes rendering of large tables much faster.

I'd argue that if you have so many rows that DOM can't handle, humans won't either. Then you need search, filtering, data exports, not JS attaching a faked scrollbar to millions of rows.


Yeah, Rust closures that capture data are fat pointers { fn*, data* }, so you need an awkward dance to make them thin pointers for C.

    let mut state = 1;
    let mut fat_closure = || state += 1;
    let (fnptr, userdata) = make_trampoline(&mut &mut fat_closure);

    unsafe {
        fnptr(userdata);
    }

    assert_eq!(state, 2);

    use std::ffi::c_void;
    fn make_trampoline<C: FnMut()>(closure: &mut &mut C) -> (unsafe fn(*mut c_void), *mut c_void) {
        let fnptr = |userdata: *mut c_void| {
            let closure: *mut &mut C = userdata.cast();
            (unsafe { &mut *closure })()
        };
        (fnptr, closure as *mut _ as *mut c_void)
    }
    
It requires a userdata arg for the C function, since there's no allocation or executable-stack magic to give a unique function pointer to each data instance. OTOH it's zero-cost. The generic make_trampoline inlines code of the closure, so there's no extra indirection.

> Rust closures that capture data are fat pointers { fn, data }

This isn’t fully accurate. In your example, `&mut C` actually has the same layout as usize. It’s not a fat pointer. `C` is a concrete type and essentially just an anonymous struct with FnMut implemented for it.

You’re probably thinking of `&mut dyn FnMut` which is a fat pointer that pairs a pointer to the data with a pointer to a VTable.

So in your specific example, the double indirection is unnecessary.

The following passes miri: https://play.rust-lang.org/?version=nightly&mode=debug&editi...

(did this on mobile, so please excuse any messiness).


This is a problem for all capturing closures though, not just Rust's. A pure fn-ptr arg can't have state, and if there's no user data arg then there's no way to make a trampoline. If C++ was calling a C API with the same constraint it would have the same problem.

Well, capturing closures that are implemented like C++ lambdas or Rust closures anyway. The executable stack crimes do make a thin fn-ptr with state.


If Rust has a stable ABI on where the data* is in the function arguments (presumably first?), you don't need to do anything if it matches the C code's expected function signature including the user context arg.

Unfortunately a lot of existing C APIs won't have the user arg in the place you need it, it's a mix of first, last, and sometimes even middle.


I know about this technique but it uses too much unsafe for my taste. Not that it's bad or anything, just a personal preference.

It can be done in 100% safe code as far as Rust is concerned (if you use `dyn Fn` type instead of c_void).

The only unsafe here is to demonstrate it works with C/C++ FFI (where void* userdata is actually not type safe)


Yes but my problem wasn’t with the user data pointer but the fact that I needed a STATIC generic lambda. Static because the C library then forks and continues to call the lambda in the new process but I also type based conversions in it.

There's Rust for Dreamcast (https://dreamcast.rs) via Rust's GCC backend.

If you create wrappers that provide additional type information, you do get extra safety and nicer interfaces to work with.

Apple is extending Swift specifically for kernel development.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: