Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
SQLite builds for WASI since 3.41.0 (wasmlabs.dev)
111 points by rapnie on May 24, 2023 | hide | past | favorite | 47 comments


> For legal reasons, we could not contribute directly, but we could discuss the required changes with someone from the SQLite team and then iterate with them by testing or commenting.

Ah, here's the catch! I knew SQLite didn't accept code contributions, I was suprised they managed to contribute anyway, and then this sentence clarified the situation. So they indeed did not manage to contribute their code itself, but the SQLite team agreed to write this code themselves. I wonder what was the SQLite team's motivation behind this cooperative work.


https://www.sqlite.org/copyright.html

To summarize, instead of using one of the OSS licenses, the copyright holders simply declare the source to be in the public domain. In order to preserve that status they don't accept patches unless you submit some signed document that you agree with that.

To make things more complicated, they also use their a relatively niche version management system instead of git. Which would complicate making contributions (if they accepted them).

There's a popular fork that fixes all of these issues: https://github.com/libsql/libsql It is MIT licensed, on Github, and open for contributions.

Kind of a weird legal situation for a popular project like this that so many people depend on to have. Not judging; but it is odd. Seems like a lot of wasted efforts between users, would be contributors, and the people that forked this thing to address all that.


Wait, when did libsql become popular? As I recall, the libsql startup burned some goodwill when they tried to publicly shame SQLite, after forking SQLite and barely changing its code[1]. Looking at the libsql repo today, it looks like the contributors are still on the payroll of that startup.

Beyond that, the insinuation that good OSS must conform to Github social norms is silly given SQLite's track record of success. The maintainers are smart to direct their limited time and resources into development and not into community management or clout-chasing.

[1]https://news.ycombinator.com/item?id=33099222 and https://news.ycombinator.com/item?id=33081159


> To make things more complicated, they also use their a relatively niche version management system instead of git. Which would complicate making contributions (if they accepted them).

The Fossil VCS actually has a page explaining why it was created, instead of just using Git: https://sqlite.org/whynotgit.html

Honestly, a lot of those points make sense, especially how Git is perhaps a little bit more complex and tricky to wrap one's head around than it should be, making you think more about the VCS than just what you want to do. To that end, I'd actually suggest that people have a brief look at Fossil, it even comes with a built in web interface and some common functionality out of the box (vs most folks needing to setup Gitea/GitLab or use GitHub/Bitbucket etc. separately): https://fossil-scm.org/home/doc/trunk/www/index.wiki

Of course, realistically, Fossil will be dead in the water for most, given that it's still niche and won't have integrations with any graphical software that some might want to use (e.g. SourceTree, GitKraken, Git Cola exist, but I'm not aware of a rich ecosystem of solutions for Fossil, something like Fuel seems dead https://fuel-scm.org/fossil/brlist), or even with any CI/CD server solutions.


> they also use their a relatively niche version management system instead of git

They use Fossil for source control: https://www2.fossil-scm.org/home/doc/trunk/www/index.wiki

It is a git-like system, but instead of storing objects on the filesystem, it uses a SQLite database.

Has some interesting features:

- Project Management

- Built-in Web Interface

- Single executable


Aren't all those potential limitations, in fact their strengths? They are eating their own dogfood, which works for them. That sounds great to me.


From an administrative stance, I’m sure any project would happily consider a patch if you emailed it to them.

For local work, you can easily dump a codebase into your favourite version control system, work on it, then generate the final patch using your VCS’s tooling. Iterating towards a working change is as easy (or as hard) as exchanging patches between the feature developer and the committer of the upstream project.

It’s a bit old school, but it’s a good workflow to become comfortable with. I regularly patch my own private projects from my work account by emailing patches to my personal address. Once you compress and base64 a change it’s also humbling how little content goes into a piece of work!

(This isn’t a rebuttal to the parent post — more a call to arms that patch juggling isn’t as awful as others might think it is.)


> To summarize, instead of using one of the OSS licenses, the copyright holders simply declare the source to be in the public domain. In order to preserve that status they don't accept patches unless you submit some signed document that you agree with that.

Yep, see also this discussion about it in this thread https://news.ycombinator.com/item?id=36054521#36055014


What would you use this for? I don't get the whole picture and the article doesn't talk about it.


(Wasm Labs team member here) SQLite is a pretty popular database and it's a critical dependency for many different applications. By compiling it to Wasm32-wasi, you can add it to any WebAssembly module.

This enables a new set of possibilities for Wasm and SQLite. For example, now you can run a full WordPress application in the browser [1][2] / server [3] using the same Wasm module. Note that for the browser these projects use Emscripten [4], but in the future the same Wasm32-wasi module will work. A teammate gave a lightning talk about SQLite and WASI at KubeCon EU [5].

In general, any environment that includes a wasm runtime can potentially run applications that uses SQLite under the hood. Before, it wasn't possible.

- [1] https://wordpress.wasmlabs.dev/

- [2] https://developer.wordpress.org/playground/demo/

- [3] https://wasmlabs.dev/articles/running-wordpress-with-mod-was...

- [4] https://emscripten.org/

- [5] https://www.youtube.com/watch?v=E7tWtgf9V2s


> these projects use Emscripten, but in the future the same Wasm32-wasi module will work

I'm curious to know more about this. Does it mean that when browsers support WASI, these projects will not need to use Emscripten? Or maybe use it for compiling to/with WASI but without the need for a runtime anymore.

Is there anywhere I can keep track of when/how this might happen?


Those are great questions! I believe Emscripten will be required for some cases as it provides more features for targeting a Web Browser. If WASI is the only requirement for a Wasm module, then there are three possible solutions:

- Use a library that provides the WASI bindings in a browser environments: there are some OSS projects that provides WASI bindings on top of browser technologies. For example, workers-wasi from Cloudflare [1]. It could be even another Wasm module that provides the implementation for the main one. I know the people from Loophole Labs are experimenting with virtual filesystems (VFS) [2].

- Browsers provides a WASI implementation: server-oriented runtimes like NodeJS are already providing these bindings (under a experimental flag). I shouldn't have stated that as a fact, as browsers may provide it or not. However, I saw in the past the Google Chrome team experimenting with WASI and the browser FileSystem API [3]. So, I think it may happen :)

- [1] https://github.com/cloudflare/workers-wasi

- [2] https://www.youtube.com/watch?v=46jZSXVxYPw

- [3] https://github.com/GoogleChromeLabs/wasi-fs-access


I appreciate the response, thank you! I've really enjoyed following the explorations of Wasm Labs.


I guess that's one way to prevent malicious plugins from destroying your site.


Yes, exactly. We wrote an article here regarding PHP on Wasm that covers that: https://wasmlabs.dev/articles/mitigating-php-vulnerabilities...


It does what WASM is meant to do: it allows extra functionality, such as a database, without exposing more surface area for exploits.


They can still destroy your site, they just can't destroy anything else on the same server.


The goal of WASI is to add a standardized system API to WebAssembly, allowing you to access underlying systems (filesystem, networking, etc) in a homomorphic and secure way.

In practice, this means that you can compile your Rust/C++/etc code to WASM + WASI, and that code will run anywhere you have a WebAssembly VM. So the same code that accesses the "filesystem" in a browser context can also access the filesystem in a desktop/mobile/edge/etc environment.

You could also think of it as an attempt to turn WebAssembly into a fully-fledged JVM alternative. Compile once, run anywhere (including browser).

For this specifically, it means that SQLite can integrate with WASI APIs (instead of only Web APIs as before) so that an SQLite Wasm build can run in other contexts besides browsers.


> homomorphic and secure way

A lot of the sandboxing and security focus in Wasm makes sense. But the "capability-based security" mentioned at WASI spec [0] confuses me. For instance, when looking to what Cosmonic (an early adopter in Wasm fields) documents as capabilities [1] (e.g. 'messaging', 'http client', 'key-value store') it seems wholly different to concepts from the E programming language [2] by Mark S. Miller, that are finding their way in Cap'n Proto [3] and early work on the OCapN [4] unification effort. Is there any relationship to these, or are we looking at totally different approaches? Or is Cosmonic overloading the terminology of "capability"?

[0] https://github.com/WebAssembly/WASI#capability-based-securit...

[1] https://cosmonic.com/docs/category/capabilities

[2] https://en.wikipedia.org/wiki/E_(programming_language)

[3] https://capnproto.org

[4] https://github.com/ocapn/ocapn


On a brief look, I agree that your link [1] does not appear to be thinking about capabilities correctly. It seems to make the classic mistake of orienting around verbs (actions you can do) instead of nouns (specific resources you can operate on). That said I only took a brief look and could be misunderstanding something.

However, my understanding is that WASI itself is actually capability-based, and I think your link [0] is thinking about capabilities in the right way. In fact I'm pleased to see "Interposition" seems to have been added here, IIRC a few years ago that was missing and I feel it's an essential piece of capability-based security.

As an example, in WASI, there is no singleton filesystem, instead the application receives a set of file descriptors for specific directories which grant access only to those directories and their children, not their parents. Though last I looked, the libc wanted to reconstruct those into a single virtual filesystem, assigning each one a mount point and matching paths against those, in the name of compatibility, which felt unfortunate to me.

As always, a lot of people struggle to "get" capabilities, even when they are building on top of a platform designed around the idea. :/


Well, I think it is a different approach. But in another submission I posted, "WebAssembly for the Server Side: A New Way to Nginx" [0], sponsored by Nginx, they mention things feeding the confusion again:

> Perhaps most importantly, role-based access control and attribute-based access control, and other authorization and access control technologies, can introduce complex external systems that must be synchronized with the plugin as well as the underlying server-side technology. In contrast, Wasm access control capabilities are often built directly into the runtime engines, reducing the complexities and simplifying the development process.

[0] https://news.ycombinator.com/item?id=36057066


What I don't get specifically is what's being implemented here with SQLite. Assuming WASI is an interface an application can call into for things like filesystem access, sockets, does this add SQLite as interface (analogously to sqlite.h) next to these as part of WASI? So SQLite will be "under" the interface, i.e. part of a standard runtime?

How much "batteries included" is this runtime (planned to be)? If I understood this correctly, then there would be a tradeoff with having more batteries included which would imply more functionality for applications but make porting to new targets more work.


The main work behind this patch is to ensure that SQLite can compile to Wasm32-wasi. WASI doesn't offer all syscalls that SQLite may require, so it may require some conditional definitions and code to ensure it works properly. It doesn't mean that SQLite is part of WASI, but you can embed SQLite in a Wasm32-wasi module.

In the future, the WASI standard will include more and more features. This will allow SQLite to enable more internal features that are currently skipped due to the limitations on WASI.


Ah, so it's SQLite running on top of WASI, as a user (application, or rather library). Thanks for clearing that up for me.


Exactly! :D


So far the official WASM support from the SQLite project have only targeted WASM in the browser with a JS API. If I follow correctly this is about working with the SQLite team to upstream WASI support, which will eventually enable official support for it, and hence WASM on the server/cloud/edge and other WASI complement runners.


> If I follow correctly this is about working with the SQLite team to upstream WASI support, which will eventually enable official support for it, and hence WASM on the server/cloud/edge and other WASI complement runners.

(sqlite team's Wasm/JS guy here)

That's the long and the short if it. We currently _actively_ target browsers only because our team has exceedingly limited bandwidth and has to be choosy with regards to where our dev time goes. Long-term, we intend to target all wasm platforms, but getting there is an ongoing process.


Correct!


The file of my desktop app is a SQLite db in disguise, I could use this to open these files in a web version of my app, I suppose. I know there is a javascript implementation of SQLite out there but maybe this has better performance.


Well, you can use it to access SQLite files in WASM runners, of course. So you get a portable, platform-independent piece of code that can run anywhere.

Although it was a bit of a surprise to see that they started out with the toy CLI tool that my self and Nicholas hacked on for a weekend (I wanted to be able to run SQLite commands in a-Shell, which is an excellent iPad CLI environment).


Apparently they use it for their tests.

> We wanted to contribute upstream our patches for SQLite, as we were heavily relying on libsqlite for our tests of PHP.wasm with WordPress and Drupal.


I think this should come with the disclaimer that this completely eschews synchronization mechanisms used by SQLite.

Using this to read from a DB concurrently with other writers will produce garbage results. Using this to write to a DB concurrently with other writers will corrupt your DB.

WASI does not expose file locking primitives, and the unix VFS simply stubs them out in WASI builds: https://github.com/sqlite/sqlite/blob/4e8e33ba84e253878797d7...


> For legal reasons, we could not contribute directly (their patches)

What were the legal reasons?


Most likely what is documented at https://sqlite.org/copyright.html

> SQLite is open-source, meaning that you can make as many copies of it as you want and do whatever you want with those copies, without limitation. But SQLite is not open-contribution. In order to keep SQLite in the public domain and ensure that the code does not become contaminated with proprietary or licensed content, the project does not accept patches from people who have not submitted an affidavit dedicating their contribution into the public domain.

> All of the code in SQLite is original, having been written specifically for use by SQLite. No code has been copied from unknown sources on the internet.

They want to be freaking sure they are the original authors / they can release the code in the public domain.

Public domain is tricky and does not exist in the same way everywhere. Where I live, you just can't decide to put something in the public domain, you need to use something like CC0 [1]

[1] https://creativecommons.org/share-your-work/public-domain/cc...


This is the one case where NIH syndrome is warranted: Entangled legal issues & legal dependencies are no fun for anyone.


Why can they not just license the thing under CC0? Won't someone contributing implicitly understand that their contribution will also be licensed under CC0? This feels like it's more complicated than it needs to be.


Well for starters, SQLite is almost a decade older than CC0...


So? No reason they can't use it now.


Looks like this: https://www.sqlite.org/copyright.html

Kind of weird, couldn't they use a CLA?


It's not enough to guarantee the code has not been copied from somewhere / can be released in the public domain. The only way to have this guarantee is to write it yourself.


I suppose that's true, but that's no different from any other project, no? If I write a contribution to React that wasn't actually mine, it doesn't matter that it's just open source and not being released into the public domain, it's still a violation of the real author's copyright.

So if they've chosen a different risk model than essentially every other company doing open source software then that's up to them, but I'm not sure the need for it follows from it being public domain?


> I'm not sure the need for it follows from it being public domain?

There's something specific about the public domain that in many places, you can't just decide to release something in the public domain. I guess that's why they actually accept contribution but require an affidavit dedicating the contribution into the public domain. IIUC an affidavit specifically needs to be witnessed by taker of oaths, which probably ensures that the work can indeed be put in public domain.

Not sure it's the whole story, or even the main reason, or the correct analyze, though.


Ah, it sounds like there are a variety of laws throughout the world. It does sound like the US recently moved to allow digital signatures for this and related purposes though. See https://scarincihollenbeck.com/law-firm-insights/copyright-o...


> It's not enough to guarantee the code has not been copied from somewhere / can be released in the public domain.

FWIW, within the sqlite project we have a very strict policy against copy/paste of anything from outside sources. If we don't write it ourselves, it doesn't get added to the canonical source tree, no exceptions.


A CLA only covers the things you wrote, and while you ensure in the CLA that you only contribute things you wrote and are allowed to license to them, it is possible that you lie, so there is the danger (although a small one), that contributors might upload proprietary code written by others. IANAL


This is awesome. What I would really love to see is pandoc compiled to wasm and usable in a browser context. I tried before but since I'm unfamiliar with Haskell i didn't get very far.


I didn't know about pandoc. Really useful!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: