For example, you want to keep a set of objects -- then the objects themselves would be keys, and values would be true/nil. Or there is a good example in one of ldump's recent issues: https://github.com/girvel/ldump/issues/44, where the loaded packages are stored in a table as keys to easily detect external module usage.
It is intended to be used in cases where you need to store data on a disk or transfer it to another machine -- like in a video game save or a network data exchange
On my machine it produces an equivalent string, although differently formatted. It seems that ldump preserves all special characters (`"\a\b\f\n\r\t\v\\\"\'"`), although I will need to test in on all supported versions.
Ah, you know what, you're right. It's an equivalent string for me too:
"hi\
"
I didn't know Lua treated \ before newlines like that. That's cool! I made a similar Lua serialization library for myself and was using a chain of `string.match` calls to escape my strings. Now I can make it way simpler. Lol. Thanks
Thank you, it is really nice to hear. Though, I have to give credit to Lua's standard library -- the basic function serialization (without upvalues) is implemented there as `string.dump`.
Be aware that you're gonna have a bad time in scenarios where code is serialized using one Lua version and deserialized using another. Bytecode compatibility is not guaranteed between different versions of Lua(JIT).
I've shipped Love2D games as bytecode that wouldn't run on many Linux boxes because their LuaJIT installation (which is not part of Love2D but part of the system) was too old, or they stopped working after the user updated their system. There's a plethora of situations where something like that can happen.
I'm also wary of the "upvalues are preserved" feature, which sounds like a huge footgun, but I haven't looked into the details of your implementation.
This is an interesting thought. Currently, it is unsafe and intended to load only the files you trust. I should definitely include a warning into README.
Overall, it would be nice to make it safer. I don't think switching to non-Lua format would make it safer, because it is intended to serialize functions too, which can have arbitrary code even if everything else would be stored as data. Maybe it is possible to make a function like `ldump.safe_load` restricting `load`'s environment, so it wouldn't have access to debug/os/io modules.
You could take a look at SELÖVE, a (severely out of date) fork of LÖVE that is intended to make it safe to run arbitrary .love games. (It used to be on bitbucket, but it looks like it's gone? I'm not sure if I have the repo locally :/)
Running arbitrary code was such a problem that I just completely ruled it out for bitser. Instead of serializing functions, you can register safe functions as resources. This doesn't solve the upvalue problem, though.
I looked into it, and Lua allows limiting the environment when `load`ing -- through `env` argument since 5.2 or through setfenv before. I will add a helper function to produce a minimal needed environment for safe loading and a documentation page about safety.
Note that loading (maliciously crafted) bytecode is generally not safe in Lua; sandboxing can be escaped in more ways than what's possible when loading plaintext sourcecode, and there are no full mitigations for this currently as far as I know (and would probably be highly interpreter/version sensitive anyway)-- the only "real" mitigation strategy is to just not `load` bytecode at all.
But this is probably a non-issue for a lot of usecases.
Yep, that is correct. I think ldump is able to preserve all upvalues, even on edge cases such as "_ENV" and joined upvalues (multiple functions referencing one upvalue). A closure is basically an object with a single method and upvalues as fields -- serialization is straightforward. I think I got it covered, but I would be glad to hear ideas about where the serialization can be unstable.
The function (even a closure) would be fully recreated on deserialization, it is fully safe to save it to disk. It wouldn't preserve reference equality -- it would be a new function -- but the behaviour and the state (if using closures) would be equivalent.
I didn't include asserts in the linked case, because I thought it would be too verbose. You can see asserts in the test, that is linked below the example. Maybe it was the wrong call, I will think about including asserts into the example itself.
I think you could make it clearer, try reading the readme as someone with the preconceived notion that this is Yet Another Lua Serializer that translates functions, userdata and threads to their tostring() output. There are hundreds of those projects