> closures over lexical environments possibly shared by other functions The same...

reikonomusha · on Oct 13, 2021

Functions with compiled code referring to address offsets in the closure environment are not printable and I’d hazard to say they can’t be without compromising something else. The function being purportedly serialized is already in a representation very far removed from its source code, and has mutable state that isn’t just from its closure environment. LOAD-TIME-VALUE (in Common Lisp) is another problem that allocates something akin to a C ‘static’ variable. Compiler macros (in Common Lisp) also make things hairier. Lack of run-time representation of reader macros (in Common Lisp) are the cherry on top—though this is only an issue if you want your serialized representation to mimic the source code closely.

Lists do have issues of sharing data with other lists, but most times the lists that are sharing data are somehow adjacent to one another allowing the printer to collect shared references.

If we allow ourselves the freedom to ignore shared closure environments, L-T-V and other Common Lisp complications, then serializing a function would still be this very expensive recursive procedure to serialize all transitive callees to capture any possible shared state, and I think that is too much to ask for, especially in the context we are speaking (making functions appropriate objects to be printed as a part of an S-expression).

It’s all doable if everything is slow, interpreted, and first-class though. Or maybe you’re serializing FORTH words. :)

waterhouse · on Oct 13, 2021

> Functions with compiled code referring to address offsets in the closure environment are not printable and I’d hazard to say they can’t be without compromising something else.

Surely these offsets refer to items saved in the lexical or global environment, which were originally named by variables? Then you serialize the variable references and the values they refer to. A shared lexical environment would get serialized like this:

  (let ((x 10) (y '(1 2)))
    (let ((f (lambda (z) (set! x z))))
      (let ((g (lambda (a) (list x y)))
            (h (lambda (b) (f b))))
        (list g h))))
  ; serializes to the following,
  ; with the notation #closure(args body env)
  ; where env is ((var1 val1) (var2 val2) ...)
  (#closure( (a)
             ((list x y))
             (#1=(f #closure( (z)
                              ((set! x z))
                              (#2=(x 10) #3=(y (1 2)))))
              #2#
              #3#))
   #closure( (b)
             ((f b))
             (#1#
              #2#
              #3#)))

The runtime system is presumably competent at converting these into machine code, and can do that either during the read, or JIT at execution.

Yes, most of the time, when you print lists, either there is no shared structure, or the program doesn't try to modify it, so it doesn't matter if you read a portion back in and bifurcate the identity. Likewise, for the majority of functions, either they don't share their lexical environment with others, or they do but not in a way where it matters. (This thread's original use case was printing globally defined functions, which usually have no lexical environment.) The common case should work fine like this:

  (let ((n 0)) (lambda (x) (incf n x)))
  ; serializing to
  #closure( (x) ((incf n x)) ((n 0)))

If you do need to print a list and read it back in, where it's important that this list shares structure with existing objects, and you're not printing and reloading those objects, then either (a b . #<Object ID 43827>), or, if your runtime doesn't relocate objects, (a b . #<Object at address 0xabcde) would be the way to go. (Note that the GC would have to know not to kill the object; you're effectively persisting a pointer to it outside the runtime.) Likewise, if you do need to print a lambda sharing a lexenv with another one you're not printing, you'd go #(closure arglist body (#<binding ID 1234> #<binding ID 1235> ...)). Of course, this only works if you reify the object in the same running Lisp session from which you serialized it—but that applies equally to shared lists and shared closures.

The question upthread seemed to be whether it was possible to have a language that could serialize functions. I've argued yes. Now the counterargument seems to be "well, it's inconvenient and expensive to get full fidelity". This brings us to the question of what use cases we're talking about, and how often you want full fidelity (and what assumptions you can make).

I think one of the use cases was printing macroexpansions in the REPL? And then maybe the readability-requiring use case would be selecting a subexpression and saying (macroexpand-1 '([paste])). The literal functions in the expansion would generally be globally defined—the result of using ",+" instead of "+". Well, I think that's covered reasonably well by having the globally-defined function "foo" get printed as #f:foo or something like that. (Maybe we could use the syntax #'foo to mean that? Hah.) It's less nice than seeing the bare name, but macroexpansions already often contain gensyms and package prefixes.