I'm wondering why they have decided to airgap the models inside docker containers. IMO, this would have been a better comparison if the models were allowed to perform tool calls.
A small suggestion that immediately came to my mind, why not try making it a JSON serialized data and base64 encode it just like JWT. So that it can be shared and loaded effectively.