At some point autoencoding during training will take care of that, it seems. We ...

CuriouslyC · on April 11, 2024

Oh, having language models sit on top of autoencoders is 100% the right way to go, even if we weren't moving towards multi-modal models, just because right now LLMs can be brilliant in one language and retarded in another based on the training data set. Putting an autoencoder in front and using "language agnostic" encoding would solve that problem.

optimalsolver · on April 13, 2024

>autoencoding during training

Could you elaborate on this?