this is super cool! I wish there was an easy to understand and follow guide on h...

infecto · on Oct 26, 2023

I will butcher this so if any experts see this please don't flame me. I think you might be conflating ideas? You could definitely fine-tune existing embedding models or train your own from scratch but the goals of embeddings models are different than a LLM conversation. Embedding models are used for things like, classifying, search, image captioning...maybe at a high level anything where you have high dimensionality that you need to condense?

What you are asking for sounds like fine tuning an existing LLM...where the data will be tokenized but the outcomes are different? There is a lot of writeups on how people have done it. You should especially follow some of the work on Huggingface. To replicate talking to your friend though, you will need a very large dataset to train off of I would think and its unclear to me if you can just fine-tune it or you would need to train a model from scratch. So a dataset with 10s of thousands of examples and then you need to train it on a GPU.

https://www.anyscale.com/blog/fine-tuning-llama-2-a-comprehe...

Kutsuya · on Oct 27, 2023

Thank you for sending this. It's still quite puzzling to me if it's actually possible or not. Maybe what I want to train is a style? But then again, it should also remember other important things related to the friend..

sainez · on Oct 29, 2023

Parent comment is on the right track. It sounds like you want to fine tune an llm to mimic the conversation style between you and your friend. Then you can use a general embedding model to implement RAG so that the application can "recall" pieces of your conversation.