Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I am curious if an LLM trained using a tokenizer that renders words into IPA alphabet would make a better bot for creative writing, especially things like rhyming, assonance, puns, and other sound-based word games. It might also do better on "fringe" languages, where there is low corpus in the language but the words might have cognates in more widely known languages.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: