Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

UTF8 is an extremely simple and lightweight text encoding. Check out Plan 9's man page on UTF, it would fit on a t-shirt: https://plan9.io/magic/man2html/6/utf

Unicode is also just a representation for text, and a handful of common operations - you work with arrays of characters, rather than arrays of bytes. It was worth its cost on 1992 hardware; Nintendo DS is over a decade more recent.

I recommend studying libutf in sbase[0]. It's not a single header file solution (although utf.h[1] is an excellent place to start reading), but it does provide a fairly comprehensive implementation. There's also a good introduction to Unicode in Plan 9's C programming guide[2]. Even if you choose to only support runes that fit in a single byte, you gain the ability to tell byte blobs apart from text, which is useful both for reasoning about your program, and for future-proofing it, in case you needed to put places like Łódź or Πάτρα on your map.

[0]: http://git.suckless.org/sbase

[1]: http://git.suckless.org/sbase/file/utf.h.html

[2]: https://plan9.io/sys/doc/comp.html



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: