Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

UTF-16*, not UCS-2. Although there are probably many programs that assume UCS-2.


When Windows adopted Unicode, I think the only encoding available was UCS-2. They converted pretty quickly to UTF-16 though, and I think the same is true of everybody else who started with UCS-2. Unfortunately UTF-16 has its own set of hassles.


Technically, they converted to WTF-16 [0] since many places, including filenames, allow you to use unpaired surrogates.

[0] https://simonsapin.github.io/wtf-8/


Note that the asterisk in `UTF-16*` is a really big asterisk. I fixed a UCS-16 bug last week at my day job.


Yeah, there's sometimes a lot more hacks like WTF-8 and WTF-16 in practice on UCS-2 originally systems (including Windows and JS) than is healthy: https://simonsapin.github.io/wtf-8/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: