My team managed a system that did a read from user data, doing input validation. One day we got a smart quote character that happened to be > U+10000. But because the data validation happened in chunks, we only got half of it. Which was an invalid character, so input validation failed.
In UTF-8, partial characters happen so often, they're likely to get tested. In UTF-16, they are more rarely seen, so things work until someone pastes in emoji and then it falls apart.
My team managed a system that did a read from user data, doing input validation. One day we got a smart quote character that happened to be > U+10000. But because the data validation happened in chunks, we only got half of it. Which was an invalid character, so input validation failed.
In UTF-8, partial characters happen so often, they're likely to get tested. In UTF-16, they are more rarely seen, so things work until someone pastes in emoji and then it falls apart.