Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Chardetng: A More Compact Character Encoding Detector for the Legacy Web (hsivonen.fi)
63 points by hsivonen on June 8, 2020 | hide | past | favorite | 4 comments


About a year ago I had a webpage which was interpreted in the wrong encoding and was taken aback that Chrome no longer allows you to override a pages encoding.

I think it’s interesting how far we have come with UTF-8 adoption that it was the first time I had reached for said menu in probably nearly a decade.


Fantastic write up.

I regularly use the Python port of the original chardet (https://pypi.org/project/chardet/). In fact, most python devs do since it comes with requests.

This post is full of gems. E.G: I learned that it's important for your meta charset to be in the first 1024 bytes of your HTML :)


FWIW Firefox issues a warning if it finds your charset declaration late, outside the 1024. Long copyright or license headers can cause this problem, annoyingly.


This is super cool and interesting. Great write-up, thanks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: