I snuck in an edit about readability before I saw your reply. The quality of that one in particular is very meh, especially for most new sites and then you lose all of the dom structure in case you want to do more with the page. Though now I'm curious how it works on the weather.com page the author tried. pupeteer -> screenshot -> ocr (or even multi-modal which many do OCR first) -> LLM pipeline might work better there.