The "official" version used in that blog post decodes the JPG all in one go - so it's pretty memory hungry. With JPEG encoders that decode sections of the image at a time you can minimise the amount of RAM that needs to be allocated. It's also possible to stream the display data out to screen using DMA while the next chunk of image data is being decoded.
It's very easy to forget what a range of MCUs there are, from very puny, to very capable. For example the Espressif range of MCUs - which you'll find in all sorts of consumer products - are very powerful. Couple that with a lot of cheap SPI based display modules and you very quickly start wanting to show images.
It's explained in a bit more details in this original blog post written before the library was optimised: https://atomic14.substack.com/p/the-fastest-esp32-jpeg-decod....
It's very easy to forget what a range of MCUs there are, from very puny, to very capable. For example the Espressif range of MCUs - which you'll find in all sorts of consumer products - are very powerful. Couple that with a lot of cheap SPI based display modules and you very quickly start wanting to show images.