There is no standard and widely recognized long term archival format for HTML pages (with all the extras). Web ARChive (WARC) provides method for bundling all the stuff in file in one file, but that's not enough. Plus the files will be quite large.
You just don't know how your HTML and JavaScript renders 10 - 15 years from now. If you look old Web Archieve files you start to see how they become crap over time.
HTML is the format. You pack it with images, CSS and whatever else, and you have the distribution format.
> Web ARChive (WARC) provides method for bundling all the stuff in file in one file, but that's not enough. Plus the files will be quite large.
Not enough how? What is there that you need besides what the server hands to you, if that's what rendered in the first place? What magical compression methods do you have in PDF that are better than ZIP compression used in MAFF, for example?
> You just don't know how your HTML and JavaScript renders 10 - 15 years from now. If you look old Web Archive files you start to see how they become crap over time.
Have a static HTML version that's rendered the same in the future. You know, the same way that you have a static PDF standard.
How do you render Javascript in PDFs in a standard way? You don't use Javascript, that's how. Javascript is not for publication of static semantic text, so you don't use Javascript for papers, it's a no-brainer.
> HTML is the format. You pack it with images, CSS and whatever else, and you have the distribution format.
HTML is not a good format and standard for that purpose. It's loose best effort markup with no good consensus on semantics. HTML with images is not good option for papers which have many equations.
EPUB3 is emerging standard for what you want, but it's not really good complete solution that can replace PDF/A or TeX/LaTeX
> Have a static HTML version that's rendered the same in the future
> It's loose best effort markup with no good consensus on semantics.
And PDF has good semantics? Are we still on the topic of how HTML is better than PDF, or…? We're in the comments for a page that says that PDF tables are characters just floating in space, and people are saying most PDFs out there don't have semantic markup. Meanwhile HTML had semantics efforts for decades now, just choose your flavor.
Blind people read HTML, you know. Do they read PDFs?
> HTML with images is not good option for papers which have many equations.
There's MathML for that, and IIRC other formats too. You could even have embedded TeX like Anki has. Use SVG for fallback.
>> Have a static HTML version that's rendered the same in the future
> We don't have that.
Ooh, chicken-and-egg again? Freeze any of the versions from the past decade with the rendering standards, and you'll have it.
But actually, it doesn't even matter, just like HTML 2.0 can be rendered fine on modern devices (aside from the different text size). Treat your paper as a paper instead of a webzine, don't use crazy layouts, just do “text, image, text” which you'll want anyway for the different displays—and your document will render fine in the future when it will be delivered straight to the retina, instead of making me scroll the PDF back and forth because no reflow.
There is no standard and widely recognized long term archival format for HTML pages (with all the extras). Web ARChive (WARC) provides method for bundling all the stuff in file in one file, but that's not enough. Plus the files will be quite large.
You just don't know how your HTML and JavaScript renders 10 - 15 years from now. If you look old Web Archieve files you start to see how they become crap over time.