Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The official W3C validator site seems to agree with me. Did you misunderstand what I said or can show me wrong. Just feed to it any widely used webpage, for example https://news.ycombinator.com/ and it will not pass.

Just be be clear, just because there are still uses for SGML does not make it relevant in the big picture. Your use case seems to be the exception.



Don't know what you did exactly, but the official W3C validator site uses 20 year old DTDs for DTD-based validation, but then HN's markup uses presentational elements/attributes from the HTML4 transitional/loose era intended to ease migration to CSS back then. The errors show exactly what's wrong with HN's markup eg. missing "alt" attribute on images where required, use of long-obsolete elements, missing DOCTYPE, etc. so I guess it's working as expected in suggesting improvements to your site's markup, doesn't it?

FYI: if you want to parse modern HTML 5 using SGML (with my HTML5 "mini"-DTD), see [1]. For example to check the HN homepage, download it using curl, then add a DOCTYPE to it ('<!DOCTYPE html SYSTEM "about:legacy-compat">'), then invoke "sgmlproc" on it, and it'll just work and parse without errors (see downloads and instructions on linked page).

[1]: http://sgmljs.net/docs/parsing-html-tutorial/parsing-html-tu...


Yes, but that is not relevant to my argument. Validator validating is irrelevant. HN's markup is not wrong because it works. You use sgmljs to deal with the unnecessary mess that SGML/HTML/XML started.

ps. Since you seem to know this stuff, where I can find standard DTD for DTD before XML. DTD was defined using DTD, right?


Not sure what you're after exactly but DTDs were introduced with SGML (ISO 8879:1986 [0]) and then used in simplified form with XML (which is specified as a simplified profile of SGML [1]).

The (historic) SGML-DTDs for HTML, including those used by W3C's validator and early IETF DTDs for HTML 2.0, can be found at W3C's site eg [2], [3].

[0]: https://www.iso.org/standard/16387.html

[1]: https://www.w3.org/TR/REC-xml/

[2]: https://www.w3.org/TR/html4/sgml/dtd.html

[3]: https://www.w3.org/TR/2018/SPSD-html32-20180315/


My question is this: Is there standard SGML-DTD for DTD? I have no access to ISO 8879:1986, so I can't check it.


Not really. SGML (and XML) are "meta-markup languages", meaning you declare your vocabulary yourself or use a ready-made one. There is in fact a simple general-purpose vocabulary declared in an ISO/IEC 8879:1986 appendix consisting of generic paragraph and heading elements, but it's not widely used in that form.


This gets close to my point.

Even people working with the standard don't want or don't need to SGML. Similarly for CSS.


BTW. DTD valid HTML document can still violate specifications.

We have a situation where

1. non valid HTML is just fine because HTML parsers recognize informal superset, and 2. HTML validated against corresponding DTD can violate specification

We have situation where parser/validator is at the same time not enough and too much.


The validator existed, sure, but finding a page that validated was like finding a unicorn.


Well the HN home page doesn't validate in the experimental HTML 5 validator either ;). The validator's point isn't to cover the largest set of documents on the web out there (you could use my "mini"-DTDs for that) but to inform authors about less ideal markup (as in "HTML recommendation").




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: