Hacker Newsnew | past | comments | ask | show | jobs | submitlogin



Pretty much, discussion: https://github.com/ndjson/ndjson.github.io/issues/1

Also when you're using this with GeoJSON there's https://stevage.github.io/ndgeojson/ which has an actual RFC (https://datatracker.ietf.org/doc/html/rfc8142)


Which itself is based on an earlier RFC for not-specifically-Geo-JSON, RFC 7464. Both do things a little different than the others: they use the "record separator" character at the start of each line and actually split on that separator when parsing.

The GeoJSON one pretty much seems to exist just to hang an "application/geo+json-seq" media type registration off of. Part of me wants to say this really should have been more of a "all json subtypes are also json-seq subtypes" situation but maybe that's not really feasible with the standards/registration processes.


Yes -

> Two terms for equivalent formats of line-delimited JSON are:

> Newline delimited (NDJSON)[4] - The old name was Line delimited JSON (LDJSON).[5]

> JSON lines (JSONL)[6]

https://en.wikipedia.org/wiki/JSON_streaming


It's strictly worse, actually.

ndjson specifies sane newline handling, since it works with terminators.

jsonlines works with separators and thus fails to detect truncated values. As a result, it can silently produce incorrect numeric values.


This is only an issue with plain numbers, however. If you'd have the number in an object or array, you'd detect the truncation just as well. Since using jsonl for a plain list of numbers is... overkill, I'd say it's not an issue.

On the other hand, requiring line terminators in the standard would inevitably lead to incompatibility issues. Most software would accept unterminated files, because text libraries do; and so some files will not be terminated. Some applications do not line-terminate files even on Linux (hello VSCode), and it would be even more problematic on Windows


It makes it incompatible with concatenative streaming, requires O(n^2) reparsing on every new chunk instead of just scanning for \n.

And if you have to parse values to detect end of record anyway, there’s no point in having jsonl standard at all, since you can just try to parse until the matching brace and repeat on success.


VSCode has the ‘files.insertFinalNewline‘ setting to configure this.


What’s the difference between terminators and separators here? The ndjson spec [0] doesn’t say anything like that, and it seems that ndjson and jsonlines are identical in what documents they accept.

[0]: https://github.com/ndjson/ndjson-spec


A separator separates two records, where as a terminator terminates a record.

You can detect and error out if you see an unterminated record at end of transmission. With separators, the producer might not put a separator after the last record, because there's nothing to separate it from there.

There's no justification for not using terminators, it's just a bad spec. Unsurprising story: the variant with better marketing has less technical chops.


im guessing: all values must always end with a terminator, but a separator doesn't need to be present after the last value

i.e. a documemt without a newline at the end is valid jsonl, but invalid ndjson




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: