Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, this basically boils down to "a potential pitfall, but consistent with documentation, and working as designed". Whether this actually matters depends on whether users are writing transaction functions which are intended to preserve some invariant, but would only do so if executed sequentially, rather than concurrently.

Datomic's position (and Datomic, please chime in here!) is that users simply do not write transaction functions like this very often. This is defensible: the docs did explicitly state that transaction functions observe the start-of-transaction state, not one another! On the other hand, there was also language in the docs that suggested transaction functions could be used to preserve invariants: "[txn fns] can atomically analyze and transform database values. You can use them to ensure atomic read-modify-update processing, and integrity constraints...". That language, combined with the fact that basically every other Serializable DB uses sequential intra-transaction semantics, is why I devoted so much attention to this issue in the report.

It's a complex question and I don't have a clear-cut answer! I'd love to hear what the general DB community and Datomic users in particular make of these semantics.



I feel like “enough rope to shoot yourself” is kind of baked into any high power, low ceremony tool.


As a proponent of just such tools I would say also that "enough rope to shoot(?) yourself" is inherent in tools powerful enough to get anything done, and is not a tradeoff encountered only when reaching for high power or low ceremony.


I always loved the broken phrase because it implies something really went terribly wrong ;)


I don't know whether it was intentional or not, but IIRC DataScript opted for sequential intra-transaction semantics instead.


It is worth noting here that Datomic's intra-transaction semantics are not a decision made in isolation, they emerge naturally from the information model.

Everything in a Datomic transaction happens atomically at a single point in time. Datomic transactions are totally ordered, and this ordering is visible via the time t shared by every datom in the transaction. These properties vastly simplify reasoning about time.

With this information model intermediate database states are inexpressible. Intermediate states cannot all have the same t, because they did not happen at the same time. And they cannot have different ts, as they are part the same transaction.


Thank you for the explanations. Do you happen to know why transactions ("transaction requests") are represented as lists and not sets?


When we designed Datomic (circa 2010), we were concerned that many languages had better support for lists than for sets, in particular list literals and no set literals.

Clojure of course had set literals from the beginning...


An advantage of using lists is that tx data tends to be built up serially in code. Having to look at your tx data in a different (set) order would make proofreading alongside the code more difficult.


Correct. I don't know about DataScript's intention, but it is intentional for Datalevin, as we have tests for sequential intra-transaction semantics.


The idea was that it allows you to add something first and then build on top of it, all in the same transaction


Yes. Perhaps this is a performance choice for DataScript since DataScript does not keep a complete transaction history the way Datomic does? I would guess this helps DataScript process transactions faster. There is a github issue about it here: https://github.com/tonsky/datascript/issues/366




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: