How RethinkDB Works – 7pm tonight at SF Data Engineering

cperciva · on Oct 16, 2013

He will also talk about how RethinkDB fits into the CAP theorem (they chose CA)

There's no such thing as "choosing CA". You can't "choose" to not have partitions any more than you can "choose" to never have hard drives fail. The question is, when things fail (yes, when, not if) how does your system respond?

Alas, I'm in Vancouver, not San Francisco, so I can't attend and rant about this in person.

Shamanmuni · on Oct 16, 2013

CAP is about choosing which are the main guarantees of a system. When you say P in CAP it means "If there's a network partition which affects n machines, the rest of the system will take charge of that load and continue working just fine" for a variable number of n. Saying "they chose CA" means they don't provide that guarantee and there could be problems related to a network partition (which by the way is how SQL databases usually operate). In RethinkDB's case they chose to sacrifice availability by default in those cases (but you can tune it).

Take for example Riak, they chose "AP" which means the system will continue working just fine through a network partition at the expense of not giving a strong consistency guarantee (eventual consistency). That doesn't mean that the data isn't consistent at all, just that it isn't its main concern.

jbp · on Oct 16, 2013

More information on what cperciva is saying (from Daniel Abadi)

http://dbmsmusings.blogspot.com/2012/10/ieee-computer-issue-...

Quoting from the post:

     "my past criticism of CAP not actually being about picking two of three out 
      of C (consistency), A (availability), and P (partition tolerance) due to the 
      fact  that it does not make sense to reason about a system that is ‘CA’. (If 
      there is no partition, any system can be both consistent and available ---  
      the only question is what happens when there is a partition --- does 
      consistency or availability get sacrificed?)"

rdtsc · on Oct 16, 2013

> You can't "choose" to not have partitions any more than you can "choose" to never have hard drives fail.

That also struck me a little shady. Surely wait for them to explain that one.

BTW there is a fill-in-the blanks form for this created by Fred Hebert, it might apply here I am afraid:

http://ferd.ca/beating-the-cap-theorem-checklist.html

seiji · on Oct 16, 2013

To nail down some definitions: C.A.P. = Consistency, Availability, Partition tolerance.

If their goal is hardline 'C,' that means (pedantically) if a partition is detected, the database reports back to the application "database partition exists. denying all reads and writes until resolved."

If you can't tolerate a partition but still want to claim 'A' (where "tolerating" requires merging structured data cleverly with CRDT-like things ("eventually") or last write wins (or even better: random write wins (which is basically a traditional RDBMS approach))), then they can be read-avail and maybe report metadata back to the client you're in read-only mode due to partition crappage and data isn't going to update until partition crappage resolves itself. Look! It's available! I can read from it!

(notably: in Amazon's original Dynamo case, write availability was more important than read availability, which is where ESCAPE comes in ("Eventually Somewhat Consistently Available Partition tolerant Engine"))

vjoel · on Oct 16, 2013

According to http://www.rethinkdb.com/docs/architecture, you have a choice, for each read query, between C and A. The summary on meetup.com is apparently wrong.

I am going to the meetup (but am not otherwise associated with rethinkdb).

cperciva · on Oct 16, 2013

Ah good, RethinkDB is doing this right. Shame that the meetup summary was so confused.

finnh · on Oct 16, 2013

Great doc reference, thanks. I've always been impressed by RethinkDB and so was surprised to see the blurb above about being "CA". Good to see that's not actually the party line.

Obligatory Coda Hale reference:

http://codahale.com/you-cant-sacrifice-partition-tolerance/

aaronblohowiak · on Oct 16, 2013

To extend what cperciva is saying: under partition, you MUST sacrifice either consistency or availability.

A good explanation is here: https://foundationdb.com/white-papers/the-cap-theorem/

petesoder · on Oct 16, 2013

Alas! Incidentally, the vid of the talk will be posted later on g33ktalk.com if you're interested.

perryh2 · on Oct 16, 2013

Will this be recorded?

dhruvkaran · on Oct 16, 2013

Why is RSVP gated on a person?

petesoder · on Oct 16, 2013

to keep out recruiters

dman · on Oct 16, 2013

Can this be recorded please?