Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How RethinkDB Works – 7pm tonight at SF Data Engineering (meetup.com)
27 points by petesoder on Oct 16, 2013 | hide | past | favorite | 14 comments


He will also talk about how RethinkDB fits into the CAP theorem (they chose CA)

There's no such thing as "choosing CA". You can't "choose" to not have partitions any more than you can "choose" to never have hard drives fail. The question is, when things fail (yes, when, not if) how does your system respond?

Alas, I'm in Vancouver, not San Francisco, so I can't attend and rant about this in person.


CAP is about choosing which are the main guarantees of a system. When you say P in CAP it means "If there's a network partition which affects n machines, the rest of the system will take charge of that load and continue working just fine" for a variable number of n. Saying "they chose CA" means they don't provide that guarantee and there could be problems related to a network partition (which by the way is how SQL databases usually operate). In RethinkDB's case they chose to sacrifice availability by default in those cases (but you can tune it).

Take for example Riak, they chose "AP" which means the system will continue working just fine through a network partition at the expense of not giving a strong consistency guarantee (eventual consistency). That doesn't mean that the data isn't consistent at all, just that it isn't its main concern.


More information on what cperciva is saying (from Daniel Abadi)

http://dbmsmusings.blogspot.com/2012/10/ieee-computer-issue-...

Quoting from the post:

     "my past criticism of CAP not actually being about picking two of three out 
      of C (consistency), A (availability), and P (partition tolerance) due to the 
      fact  that it does not make sense to reason about a system that is ‘CA’. (If 
      there is no partition, any system can be both consistent and available ---  
      the only question is what happens when there is a partition --- does 
      consistency or availability get sacrificed?)"


> You can't "choose" to not have partitions any more than you can "choose" to never have hard drives fail.

That also struck me a little shady. Surely wait for them to explain that one.

BTW there is a fill-in-the blanks form for this created by Fred Hebert, it might apply here I am afraid:

http://ferd.ca/beating-the-cap-theorem-checklist.html


To nail down some definitions: C.A.P. = Consistency, Availability, Partition tolerance.

If their goal is hardline 'C,' that means (pedantically) if a partition is detected, the database reports back to the application "database partition exists. denying all reads and writes until resolved."

If you can't tolerate a partition but still want to claim 'A' (where "tolerating" requires merging structured data cleverly with CRDT-like things ("eventually") or last write wins (or even better: random write wins (which is basically a traditional RDBMS approach))), then they can be read-avail and maybe report metadata back to the client you're in read-only mode due to partition crappage and data isn't going to update until partition crappage resolves itself. Look! It's available! I can read from it!

(notably: in Amazon's original Dynamo case, write availability was more important than read availability, which is where ESCAPE comes in ("Eventually Somewhat Consistently Available Partition tolerant Engine"))


According to http://www.rethinkdb.com/docs/architecture, you have a choice, for each read query, between C and A. The summary on meetup.com is apparently wrong.

I am going to the meetup (but am not otherwise associated with rethinkdb).


Ah good, RethinkDB is doing this right. Shame that the meetup summary was so confused.


Great doc reference, thanks. I've always been impressed by RethinkDB and so was surprised to see the blurb above about being "CA". Good to see that's not actually the party line.

Obligatory Coda Hale reference:

http://codahale.com/you-cant-sacrifice-partition-tolerance/


To extend what cperciva is saying: under partition, you MUST sacrifice either consistency or availability.

A good explanation is here: https://foundationdb.com/white-papers/the-cap-theorem/


Alas! Incidentally, the vid of the talk will be posted later on g33ktalk.com if you're interested.


Will this be recorded?


Why is RSVP gated on a person?


to keep out recruiters


Can this be recorded please?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: