DocDB is built atop a very well battle-tested distributed systems framework with...

hendzen · on Aug 21, 2014

Another way to look at this is that it is based on Microsoft's internal distributed systems infrastructure and thus will never be open-sourced for the same reason Google will never open-source Spanner, Megastore or Colossus. Having battle-tested internal systems to build on is nice, but it means the DocumentDB code is probably nearly impossible to run outside of Microsoft.

dougws · on Aug 21, 2014

Your objections to MongoDB's model seem reasonable, but I don't see any evidence in either this comment or the linked blog post that DocumentDB is better (especially in the absence of benchmarks). What is this "battle-tested distributed systems framework"? Several of your complaints about MongoDB have to do with the interaction between persistence to disk and replication to the network; as the Multi-Paxos algorithm does not specify when data should be written to disk (much less what the format should be), what reason is there to believe that DocumentDB does this any better?

I'm totally willing to believe that DocumentDB beats the pants of MongoDB on just about every axis (in fact, that seems pretty likely) but it's going to take some actual numbers and a better description of the internals.

reubenbond · on Aug 21, 2014

I agree with you - we need numbers before making that kind of conclusion and I haven't run any benchmarks on the public version of DocDB. I'd like to see someone measure MongoDB on Azure vs DocDB on Azure - even then it might not be a fair measurement of db vs. db, since we don't know what machines DocDB is hosted on.

All I can really say is that the replication model provides a significant performance boost over MongoDB in the multiple replica (i.e., production) scenario.

We were using MongoDB at Microsoft for a while (I left MS almost a year ago). I was developing a real-time metrics system with it. It was very unstable at our target load (500k increments per minute, high percentage of tomorrow's documents preallocated the day before). We only managed maybe 10% of that with MongoDB, IIRC. Sometimes it would choke and not come back until I restarted the cluster (~30 machines total, I believe. 3 replicas * 10 shards).

We were so sure that MongoDB should be able to handle this scenario, since they talk about it in their documentation. After talking with the MongoDB devs, we came to the conclusion that even though we were issuing increment operations on preallocated documents, MongoDB was:

a) using a global lock on the "local" db used for replication, and

b) "replicating via disk" instead of via the network. In other words, replication requires writing to the journal journal before other members of the replica set have a chance to apply the change and ack back. This results in a loss of concurrency.

The lack of async query support in the C# driver didn't help either.

Eventually we used a replicated, write-back cache which sits atop the framework DocDB uses. Not a fair comparison, but the goal was achieved easily with 1/3rd the hardware. We just backed it onto Azure Table Storage. Our queries were all range queries, which table storage supports.

I can't talk about the framework, unfortunately.

ddorian43 · on Aug 22, 2014

next time you need fast counters, try hypertable (non-reading increments)

reubenbond · on Aug 22, 2014

It would still make sense to use the replicated write-back cache to avoid trips to disk. We were considering replacing MongoDB with Cassandra, though.

I wanted to avoid having to deploy and maintain a database system, so using table storage was a solid choice.