Symas Lightning Memory-Mapped Database (LMDB)

justin66 · on May 11, 2014

Recently deleted from Wikipedia as being not notable by a former Oracle employee.

http://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion...

edit: that page refers to a deletion that happened a year ago but lmdb had a page much more recently than that - even a month or two ago, I think - so apparently deleting the lmdb page is a hobby for someone. I'm not familiar enough with the tools to dig in and figure out what's going on there.

sitkack · on May 12, 2014

http://en.wikipedia.org/wiki/User:ThurnerRupert/Lightning_Me...

This is a problem with wikipedia in general. I think archive.org is/should be archiving dumps.

sitkack · on May 11, 2014

On mobile, maybe check the deletopedia

mitchellh · on May 11, 2014

As a data point, this is the underlying data storage mechanism we use for Consul (https://github.com/hashicorp/consul).

We also forked and improved some Go bindings to lmdb: https://github.com/armon/gomdb (which is the lib we use in Consul).

When building Consul, we were specifically looking for an in-process DB that supports MVCC. The reason is because while Consul is doing a snapshot, we wanted to be able to INSERT/UPDATE without affecting the integrity of the snapshot. LMDB fit this role nicely and the performance has been fantastic for our use case.

joshu · on May 11, 2014

What is the difference between consul and serf? The top-level explanation is very similar.

jdf · on May 12, 2014

http://www.consul.io/intro/vs/serf.html

pkieltyka · on May 12, 2014

thats interesting! since Consul / Serf are written in Go.. have you considered using boltdb? https://github.com/boltdb/bolt it's an LMDB implementation in Go, really clean code.

hyc_symas · on May 12, 2014

Bolt is still a lot slower than LMDB/GoLMDB. http://eagain.net/talks/go-nuts-and-bolts/slides.html#33

And still very immature.

ddorian43 · on May 11, 2014

Hustle is a distributed, column oriented, relational OLAP Database that uses lmdb.

http://chango.github.io/hustle/

pdq · on May 11, 2014

Howard Chu, the author of LMDB, gave a great presentation on LMDB: http://parleys.com/play/517f58f9e4b0c6dcd95464ae/chapter0/ab...

He can also rock the violin.

rdtsc · on May 11, 2014

Benchmarks are very impressive:

http://symas.com/mdb/microbench/

Multi-threaded, vs memcache

http://symas.com/mdb/memcache/

lafar6502 · on May 11, 2014

Pls remember this database has no write ahead log and therefore is not too fast for write-intensive applications. But i really like it and regret I learned about it only recently.

PS and read the benchmark description carefully - some of the benchmarks are performed in memory, without actual disk i/o.

_wmd · on May 11, 2014

This is only part of the story - LMDB lacks a WAL because it uses shadow paging, which is a complementary technique with totally different characteristics more suited to read-optimized loads.

Similarly "not too fast for write-intensive apps" is only half the story - LMDB write transactions have a fixed cost related to the tree depth, so while many tiny updates may incur a noticeable fixed penalty in some cases, that cost becomes less noticeable e.g. with larger transactions performing reasonably localized updates (say, to partially sequential key ranges).

Also note even for huge databases, the write overhead of shadow paging is usually somewhere south of 64kb per transaction. To reiterate, the fixed cost is essentially per-transaction rather than per-update.

hyc_symas · on May 12, 2014

For some write-intensive applications it is faster than everything else. E.g. http://symas.com/mdb/hyperdex/

Depends a lot on your key and value sizes - the larger the values, the faster LMDB is vs any other solution, due to the zero-copy behavior.

lafar6502 · on May 21, 2014

_wmd, hyc_symas I understand the principles of LMDB operation, the worst case scenario is lots of random writes that hit random pages in the db file. In such case the I/O will be heavy and random also, which leads to bad performance of disk operations (not so bad with SSD ,though). But you're right, if you avoid the worst case LMDB shines.

hyc_symas · on May 22, 2014

I'm pretty sure the worst case is purely size-dependent.

The write pattern in the HyperDex benchmark is pure random but it still obliterates HyperLevelDB (and don't even think of vanilla LevelDB there). Keep in mind that LMDB uses free pages in sorted order, so even for a random write workload, pages are allocated in ascending order, which generally translates to unidirectional seeks on an HDD. This is why even for a purely random write load, LMDB is still faster than all of the traditional update-in-place B-trees out there - they really have to do random seeks, LMDB doesn't. (And you pay the highest in seek time when the drive head has to reverse direction.)

That's why LMDB write performance remains uniform under load, while all LSMs suffer from GC/compaction pauses.

But ultimately, storage is moving to solid state, and seek time will be irrelevant.

apendleton · on May 11, 2014

I'd be curious to see how it compares to LSM, Sqlite4's new underlying KVS.

hyc_symas · on May 12, 2014

The sqlite4 authors tested it. Never published the results, AFAIK. Probably because sqlite4 LSM is a pig. http://www.sqlite.org/src4/info/51816384756d6c620e991bb4ef81...