Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If any cloudflare employees end up here who helped decide on Capn Proto over other stuff (e.g. protobuf), what considerations went into that choice? I'm curious if the reasons will be things important to me, or things that you don't need to worry about unless you deal with huge scale.


Here's a blog post about Cloudflare's use of Cap'n Proto in 2014, three years before I joined: https://blog.cloudflare.com/introducing-lua-capnproto-better...

To this day, Cloudflare's data pipeline (which produces logs and analytics from the edge) is largely based on Cap'n Proto serialization. I haven't personally been much involved with that project.

As for Cloudflare Workers, of course, I started the project, so I used my stuff. Probably not the justification you're looking for. :)

That said, I would argue the extreme expressiveness of Cap'n Proto's RPC protocol compared to alternatives has been a big help in implementing sandboxing in the Workers Runtime, as well as distributed systems features like Durable Objects. https://blog.cloudflare.com/introducing-workers-durable-obje...


I don't work at Cloudflare but follow their work and occasionally work on performance sensitive projects.

If I had to guess, they looked at the landscape a bit like I do and regarded Cap'n Proto, flatbuffers, SBE, etc. as being in one category apart from other data formats like Avro, protobuf, and the like.

So once you're committed to record'ish shaped (rather than columnar like Parquet) data that has an upfront parse time of zero (nominally, there could be marshalling if you transmogrify the field values on read), the list gets pretty short.

https://capnproto.org/news/2014-06-17-capnproto-flatbuffers-... goes into some of the trade-offs here.

Cap'n Proto was originally made for https://sandstorm.io/. That work (which Kenton has presumably done at Cloudflare since he's been employed there) eventually turned into Cloudflare workers.

Another consideration: https://github.com/google/flatbuffers/issues/2#issuecomment-...


Aside from CF Workers using capn proto, how is it related to capn proto or sandstorm?


They are all projects I started.

But other than who worked on them, and sharing some technology choices under the hood, there's mostly no relationship between Workers and Sandstorm.


To summarize something from a little over a year after I joined there: Cloudflare was building out a way to ship logs from its edge to a central point for customer analytics and serving logs to enterprise customers. As I understood it, the primary engineer who built all of that out, Albert Strasheim, benchmarked the most likely serialization options available and found Cap'n Proto to be appreciably faster than protobuf. It had a great C++ implementation (which we could use from nginx, IIRC with some lua involved) and while the Go implementation, which we used on the consuming side, had its warts, folks were able to fix the key parts that needed attention.

Anyway. Cloudflare's always been pretty cost efficient machine wise, so it was a natural choice given the performance needs we had. In my time in the data team there, Cap'n Proto was always pretty easy to work with, and sharing proto definitions from a central schema repo worked pretty well, too. Thanks for your work, Kenton!


Albert here :) Decision basically came down to the following: we needed to build a 1 million events/sec log processing pipeline, and had only 5 servers (more were on the way, but would take many months to arrive at the data center). So v1 of this pipeline was 5 servers, each running Kafka 0.8, a service to receive events from the edge and put it into Kafka, and a consumer to aggregate the data. To squeeze all of this onto this hardware footprint, I spent about a week looking for a format that optimized for deserialization speed, since we had a few thousand edge servers serializing data, but only 5 deserializing. Capnproto was a good fit :)


And that's how I first learned about Cloudflare. Which I eventually joined, and built Cloudflare Workers. :)


The lead dev of Cloudflare workers is the creator of Cap'n Proto so that likely made it an easy choice


The article says they were using it before hiring him though, so there must have been some prior motivation:

> In fact, you are using Cap’n Proto right now, to view this site, which is served by Cloudflare, which uses Cap’n Proto extensively (and is also my employer, although they used Cap’n Proto before they hired me)


The post states "they used Cap’n Proto before they hired me"


He helped build the Workers platform after they hired him.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: