More

robjs · 2025-12-19T15:41:37 1766158897

ICMP packets pretty much always carry some data (even though it's not _strictly_ required). This data is what is padded when the user asks for a ping with a specific packet size (e.g., when debugging MTU issues).

In some applications, using an ICMP payload and getting a quote of the IP header + 8-bytes of the original packet back in ICMP error messages is part of the application. For example, traceroute utilises the fact that it gets part of the payload back in a ICMP TTL exceeded message to identify _which_ traceroute request was being responded to.

robjs · on July 5, 2023

https://rob.sh/postindex

I write mainly about networking topics, either standards updates or open source projects. Infrequently at best.

robjs · on April 29, 2023

Oh, it definitely is (re: SF being into neighbourhoods)! Cool project.

Kudos on recognising “Fairmount” versus lobbing it in with Glen Park or Noe Valley. (I don’t /really/ have a complex about this - but the neighbourhood does have some interesting history!)

robjs · on April 4, 2023

As always, Bruce raises a good discussion here, but I’m disappointed in the lack of depth of this analysis. The article, to me, characterises this as a religious discussion, choosing between simple ECMP/multipath and MPLS-TE. I think this ignores the business reality of why one looks at deploying traffic engineering within a network, and the available approaches. I’m also a little disappointed that it characterises Google’s B4 and Microsoft’s SWAN as networks that rely on RSVP-TE (to my knowledge they do not, see the B4 paper). To my mind, these are demonstrations of the utility of traffic engineering independently of the mess that distributed traffic engineering with RSVP-TE creates.

(My background: I’ve inherited RSVP-TE deployments in a number of continent-wide, and global networks — which has involved driving standards to improve its scalability, and subsequently driving segment routing in the industry and production deployments.)

The issue one has at any kind of scale is that it is non-trivial to acquire capacity that can be coherent in terms of different optimisation dimensions for your network. For example, a network I worked on could acquire limited capacity on Europe to India cable systems, alternate routes were significantly different latency - but there was significant EU-India demand in the network. What were the options for placing this traffic on the network? IGP weights - sure - but this means there is no selective placement of that traffic (i.e., everyone has to take the same route), which one might not be able to support commercially. Looking beyond that, there are limited options _other_ than MPLS-TE based on RSVP-TE. Path Computation Engine (PCE) support, even when it emerged, was RSVP-TE-centric. So, commercially, those networks didn’t have a choice to deploy traffic engineering — and it wasn’t for want of trying. Significant cable system deployments have been driven on routes such as the one that I mention above — so there was capital to be deployed to fix the problem, it’s just that building such systems take years to be built. Did their architects want to deploy RSVP-TE? Pre-SDN (and SDN-in-the-WAN like B4 and SWAN) what option did they have to meet that business requirement? I would postulate very little (at least at the time that I was engaged in these discussions there were no clear alternatives). In fact, I would postulate that the existence of TE in B4 and SWAN shows that there is value in traffic engineering practically. Greenfield/ground-up systems still implemented.

RSVP-TE itself though was not well thought through. The systems design discussion that I think is very interesting here is considering the lessons that we can learn from such a technology. Distributed state in the network, that causes large amounts of signalling following failures, and requires midpoints to be aware of all demands that traverse them and admit them is fragile by its very nature. The scaling analysis that was done during the architectural work (RFC5439 for example) did not think of the RSVP-TE distributed system’s different points of dynamism — it concentrated on steady-state cost, but we’ve demonstrated time and again in production (over many, many years) that practically the system’s scaling was to do with the cost and scaling of dynamic resignalling following events rather than steady-state utilisation (I’ve presented effusively on this, see https://research.google/pubs/pub45800/ and https://youtu.be/NtED7CUHLNE).

Rather than raising the question, from a systems design perspective, as to whether MPLS-TE was a religious mistake — let’s raise the one around how complex distributed systems that have multiple vendors of their equipment (i.e., not ecosystems that are controlled by one party) can iterate on solving business problems without religiously filling in the gaps of protocols that don’t work. In my view, this would be a huge step forward for the networking industry.

Finally, let’s not fall into the trap that we over-generalise. Higher scale networks within a limited geography (terrestrial UK, US etc.) may not have the same business considerations — and therefore may not need the same approach to traffic placement. There’s, as always, a set of trade-offs here.

netthrow2 · on April 4, 2023

> I’m also a little disappointed that it characterises Google’s B4 and Microsoft’s SWAN as networks that rely on RSVP-TE (to my knowledge they do not, see the B4 paper).

I work for one of the vendors that makes the boxes that run B4 and SWAN. You are absolutely right.

Throw away because I don't want be identified.

robjs · on Sept 13, 2022

I am coeliac (en-US: celiac), a condition that I (along with a number of other things) contracted after contracting an infection in 2019. One of the symptoms that many of those that suffer from coeliac disease suffer from is getting brain fog after having ingested gluten (colloquially "getting glutened"). My specialist predicted that there would be a significant uptick of many other post-infectious conditions following the COVID pandemic and I'm sorry to hear that, going by articles like this, that prediction seems to be coming true.

It is almost impossible to describe the feeling of not being able to think in this way. I'm a senior software engineer at a large company, I spent much of my time diving into different code bases, and in meetings where I am often unfamiliar with the specifics of a situation and need to quickly reload context. When I am in a state that I have brain fog, I absolutely cannot do these things - I need to sit and prepare for a meeting, and even then I can't think quickly enough in it to be able to comment on anything in a meaningful way. Creativity is not possible, I can't think around a problem at all. Understanding unfamiliar code becomes extremely taxing (if not impossible). Whilst I'm not as bad as some of these folks that are described in the article, coherently forming sentences how I would normally (I'm someone that "thinks out-loud" often) is just not possible. It's debilitating.

For folks that don't have it, it can be hard to explain. Like someone in this article, I tend to just cancel things when I'm in this state (which can be for days, or weeks -- luckily I get to emerge as my body deals with being able to eat again). I just need to sleep - partially because of the emotional toil of the frustration of being cognitively impaired overnight, but also from the physical toll having a flare-up causes my body. I'm not really making this post (and sharing things that I probably usually wouldn't) for any reason other than to just say to folks, hey - be understanding to your coworkers, these conditions are poorly understood, and difficult to deal with. We all tend to feel guilty for not pulling our weight during the time that we're sick, so sometimes just knowing that our coworkers are cutting us some slack really helps :-) Thanks!

ianai · on Sept 13, 2022

100% agreed. And maybe keep this in mind when people are taking more precautions than you think are necessary.

It kind of reminds me of “having the winds leave my sails.” Like I know I could do this thing previously but today there just isn’t a way to get going.

robjs · on July 24, 2022

So, what does one who needs a off-road capable/large storage vehicle do? I do not drive to commute. I have a large dog, bicycles, camping gear, ski gear, etc. that are the things that accompany me and my passengers during trips. By this logic, these consumers (who might have more disposable income) should not move to make a climate difference earlier (where they may and probably do drive thousands of kilometres per year!

Nuance. These clickbait articles really miss this. I wish we had better editors.

robjs · on June 15, 2022

SRv6 is not going to transform the quality of experience/quality of service that you see from Internet applications. Traffic engineering technologies like this (MPLS-based, IP-based, emulated circuit based...) are used inside networks to select paths through them, this has been done for many years, and the segment routing data-plane - whether it be MPLS or IPv6, is a different realisation of how to achieve that path selection through a network. There are networks that have done this traffic engineering using IP encapsulation for many years.

The whole "IP 2.0" presumption that appears to be being made here is that suddenly some external traffic source will be able to select a route through someone else's network -- but this just isn't the commercial reality. Some more performant paths are going to have costs associated with them (even if it's just to build more capacity), so there is going to be a cost of choosing that route through the network. That cost is going to need to be covered somewhere - so you are very unlikely to actually be able to get to choosing a path without some commercial contract. Guess what? We've already had those -- they just tend to use the DSCP bits to indicate what the traffic class, and hence associated requested SLO is - not an explicitly chosen path.

Equally, let's think about how this would even work - if you are going to choose a path through the network to get better QoS, you're going to need to know something about what IDs to use, which implies knowledge of the topology. Inter-domain topology exposure is going to /significantly/ increase the complexity and fragility of inter-domain routing -- there are reasons that we don't run a global link-state protocol :-)

In conclusion - I think this is hype with little technical justification, and is unlikely to have any different impact than other intra-domain traffic engineering that the industry has been running for many years.

teleforce · on June 15, 2022

Yes sure MPLS works as combustion engines work as long we can remember, but it's not stopping EV as a new contender as the near future vehicle platform whether we like it or not. Now we have hybrid MPLS or SR-MPLS as a stopgap measure similar to what we have now with PHEV, etc. Personally I think the networking industry has already learned their mistakes from the ATM days and hopefully SRv6 will take off for better and more effective IP based networking.

robjs · on June 15, 2022

I don't understand what this means. This was not a defence of MPLS, but rather an observation that says that the fundamental domains and business logic around externally-selected TE paths do not change because we change what header instructs the network to steer packets. It's still not going to make sense to have external users try and use the "premium" paths in the network without some recompense to keep scaling them, equally, there needs to be some incentive for users to choose a "less than best" path.

Quite honestly -- this kind of marketing hype and hyperbole is what is wrong with the whole area of SR today (and I say this as someone that was _very_ involved). We've completely lost the ability to say what it is we're solving, and why we're doing it. We're driving disparate architectures into silicon where there's opportunity cost for the functionality. We're having political disagreements within the IETF based on folks trying to keep political control of technologies, not worrying about the efficacy for the industry. It's all pretty broken. YMMV.

teleforce · on June 15, 2022

Let's us step back and look into the current mess of the existing IP networks with all the duct tapes involved (looking at you NAT). Any QoS that the end users probably want is only make dollar sense for the big companies that's why we have the fine print of best-effort services written in almost all the Telcos' EULA. What IPv6 based technology for example SRv6 is doing is trying to democratize QoS so that it's hopefully affordable to the average Joe. Perhaps it's still a pipe dream at the moment but given the current situations I will take SRv6 over MPLS, or any expensive TE any day. My networking utopia will be a local-first software with cloud vendors and Telcos independence that can utilize the networks based on their required and necessary QoS. I foresee that the best way of going forward is based on this new promising and more affordable technology that the incumbent technology cannot provide.

fach · on June 15, 2022

Can we not step back and instead address the practical implementation challenges robjs raised?

robjs · on July 28, 2020

The most important thing for you to identify is what the key characteristics of the person that you're trying to hire are -- can you teach them the technology if they don't know it, but it's key that they're able to interact with your customers in a helpful manner? Maybe it'd be OK to be a bit gruff if they were able to handle all your DevOps tasks and take those off your plate. Figure that out - and then bias for those characteristics.

Even then, you absolutely cannot distill how successful an employee someone will be down to a single document, but equally, you can't interview everyone - so there's some level of pragmatism that you have to exercise to be able determine who to interview. My experience here is that the filter criteria doesn't have to be the typical screening that a bunch of us probably are frustrated by (e.g., "does the resumé mention Python? No. Route it to /dev/null").

I've tried the following.

1. I'll filter by folks that I think have demonstrated some interesting commitment in their journey to where they are. For example, interesting side projects along with their professional experience, or a non-traditional route to their role (e.g., didn't attend college for the "expected" subject). My experience here has been that these folks often have different insights that have correlated to being easy to work with, and fruitful for the teams that I've engaged with.

2. I'll try and determine folks that are able to communicate in their resumé some particular outcome that shows that they had a wider perspective than their immediate role. This tended to me to show that the person was able to think a bit wider than the task that was on their plate, which again has correlated with good teammates to me.

robjs · on May 28, 2020

I wholeheartedly agree with this comment.

An industry mentor of mine who runs a R&D for a large, and successful software/hardware business unit once explained their approach to building teams in a really interesting manner to me. They explained that they wanted to have experienced folks on the team, so that they didn't re-learn lessons that the team should already have learnt, and instilled the culture and best practices of good developers -- but they wanted junior engineers on the team to drive a healthy disregard for what was asserted to be impossible, or written off as folly based on prior experiences. Their experience was that this balance helped grow great engineers from the junior folks, whilst keeping the senior folks on their toes -- seeing that their bias against an idea wasn't always the right thing.

Something that stuck with me from this conversation: We naturally become more conservative in terms of what we view as possible as we become more experienced -- after all, we tried X before, and it really didn't work out -- let's not waste our time doing X again...!

I interviewed 10s of candidates at a FAANG last year. A significant number of them were older than me. A non-trivial proportion of those were interviewing for positions that were more junior to me. Of those folks we hired a good number of them. If you see age-bias in the interviewing process, it's already a red-flag as to the team not clearly understanding how to make the best of the pool of engineers that they can potentially hire, and I'd avoid them for that reason.

I'd also love to hire experienced folks into my team. A great thing about the FAANG that I'm at, is that we're explicitly empowered in the hiring process - and hiring managers really _have_ to care about what interviewers think. I would encourage not writing this class of company off as being only youth-focused. :-)

robjs · on May 13, 2020

There is no single answer that is going to be universally applicable (surprise, surprise :-)). Oftentimes, the right design is going to be informed by some view of the practicalities of actually trying to implement it. Equally, there will be many problems where starting to fill the buffer of your text editor as the only way of designing the code will result in kludgey solutions to many cases that weren't immediately apparent when one started thinking about the problem.

To combat the fact that there's no right answer here, the teams that I work in have adopted a couple of strategies.

1. Start out with a reasonable sketch of the high-level design of the system and/or library that is being written. Thinking through this high-level approach lets everyone start with a reasonable shared mental model of how things will look as development proceeds. It also means that some of the complexities of "which of the more complex cases that we're going to come across are we going to solve, and which can be a TODO" be decided up-front.

2. After this initial design exercise, do design is small chunks, iteratively. Take the smallest possible set of functionality, and have a developer in the team (not just the tech lead...) determine the options for design, and propose one as the solution. These small units of design can be informed as the developer wishes -- i.e., they can do prototyping if it makes sense.

3. At all costs, avoid development approaches and team dynamics where refactoring is not considered an option, or discouraged. As you observe in your question, getting the design right might involve understanding more of the problem, or the requirements, and that might need the "oh , we should have solved this like that!" moment that only comes from having implemented something, or watching your users try and use the solution you built entirely differently to how you expected. Ensure that there's a team culture, or an openness to the idea that you definitely won't be right the first time. Perfect is the enemy of "done" or "shipped".

Personally, I prefer to start to think about the problem in an abstract form before coding, because this allows me to separate the consideration of the best way to write maintainable, testable code from the problem of the system, library or application design. Of course, YMMV, but this is what works for me :-)