Thanks for sharing your report, it's frustrating to see things like this break in minor patch updates. Small tip for GitHub Gist: set the file format to markdown (give it a .md extension) so that the markdown will be rendered and won't require horizontal scrolling :)
The report says it broke when updating from macOS 15 to 26, so not a minor patch update. I'm a bit surprised no one noticed this earlier though, since 26 has been out since September and in beta since June.
We want to provide some additional information on the power issue in a single Availability Zone in the ME-CENTRAL-1 Region. At around 4:30 AM PST, one of our Availability Zones (mec1-az2) was impacted by objects that struck the data center, creating sparks and fire. The fire department shut off power to the facility and generators as they worked to put out the fire. We are still awaiting permission to turn the power back on, and once we have, we will ensure we restore power and connectivity safely. It will take several hours to restore connectivity to the impacted AZ. The other AZs in the region are functioning normally.
Most likely, those were debris from the interception of missiles flying overhead and being destroyed on their way to a military target.
AFAIK, there have been no confirmed signs of civilian sites being targeted directly, and it would also be unlikely that actual missiles would cause so little damage that you could patch your datacenter up and get it ready to go within hours.
Thanks for the links, which I've reviewed. Allow me to clarify: I meant sources that confirm that the civilian places hit (eg. hotels and residential buildings) were the actual targets.
Local and official news all say that these were hit by debris from intercepted missiles/drones (on their way to somewhere else). There is a major difference between this, vs. if those buildings were directly being targeted.
AFAICT your linked sources indicate that the oil installations and ports were targets, but not the hotels and buildings.
I'm asking in good faith as this makes a significant difference.
I don't see the large difference between a civilian port, a civilian oil facility or a civilian aluminum factory vs a hotel on the topic of whether the Iranians are capable of targeting a civilian data center, however, assuming you are curious, here goes
Finding these take time so I am sorry if this is going to be the last of these sources I'll paste, for example Bahrain luxury apartments building being hit:
This reminds me of a visit to an Equinix data centre where the sales person was droning on and on about how incredibly reliable their power supplies were, how uninterruptible everything was, etc, etc…
Essentially, he was trying to assure us that no-no-no, we don’t need multiple zones like the public clouds, they can instead guarantee 100% uninterrupted power under all circumstances.
A bit bored and annoyed, I pointed to the giant red button conspicuously placed in the middle of a pillar and asked what it is for.
“Oh, that’s in case there’s a fire!”
“What does it do?”
“It cuts… the power… uhh… for the safety of the fire department.”
“So… if there’s a wisp of smoke in a corner somewhere, the fireys turn up, the first thing they do is… cut the power?”
Equinix in Sydney plonked 2 datacenters right on top of each other, and still insists that they are useful as redundant sites.
There was a locally very funny situation for a while when a tech influencer was insisting both equinix sites could be shut down by a single building collapse. He was wrong, but he wasn't so wrong that people shouldn't make better infrastructure decisions.
The building in question wasnt really tall enough. And would have to be precision demolished to collapse in the way he was afraid of.
It would still cause chaos and possible power issues.
Needs to be taken in context with some Sydney buildings having maintenance defects a few years after they open. Largely due to inferior materials imported from china. The building in question developed some cracks in supporting beams and was briefly evacuated. There was never a chance it was going to topple on its own in a way that impacted more than 1/2 datacenters, so he pivoted to possible terrorism, but even thats largely nonsensical.
I just went hunting for the case and couldnt find it. The gentleman in question had published the claim to his business that was it happens trying to build contacts with defense and intelligence agencies for third party threat assessment. As far as I can tell the business no longer exists and he has deleted their footprint.
But he also posted the claim on public mailing lists so I can probably trawl it up if necessary.
My boss at my first job hit the Big Red Button by swinging his arms too big in our datacenter one day, shutting down hundreds of servers and the mainframe, wreaking havoc for days!
That was when we installed the Big Clear Button Cover.
> we will ensure we restore power and connectivity safely
this would require human intervention and I am a bit worried what if the strike can happen again and human lives might be lost.
IIRC there have been cases in history where sometimes a same location is targeted across multiple days. Obviously, AWS might have local employees working in the region but would there be an evaluation of this threat itself within the relevant team in AWS. What if they try to bring the service back but then missiles are struck again and what if human lives might be lost on it. Let's just hope that it could be part of a evaluation as well.
But I mean,are the employees safe at home?
I guess if the really targeted the data center then home is safer, but in the fog of war maybe the data center wasn't the target?
> But I mean,are the employees safe at home? I guess if the really targeted the data center then home is safer, but in the fog of war maybe the data center wasn't the target?
My gut feeling says that they would be safer at homes than at datacenters. The only large info I have heard is attack on hotels, this datacenter etc. (atleast till right now).
> but in the fog of war maybe the data center wasn't the target?
We can't say this for sure but even if that was the case, I do think that they would see some damage was caused and then, try to double tap it for even more damage. So chances are, even if it wasn't the target previously, it might be the target now?
100% absolutely but its a bit worrying if in the future multiple AZ/datacenters could be start to get targeted?
attacking datacenters within a particular region so that service would have a hard time.
I guess someone can use some other regions DC to have more than (regional?) AZ but for mission critical infra, I can see that having sometimes issue too and you genuinely can't predict any of all of this.
That being said, There should be more than one AZ reliance but IMO also off-site or multi-cloud backups should also be preferred/used as well.
> What about local hospitals which may have service from that data center? There are heroes needed everywhere, all the time.
Off-site backups/Multi Cloud Strategy while encrypting data (and keeping the key safe, key point) might be a better strategy for such mission critical infrastructure.
I'm sure bezos will be really happy someone is being a hero for him in a war zone while he sails his newest yacht to wherever the new version of the island is.
on second thought there is a difference between restoring critical infrastructure in times of crisis vs restoring bot infrastructure for indian spamming operations. choose wisely
Here is something that gets lost in all the excitement about AI productivity: most software engineers became engineers because they love writing code.
I think there's a big split between those who derive meaning and enjoyment from the act of writing code or the code itself vs. those who derive it from solving problems (for which the code is often a necessary byproduct). I've worked with many across both of these groups throughout my career.
I am much more in the latter group, and the past 12mo are the most fun I've had writing software in over a decade. For those in the first group, it's easy to see how this can be an existential crisis.
Right. Most of the news articles don't link to the decision, which is worth reading.
It's a 6-3 decision. Not close.
Here's the actual decision:
The judgment of the United States Court of Appeals for
the Federal Circuit in case No. 25–250 is affirmed. The
judgment of the United States District Court for the District of Columbia in case No. 24–1287 is vacated, and the
case is remanded with instructions to dismiss for lack of jurisdiction.
So what does that mean in terms of action?
It means this decision [1] is now live. The vacated decision was a stay, and that's now dead.
So the live decision is now: We affirm the CIT’s holding that the Trafficking and
Reciprocal Tariffs imposed by the Challenged Executive Orders exceed the authority delegated to the President by IEEPA’s text. We also affirm the CIT’s grant of declaratory
relief that the orders are “invalid as contrary to law.”
"CIT" is the Court of International Trade. Their judgement [2], which was unanimous, is now live.
It reads:
"The court holds for the foregoing reasons that IEEPA does not authorize any of the
Worldwide, Retaliatory, or Trafficking Tariff Orders. The Worldwide and Retaliatory Tariff
Orders exceed any authority granted to the President by IEEPA to regulate importation by means
of tariffs. The Trafficking Tariffs fail because they do not deal with the threats set forth in those orders. This conclusion entitles Plaintiffs to judgment as a matter of law; as the court further finds no genuine dispute as to any material fact, summary judgment will enter against the United States. See USCIT R. 56. The challenged Tariff Orders will be vacated and their operation permanently enjoined."
So that last line is the current state: "The challenged Tariff Orders will be vacated and their operation permanently enjoined." Immediately, it appears.
A useful question for companies owed a refund is whether they can use their credit against the United States for other debts to the United States, including taxes.
”Based on two words separated by 16 others, the President asserts the independent power to impose tariffs on imports from any country, of any product, at any rate, for any amount of time. Those words cannot bear such weight.”
Zing! Surprisingly spicy writing for such a gravely serious body.
The Gorsuch concurring is quite the read, but wish more Americans internalized its final paragraph (excerpts below).
Yes, legislating can be hard and take time. And, yes, it can be tempting to bypass Congress when some pressing problem arises. But the deliberative nature of the legislative process was the whole point of its design. ...
But if history is any guide, the tables will turn and the day will come when those disappointed by today’s result will appreciate the legislative process for the bulwark of liberty it is.
I agree with Gorsuch, and I love this idea, but until the legislative branch abandons procedures that prevent the deliberation from happening in the first place, this will keep happening.
There is a balance to be struck to avoid a completely ineffectual congress but I'm not sure a legislative body biased towards action is one you would actually want. Making it easier to kill bills than pass them has a natural stabilizing effect which I think is a net good for the country.
Hmm, I read some of the decision, and now I'm not sure what to make of all of it.
When I came to the opinion from Jackson, J., I found it extremely compelling. He says this:
... But some of TWEA’s sections delegating this authority had lapsed, and “there [was] doubt as to the effectiveness of other sections.” Accordingly, Congress amended TWEA in 1941, adding the subsection that includes the “regulate ... importation” language on which the President relies today. The Reports explained Congress’s primary purpose for the 1941 amendment: shoring up the President’s ability to control foreign-owned property by maintaining and strengthening the “existing system of foreign property control (commonly known as freezing control).”
When Congress enacted IEEPA in 1977, limiting the circumstances under which the President could exercise his emergency authorities, it kept the “regulate ... importation” language from TWEA. The other two relevant pieces of legislative history—the Senate and House Reports that accompanied IEEPA—demonstrate that Congress’s intent regarding the scope of this statutory language remained the same. As the Senate Report explained, Congress’s sole objective for the “regulate ... importation” subsection was to grant the President the emergency authority “to control or freeze property transactions where a foreign interest is involved.” The House Report likewise described IEEPA as empowering the President to “regulate or freeze any property in which any foreign country or a national thereof has any interest.”
However, then I read Kavanaugh, J. who writes the following:
In 1971, President Nixon imposed 10 percent tariffs on almost all foreign imports. He levied the tariffs under IEEPA’s predecessor statute, the Trading with the Enemy Act (TWEA), which similarly authorized the President to “regulate ... importation.” The Nixon tariffs were upheld in court.
When IEEPA was enacted in 1977 in the wake of the Nixon and Ford tariffs and the Algonquin decision, Congress and the public plainly would have understood that the power to “regulate ... importation” included tariffs. If Congress wanted to exclude tariffs from IEEPA, it surely would not have enacted the same broad “regulate ... importation” language that had just been used to justify major American tariffs on foreign imports.
And I also find this compelling.
To add onto this, Roberts, C. J. says: IEEPA’s grant of authority to “regulate ... importation” falls short. IEEPA contains no reference to tariffs or duties. The Government points to no statute in which Congress used the word “regulate” to authorize taxation. And until now no President has read IEEPA to confer such power.
This seems directly contradictory to Kavanaugh, J.'s dissent! Kavanaugh, J. claims that Nixon used the word “regulate” to impose tarrifs. And apparently the word isn't just in some random other statute — Nixon did so from TWEA, the predecessor of IEEPA: when Congress enacted IEEPA in 1977 it kept the “regulate ... importation” language from TWEA. (from Jackson, J.) So the point that no President has read IEEPA to confer such power seems pretty weak, when Nixon apparently did so from TWEA.
I have no conclusion from this, but IMO both Jackson, J. and Kavanaugh, J. have pretty strong points in opposing directions.
Kavanaugh’s reasoning is that a wartime law, TWEA, can be congruent to a peacetime law, IEEPA. The rest of the court acknowledged that the President always had control of tariffs during war.
As somebody whose first day working at Heroku was the day this acquisition closed, I think it’s mostly a misconception to blame Salesforce for Heroku’s stagnation and eventual irrelevance. Salesforce gave Heroku a ton of funding to build out a vision that was way ahead of its time. Docker didn’t even come out until 2013, AWS didn’t even have multiple regions when it was built. They mostly served as an investor and left us alone to do our thing, or so it seemed those first couple years.
The launch of the multi language Cedar runtime in 2011 led to incredible growth and by 2012 we were drowning in tech debt and scaling challenges. Despite more than tripling our headcount in that first year (~20 to 74) we could not keep up.
Mid 2012 was especially bad as we were severely impacted by two us-east-1 outages just 2 weeks apart. To the extent it wasn’t already, reliability and paying down tech debt became the main focus and I think we went about 18 months between major user-facing platform launches (Europe region and eventually larger sized dynos being the biggest things we eventually shipped after that drought). The organization lost its ability to ship significant changes or maybe never really had that ability at scale.
That time coincided with the founders taking a step back, leaving a loss of leadership and vision that was filled by people more concerned with process than results. I left in 2014 and at that time it already seemed clear to me that the product was basically stalled.
I’m not sure how much of this could have been done better even in hindsight. In theory Salesforce could have taken a more hands on approach early on but I don’t think that could have ended better. They were so far from profitability in late 2010 that they could not stay independent without raising more funding. The venture market in ~2010 was much smaller than a few years later—tiny rounds and low valuations. Had the company spent its pre-acquisition engineering cycles building for scalability & reliability at the expense of product velocity they probably would have never gotten successful.
Even still, it was the most amazing professional experience of my career, full of brilliant and passionate people, and it’s sad to see it end this way.
It remains the greatest engineering team I've ever seen or had the pleasure to be a part of. I was only there from early 2011 to mid 2012 but what I took with me changed me as an engineer. The shear brilliance of the ideas and approaches...I was blessed to witness it. I don't think I can overstate it, though many will think this is all hyperbole. I didn't always agree with the decisions made and I was definitely there when the product stagnation started, but we worked hard to reduce tech debt, build better infrastructure, and improve... but man, the battles we fought. Many fond memories, including the single greatest engineering mistake I've ever seen made, one that I still think about until this day (but will never post in a public forum :)).
I'm just going to chime in here and say thank you, there still really isn't in my mind a comparable offering to heroku's git push and go straight to a reasonable production
I honestly find it a bit nuts, there's offerings that come close, but using them I still get the impression that they've just not put in the time really refining that user interface, so I just wanted to say thank you for the work you and the GP did, it was incredibly helpful and I'm happy to say helped me launch and test a few product offerings as well as some fun ideas
It absolutely boggles my mind that nothing else exists to fill this spot. Fly and others offer varying degrees of easier-than-AWS hosting, but nobody offers true PaaS like Heroku, IMHO.
The Heroku style of PaaS just isn't very interesting to most large businesses that actually pay for things. The world basically moved on to Kubernetes-based products (see Google and Red Hat)--or just shutdown like a lot of Cloud Foundry-based products. Yes, many individuals and smaller shops care more about simplicity but they're generally not willing/able to pay a lot (if anything).
It seems like you’re right, but it’s strange that the data world seems to be moving in the opposite direction, with PaaS products like Snowflake, DataBricks, Microsoft Fabric, even Salesforce’s own Data Cloud eating the world.
PaaS has always been this thing that isn't pure infrastructure or pure hosted software that you use as-is. Salesforce has something over 100K attendees of partners and users to its annual conference. It's always been this in-between thing with a fairly loose definition. I'd argue that Salesforce was long a cross between SaaS (for the users) and PaaS (for developers). You can probably apply the same view to a lot of other company products.
In 2022 Render increased their prices (which for my team worked out at a doubling of costs) with a one month notice period and the CEO's response to me when I asked him if he thought that was a fair notice period was that it was operationally necessary and he was sorry to see us go.
It's a natural law to pay based on what you consume. Whenever that isn't required, you're typically being subsidized in a form or another, but don't be surprised when it goes away.
Heroku and Ruby, for me, was the 21st century answer to 'deploying' a PHP site over FTP.
The fact that it required nothing but 'git push heroku master' at the time was incredible, especially for how easy it was to spin up pre-prod environments with it, and how wiring up a database was also trivial.
Every time I come across an infrastructure that is bloated out with k8s, helm charts, and a complex web of cloud resources, all for a service not even running at scale, I look back to the simplicity we used to have.
I completely agree that there's nothing comparable to old-school Heroku, which is crazy. That said, Cloudflare seems promising for some types of projects and I use them for a few things. Anyone using them as a one-stop-shop?
For me Northflank have filled this spot. Though by the time I switched I was already using Docker so can't speak directly to their Heroku Buildpack support.
vercel goes a step further, and (when configured this way) allocates a new hostname (eg feature-branch-add-thingg.app.vercel.example.com) for new branches, to make testing even easier.
This looks nice! Wish they had a no-credit-card-required version for educational purposes. For the course I teach we use Spring Boot, and life was good with Heroku till they discontinued the no-credit-card version, and then the only choice we had (with support for Spring Boot) was to move over to Azure, which works but is a bit overkill and complicated for our purposes. I guess we could just use Docker and then many more platform would become available, but I'd rather not add one more step to the pipeline if possible.
I’ve been using Render for close to 5 years, and it’s excellent. I can’t think of anything I use that it doesn’t do as well or better than Heroku did last I checked.
I agree with everything you said, and can only thank the founders for their tremendous insight and willingness to push the limits. The shear number of engineering practices we take for granted today because of something like Heroku boggles my mind.
I am forever grateful for the opportunity to work there and make it an effort to pass on what I learned to others.
> by 2012 we were drowning in tech debt and scaling challenges.
> the greatest engineering team I've ever seen
How do these two things reconcile in your opinion? In my view , doing something quickly is the easy part , good engineering is only needed exactly when you want things to be maintainable and scalable, so the assertions above don’t really make much sense to me.
It is hard to explain the impact of such massive growth over a 2-3 period. New features were coming online while old ones were being abused by overuse. For instance, we launched PostgreSQL in the cloud, something we take for granted today. Not only that, but we offered an insane feature set around "follow" and "forking" that made working with databases seem futuristic.
I remember when we launched that product we went to that year's PGCon and there were people in the crowd angry and dismissive that we would treat data that way. It was actually pretty confrontational. Products like that were being produced while we were also working on migrating away from the initial implementation of the "free tier" (internally called Shen). It took me and a few others months to replace it and ensure we didn't lose data while also making it maintainable. The resulting tool lovingly named "yobuko" ended up remaining for years after that (largely due to the stagnation and turn over).
Anyways, that was just a slice of it. Decisions made today are not always the decisions you wanted to be made tomorrow. Day0 is great, day100 comes with more knowledge and regret. :D
In general, my impression has been that you don't want to architect your solution at first for massive scaling, because:
* You probably aren't going to need it, so putting the effort into scaling means slowing down your delivery of the very features that would make customers want your solution.
* It typically slows down performance of individual features.
* It definitely significant increases the complexity of your solution (and probably the user-facing tooling as well).
* It is difficult to achieve until you have the live traffic to test your approach.
Yeah I think there is a lot of truth here. You can't solve all the problems and in Heroku's case we focused on user experience (internally and externally). Great ideas like "git push heroku main" are game changers, but what happens once that git server is receiving 1000 pushes a minute? Totally different thought process.
Perhaps the thing I would add is that even with the tech debt and scaling problems we still had over a million applications deployed ready for that request to hit them.
Well many of them you may know in that they made their way into so many systems (though arguably without the refined UX of Heroku) but the two that come up the most and I am teaching others:
* The simpler the interface for the user, the more decisions you can make behind the scenes. A good example here is "git push heroku". Not only is that something every user (bot or human) can run, it is also easy to script, protect, and scale. It keeps the surface area small and the abstraction makes the most sense. The code that was behind that push lasted for quite some time, and it was effectively 1-2 Python classes as a service. But once we got the code into our systems, we could do anything with it... and we did. One of the things that blows my mind is that our "slug" (this is what we called the tarballs of code that we put on the runtimes) maker was itself a heroku app. It lived along side everyone else. We were able to reduce our platform down to a few simple pieces (what we called the kernel) and everything else was deployed on top. We benefited from the very things our customers were using.
* NOTE: This one is going to be hard to explain because it is so simple, but when you start thinking about system design in this way the possibilities start to open up right in front of you.
The idea is that everything we do is effectively explained as "input -> filter -> output". Even down to the CPU. But especially when it comes to a platform. With this design mentality we had a logging pipeline that I am still jealous of. We had metrics flowing into dashboards that were everywhere and informed us of our work. We had things like "integration testing" that ran continuously against the platform, all from the users perspective, that allowed us to test features long before they reached the public. All of these things were "input" that we "filtered" in some way to produce "output". When you start using that "output" as "input" and chaining these things together you get to a place where you can design a "kernel" (effectively an API service and a runtime) and start interacting with it to produce a platform.
I remember when we were pairing down services to get to our "kernel" one of the Operations engineers developed our Chef so that an internal engineer needed maybe 5-7 lines of Ruby to deploy their app and get everything they needed. Simple input, that produced a reliable application setup, that could now get us into production faster.
Thanks for sharing your story. Those early days of using Heroku were really enjoyable for me. It felt so promising and simple. I remember explaining the concept to a lot of different people who didn't believe that the deployment model could be that simple and accessible until I showed them.
Then life went on, I bounced around in my career, and forgot about Heroku. Years later I actually suggested it for someone to use for a simple project once and I could practically feel the other developers in the room losing respect for me for even mentioning it. I hadn't looked at it for so long that I didn't realize it had fallen out of favor.
> That time coincided with the founders taking a step back, leaving a loss of leadership and vision that was filled by people more concerned with process than results
This feels all too familiar. All of my enjoyable startup experiences have ended this way: The fast growing, successful startup starts attracting people who look like they should be valuable assets to the company, but they're more interested in things like policies, processes, meetings, and making the status reports look nice than actually shipping and building.
Having been on a bigco team that underwent the same sort of headcount growth in a very short time I have to imagine that "more than tripling our headcount in that first year" was likely more a driver of the inability to keep up than a solution. That's not a knock on the talents of anyone hired; it's just exceedingly difficult to simultaneously grow a team that fast and maintain any kind of velocity regardless of the complexity of the problems you're trying to tackle. The culture and knowledge that enabled the previous team's velocity just gets completely diluted.
FWIW, the team that eventually created "Docker" was working at the same time on dotCloud as a direct Heroku competitor. I remember meeting them at a meet-up in the old Twitter building but couldn't tell you exactly which year that was. Maybe 2010 or 2011?
Yep, that team did great work. I remember having lunch at the Heroku office with the dotCloud team in 2011 or 2012 and also Solomon Hykes demoing Docker to us in our office’s basement before it launched. So much cool stuff was happening back then!
Edit: I have also uploaded it on preservetube when I got interested in what are some wayback machine alternatives (given that wayback machine suggests to not upload videos given storage constraints unless absolutely necessary)
So I found preservetube which is sort of intended for this and uploaded this video on it
I worked with some of the folks from there, and honestly you make it sounds like tech debt is inevitable consequence haunting projects from year one.
I disagree, I think the folks just did a sloppy job of "let's bungee strap it all together" for speed, instead of more serious planning and architecturing. They self-inflicted the tech debt on them and got drowned in the debt interest super fast.
There's someone out there who built the scalable version of Heroku at another garage startup. But we never heard of them because Heroku beat them to market.
As far as the Salesforce acquisition goes, I'd be curious to see who made the decision to put Heroku into maintenance only mode.
I worked for a different part of Salesforce. I don't really feel like Salesforce did a ton of meddling in any of its bigger acquisitions other than maybe Tableau. I think the biggest missed opportunity was potentially creating a more unified experience between all of its subsidiaries. Though, it's hard to call that a failure since they're making tons of money.
It could be a case of post-founder leadership seeing that there's not a lot of room for growth and giving up. That happens a lot in the tech industry.
You'll need to unlock your iPhone first. Even though you're staring at the screen and just asked me to do something, and you saw the unlocked icon at the top of your screen before/while triggering me, please continue staring at this message for at least 5 seconds before I actually attempt FaceID to unlock your phone to do what you asked.
None of that means Oban or similar queues don't/can't scale—it just means a high volume of NOTIFY doesn't scale, hence the alternative notifiers and the fact that most of its job processing doesn't depend on notifications at all.
There are other reasons Oban recommends a different notifier per the doc link above:
> That keeps notifications out of the db, reduces total queries, and allows larger messages, with the tradeoff that notifications from within a database transaction may be sent even if the transaction is rolled back
No, I don't think so. Oban does not rely on a large volume of NOTIFY in order to process a large volume of jobs. The insert notifications are simply a latency optimization for lower volume environments, and for inserts can be fully disabled such that they're mainly used for control flow (canceling jobs, pausing queues, etc) and gossip among workers.
River for example also uses LISTEN/NOTIFY for some stuff, but we definitely do not emit a NOTIFY for every single job that's inserted; instead there's a debouncing setup where each client notifies at most once per fetch period, and you don't need notifications at all in order to process with extremely high throughput.
In short, the fact that high volume NOTIFY is a bottleneck does not mean these systems cannot scale, because they do not rely on a high volume of NOTIFY or even require it at all.
Does River without any extra configuration run into scaling issues at a certain point? If the answer is yes, then River doesn’t scale without optimization (Redis/Clustering in Oban’s case).
While the root cause might not be River/Oban, them not being scalable still holds true. It’s of extra importance given the context of this post is moving away from redis and to strictly a database for a queue system.
It does not scale forever, and as you grow in throughput and job table size you will probably need to do some tuning to keep things running smoothly. But after the amount of time I've spent in my career tracking down those numerous distributed systems issues arising from a non-transactional queue, I've come to believe this model is the right starting point for the vast majority of applications. That's especially true given how high the performance ceiling is on newer / more modern job queues and hardware relative to where things were 10+ years ago.
If you are lucky enough to grow into the range of many thousands of jobs per second then you can start thinking about putting in all that extra work to build a robust multi-datastore queueing system, or even just move specific high-volume jobs into a dedicated system. Most apps will never hit this point, but if you do you'll have deferred a ton of complexity and pain until it's truly justified.
I get the temptation to attribute the popularity of these systems to lazy police with nothing better to do, but from personal experience there’s more to it.
I live in a medium sized residential development about 15 minutes outside Austin. A few years ago we started getting multiple incidents per month of attempted car theft where the thieves would go driveway to driveway checking for unlocked doors. Sometimes the resident footage revealed the thieves were armed while doing so. In a couple of cases they did actually steal a car.
The sheriffs couldn’t really do much about it because a) it was happening to most of the neighborhoods around us, b) the timing was unpredictable, and c) the manpower required to camp out to attempt to catch these in progress would be pretty high.
Our neighborhood installed Flock cameras at the sole entrance in response to growing resident concerns. We also put in a strict policy around access control by non law enforcement. In the ~two years since they were installed, we’ve had two or three incidents total whereas immediately prior it was at least as many each month. And in those cases the sheriffs could easily figure out which vehicles had entered or left during that time. I continue to see stories of attempted car thefts from adjacent neighborhoods several times per month.
I totally get the privacy concerns around this and am inherently suspicious of any new surveillance. I also get the reflexive dismissal of their value. In this case it has been a clear win for our community through the obvious deterrent factor and the much higher likelihood of having evidence if anything does happen.
Our Flock cameras do not show on the map here, btw.
reply