Hacker Newsnew | past | comments | ask | show | jobs | submit | bgentry's commentslogin

Thanks for sharing your report, it's frustrating to see things like this break in minor patch updates. Small tip for GitHub Gist: set the file format to markdown (give it a .md extension) so that the markdown will be rendered and won't require horizontal scrolling :)

The report says it broke when updating from macOS 15 to 26, so not a minor patch update. I'm a bit surprised no one noticed this earlier though, since 26 has been out since September and in beta since June.

The important quote from the timeline:

Mar 01 9:41 AM PST

We want to provide some additional information on the power issue in a single Availability Zone in the ME-CENTRAL-1 Region. At around 4:30 AM PST, one of our Availability Zones (mec1-az2) was impacted by objects that struck the data center, creating sparks and fire. The fire department shut off power to the facility and generators as they worked to put out the fire. We are still awaiting permission to turn the power back on, and once we have, we will ensure we restore power and connectivity safely. It will take several hours to restore connectivity to the impacted AZ. The other AZs in the region are functioning normally.


> impacted by objects

Well that's a way of phrasing it.

I assume these were missiles.


Most likely, those were debris from the interception of missiles flying overhead and being destroyed on their way to a military target.

AFAIK, there have been no confirmed signs of civilian sites being targeted directly, and it would also be unlikely that actual missiles would cause so little damage that you could patch your datacenter up and get it ready to go within hours.


That's incorrect, there have been multiple hotels being attacked and recently oil facilities in SA


Can you please share sources?


refineries, gas fields - https://www.reuters.com/business/energy/saudi-aramco-shuts-r...

dubai airport, residential buildings, hotels, ports - https://www.reuters.com/world/middle-east/several-loud-blast...

https://www.lemonde.fr/en/international/article/2026/03/01/i...

Bahrain - https://bh.usembassy.gov/security-alert-update-5-u-s-embassy...

You can google for other targets in Qatar, Oman, Jordan, Iraq, Cyprus and Kuwait


Thanks for the links, which I've reviewed. Allow me to clarify: I meant sources that confirm that the civilian places hit (eg. hotels and residential buildings) were the actual targets.

Local and official news all say that these were hit by debris from intercepted missiles/drones (on their way to somewhere else). There is a major difference between this, vs. if those buildings were directly being targeted.

AFAICT your linked sources indicate that the oil installations and ports were targets, but not the hotels and buildings.

I'm asking in good faith as this makes a significant difference.


I don't see the large difference between a civilian port, a civilian oil facility or a civilian aluminum factory vs a hotel on the topic of whether the Iranians are capable of targeting a civilian data center, however, assuming you are curious, here goes

Finding these take time so I am sorry if this is going to be the last of these sources I'll paste, for example Bahrain luxury apartments building being hit:

https://edition.cnn.com/world/video/bahrain-iran-drone-strik...

US warning that high rise buildings in Bahrain are being targeted by Iranians drones:

https://x.com/TravelGov/status/2027843430987010446


This reminds me of a visit to an Equinix data centre where the sales person was droning on and on about how incredibly reliable their power supplies were, how uninterruptible everything was, etc, etc…

Essentially, he was trying to assure us that no-no-no, we don’t need multiple zones like the public clouds, they can instead guarantee 100% uninterrupted power under all circumstances.

A bit bored and annoyed, I pointed to the giant red button conspicuously placed in the middle of a pillar and asked what it is for.

“Oh, that’s in case there’s a fire!”

“What does it do?”

“It cuts… the power… uhh… for the safety of the fire department.”

“So… if there’s a wisp of smoke in a corner somewhere, the fireys turn up, the first thing they do is… cut the power?”

“… yes.”

“Not 100% then, is it?”


Equinix in Sydney plonked 2 datacenters right on top of each other, and still insists that they are useful as redundant sites.

There was a locally very funny situation for a while when a tech influencer was insisting both equinix sites could be shut down by a single building collapse. He was wrong, but he wasn't so wrong that people shouldn't make better infrastructure decisions.


The two in Alexandria are of course just down the road from the airport. A widebody jet cratering in that location will put them both out of business.


None of the runways point towards them, its hard to imagine that scenario.


34R does. The immediate right turn in the 34R ILS and RNP missed approach procedures takes you pretty much directly overhead Equinix.

Aircraft routinely overfly the location either on departure from 34R or approach for 16L.


They're also below sea level.


Theres an Energex office building in Brisbane that has a backup generator below sea level. (Guess when they need to use the generator lmao.)


If it's a once in 25 year flood then it won't be a problem for the person who made that decision!


I'm intrigued...how was he wrong?


The building in question wasnt really tall enough. And would have to be precision demolished to collapse in the way he was afraid of.

It would still cause chaos and possible power issues.

Needs to be taken in context with some Sydney buildings having maintenance defects a few years after they open. Largely due to inferior materials imported from china. The building in question developed some cracks in supporting beams and was briefly evacuated. There was never a chance it was going to topple on its own in a way that impacted more than 1/2 datacenters, so he pivoted to possible terrorism, but even thats largely nonsensical.

I just went hunting for the case and couldnt find it. The gentleman in question had published the claim to his business that was it happens trying to build contacts with defense and intelligence agencies for third party threat assessment. As far as I can tell the business no longer exists and he has deleted their footprint.

But he also posted the claim on public mailing lists so I can probably trawl it up if necessary.


Should have pushed it.


at The Planet in Dallas c. 2002 the EPO button was exposed with no cover, and in very very close proximity to the "Exit" button for the doors...

one day, a colo customer hit the wrong button on the way out, and uhh, there was an outage



How did you find out about the outage?


My boss at my first job hit the Big Red Button by swinging his arms too big in our datacenter one day, shutting down hundreds of servers and the mainframe, wreaking havoc for days!

That was when we installed the Big Clear Button Cover.


Impact assessment: yes.


> we will ensure we restore power and connectivity safely

this would require human intervention and I am a bit worried what if the strike can happen again and human lives might be lost.

IIRC there have been cases in history where sometimes a same location is targeted across multiple days. Obviously, AWS might have local employees working in the region but would there be an evaluation of this threat itself within the relevant team in AWS. What if they try to bring the service back but then missiles are struck again and what if human lives might be lost on it. Let's just hope that it could be part of a evaluation as well.


I wouldn't risk it.

Both Americans and Israelis are known for double taps. Surely Iran can adapt their tactics too.


But I mean,are the employees safe at home? I guess if the really targeted the data center then home is safer, but in the fog of war maybe the data center wasn't the target?


> But I mean,are the employees safe at home? I guess if the really targeted the data center then home is safer, but in the fog of war maybe the data center wasn't the target?

My gut feeling says that they would be safer at homes than at datacenters. The only large info I have heard is attack on hotels, this datacenter etc. (atleast till right now).

> but in the fog of war maybe the data center wasn't the target?

We can't say this for sure but even if that was the case, I do think that they would see some damage was caused and then, try to double tap it for even more damage. So chances are, even if it wasn't the target previously, it might be the target now?


> this would require human intervention

Amazon has self-propelled robots that handle their logistics and fulfillment, don't they? Send in the robots.


> this would require human intervention

that's the difference between heroes and ordinary employees who bitch about having to go into the office twice a month.

same as the stories you hear of guys taking snow-cats up a mountain in a blizzard to restore phone circuits or radio transmitters gone offline.


Man, don’t be a “hero” trying to restore a lower ping to someone trying to buy a kindle in Jeddah.


What about local hospitals which may have service from that data center? There are heroes needed everywhere, all the time.


In that case, the hero was the person who avoided relying on a single AZ when they deployed to cloud.


100% absolutely but its a bit worrying if in the future multiple AZ/datacenters could be start to get targeted?

attacking datacenters within a particular region so that service would have a hard time.

I guess someone can use some other regions DC to have more than (regional?) AZ but for mission critical infra, I can see that having sometimes issue too and you genuinely can't predict any of all of this.

That being said, There should be more than one AZ reliance but IMO also off-site or multi-cloud backups should also be preferred/used as well.


Their lack of multiple AZ’s isn’t the guy making 30k a year’s problem.


> What about local hospitals which may have service from that data center? There are heroes needed everywhere, all the time.

Off-site backups/Multi Cloud Strategy while encrypting data (and keeping the key safe, key point) might be a better strategy for such mission critical infrastructure.


I'm sure bezos will be really happy someone is being a hero for him in a war zone while he sails his newest yacht to wherever the new version of the island is.


on second thought there is a difference between restoring critical infrastructure in times of crisis vs restoring bot infrastructure for indian spamming operations. choose wisely


Here is something that gets lost in all the excitement about AI productivity: most software engineers became engineers because they love writing code.

I think there's a big split between those who derive meaning and enjoyment from the act of writing code or the code itself vs. those who derive it from solving problems (for which the code is often a necessary byproduct). I've worked with many across both of these groups throughout my career.

I am much more in the latter group, and the past 12mo are the most fun I've had writing software in over a decade. For those in the first group, it's easy to see how this can be an existential crisis.



Right. Most of the news articles don't link to the decision, which is worth reading.

It's a 6-3 decision. Not close.

Here's the actual decision:

The judgment of the United States Court of Appeals for the Federal Circuit in case No. 25–250 is affirmed. The judgment of the United States District Court for the District of Columbia in case No. 24–1287 is vacated, and the case is remanded with instructions to dismiss for lack of jurisdiction.

So what does that mean in terms of action?

It means this decision [1] is now live. The vacated decision was a stay, and that's now dead.

So the live decision is now: We affirm the CIT’s holding that the Trafficking and Reciprocal Tariffs imposed by the Challenged Executive Orders exceed the authority delegated to the President by IEEPA’s text. We also affirm the CIT’s grant of declaratory relief that the orders are “invalid as contrary to law.”

"CIT" is the Court of International Trade. Their judgement [2], which was unanimous, is now live. It reads:

"The court holds for the foregoing reasons that IEEPA does not authorize any of the Worldwide, Retaliatory, or Trafficking Tariff Orders. The Worldwide and Retaliatory Tariff Orders exceed any authority granted to the President by IEEPA to regulate importation by means of tariffs. The Trafficking Tariffs fail because they do not deal with the threats set forth in those orders. This conclusion entitles Plaintiffs to judgment as a matter of law; as the court further finds no genuine dispute as to any material fact, summary judgment will enter against the United States. See USCIT R. 56. The challenged Tariff Orders will be vacated and their operation permanently enjoined."

So that last line is the current state: "The challenged Tariff Orders will be vacated and their operation permanently enjoined." Immediately, it appears.

A useful question for companies owed a refund is whether they can use their credit against the United States for other debts to the United States, including taxes.

[1] https://www.cafc.uscourts.gov/opinions-orders/25-1812.OPINIO...

[2] https://storage.courtlistener.com/recap/gov.uscourts.cit.170...


”Based on two words separated by 16 others, the President asserts the independent power to impose tariffs on imports from any country, of any product, at any rate, for any amount of time. Those words cannot bear such weight.”

Zing! Surprisingly spicy writing for such a gravely serious body.


The Gorsuch concurring is quite the read, but wish more Americans internalized its final paragraph (excerpts below).

Yes, legislating can be hard and take time. And, yes, it can be tempting to bypass Congress when some pressing problem arises. But the deliberative nature of the legislative process was the whole point of its design. ... But if history is any guide, the tables will turn and the day will come when those disappointed by today’s result will appreciate the legislative process for the bulwark of liberty it is.


I agree with Gorsuch, and I love this idea, but until the legislative branch abandons procedures that prevent the deliberation from happening in the first place, this will keep happening.


There is a balance to be struck to avoid a completely ineffectual congress but I'm not sure a legislative body biased towards action is one you would actually want. Making it easier to kill bills than pass them has a natural stabilizing effect which I think is a net good for the country.


Sure but the original rules largely accommodate that. Anything that can get 60 Votes in the senate should pass.


Hmm, I read some of the decision, and now I'm not sure what to make of all of it.

When I came to the opinion from Jackson, J., I found it extremely compelling. He says this:

... But some of TWEA’s sections delegating this authority had lapsed, and “there [was] doubt as to the effectiveness of other sections.” Accordingly, Congress amended TWEA in 1941, adding the subsection that includes the “regulate ... importation” language on which the President relies today. The Reports explained Congress’s primary purpose for the 1941 amendment: shoring up the President’s ability to control foreign-owned property by maintaining and strengthening the “existing system of foreign property control (commonly known as freezing control).”

When Congress enacted IEEPA in 1977, limiting the circumstances under which the President could exercise his emergency authorities, it kept the “regulate ... importation” language from TWEA. The other two relevant pieces of legislative history—the Senate and House Reports that accompanied IEEPA—demonstrate that Congress’s intent regarding the scope of this statutory language remained the same. As the Senate Report explained, Congress’s sole objective for the “regulate ... importation” subsection was to grant the President the emergency authority “to control or freeze property transactions where a foreign interest is involved.” The House Report likewise described IEEPA as empowering the President to “regulate or freeze any property in which any foreign country or a national thereof has any interest.”

However, then I read Kavanaugh, J. who writes the following:

In 1971, President Nixon imposed 10 percent tariffs on almost all foreign imports. He levied the tariffs under IEEPA’s predecessor statute, the Trading with the Enemy Act (TWEA), which similarly authorized the President to “regulate ... importation.” The Nixon tariffs were upheld in court.

When IEEPA was enacted in 1977 in the wake of the Nixon and Ford tariffs and the Algonquin decision, Congress and the public plainly would have understood that the power to “regulate ... importation” included tariffs. If Congress wanted to exclude tariffs from IEEPA, it surely would not have enacted the same broad “regulate ... importation” language that had just been used to justify major American tariffs on foreign imports.

And I also find this compelling.

To add onto this, Roberts, C. J. says: IEEPA’s grant of authority to “regulate ... importation” falls short. IEEPA contains no reference to tariffs or duties. The Government points to no statute in which Congress used the word “regulate” to authorize taxation. And until now no President has read IEEPA to confer such power.

This seems directly contradictory to Kavanaugh, J.'s dissent! Kavanaugh, J. claims that Nixon used the word “regulate” to impose tarrifs. And apparently the word isn't just in some random other statute — Nixon did so from TWEA, the predecessor of IEEPA: when Congress enacted IEEPA in 1977 it kept the “regulate ... importation” language from TWEA. (from Jackson, J.) So the point that no President has read IEEPA to confer such power seems pretty weak, when Nixon apparently did so from TWEA.

I have no conclusion from this, but IMO both Jackson, J. and Kavanaugh, J. have pretty strong points in opposing directions.


Kavanaugh’s reasoning is that a wartime law, TWEA, can be congruent to a peacetime law, IEEPA. The rest of the court acknowledged that the President always had control of tariffs during war.


Jackson is a woman just fyi.


Ah thanks for clarifying.


As somebody whose first day working at Heroku was the day this acquisition closed, I think it’s mostly a misconception to blame Salesforce for Heroku’s stagnation and eventual irrelevance. Salesforce gave Heroku a ton of funding to build out a vision that was way ahead of its time. Docker didn’t even come out until 2013, AWS didn’t even have multiple regions when it was built. They mostly served as an investor and left us alone to do our thing, or so it seemed those first couple years.

The launch of the multi language Cedar runtime in 2011 led to incredible growth and by 2012 we were drowning in tech debt and scaling challenges. Despite more than tripling our headcount in that first year (~20 to 74) we could not keep up.

Mid 2012 was especially bad as we were severely impacted by two us-east-1 outages just 2 weeks apart. To the extent it wasn’t already, reliability and paying down tech debt became the main focus and I think we went about 18 months between major user-facing platform launches (Europe region and eventually larger sized dynos being the biggest things we eventually shipped after that drought). The organization lost its ability to ship significant changes or maybe never really had that ability at scale.

That time coincided with the founders taking a step back, leaving a loss of leadership and vision that was filled by people more concerned with process than results. I left in 2014 and at that time it already seemed clear to me that the product was basically stalled.

I’m not sure how much of this could have been done better even in hindsight. In theory Salesforce could have taken a more hands on approach early on but I don’t think that could have ended better. They were so far from profitability in late 2010 that they could not stay independent without raising more funding. The venture market in ~2010 was much smaller than a few years later—tiny rounds and low valuations. Had the company spent its pre-acquisition engineering cycles building for scalability & reliability at the expense of product velocity they probably would have never gotten successful.

Even still, it was the most amazing professional experience of my career, full of brilliant and passionate people, and it’s sad to see it end this way.


It remains the greatest engineering team I've ever seen or had the pleasure to be a part of. I was only there from early 2011 to mid 2012 but what I took with me changed me as an engineer. The shear brilliance of the ideas and approaches...I was blessed to witness it. I don't think I can overstate it, though many will think this is all hyperbole. I didn't always agree with the decisions made and I was definitely there when the product stagnation started, but we worked hard to reduce tech debt, build better infrastructure, and improve... but man, the battles we fought. Many fond memories, including the single greatest engineering mistake I've ever seen made, one that I still think about until this day (but will never post in a public forum :)).

It was a pleasure working with you bgentry!


I'm just going to chime in here and say thank you, there still really isn't in my mind a comparable offering to heroku's git push and go straight to a reasonable production

I honestly find it a bit nuts, there's offerings that come close, but using them I still get the impression that they've just not put in the time really refining that user interface, so I just wanted to say thank you for the work you and the GP did, it was incredibly helpful and I'm happy to say helped me launch and test a few product offerings as well as some fun ideas


This!

It absolutely boggles my mind that nothing else exists to fill this spot. Fly and others offer varying degrees of easier-than-AWS hosting, but nobody offers true PaaS like Heroku, IMHO.


The Heroku style of PaaS just isn't very interesting to most large businesses that actually pay for things. The world basically moved on to Kubernetes-based products (see Google and Red Hat)--or just shutdown like a lot of Cloud Foundry-based products. Yes, many individuals and smaller shops care more about simplicity but they're generally not willing/able to pay a lot (if anything).


It seems like you’re right, but it’s strange that the data world seems to be moving in the opposite direction, with PaaS products like Snowflake, DataBricks, Microsoft Fabric, even Salesforce’s own Data Cloud eating the world.


PaaS has always been this thing that isn't pure infrastructure or pure hosted software that you use as-is. Salesforce has something over 100K attendees of partners and users to its annual conference. It's always been this in-between thing with a fairly loose definition. I'd argue that Salesforce was long a cross between SaaS (for the users) and PaaS (for developers). You can probably apply the same view to a lot of other company products.


I find render.com basically as good as Heroku and certainly much better than fly.io's unpredictable pricing


In 2022 Render increased their prices (which for my team worked out at a doubling of costs) with a one month notice period and the CEO's response to me when I asked him if he thought that was a fair notice period was that it was operationally necessary and he was sorry to see us go.


In what way is Fly's pricing unpredictable ?


It varies based on usage


Isn't that how things should be ?


From a consumer's standpoint, why would I choose that?


It's a natural law to pay based on what you consume. Whenever that isn't required, you're typically being subsidized in a form or another, but don't be surprised when it goes away.


Heroku and Ruby, for me, was the 21st century answer to 'deploying' a PHP site over FTP.

The fact that it required nothing but 'git push heroku master' at the time was incredible, especially for how easy it was to spin up pre-prod environments with it, and how wiring up a database was also trivial.

Every time I come across an infrastructure that is bloated out with k8s, helm charts, and a complex web of cloud resources, all for a service not even running at scale, I look back to the simplicity we used to have.


I completely agree that there's nothing comparable to old-school Heroku, which is crazy. That said, Cloudflare seems promising for some types of projects and I use them for a few things. Anyone using them as a one-stop-shop?


For me Northflank have filled this spot. Though by the time I switched I was already using Docker so can't speak directly to their Heroku Buildpack support.


vercel goes a step further, and (when configured this way) allocates a new hostname (eg feature-branch-add-thingg.app.vercel.example.com) for new branches, to make testing even easier.


But their offering is "frontend oriented", what you describe doesn't work for django / laravel / rails / etc, no ?


Have a look at Scalingo, it's a good mix of simplicity and maturity.

https://scalingo.com/blog/heroku-alternative-europe-scalingo...


This looks nice! Wish they had a no-credit-card-required version for educational purposes. For the course I teach we use Spring Boot, and life was good with Heroku till they discontinued the no-credit-card version, and then the only choice we had (with support for Spring Boot) was to move over to Azure, which works but is a bit overkill and complicated for our purposes. I guess we could just use Docker and then many more platform would become available, but I'd rather not add one more step to the pipeline if possible.


Yes. We are taking a stab at the entire infrastructure like Heroku did but with a focus on a coding agent-centric workflow: https://specific.dev


I’ve been using Render for close to 5 years, and it’s excellent. I can’t think of anything I use that it doesn’t do as well or better than Heroku did last I checked.


I agree with everything you said, and can only thank the founders for their tremendous insight and willingness to push the limits. The shear number of engineering practices we take for granted today because of something like Heroku boggles my mind.

I am forever grateful for the opportunity to work there and make it an effort to pass on what I learned to others.


> by 2012 we were drowning in tech debt and scaling challenges.

> the greatest engineering team I've ever seen

How do these two things reconcile in your opinion? In my view , doing something quickly is the easy part , good engineering is only needed exactly when you want things to be maintainable and scalable, so the assertions above don’t really make much sense to me.


It is hard to explain the impact of such massive growth over a 2-3 period. New features were coming online while old ones were being abused by overuse. For instance, we launched PostgreSQL in the cloud, something we take for granted today. Not only that, but we offered an insane feature set around "follow" and "forking" that made working with databases seem futuristic.

I remember when we launched that product we went to that year's PGCon and there were people in the crowd angry and dismissive that we would treat data that way. It was actually pretty confrontational. Products like that were being produced while we were also working on migrating away from the initial implementation of the "free tier" (internally called Shen). It took me and a few others months to replace it and ensure we didn't lose data while also making it maintainable. The resulting tool lovingly named "yobuko" ended up remaining for years after that (largely due to the stagnation and turn over).

Anyways, that was just a slice of it. Decisions made today are not always the decisions you wanted to be made tomorrow. Day0 is great, day100 comes with more knowledge and regret. :D


In general, my impression has been that you don't want to architect your solution at first for massive scaling, because:

* You probably aren't going to need it, so putting the effort into scaling means slowing down your delivery of the very features that would make customers want your solution.

* It typically slows down performance of individual features.

* It definitely significant increases the complexity of your solution (and probably the user-facing tooling as well).

* It is difficult to achieve until you have the live traffic to test your approach.


Yeah I think there is a lot of truth here. You can't solve all the problems and in Heroku's case we focused on user experience (internally and externally). Great ideas like "git push heroku main" are game changers, but what happens once that git server is receiving 1000 pushes a minute? Totally different thought process.

Perhaps the thing I would add is that even with the tech debt and scaling problems we still had over a million applications deployed ready for that request to hit them.


An organization without “tech debt” is not good at prioritizing.


Yea it seems like not much thought was put into scalability.


Tell us more about some of these ideas and approaches that changed you as an engineer! We'd love to hear!


Well many of them you may know in that they made their way into so many systems (though arguably without the refined UX of Heroku) but the two that come up the most and I am teaching others:

* The simpler the interface for the user, the more decisions you can make behind the scenes. A good example here is "git push heroku". Not only is that something every user (bot or human) can run, it is also easy to script, protect, and scale. It keeps the surface area small and the abstraction makes the most sense. The code that was behind that push lasted for quite some time, and it was effectively 1-2 Python classes as a service. But once we got the code into our systems, we could do anything with it... and we did. One of the things that blows my mind is that our "slug" (this is what we called the tarballs of code that we put on the runtimes) maker was itself a heroku app. It lived along side everyone else. We were able to reduce our platform down to a few simple pieces (what we called the kernel) and everything else was deployed on top. We benefited from the very things our customers were using.

* NOTE: This one is going to be hard to explain because it is so simple, but when you start thinking about system design in this way the possibilities start to open up right in front of you. The idea is that everything we do is effectively explained as "input -> filter -> output". Even down to the CPU. But especially when it comes to a platform. With this design mentality we had a logging pipeline that I am still jealous of. We had metrics flowing into dashboards that were everywhere and informed us of our work. We had things like "integration testing" that ran continuously against the platform, all from the users perspective, that allowed us to test features long before they reached the public. All of these things were "input" that we "filtered" in some way to produce "output". When you start using that "output" as "input" and chaining these things together you get to a place where you can design a "kernel" (effectively an API service and a runtime) and start interacting with it to produce a platform.

I remember when we were pairing down services to get to our "kernel" one of the Operations engineers developed our Chef so that an internal engineer needed maybe 5-7 lines of Ruby to deploy their app and get everything they needed. Simple input, that produced a reliable application setup, that could now get us into production faster.

Anyways, those are just a couple.


Here[0] is a talk that shows off some of these tools. Mark led the way on many of these ideas internally.

[0]: https://www.youtube.com/watch?v=yGcaofDq8SM


Thank you so much for this!


Absolutely agree and likewise buddy :)


Thanks for sharing your story. Those early days of using Heroku were really enjoyable for me. It felt so promising and simple. I remember explaining the concept to a lot of different people who didn't believe that the deployment model could be that simple and accessible until I showed them.

Then life went on, I bounced around in my career, and forgot about Heroku. Years later I actually suggested it for someone to use for a simple project once and I could practically feel the other developers in the room losing respect for me for even mentioning it. I hadn't looked at it for so long that I didn't realize it had fallen out of favor.

> That time coincided with the founders taking a step back, leaving a loss of leadership and vision that was filled by people more concerned with process than results

This feels all too familiar. All of my enjoyable startup experiences have ended this way: The fast growing, successful startup starts attracting people who look like they should be valuable assets to the company, but they're more interested in things like policies, processes, meetings, and making the status reports look nice than actually shipping and building.


The cancer of corporations: bureaucracy.


As corporations grow, they also increasingly need some level of process which can increasingly look a lot like bureaucracy.


Bureaucracy is a natural consequence of workforce growth. Respectfully, I think you are blaming the symptoms here, not the root cause.


Having been on a bigco team that underwent the same sort of headcount growth in a very short time I have to imagine that "more than tripling our headcount in that first year" was likely more a driver of the inability to keep up than a solution. That's not a knock on the talents of anyone hired; it's just exceedingly difficult to simultaneously grow a team that fast and maintain any kind of velocity regardless of the complexity of the problems you're trying to tackle. The culture and knowledge that enabled the previous team's velocity just gets completely diluted.


Thanks for a capsule tour through Heroku!

FWIW, the team that eventually created "Docker" was working at the same time on dotCloud as a direct Heroku competitor. I remember meeting them at a meet-up in the old Twitter building but couldn't tell you exactly which year that was. Maybe 2010 or 2011?


Yep, that team did great work. I remember having lunch at the Heroku office with the dotCloud team in 2011 or 2012 and also Solomon Hykes demoing Docker to us in our office’s basement before it launched. So much cool stuff was happening back then!


I remember Solomon's demoing at All Hands... it was such a crazy time for tech and innovation.


Oh hey Curt!! Remember, you are not a monitoring system :)


Haha, funny you should mention that as I was just telling a coworker that story as we worked on a new dashboard for our infrastructure. :D


> people more concerned with process than results

Sounds like something Steve Jobs observed at apple https://youtu.be/l4dCJJFuMsE?si=QOCBUqcUPWu8AsAX


Link with spyware removed: https://youtu.be/l4dCJJFuMsE


I feel as if youtube itself might still contain some tracking.

People recommend invidious and piped but they are starting to have issues.

This video is archived on wayback machine (archive.org) and I feel as if that might be the best way to watch this video right now.

https://web.archive.org/web/20260207024055/https://www.youtu...

Edit: I have also uploaded it on preservetube when I got interested in what are some wayback machine alternatives (given that wayback machine suggests to not upload videos given storage constraints unless absolutely necessary)

So I found preservetube which is sort of intended for this and uploaded this video on it

https://preservetube.com/watch?v=l4dCJJFuMsE


Spyware?


Not OP, but the "si" parameter in the URL is an individual tracking identifier, generated specifically for YouTube to see who you share the link with.


TY


GP means the si url parameter, which is a token that helps google track how their videos are being shared.


I worked with some of the folks from there, and honestly you make it sounds like tech debt is inevitable consequence haunting projects from year one.

I disagree, I think the folks just did a sloppy job of "let's bungee strap it all together" for speed, instead of more serious planning and architecturing. They self-inflicted the tech debt on them and got drowned in the debt interest super fast.


There's someone out there who built the scalable version of Heroku at another garage startup. But we never heard of them because Heroku beat them to market.


Sure, but that bungee strapped slop got them pretty damn far


My point is that they would have gotten much farther. I'm not talking about architecture astronautics, just a little more thinking-ahead.


As far as the Salesforce acquisition goes, I'd be curious to see who made the decision to put Heroku into maintenance only mode.

I worked for a different part of Salesforce. I don't really feel like Salesforce did a ton of meddling in any of its bigger acquisitions other than maybe Tableau. I think the biggest missed opportunity was potentially creating a more unified experience between all of its subsidiaries. Though, it's hard to call that a failure since they're making tons of money.

It could be a case of post-founder leadership seeing that there's not a lot of room for growth and giving up. That happens a lot in the tech industry.


You'll need to unlock your iPhone first. Even though you're staring at the screen and just asked me to do something, and you saw the unlocked icon at the top of your screen before/while triggering me, please continue staring at this message for at least 5 seconds before I actually attempt FaceID to unlock your phone to do what you asked.


This is largely because LISTEN/NOTIFY has an implementation which uses a global lock. At high volume this obviously breaks down: https://www.recall.ai/blog/postgres-listen-notify-does-not-s...

None of that means Oban or similar queues don't/can't scale—it just means a high volume of NOTIFY doesn't scale, hence the alternative notifiers and the fact that most of its job processing doesn't depend on notifications at all.

There are other reasons Oban recommends a different notifier per the doc link above:

> That keeps notifications out of the db, reduces total queries, and allows larger messages, with the tradeoff that notifications from within a database transaction may be sent even if the transaction is rolled back


> None of that means Oban or similar queues don't/can't scale—it just means a high volume of NOTIFY doesn't scale

Given the context of this post, it really does mean the same thing though?


No, I don't think so. Oban does not rely on a large volume of NOTIFY in order to process a large volume of jobs. The insert notifications are simply a latency optimization for lower volume environments, and for inserts can be fully disabled such that they're mainly used for control flow (canceling jobs, pausing queues, etc) and gossip among workers.

River for example also uses LISTEN/NOTIFY for some stuff, but we definitely do not emit a NOTIFY for every single job that's inserted; instead there's a debouncing setup where each client notifies at most once per fetch period, and you don't need notifications at all in order to process with extremely high throughput.

In short, the fact that high volume NOTIFY is a bottleneck does not mean these systems cannot scale, because they do not rely on a high volume of NOTIFY or even require it at all.


Does River without any extra configuration run into scaling issues at a certain point? If the answer is yes, then River doesn’t scale without optimization (Redis/Clustering in Oban’s case).

While the root cause might not be River/Oban, them not being scalable still holds true. It’s of extra importance given the context of this post is moving away from redis and to strictly a database for a queue system.


Yup. I wasn’t talking about notify in particular but about using Postgres in general.


Yeah, River generally recommends this pattern as well (River co-author here :)

To get the benefits of transactional enqueueing you generally need to commit the jobs transactionally with other database changes. https://riverqueue.com/docs/transactional-enqueueing

It does not scale forever, and as you grow in throughput and job table size you will probably need to do some tuning to keep things running smoothly. But after the amount of time I've spent in my career tracking down those numerous distributed systems issues arising from a non-transactional queue, I've come to believe this model is the right starting point for the vast majority of applications. That's especially true given how high the performance ceiling is on newer / more modern job queues and hardware relative to where things were 10+ years ago.

If you are lucky enough to grow into the range of many thousands of jobs per second then you can start thinking about putting in all that extra work to build a robust multi-datastore queueing system, or even just move specific high-volume jobs into a dedicated system. Most apps will never hit this point, but if you do you'll have deferred a ton of complexity and pain until it's truly justified.


state machines to the rescue, ie i think the nature of asynchronous processing requires that we design for good/safe intermediate states.


I get the temptation to attribute the popularity of these systems to lazy police with nothing better to do, but from personal experience there’s more to it.

I live in a medium sized residential development about 15 minutes outside Austin. A few years ago we started getting multiple incidents per month of attempted car theft where the thieves would go driveway to driveway checking for unlocked doors. Sometimes the resident footage revealed the thieves were armed while doing so. In a couple of cases they did actually steal a car.

The sheriffs couldn’t really do much about it because a) it was happening to most of the neighborhoods around us, b) the timing was unpredictable, and c) the manpower required to camp out to attempt to catch these in progress would be pretty high.

Our neighborhood installed Flock cameras at the sole entrance in response to growing resident concerns. We also put in a strict policy around access control by non law enforcement. In the ~two years since they were installed, we’ve had two or three incidents total whereas immediately prior it was at least as many each month. And in those cases the sheriffs could easily figure out which vehicles had entered or left during that time. I continue to see stories of attempted car thefts from adjacent neighborhoods several times per month.

I totally get the privacy concerns around this and am inherently suspicious of any new surveillance. I also get the reflexive dismissal of their value. In this case it has been a clear win for our community through the obvious deterrent factor and the much higher likelihood of having evidence if anything does happen.

Our Flock cameras do not show on the map here, btw.


Aaron did RT the post here which likely indicates some agreement with the sentiment in it: https://x.com/searls/status/1972293469193351558

Also he shared it directly while saying it was good here: https://x.com/tenderlove/status/1972370330892321197


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: