Someone writes the migration, commits it, it passes the build and unit test stages of the pipeline, then the application as currently running passes all function and integration tests with (and this is important) both the prior and the revised schema. Your commit is tagged as release ready! Not long after, the automation tooling confidently executes the now-tested migration under machine control during the next deploy, everyone goes home happy with your shiny new published_at column, and no-one has directly touched prod.
Two days later the CTO sends everyone a stroppy email about "column bloat that should've been a table", ssh's into the personal instance that they've been keeping alive† since before you had funding and learned to launch servers as immutable black boxes, and whilst trying to prove a point by rolling it back manually, drops all tables by mistake when a cat treads on the keyboard
> Someone writes the migration, commits it, it passes the build and unit test stages of the pipeline, then the application as currently running passes all function and integration tests with (and this is important) both the prior and the revised schema. Your commit is tagged as release ready! Not long after, the automation tooling confidently executes the now-tested migration under machine control during the next deploy, everyone goes home happy
What happens if something goes really wrong after the production deploy? Is there a way to skip steps if you need to quickly push an emergency fix?
At our company, we have "an immutable DB", too, but when there's a critical emergency (say, full downtime), we can apply fixes manually. In that case, we run the tests after applying the fix.
I think I‘ve been walking this path for twenty years now.