There are a few case studies listed in most of the feature flag solutions, of course, each organization is completely different and the maturity of each organization varies. But feature flags are a 2-way-door decision, meaning that you can adopt them at smaller scale, try it out and see if it works for you before making a decision.
The main challenge is when things go wrong. Feature flags are designed for high-rate evaluation with low latency responses. Configuration usually doesn't care that much about latency as it's usually read once at startup. This context leads to some very specific tradeoffs such as erring to availability over consistency, which in the case of configuration management could be a bad choice
It can be done by opening a PR, I haven't tried it yet, but I'm curious to try out https://github.com/uber/piranha or maybe hear some experiences if someone has used it
AFAIK, it'd only open a PR if the flag is fully enabled and has some heuristics to determine when it's safe to remove. Honestly, I haven't tested it but I'm curious to know if someone had either good or bad experiences.
If all the PRs are instantly rejected, that would be a bad sign, but I couldn't find someone who effectively used it. I mean, it's been around for a while but it didn't spread out, so that already gives me some hint
If the cleanup only happens if the flag is not used, then the "expiration date" is basically meaningless. You can either delete it or you can't. Who cares if it's expired or not.
I think expires is just a signal for a feature that should "potentially" be removed. I believe it's a good way to focus on the ones you should pay attention to. But, it might be cool if you could say "Yes, I know, please extend this for another period" (or do not notify me again for another month)
I agree that if you have only a few changes going to prod, fast and doing canary testing, you should be covered. In my experience that's rarely the case because of multiple teams deploying changes at the same time, and even deployments in external services causing side effects in other services.
Emergent inter-service issues are challenging to deal with regardless.
I’ve absolutely seen canary testing work in large environments with a lot of teams doing frequent deploys. The teams need to have the tooling to conduct their own canary testing and monitoring.
As soon as you’re involving external services or anything persistent you may not be able to undo the damage of misbehaving software by simply disabling the offending code with a flag.
In practice the cost/benefit of feature flags has never proven out for me, better to just speed up your deploys/rollbacks, the caveat is I’ve only ever worked in web environments, I can imagine with software running on an end user device it could solve some difficult problems provided you have a way to toggle the flag.
It's true that there are more long-lived use cases, but if you have the ability to choose, runtime controlled ones cover both cases, while compile time only cover some use cases. But fair point
I faced something similar, and I think it's unavoidable. Give people a screwdriver and they'll find a way of using it as a hammer.
The best you can do is expect the feature flagging solution to give some kind of warning for tech debt. Then equip them with alternative tools for configuration management. Rather than forbidding, give them options, but if it's not your scope, I'd let them be (I know as engineers this is hard to do :P).
> Give people a screwdriver and they'll find a way of using it as a hammer.
I feel like feature flags aren't that far off though. They're fantastic for many uses of runtime configuration as mentioned in another comment.
There's multiple people in this thread complaining about "abuse" of feature flags but no one has been able to voice why it's abuse instead of just use beyond esoteric dogma.
Feature Flags inherently introduce at least one branch into your codebase.
Every branch in your codebase creates a brand new state your code can run through.
The number of branches introduced by Feature Flags likely does not scale linearly, because there is a good chance they will become nested, especially as more are added.
Start with even an example of one feature flag nested inside another. That creates four possible program states. Four is not unreasonable, you can clearly define what state the program should be in for all four states.
Now scale that to a hundred feature flags, some nested, some not.
It becomes impossible to know what any particular program state should be past the most common configurations. If you can't point to a single interface in a program and tell me all of the possible states of it, your program is going to be brittle as hell. It will become a QA nightmare.
This is why Feature Flags should be used for temporary development efforts or A/B testing, and removed.
Otherwise you're going to have a debugging nightmare on your hands eventually.
Edit: Note that this is different from normal runtime configurations because normally runtime configurations don't have a mix of in-dev options and other temporary flags. Also, they aren't usually set up to arbitrarily add new options whenever it is convenient for a developer.
Branches are difficult to reason about? Yes, I agree.
Are branches necessary to make the product behave in a different way in some circumstances? Most of the time.
Do those circumstances require a branch? Unless you’re super confident about some part of code, yes? But why would you be?
Runtime configuration is not about making QA easy. It’s introduced because QA has been hell already so you can control rollout of code which you know wasn’t properly QA’d - or it was but turns out the thing you built isn’t the thing users want and the release cycle is too long to deploy a revert.
I’d say ‘branches are bad but alternatives are worse’.
The fundamental diff between feature flags and config is the former is meant to be a soft deploy of code where everyone is expected to eventually be on the new code. Thus it should have a timer built in where it stops, and you should consider all new customers launching with it on.
As for why: if you don't deprecate the feature flag in some time span, you're permanently carrying both code paths. With ongoing associated dev and qa resources and costs against your complexity budget.
Permanent costs should only be undertaken after careful consideration, and should be outside the scope of a single dev deciding to undertake them. Whereas flags should be cheap to add to enable dev to get stuff into prod faster while retaining safety.
Permanently making something a config choice should be done after heavier deliberation because of the aforementioned costs, and you often want different tools to manage it. Including something heavier duty than a single checkbox/button in your internal CS admin tooling. These are often tied into contracts or legal needs, and in many cases salesforce should be the source of truth for them. Or whatever CPQ system you're using.
Here's a list of case studies from some of the solutions referred in the comments, some focus on operational metrics, others in lead time to changes: https://www.getunleash.io/case-studies https://launchdarkly.com/case-studies/ https://www.flagsmith.com/case-studies