Aren't there common patterns of good "just code" and bad "just code"? I've been told for a long time that global variables are a bad pattern. Maybe feature flags are a bad pattern too.
One concern about feature flags is testing, and the added permutations of testing needed to include all the feature flags in testing. You tested with flag A on and off, you tested with flag B on and off, but did you ever test with them both on and both off? Without feature flags, a big change that could have been represented by a feature flag would hopefully have to make its way past some quality gates. With feature flags, the exact permutation that you're going to cause later today by flipping on some feature flags may well not have been tested. Not that forgetting to test is something you can't protect yourself against with tools and processes, but testing all the permutations may be expensive.
You may not have to test all the permutations, if you can predict which permutations are relevant for your flipping feature flags later today. But a lot of organizations have poor discipline in cleaning up old feature flags, so it may not be so predictable. Maybe that's not a feature flag problem but an organizational problem, but the feature flags are gonna get blamed at some point, nonetheless.
It's just code.
If the team is so bad that an intern can mess things up, they will, and the mess will have nothing to do with feature flags.