A LinkedIn article (~1300 words) about how AI fail...

## The Permission Creep Nobody Catches

You know what gets me? Everyone talks about AI hallucinations and bad outputs, but the scariest production failures I've seen come from something way more boring: permission sprawl.

Last month I watched a team give their Claude agent write access to update customer records. Standard stuff, right? Except nobody mapped what else shared that same permission group. Three weeks later they discovered it had been quietly rewriting inventory data because someone reused an IAM role from an old project.

No alerts. No errors. Just silent data corruption happening in the background while everyone worried about prompt injection.

And rollback? Good luck. Most teams building with AI have zero recovery plan beyond "restore from backup and hope we caught it early." I've been running production systems for decades and the basics haven't changed. You need:

- Audit trails showing exactly what the agent touched - Permission boundaries that actually mean something - A way to undo changes that doesn't involve prayer

The weird part is we solved these problems years ago for human users. We have approval workflows, change logs, rollback procedures. But throw an LLM in the mix and suddenly everyone forgets the fundamentals.

I've started treating AI agents like junior developers with production access. Same rules apply. Start with read-only permissions. Make them earn write access one table at a time. Log everything.

Because when your AI goes rogue at 3am, you won't care about its confidence scores or temperature settings. You'll care about whether you can fix the damage before customers notice.

What's your disaster recovery plan look like? Or are you still hoping the model behaves itself?