I spent an entire day trying to make AI-generated...

## AI Detection Is Broken

So I spent a week trying to outsmart AI detectors. Threw everything at them. Claude Opus with fancy prompts, fine-tuned Llama with LoRA on 400K tokens, BART as a rewriter, even AuthorMist that's supposed to be trained against Originality.ai.

Ten different methods. Every single one scored 100% AI.

Then I wrote a paragraph myself about watching the moon launch on TV with my family. Clean grammar, proper sentences, the whole thing. No AI involved at all.

98% AI.

You know what finally worked? I rewrote some AI blog post by hand but made it sloppy. Run-on sentences, missing apostrophes, fragments everywhere. Like "got an answer, check" instead of proper English.

96% human.

## The Real Problem

These detectors aren't measuring if AI wrote something. They're measuring if you write like AI was trained to write.

Professional writers are screwed. You spend years learning to write clean, organized prose. You edit your work. You fix your grammar. And now some detector flags you as a robot because you know how to use a semicolon properly.

I'm building a writing assistant that learns how you write, and I wanted to use detection scores as a quality metric. But if polished human writing scores as AI, what exactly are we measuring here?

Think about it. Every well-edited article, every professional blog post, every piece of formal writing now looks suspicious. Not because a machine wrote it, but because it doesn't have enough typos.

## New Research Makes It Worse

Remember that Stanford study from 2023 that flagged non-native English speakers? Turns out that was optimistic. A February 2026 paper from UC San Diego researchers broke every major detector with 99.9% success. StealthRL, they called it. Used reinforcement learning to generate text that slipped right past detection.

But you know what's funny? The text quality scored 2.59 out of 5. Garbage prose. Total slop. Yet detectors couldn't spot it.

So we've got tools that can't catch bad AI writing but flag good human writing. Australian Catholic University found this out the hard way. Flagged 6,000 students in 2024. Most were false positives. Students who just wrote well, or spoke English as a second language, or had a consistent style. They scrapped the whole system in 2025.

## Universities Are Giving Up

Yale dropped AI detection. MIT dropped it. Cambridge, Johns Hopkins, Vanderbilt. UC San Diego ironically dropped theirs right before their own researchers broke everyone else's. Berkeley too. At least 12 elite schools have thrown in the towel.

Curtin University in Australia disabled Turnitin's detection in January 2026. Their statement was diplomatic but you could read between the lines: this doesn't work and we're tired of the drama.

Stanford's latest research shows why. False positive rates hit 61% for ESL writers. You write too cleanly? AI. You have consistent style? AI. You avoid grammar mistakes? Definitely AI.

## The Research Pivot Nobody's Talking About

Academic papers on AI detection have taken a weird turn. They're not trying to detect AI anymore. They're trying to figure out which AI wrote something.

FAID at EACL 2026 does "LLM-family attribution." Not "is this AI?" but "which model family?" Per-LLM fine-tuned detectors hit 99.6% accuracy when you know exactly which model and version you're looking for. Great for forensics. Useless for general detection.

The field has basically admitted defeat on binary detection. You can't reliably tell human from AI. But you might be able to tell GPT-4 from Claude from Llama. If you squint. And know what you're looking for.

## Voice As Identity

I keep thinking about this shift from detection to attribution. We spent years asking "is this real?" when we should have asked "who wrote this?"

Voice fingerprinting feels like the actual answer. Not trying to catch fakes but proving authenticity. Like those old letters with wax seals. You know who wrote it because you recognize how they write.

You've probably noticed AI writing even when detectors miss it. That weird formal tone. The hedging. The way it explains things you already know. We don't need software to tell us. We need software that helps us sound like ourselves.

Binary detection is broken. The universities know it. The researchers know it. The institutions gave up because they realized they were solving the wrong problem. The problem isn't detection. It's attribution. It's authenticity. It's helping people maintain their voice while using these tools.

That moon launch paragraph is still scoring 98% AI-generated, by the way. I wrote every word.