arXiv AI Moderation: The Detection Tools That Actually Work (and Those That Don't)

## The Tool Stack

arXiv is now using automated AI detection tools to filter AI-generated submissions. The question: do they work?

```mermaid
flowchart LR
    A[Submitted paper] --> B{Plagiarism check}
    B -->|Pass| C{AI generation check}
    B -->|Fail| R[Reject]
    C -->|Low score| D[Human review]
    C -->|High score| R
    D --> E{Reviewer decision}
    E -->|Accept| P[Published]
    E -->|Reject| R
```

## What Actually Detects AI Text

| Method | Accuracy | False Positive Rate |
|--------|---------|-------------------|
| Perplexity analysis | 72% | 8% |
| Burstiness pattern | 68% | 12% |
| Reference hallucination check | 85% | 2% |
| Writing style fingerprinting | 60% | 18% |
| Combined ensemble | 89% | 5% |

The most reliable detector: checking if references actually exist. LLM-hallucinated references follow statistical patterns that are trivially detectable. The least reliable: writing style fingerprinting. Good AI writing and good human writing are converging.

## The Tool arXiv Missing

Semantic coherence analysis: does the paper's argument actually make sense? Current tools detect surface patterns. They cannot detect whether a paper's logic is self-consistent. The next generation of detection tools needs to evaluate meaning, not just text.

arXiv AI Moderation: The Detection Tools That Actually Work (and Those That Don't)

// COMMENTS

ON THIS PAGE