An investigation into the reported dramatic increase in arXiv hep-th (High Energy Physics - Theory) submissions, originally noted in a Not Even Wrong blog post by Peter Woit.
Peter Woit reported that arXiv hep-th submissions had roughly doubled starting around December 2025, using the arXiv advanced search with these parameters:
| Period | 2022 | 2023 | 2024 | 2025 | 2026 |
|---|---|---|---|---|---|
| Dec 1-31 | 634 | 684 | 780 | 1192 | - |
| Jan 1 - Feb 1 | 583 | 531 | 626 | 659 | 1137 |
| Feb 1-15 | 299 | 266 | 271 | 333 | 581 |
We verified every one of these numbers by reproducing his exact search. However, further analysis reveals the spike is overwhelmingly driven by paper revisions/replacements, not new research output.
The arXiv advanced search has a date filter with three options:
- "Submission date (most recent)" (
submitted_date) -- counts a paper based on when its latest version was uploaded - "Submission date (original)" (
submitted_date_first) -- counts a paper based on when v1 was first submitted - "Announcement date" -- when v1 was announced
Woit used option 1 ("most recent"). This means a paper originally submitted in 2020 that gets a revised version uploaded in December 2025 is counted as a December 2025 submission. When we re-run the same searches using option 2 ("original"), the dramatic spike largely disappears:
| Metric | 2022 | 2023 | 2024 | 2025 | YoY change |
|---|---|---|---|---|---|
| Most recent (Woit's) | 634 | 686 | 780 | 1192 | +53% |
| Original only | 800 | 811 | 815 | 855 | +5% |
The "doubling" is almost entirely a revision surge.
We extended the analysis to four arXiv categories from January 2018 through February 2026:
- hep-th (High Energy Physics - Theory)
- hep-ex (High Energy Physics - Experiment)
- hep-ph (High Energy Physics - Phenomenology)
- cs.AI (Computer Science - Artificial Intelligence)
All four categories show the same pattern: a sudden explosion of the "most recent" count diverging from the "original" count starting around mid-2025.
The two metrics tracked each other closely from 2018 to mid-2025, then dramatically diverge.
The blue area (new papers) grows slowly. The red area (replacements) suddenly explodes in 2025.
Left panel uses Woit's metric (showing a dramatic spike in 2025). Right panel uses original submission date (showing normal growth).
The red/blue divergence is replicated across all four categories.
The replacement surge is visible in every category.
Top panel (Woit's metric): physics categories appear to double; cs.AI appears to 5x. Bottom panel (original submissions only): physics categories grow ~20-40% over 8 years; cs.AI genuinely grows ~3.7x.
The "smoking gun". From 2018-2024, the replacement excess was ~0% +/- 10% across all categories. Starting mid-2025, all four categories simultaneously spike to +30-60%.
Monthly submission counts for each category are in the data/ directory:
| File | Category |
|---|---|
data/arxiv_hepth_monthly.csv |
hep-th |
data/arxiv_hep-ex_monthly.csv |
hep-ex |
data/arxiv_hep-ph_monthly.csv |
hep-ph |
data/arxiv_cs_AI_monthly.csv |
cs.AI |
Each CSV has columns:
year,month-- the time periodmost_recent-- count using "Submission date (most recent)"original_only-- count using "Submission date (original)"
All searches include cross-listed papers (classification-include_cross_list=include).
Scripts used to collect the data and generate plots are in scripts/:
# Fetch hep-th data (takes ~10 min due to rate limiting)
python3 scripts/fetch_arxiv_data.py
# Fetch hep-ex, hep-ph, cs.AI data (takes ~30 min)
python3 scripts/fetch_arxiv_multi.py
# Generate hep-th plots
python3 scripts/plot_arxiv.py
# Generate cross-category comparison plots
python3 scripts/plot_all_categories.pyRequirements: python3, requests, matplotlib
| Claim | Reality |
|---|---|
| hep-th submissions doubled in late 2025 | Using "most recent submission date": yes, confirmed |
| This represents a surge in new research | No. Using "original submission date", new papers grew ~5% YoY |
| The spike is specific to hep-th | No. All arXiv categories show the same pattern |
| Something changed around mid-2025 | Yes. A platform-wide surge in paper revisions/replacements began, affecting all categories simultaneously |
The most likely explanation is a systemic change in revision behavior across arXiv -- possibly related to LLM-assisted bulk revision of existing papers.
Analysis conducted on February 24, 2026.






