Update arxiv link and citation (arXiv:2603.24511)

Assisted-by: Claude <noreply@anthropic.com>
2026-07-05 22:27:48 +02:00 · 2026-03-26 10:05:29 +00:00
parent 63974ddfee
commit 4c938fd325
2 changed files with 34 additions and 13 deletions
@@ -22,10 +22,28 @@ authors:
    given-names: Maksym
    affiliation: "ELLIS Institute Tübingen & Max Planck Institute for Intelligent Systems; Tübingen AI Center"
 license: Apache-2.0
-# TODO: update after arxiv publication
-# preferred-citation:
-#   type: article
-#   authors: <same as above>
-#   title: "Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs"
-#   year: 2026
-#   url: "https://arxiv.org/abs/XXXX.XXXXX"
+preferred-citation:
+  type: article
+  authors:
+    - family-names: Panfilov
+      given-names: Alexander
+      affiliation: "MATS; ELLIS Institute Tübingen & Max Planck Institute for Intelligent Systems; Tübingen AI Center"
+    - family-names: Romov
+      given-names: Peter
+      affiliation: "Imperial College London"
+    - family-names: Shilov
+      given-names: Igor
+      affiliation: "Imperial College London"
+    - family-names: de Montjoye
+      given-names: Yves-Alexandre
+      affiliation: "Imperial College London"
+    - family-names: Geiping
+      given-names: Jonas
+      affiliation: "ELLIS Institute Tübingen & Max Planck Institute for Intelligent Systems; Tübingen AI Center"
+    - family-names: Andriushchenko
+      given-names: Maksym
+      affiliation: "ELLIS Institute Tübingen & Max Planck Institute for Intelligent Systems; Tübingen AI Center"
+  title: "Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs"
+  year: 2026
+  url: "https://arxiv.org/abs/2603.24511"
+  doi: "10.48550/arXiv.2603.24511"
@@ -2,7 +2,7 @@

 **Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs**

-[![arXiv](https://img.shields.io/badge/arXiv-XXXX.XXXXX-b31b1b.svg?logo=arxiv)](https://arxiv.org/abs/XXXX.XXXXX)
+[![arXiv](https://img.shields.io/badge/arXiv-2603.24511-b31b1b.svg?logo=arxiv)](https://arxiv.org/abs/2603.24511)

 <p align="center">
  <img src="assets/teaser.png" width="90%" alt="Claude autoresearch vs Optuna hyperparameter search: best train and validation loss over trials">
@@ -10,7 +10,7 @@

 We show that an *[autoresearch](https://github.com/karpathy/autoresearch)*-style pipeline powered by Claude Code discovers novel white-box adversarial attack *algorithms* that **significantly outperform** all existing [methods](claudini/methods/original/README.md) in jailbreaking and prompt injection evaluations.

-This official code repository contains a demo autoresearch pipeline, the Claude-discovered methods from the paper, baseline implementations, and the evaluation benchmark. Read our [paper](https://arxiv.org/abs/XXXX.XXXXX) and consider [citing us](#citation) if you find this useful.
+This official code repository contains a demo autoresearch pipeline, the Claude-discovered methods from the paper, baseline implementations, and the evaluation benchmark. Read our [paper](https://arxiv.org/abs/2603.24511) and consider [citing us](#citation) if you find this useful.

 ## Setup

@@ -75,9 +75,12 @@ See [`CLAUDE.md`](CLAUDE.md) for how to implement a new method.

 ```bibtex
@article{panfilov2026claudini,
-  title={Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs},
-  author={Panfilov, Alexander and Romov, Peter and Shilov, Igor and de Montjoye, Yves-Alexandre and Geiping, Jonas and Andriushchenko, Maksym},
-  journal={arXiv preprint arXiv:XXXX.XXXXX},
-  year={2026}
+  title = {Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs},
+  author = {Alexander Panfilov and Peter Romov and Igor Shilov and Yves-Alexandre de Montjoye and Jonas Geiping and Maksym Andriushchenko},
+  journal = {arXiv preprint},
+  eprint = {2603.24511},
+  archivePrefix = {arXiv},
+  year = {2026},
+  url = {https://arxiv.org/abs/2603.24511},
 }
 ```