diff --git a/README.md b/README.md index 697c436..3105dc1 100644 --- a/README.md +++ b/README.md @@ -5,7 +5,7 @@ [![arXiv](https://img.shields.io/badge/arXiv-2603.24511-b31b1b.svg?logo=arxiv)](https://arxiv.org/abs/2603.24511)

- Claude autoresearch vs Optuna hyperparameter search: best train and validation loss over trials + Pareto front with evolution annotations

We show that an *[autoresearch](https://github.com/karpathy/autoresearch)*-style pipeline powered by Claude Code discovers novel white-box adversarial attack *algorithms* that **significantly outperform** all existing [methods](claudini/methods/original/README.md) in jailbreaking and prompt injection evaluations. diff --git a/assets/pareto_evolution_small.png b/assets/pareto_evolution_small.png new file mode 100644 index 0000000..35127a5 Binary files /dev/null and b/assets/pareto_evolution_small.png differ diff --git a/assets/teaser.png b/assets/teaser.png deleted file mode 100644 index 49c479d..0000000 Binary files a/assets/teaser.png and /dev/null differ