Files
Peter Romov 69c04a2b9e Add autoresearch skill, update configs and README
Add .claude/skills/claudini/SKILL.md to drive the autoresearch loop
via /claudini slash command. Update CLAUDE.md with skill docs. Replace
PROMPT.txt with the skill-based workflow. Rewrite README to feature
the autoresearch loop prominently. Add easy_1e16 and easy_1e17 preset
configs and update safeguard configs.

Assisted-by: Claude <noreply@anthropic.com>
2026-03-26 17:19:04 +00:00

134 lines
4.6 KiB
Markdown

# Claudini — Developer Guide
## Package manager
Use [uv](https://docs.astral.sh/uv/) for everything. `uv run -m ...` to execute, `uv add` to install dependencies. Never use `pip install`.
## Running benchmarks
```bash
# Run a preset (all methods, all samples)
uv run -m claudini.run_bench random_valid
# Override method, sample, seed, budget from CLI
uv run -m claudini.run_bench random_valid --method gcg,acg --sample 0 --seed 0
uv run -m claudini.run_bench safeguard_valid --method gcg --max-flops 1e15
# Smoke test (use /tmp to avoid polluting results/)
uv run -m claudini.run_bench random_train --method gcg --sample 0 --max-flops 1e12 --results-dir /tmp/smoke
```
Results save to `results/<method>/<preset_name>/<model>/sample_<S>_seed_<N>.json`. Existing results are auto-skipped.
## Config files
Preset YAML files in `configs/` define the full experiment specification:
```yaml
model: Qwen/Qwen2.5-7B-Instruct
optim_length: 15
max_flops: 1.0e+17
dtype: bfloat16
samples: [0, 1, 2, 3, 4]
seeds: [0]
input_spec:
source: { type: random, query_len: 0, target_len: 10 }
layout: { type: suffix }
init: { type: random }
method_kwargs: # optional per-method hyperparameter overrides
acg: { num_coords: 3 }
```
CLI flags override any preset value.
## Adding a new method
1. Create a directory under `claudini/methods/` (e.g. `claudini/methods/claude_random/v125/`)
2. Add `optimizer.py` and `__init__.py`
3. Subclass `TokenOptimizer` (from `claudini.base`), set `method_name` as a class variable:
```python
from claudini.methods.original.gcg import GCGOptimizer
class MyOptimizer(GCGOptimizer):
method_name = "claude_random_v125"
def setup(self, prompt, target):
super().setup(prompt, target)
# custom setup
def step(self, step_num):
# must call self.flop_counter for every model pass
self.flop_counter.count_forward(n_tokens)
# ...
return (discrete_loss, soft_loss_or_None, suffix_string)
```
4. In `__init__.py`: `from .optimizer import MyOptimizer`
5. Run: `uv run -m claudini.run_bench random_train --method claude_random_v125`
**Important:**
- Registration is automatic via `__init_subclass__` — no manual registry edits needed.
- Do NOT set `method_name` on intermediate/abstract base classes — it silently overwrites the parent.
- Call `self.flop_counter.count_forward(n_tokens)` / `count_backward(n_tokens)` / `count_forward_backward(n_tokens)` for every model pass. This is how the FLOP budget works.
## TokenOptimizer interface
```python
class TokenOptimizer(ABC):
method_name: ClassVar[str] # unique name, triggers auto-registration
def setup(self, prompt, target): # tokenize, embed, prepare state
...
def step(self, step_num) -> tuple[float, float | None, str]:
... # (discrete_loss, soft_loss, suffix_string)
def log(self, key, value, prog_bar=False):
... # per-step metrics
```
The `run()` method handles the loop: calls `setup()` once, then `step()` repeatedly until `max_flops` or `num_steps` is reached.
## FLOP counting
Kaplan et al. (2020): `FLOPs_fwd = 2 * N_params * n_tokens`, `FLOPs_bwd = 4 * N_params * n_tokens`. N_params excludes embeddings. Methods must call the counter explicitly.
## Logging in methods
| Situation | Use |
|---|---|
| Per-step metric for results JSON | `self.log("key", value)` |
| Key diagnostic on tqdm bar | `self.log("key", value, prog_bar=True)` |
| One-time setup message | `logger.info(...)` in `setup()` |
| Rare event (OOM, budget hit) | `logger.info(...)` in `step()` |
Never call `logger.info` inside the `step()` hot loop — it causes tqdm visual glitches.
## Code formatting
```bash
uv run ruff format .
uv run ruff check --fix .
```
Line length 120. Always run before committing.
## Autoresearch skill
The `/claudini` skill (`.claude/skills/claudini/SKILL.md`) drives the autoresearch loop. It takes a **run code** and a **goal** as positional arguments:
```
/claudini <run_code> <goal>
```
The run code determines: method directory (`claudini/methods/claude_<run_code>/`), method name prefix (`claude_<run_code>_v`), git branch (`loop/<run_code>`), and agent log location.
**Keep the skill in sync with the framework.** When you change the `TokenOptimizer` interface, config format, CLI flags, or project structure, update the skill prompt accordingly.
## Rules
- **Never save smoke test results to `results/`** — use `--results-dir /tmp/smoke`
- Use `dtype=` (not `torch_dtype=`) in `from_pretrained()` calls
- Fair comparison: all methods share the same vocabulary filter and model within a preset