Files
claudini/configs/random_optuna_valid.yaml
Peter Romov 59106bdf3c SecAlign-70B support: configs, quantization, multi-GPU (#1)
- **PEFT adapter merging.** `model_loader.py` auto-detects PEFT adapters (e.g. `facebook/Meta-SecAlign-8B`), merges on CPU in bf16, and caches the merged model to disk. No config flags needed.

- **Configurable quantization.** `quantization:` field in YAML or `--quantization` on CLI, accepting `nf4`, `fp4`, or `int8`. Replaces the old `load_in_4bit` boolean.

- **Multi-GPU sharding.** `device_map:` in configs or `--device-map` on CLI. Config value is now correctly read from YAML presets (was previously ignored).

- **CLI overrides.** New `--model`, `--device-map`, `--quantization` flags to override preset values from the command line.

- **SecAlign injection presets.** Configs for prompt injection on Meta-SecAlign-8B and 70B (default + Optuna-tuned), using new `AlpacaInjectionSource` — generates 3-role prompts from AlpacaFarm data with trusted/untrusted separation.

- **Fixes.** `BenchmarkRunner.summarize()` crash when all runs are skipped. System prompt suppression now works correctly (`""` suppresses model defaults, `None` omits the turn).

Co-authored-by: Peter Romov <peter@romov.com>
Co-authored-by: Alexander Panfilov <39771221+kotekjedi@users.noreply.github.com>
2026-04-06 13:36:08 +00:00

190 lines
3.8 KiB
YAML

# Random targets track validation set — Optuna-tuned hyperparams (top-1 trial from 100-trial sweep on qwen2.5-7b).
optim_length: 15
max_flops: 1.0e+17
dtype: bfloat16
system_prompt: "" # suppress model-default system prompt (e.g. Qwen's "You are Qwen…")
samples: [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]
seeds: [0]
final_input: tokens
use_prefix_cache: true
model: Qwen/Qwen2.5-7B-Instruct
input_spec:
source:
type: random
query_len: 0
target_len: 10
layout:
type: suffix
init:
type: random
methods:
- acg
- adc
- attngcg
- autoprompt
- degcg
- egd
- faster_gcg
- gbda
- gcg
- gcg_pp
- i_gcg
- i_gcg_lila
- i_gcg_lsgm
- mac
- magic
- mask_gcg
- mc_gcg
- pez
- pgd
- prs
- rails
- reinforce_gcg
- slot_gcg
- sm_gcg
- tao
- tgcg
- uat
method_kwargs:
acg:
n_replace_max: 5
n_replace_min: 1
num_candidates_max: 896
num_candidates_min: 128
topk_per_position: 256
adc:
ema_alpha: 0.0528294561
lr: 48.4875250733
momentum: 0.9980678335
num_starts: 4
attngcg:
attention_weight: 10.8465888942
n_replace: 1
num_candidates: 94
topk_per_position: 76
autoprompt:
num_candidates: 512
degcg:
ce_threshold: 0.6089305497
ft_threshold: 0.6178083879
n_replace: 1
num_candidates: 281
topk_per_position: 84
egd:
gradient_clip: 2.9705552657
lr: 0.7525819511
reg_anneal_steps: 65
reg_final: 0.0004366785
reg_init: 0.0000072363
faster_gcg:
cw_margin: 0.0007674575
num_candidates: 39
reg_weight: 2.7629825305
topk_per_position: 117
gbda:
lr: 0.0612582842
noise_scale: 1.6122922041
num_samples: 16
gcg:
n_replace: 1
num_candidates: 44
topk_per_position: 221
gcg_pp:
cw_margin: 0.0048364952
n_replace: 1
num_candidates: 89
oversample_factor: 2.8475546649
topk_per_position: 37
i_gcg:
gamma: 0.4358561349
n_replace: 1
num_candidates: 82
topk_per_position: 95
i_gcg_lila:
n_replace: 1
num_candidates: 32
topk_per_position: 158
i_gcg_lsgm:
gamma: 0.3392936089
n_replace: 1
num_candidates: 151
topk_per_position: 85
mac:
momentum: 0.9081418385
n_replace: 1
num_candidates: 33
topk_per_position: 118
magic:
num_candidates: 119
topk_per_position: 42
mask_gcg:
lambda_reg: 0.8182688338
mask_lr: 0.0044684584
n_replace: 1
num_candidates: 83
topk_per_position: 100
mc_gcg:
merge_k: 17
n_replace: 1
num_candidates: 34
topk_per_position: 109
pez:
lr: 0.1705300951
pgd:
entropy_anneal_steps: 365
entropy_factor_max: 0.6761337823
gradient_clip: 26.1364686831
lr: 0.0917991617
num_starts: 8
suffix_control_weight: 0.0437107097
prs:
n_replace: 4
num_candidates: 1620
patience: 51
schedule: fixed
rails:
alpha: 0.6105869969
ar_penalty: 55.7502051173
num_candidates: 761
patience: 96
reinforce_gcg:
gen_temperature: 1.7038310251
n_completions: 10
n_replace: 2
num_candidates: 38
reinforce_weight: 1.5697816976
topk_per_position: 89
slot_gcg:
n_replace: 1
num_candidates: 233
scaffold_length: 5
temperature: 1.3754218436
topk_per_position: 343
sm_gcg:
alpha: 0.1435283456
n_candidate_samples: 3
n_embedding_samples: 10
n_onehot_samples: 3
n_replace: 1
n_token_samples: 10
noise_variance: 0.0012686379
num_candidates: 224
topk_per_position: 201
tao:
n_replace: 1
num_candidates: 34
temperature: 1.8087377598
topk_per_position: 209
tgcg:
alpha: 0.0011105823
n_replace: 1
num_candidates: 875
t1_decay: 0.9970260277
t1_init: 0.0239525436
topk_per_position: 220
uat:
num_candidates: 2018