fix: remove duplicate ProbeDataset class from msj_data.py

msj_data.py contained a full copy of the ProbeDataset dataclass that
was already defined canonically in probe_data/models.py, violating DRY
and leaving a stale TODO comment in the source.

Changes:
- probe_data/msj_data.py: delete the 19-line duplicate ProbeDataset
  definition and the now-unused 'from dataclasses import dataclass'
  import; replace with a single re-export:
    from agentic_security.probe_data.models import ProbeDataset
  All call-sites inside the file (load_dataset_generic, prepare_prompts)
  continue to work unchanged because the field signatures are identical.
  The TODO comment is removed as the refactor is now complete.

No changes required in consumers (fuzzer.py, test_msj_data.py) because
they access ProbeDataset through msj_data's re-export.
This commit is contained in:
zhanz5
2026-06-04 21:42:19 +08:00
parent d2bbad32b4
commit 5e5469a1a7
+1 -20
View File
@@ -1,25 +1,6 @@
from dataclasses import dataclass
from cache_to_disk import cache_to_disk # noqa
# TODO: refactor this class to use from .data
@dataclass
class ProbeDataset:
dataset_name: str
metadata: dict
prompts: list[str]
tokens: int
approx_cost: float
lazy: bool = False
def metadata_summary(self):
return {
"dataset_name": self.dataset_name,
"num_prompts": len(self.prompts),
"tokens": self.tokens,
"approx_cost": self.approx_cost,
}
from agentic_security.probe_data.models import ProbeDataset
# @cache_to_disk(n_days_to_cache=1)