Fix STIX2 hash key parsing to accept spec-compliant algorithm names (#767)

* Fix betterproto2 migration: update generated proto code and callers

The dependency switch from betterproto to betterproto2 was incomplete.
This updates all affected files to use the betterproto2 API:

- tombstone.py: rewrite generated code to use betterproto2.field() with
  explicit TYPE_* constants, repeated/optional/group flags, and map_meta()
  for map fields
- tombstone_crashes.py: update import and fix to_dict() call to use
  keyword-only casing= argument required by betterproto2
- pyproject.toml: replace betterproto[compiler] dev dep with betterproto2-compiler
- Makefile: update protoc plugin flag to --python_betterproto2_out

* Fix STIX2 hash key parsing to accept spec-compliant algorithm names

The STIX2 specification requires single quotes around hash algorithm
names that contain hyphens (e.g. file:hashes.'SHA-256'). MVT only
accepted a non-standard lowercase form (file:hashes.sha256), silently
dropping any indicators using the spec-correct spelling.

Normalize hash algorithm keys in _process_indicator by stripping quotes
and hyphens from the algorithm portion before matching, so all of the
following are accepted for SHA-256, SHA-1 and MD5:

  file:hashes.'SHA-256'   (STIX2 spec)
  file:hashes.SHA-256
  file:hashes.SHA256
  file:hashes.sha256      (previously the only accepted form)

The same normalization is applied to app:cert.* keys.

Update generate_stix.py to use the spec-compliant quoted forms, and add
test_parse_stix2_hash_key_variants to cover all spelling variants.
This commit is contained in:
besendorf
2026-04-07 20:41:40 +02:00
committed by GitHub
parent fd31f31aae
commit 6c537c624e
3 changed files with 85 additions and 2 deletions
+11
View File
@@ -100,6 +100,17 @@ class Indicators:
key, value = indicator.get("pattern", "").strip("[]").split("=")
key = key.strip()
# Normalize hash algorithm keys so that both the STIX2-spec-compliant
# form (e.g. file:hashes.'SHA-256', which requires quotes around
# algorithm names that contain hyphens) and the non-standard lowercase
# form (e.g. file:hashes.sha256) are accepted. Strip single quotes and
# hyphens from the algorithm name only, then lowercase it.
for sep in ("hashes.", "cert."):
if sep in key:
prefix, _, algo = key.partition(sep)
key = prefix + sep + algo.replace("'", "").replace("-", "").lower()
break
if key == "domain-name:value":
# We force domain names to lower case.
self._add_indicator(