Commit Graph

1 Commits

Author SHA1 Message Date
Victor Kuznetsov 99e57c872f perf(text-mark): footprint-sized arrays in reverse-alpha CPU path
The reverse-alpha text-mark engine (Doubao/Jimeng/Samsung) allocated
full-frame arrays where only the glyph footprint is ever read:

  - _fixed_alpha_map / _aligned_alpha_map each built a full (h, w) float32
    alpha map non-zero only inside the glyph box, and two were held at once
    during removal (~96 MB of mostly-zeros on a 12 MP frame);
  - extract_mask built a full (h, w) uint8 mask that every caller cropped to
    the located box (~12 MB, rebuilt per text-mark detector on the
    memory-tight identify path).

Both now return footprint-sized arrays: the alpha helpers return the
glyph-sized block plus its placement (ax, ay, gw, gh), and extract_mask
returns the box-sized mask. _apply_reverse_alpha consumes the block
directly; the residual inpaint embeds it into one full-frame uint8 mask only
at cv2.inpaint time (which needs a full-frame mask). remove_watermark_
reverse_alpha tracks the winning region alongside best_amap to place it.

Peak allocation drops from O(image*4)x2 + O(image) to O(footprint)x2 +
one gated O(image*1) uint8 mask -- a win every consumer gets, motivated by
the 512 MB raiw.cc worker that OOMs on large decodes. GPU path untouched.

Byte-identical to the old full-frame path (verified: 17 output hashes
across the three engines, inpaint/no-inpaint, detect, and the real
doubao-1.png fixture, unchanged before/after). tests/test_text_mark_memory.py
guards it by reconstructing the old full-frame path inline and asserting
equality, so the proof survives a cv2/asset bump, and pins the O(footprint)
shape so a regression to full-frame fails loudly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 10:01:07 -07:00