mirror of
https://github.com/hacksider/Deep-Live-Cam.git
synced 2026-05-14 02:42:09 +02:00
890a6d41b6
ORT's CoreML EP GatherOpBuilder::IsOpSupportedImpl explicitly rejects rank-0 (scalar) index tensors. StyleGAN-derived models (GFPGAN's 1024 variant has 16 of them, one per style-code slice) hit this in the generator, and the resulting CPU fallbacks split the CoreML subgraph into multiple partitions with boundary crossings on every inference. Add a load-time ONNX rewrite that promotes each scalar index to [1] and squeezes the added axis on the Gather output — semantically identical but CoreML-compatible. GFPGAN now runs as a single CoreML partition with zero CPU-fallback nodes; inference drops from ~87 ms to ~81 ms on an M-series Mac. The fix has been filed upstream as microsoft/onnxruntime#28180 — the existing code comment in gather_op_builder.cc already describes this exact workaround, it just isn't applied. Once the upstream fix ships and the ORT floor is raised, this pass can be deleted.