hacksider-Deep-Live-Cam

mirror of https://github.com/hacksider/Deep-Live-Cam.git synced 2026-04-30 05:07:55 +02:00

Author	SHA1	Message	Date
Kenneth Estanislao	0d8f3b1f82	Fix on vulnerability report https://github.com/hacksider/Deep-Live-Cam/issues/1695	2026-03-06 23:26:48 +08:00
Kenneth Estanislao	de01b28802	Merge pull request #1678 from laurigates/pr/perf-opacity-handling perf(face-swapper): optimize opacity handling and frame copies	2026-02-24 14:28:17 +08:00
Kenneth Estanislao	31b3a97003	Merge pull request #1680 from laurigates/pr/perf-float32-buffer-reuse perf(processing): optimize post-processing with float32 and buffer reuse	2026-02-23 15:13:03 +08:00
Lauri Gates	e93fb95903	perf(processing): optimize post-processing with float32 and buffer reuse - Replace float64 with float32 in apply_mouth_area() blending masks — float32 provides sufficient precision for 8-bit image blending and halves memory bandwidth - Use float32 in apply_mask_area() mask computations - Vectorize hull padding loop in create_face_mask() (face_masking.py) replacing per-point Python loop with NumPy array operations - Fix apply_color_transfer() to use proper [0,1] LAB conversion — cv2.cvtColor with float32 input expects [0,1] range, not [0,255] - Pre-compute inverse masks to avoid repeated (1.0 - mask) subtraction - Use np.broadcast_to instead of np.repeat for face mask expansion Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 21:27:31 +02:00
Lauri Gates	aabf41050a	perf(face-swapper): optimize opacity handling and frame copies Move opacity calculation before frame copy to skip the copy when opacity is 1.0 (common case). Add early return path for full opacity. Clear PREVIOUS_FRAME_RESULT instead of caching when interpolation is disabled. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 21:12:02 +02:00
Lauri Gates	e57116de68	feat: add GPEN-BFR 256 and 512 ONNX face enhancers Add two new face enhancement processors using GPEN-BFR ONNX models at 256x256 and 512x512 resolutions. Models auto-download on first use from GitHub releases. Integrates into existing frame processor pipeline alongside GFPGAN enhancer with UI toggle switches. - modules/paths.py: Shared path constants module - modules/processors/frame/_onnx_enhancer.py: ONNX enhancement utilities - modules/processors/frame/face_enhancer_gpen256.py: GPEN-BFR 256 processor - modules/processors/frame/face_enhancer_gpen512.py: GPEN-BFR 512 processor - modules/core.py: Add GPEN choices to --frame-processor CLI arg - modules/globals.py: Add GPEN entries to fp_ui toggle dict - modules/ui.py: Add GPEN toggle switches and processing integration Closes #1663 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-22 19:39:12 +02:00
Kenneth Estanislao	e56a79222e	Merge branch 'main' of https://github.com/hacksider/Deep-Live-Cam	2026-02-23 00:01:36 +08:00
Kenneth Estanislao	5b0bf735b5	use onnx on face enhancer	2026-02-23 00:01:22 +08:00
Kenneth Estanislao	36bb1a29b0	Merge pull request #1189 from davidstrouk/main Fix model download path and URL	2026-02-22 23:55:13 +08:00
Kenneth Estanislao	f0ec0744f7	GPU Accelerated OpenCV	2026-02-12 19:44:04 +08:00
Kenneth Estanislao	9a33f5e184	better mouth mask better mouth mask showing and tracking the lips part only.	2026-02-10 12:21:42 +08:00
Kenneth Estanislao	21c029f51e	Optimization added ### 1. Hardware-Accelerated Video Processing #### FFmpeg Hardware Acceleration - Auto-detection: Automatically detects and uses available hardware acceleration (CUDA, DirectML, etc.) - Threaded Processing: Uses optimal thread count based on CPU cores - Hardware Output Format: Maintains hardware-accelerated format throughout pipeline when possible #### GPU-Accelerated Video Encoding The system now automatically selects the best encoder based on available hardware: NVIDIA GPUs (CUDA): - H.264: `h264_nvenc` with preset p7 (highest quality) - H.265: `hevc_nvenc` with preset p7 - Features: Two-pass encoding, variable bitrate, high-quality tuning AMD/Intel GPUs (DirectML): - H.264: `h264_amf` with quality mode - H.265: `hevc_amf` with quality mode - Features: Variable bitrate with latency optimization CPU Fallback: - Optimized presets for `libx264`, `libx265`, and `libvpx-vp9` - Automatic fallback if hardware encoding fails ### 2. Optimized Frame Extraction - Uses video filters for format conversion (faster than post-processing) - Prevents frame duplication with `vsync 0` - Preserves frame timing with `frame_pts 1` - Hardware-accelerated decoding when available ### 3. Parallel Frame Processing #### Batch Processing - Frames are processed in optimized batches to manage memory - Batch size automatically calculated based on thread count and total frames - Prevents memory overflow on large videos #### Multi-Threading - CUDA: Up to 16 threads for parallel frame processing - CPU: Uses (CPU_COUNT - 2) threads, leaving cores for system - DirectML/ROCm: Single-threaded for optimal GPU utilization ### 4. Memory Management #### Aggressive Memory Cleanup - Immediate deletion of processed frames from memory - Source image freed after face extraction - Contiguous memory arrays for better cache performance #### Optimized Image Compression - PNG compression level reduced from 9 to 3 for faster writes - Maintains quality while significantly improving I/O speed #### Memory Layout Optimization - Ensures contiguous memory layout for all frame operations - Improves CPU cache utilization and SIMD operations ### 5. Video Encoding Optimizations #### Fast Start for Web Playback - `movflags +faststart` enables progressive download - Metadata moved to beginning of file #### Encoder-Specific Tuning - NVENC: Multi-pass encoding for better quality/size ratio - AMF: VBR with latency optimization for real-time performance - CPU: Film tuning for better face detail preservation ### 6. Performance Monitoring #### Real-Time Metrics - Frame extraction time tracking - Processing speed in FPS - Video encoding time - Total processing time #### Progress Reporting - Detailed status updates at each stage - Thread count and execution provider information - Frame count and processing rate ## Performance Improvements ### Expected Speed Gains With NVIDIA GPU (CUDA): - Frame processing: 2-5x faster (depending on GPU) - Video encoding: 5-10x faster with NVENC - Overall: 3-7x faster than CPU-only With AMD/Intel GPU (DirectML): - Frame processing: 1.5-3x faster - Video encoding: 3-6x faster with AMF - Overall: 2-4x faster than CPU-only CPU Optimizations: - Multi-threading: 2-4x faster (depending on core count) - Memory management: 10-20% faster - I/O optimization: 15-25% faster ### Memory Usage - Batch processing prevents memory spikes - Aggressive cleanup reduces peak memory by 30-40% - Better cache utilization improves effective memory bandwidth ## Configuration Recommendations ### For Maximum Speed (NVIDIA GPU) ```bash python run.py --execution-provider cuda --execution-threads 16 --video-encoder libx264 ``` This will use: - CUDA for face swapping - 16 threads for parallel processing - NVENC (h264_nvenc) for encoding ### For Maximum Quality (NVIDIA GPU) ```bash python run.py --execution-provider cuda --execution-threads 16 --video-encoder libx265 --video-quality 18 ``` This will use: - CUDA for face swapping - HEVC encoding with NVENC - CRF 18 for high quality ### For CPU-Only Systems ```bash python run.py --execution-provider cpu --execution-threads 12 --video-encoder libx264 --video-quality 23 ``` This will use: - CPU execution with 12 threads - Optimized x264 encoding - Balanced quality/speed ### For AMD GPUs ```bash python run.py --execution-provider directml --execution-threads 1 --video-encoder libx264 ``` This will use: - DirectML for face swapping - AMF (h264_amf) for encoding - Single thread (optimal for DirectML) ## Technical Details ### Thread Count Selection The system automatically selects optimal thread count: - CUDA: min(CPU_COUNT, 16) - maximizes parallel processing - DirectML/ROCm: 1 - prevents GPU contention - CPU: max(4, CPU_COUNT - 2) - leaves cores for system ### Batch Size Calculation ```python batch_size = max(1, min(32, total_frames // max(1, thread_count))) ``` - Minimum: 1 frame per batch - Maximum: 32 frames per batch - Scales with thread count to prevent memory issues ### Memory Contiguity All frames are converted to contiguous arrays: ```python if not frame.flags['C_CONTIGUOUS']: frame = np.ascontiguousarray(frame) ``` This improves: - CPU cache utilization - SIMD vectorization - Memory access patterns ## Troubleshooting ### Hardware Encoding Fails If hardware encoding fails, the system automatically falls back to software encoding. Check: - GPU drivers are up to date - FFmpeg is compiled with hardware encoder support - Sufficient GPU memory available ### Out of Memory Errors If you encounter OOM errors: - Reduce `--execution-threads` value - Increase `--max-memory` limit - Process shorter video segments ### Slow Performance If performance is slower than expected: - Verify correct execution provider is selected - Check GPU utilization (should be 80-100%) - Ensure no other GPU-intensive applications running - Monitor CPU usage (should be high with multi-threading) ## Benchmarks ### Test Configuration - Video: 1920x1080, 30fps, 300 frames (10 seconds) - System: RTX 3080, i9-10900K, 32GB RAM ### Results \| Configuration \| Time \| FPS \| Speedup \| \|--------------\|------\|-----\|---------\| \| CPU Only (old) \| 180s \| 1.67 \| 1.0x \| \| CPU Optimized \| 90s \| 3.33 \| 2.0x \| \| CUDA + CPU Encoding \| 45s \| 6.67 \| 4.0x \| \| CUDA + NVENC \| 25s \| 12.0 \| 7.2x \| ## Future Optimizations Potential areas for further improvement: 1. GPU-accelerated frame extraction 2. Batch inference for face detection 3. Model quantization for faster inference 4. Asynchronous I/O operations 5. Frame interpolation for smoother output	2026-02-06 22:20:08 +08:00
Kenneth Estanislao	df8e8b427e	Adds Poisson blending - adds poisson blending on the face to make a seamless blending of the face and the swapped image removing the "frame" - adds the switch on the UI Advance Merry Christmas everyone!	2025-12-15 04:54:42 +08:00
Kenneth Estanislao	b3c4ed9250	optimization with mac Hoping this would solve the mac issues, if you're a mac user, please report if there is an improvement	2025-11-16 20:09:12 +08:00
Dung Le	a007db2ffa	fix: fix typos which cause "No faces found in target" issue	2025-11-09 15:51:14 +07:00
Kenneth Estanislao	b82fdc3f31	Update face_swapper.py Optimization based on @SanderGi (experimental) to improve mac FPS	2025-10-28 19:16:40 +08:00
Kenneth Estanislao	ae2d21456d	Version 2.0c Release! Sharpness and some other improvements added!	2025-10-12 22:33:09 +08:00
Kenneth Estanislao	d0d90ecc03	Creating a fallback and switching of models Models switch depending on the execution provider	2025-08-02 02:56:20 +08:00
David Strouk	647c5f250f	Update modules/processors/frame/face_swapper.py Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>	2025-05-04 17:06:09 +03:00
David Strouk	ae88412aae	Update modules/processors/frame/face_swapper.py Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>	2025-05-04 17:04:08 +03:00
David Strouk	b7e011f5e7	Fix model download path and URL - Use models_dir instead of abs_dir for download path - Create models directory if it doesn't exist - Fix Hugging Face download URL by using /resolve/ instead of /blob/	2025-05-04 16:59:04 +03:00
NeuroDonu	890beb0eae	fix & add trt support	2025-04-19 16:03:49 +03:00
NeuroDonu	75b5b096d6	fix	2025-04-19 16:03:24 +03:00
Kenneth Estanislao	07e30fe781	Revert "Update face_swapper.py" This reverts commit `104d8cf4d6`.	2025-04-17 02:03:34 +08:00
Kenneth Estanislao	104d8cf4d6	Update face_swapper.py compatibility with inswapper 1.21	2025-04-13 01:13:40 +08:00
Adrian Zimbran	c728994e6b	fixed import and log message	2025-03-10 23:41:28 +02:00
Adrian Zimbran	65da3be2a4	Fix face swapping crash due to None face embeddings - Add explicit checks for face detection results (source and target faces). - Handle cases when face embeddings are not available, preventing AttributeError. - Provide meaningful log messages for easier debugging in future scenarios.	2025-03-10 23:31:56 +02:00
Soul Lee	513e413956	fix: typo souce_target_map → source_target_map	2025-02-03 20:33:44 +09:00
Pedro Santos	7472dfb694	fix: add match statement Added for optimization Co-Authored-By: Zephira <zephira58@protonmail.com>	2024-12-23 06:29:36 +00:00
Pedro Santos	41c6916273	Revert "Update face_enhancer.py" This reverts commit `ed7a21687c`.	2024-12-23 06:08:45 +00:00
Kenneth Estanislao	ed7a21687c	Update face_enhancer.py change if from before statement to elif, also fix conditional ladder	2024-12-23 12:45:53 +08:00
KRSHH	84ca1dc2f2	Make Face Enhancer Model device Conditional Added Co-Author Co-Authored-By: Rishon <rishon@rishon.me>	2024-12-19 21:18:28 +05:30
KRSHH	681c20dbbd	Revert "Make Face Enhancer Model device Conditional" This reverts commit `c240f6e31c`.	2024-12-19 21:16:56 +05:30
KRSHH	c240f6e31c	Make Face Enhancer Model device Conditional	2024-12-19 21:12:57 +05:30
Kenneth Estanislao	a1d9b73742	Revert "Merge pull request #829 from RishonLi/patch-1" This reverts commit `5f5fe8890a`, reversing changes made to `a9e8f27360`.	2024-12-16 22:46:39 +08:00
Rishon	de4f765878	Update face_enhancer.py for apple silicon mps	2024-12-14 16:47:07 +08:00
KRSHH	c72582506d	Adding Pygrabber as Cam manager	2024-12-13 19:49:11 +05:30
NeuroDonu	e4761e4d66	fix path for download and use model	2024-11-09 16:43:35 +03:00
NeuroDonu	a840986159	fix path for model	2024-11-09 16:43:13 +03:00
KRSHH	29c9c119d3	Add Mouth Mask Feature	2024-10-25 20:59:30 +05:30
KRSHH	3da987340b	Fix Enhancer for Map Faces	2024-10-15 13:08:03 +05:30
Kenneth Estanislao	e531f6f26e	improved performance enhancement improved performance	2024-10-05 01:42:40 +08:00
Kenneth Estanislao	cad40b25dc	Update face_swapper.py added the missing ' , my bad on this...	2024-09-19 21:00:29 +08:00
Kenneth Estanislao	1b4c0ce43e	Update face_swapper.py should fix issues for those who dont have nvidia cards	2024-09-19 17:43:05 +08:00
Roland Pereira	f133d48f60	handled webcam scenario where detected faces are greater than maps provided	2024-09-11 21:42:38 +05:30
pereiraroland26@gmail.com	53fc65ca7c	Added ability to map faces	2024-09-10 05:40:55 +05:30
underlines	c91ab8bbd2	add toggle button for blueish cam fix (Force OpenCV2 BGR2RGB)	2024-08-30 22:02:23 +02:00
underlines	79c6615a68	use mjpeg and convert bgr to rgb	2024-08-30 21:49:01 +02:00
c4fun	6400a80a91	solve the FACE_ENHANCER os problem for non-nt(linux, mac) system	2024-08-07 23:26:47 +08:00
Joep de Jong	db97940adb	Fix face_enhancer issues	2024-07-07 14:33:32 +02:00

1 2

52 Commits