This commit is contained in:
vnyash
2024-06-13 07:56:13 +05:30
commit 47d3520c19
184 changed files with 10075 additions and 0 deletions
Vendored
BIN
View File
Binary file not shown.
+8
View File
@@ -0,0 +1,8 @@
root = true
[*]
end_of_line = lf
insert_final_newline = true
indent_size = 4
indent_style = tab
trim_trailing_whitespace = true
+4
View File
@@ -0,0 +1,4 @@
.assets
.idea
.vscode
__pychache__
+144
View File
@@ -0,0 +1,144 @@
# DeepFuze
![DeepFuze Lipsync](https://user-images.githubusercontent.com/4397546/222490039-b1f6156b-bf00-405b-9fda-0c9a9156f991.gif)
## Overview
DeepFuze is a state-of-the-art deep learning tool that seamlessly integrates with [ComfyUI](https://github.com/comfyanonymous/ComfyUI) to revolutionize facial transformations, lipsyncing, video generation, voice cloning, face swapping, and lipsync translation. Leveraging advanced algorithms, DeepFuze enables users to combine audio and video with unparalleled realism, ensuring perfectly synchronized facial movements. This innovative solution is ideal for content creators, animators, developers, and anyone seeking to elevate their video editing projects with sophisticated AI-driven features.
[![DeepFuze Lipsync](https://github.com/SamKhoze/ComfyUI-DeepFuze/blob/main/imgs/DeepFuze_Lipsync.jpg)](https://www.youtube.com/watch?v=9WbvlOK_BlI "DeepFuze Lipsync")
[![IMAGE ALT TEXT HERE](https://github.com/SamKhoze/ComfyUI-DeepFuze/blob/main/imgs/DeepFuze_Lipsync_02.jpg)](https://www.youtube.com/watch?v=1c5TK3zTKr8)
---
## Installation
### Prerequisites for Voice Cloning and Lipsyncing
Below are the two ComfyUI repositories required to load video and audio. Install them into your `custom_nodes` folder:
1. Clone the repositories:
```bash
cd custom_nodes
git clone https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite.git
git clone https://github.com/a1lazydog/ComfyUI-AudioScheduler.git
```
### Running the Model and Installation
2. Clone this repository into the `custom_nodes` folder and install requirements:
```bash
git clone https://github.com/SamKhoze/CompfyUI-DeepFuze.git
cd CompfyUI-DeepFuze
pip3 install -r requirements.txt
```
3. Download models from the links below or download all models at once via [DeepFuze Models](https://drive.google.com/drive/folders/1dyu81WAP7_us8-loHjOXZzBJETNeJYJk?usp=sharing)
----
### Windows Native
- Make sure you have `ffmpeg` in the `%PATH%`, following [this](https://www.geeksforgeeks.org/how-to-install-ffmpeg-on-windows/) tutorial to install `ffmpeg` or using scoop.
----
### For MAC users please set the environment variable before running it
This method has been tested on a M1 and M3 Mac
```
export PYTORCH_ENABLE_MPS_FALLBACK=1
```
### macOS needs to install the original dlib.
```
pip install dlib
```
---
## DeepFuze Lipsync
This node generates lipsyncing video from, video, image, and WAV audio files.
**Input Types:**
- `images`: Extracted frame images as PyTorch tensors.
- `audio`: An instance of loaded audio data.
- `mata_batch`: Load batch numbers via the Meta Batch Manager node.
**Output Types:**
- `IMAGES`: Extracted frame images as PyTorch tensors.
- `frame_count`: Output frame counts int.
- `audio`: Output audio.
- `video_info`: Output video metadata.
**DeepFuze Lipsync Features:**
- `enhancer`: You can add an enhancer to improve the quality of the generated video. Using gfpgan or RestoreFormer to enhance the generated face via face restoration network
- `frame_enhancer`: You can add an enhancing the whole frame of video
- `face_mask_padding_left` : padding to left on the face while lipsycing
- `face_mask_padding_right` : padding to right on the face while lipsycing
- `face_mask_padding_bottom` : padding to bottom on the face while lipsycing
- `face_mask_padding_top` : padding to top on the face while lipsycing
- `device` : [cpu,gpu]
- `trim_frame_start`: remove the number of frames from start
- `trim_frame_end`: remove the number of frames from end
- `save_outpou`: If it is True, it will save the output.
![Lipsyncing Node example](https://github.com/SamKhoze/ComfyUI-DeepFuze/blob/main/examples/node.jpeg)
### DeepFuze_TTS
**Languages:**
**DeepFuze_TTS voice cloning supports 17 languages: English (en), Spanish (es), French (fr), German (de), Italian (it), Portuguese (pt), Polish (pl), Turkish (tr), Russian (ru), Dutch (nl), Czech (cs), Arabic (ar), Chinese (zh-cn), Japanese (ja), Hungarian (hu), Korean (ko) Hindi (hi).**
This node is used to clone any voice from typed input. The audio file should be 10-15 seconds long for better results and should not have much noise.
**Input Types:**
- `audio`: An instance of loaded audio data.
- `text`: Text to generate the cloned voice audio.
**Output Types:**
- `audio`: An instance of loaded audio data.
![TTS Node example](https://github.com/SamKhoze/ComfyUI-DeepFuze/blob/main/imgs/DeepFuze_TTS.jpg)
**Basic Integration**
![BasicWorkspace](https://github.com/SamKhoze/ComfyUI-DeepFuze/blob/main/imgs/BasicWorkspace.jpg)
---
## Example of How to Use DeepFuze Programmatically
```python
from deepfuze import DeepFuze
# Initialize the DeepFuze instance
deepfuze = DeepFuze()
# Load video and audio files
deepfuze.load_video('path/to/video.mp4')
deepfuze.load_audio('path/to/audio.mp3')
deepfuze.load_checkpoint('path/to/checkpoint_path')
# Set parameters (optional)
deepfuze.set_parameters(sync_level=5, transform_intensity=3)
# Generate lipsynced video
output_path = deepfuze.generate(output='path/to/output.mp4')
print(f"Lipsynced video saved at {output_path}")
```
# Acknowledgements
This repository could not have been completed without the contributions from [SadTalker](https://github.com/OpenTalker/SadTalker/tree/main), [Facexlib](https://github.com/xinntao/facexlib), [GFPGAN](https://github.com/TencentARC/GFPGAN), [GPEN](https://github.com/yangxy/GPEN), [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN), [TTS](https://github.com/coqui-ai/TTS/tree/dev), [SSD](https://pytorch.org/hub/nvidia_deeplearningexamples_ssd/), and [wav2lip](https://github.com/Rudrabha/Wav2Lip),
1. Please carefully read and comply with the open-source license applicable to this code and models before using it.
2. Please carefully read and comply with the intellectual property declaration applicable to this code and models before using it.
3. This open-source code runs completely offline and does not collect any personal information or other data. If you use this code to provide services to end-users and collect related data, please take necessary compliance measures according to applicable laws and regulations (such as publishing privacy policies, adopting necessary data security strategies, etc.). If the collected data involves personal information, user consent must be obtained (if applicable).
4. It is prohibited to use this open-source code for activities that harm the legitimate rights and interests of others (including but not limited to fraud, deception, infringement of others' portrait rights, reputation rights, etc.), or other behaviors that violate applicable laws and regulations or go against social ethics and good customs (including providing incorrect or false information, terrorist, child/minors pornography and violent information, etc.). Otherwise, you may be liable for legal responsibilities.
The DeepFuze code is developed by Dr. Sam Khoze and his team. Feel free to use the DeepFuze code for personal, research, academic, and non-commercial purposes. You can create videos with this tool, but please make sure to follow local laws and use it responsibly. The developers will not be responsible for any misuse of the tool by users. For commercial use, please contact us at info@cogidigm.com.
Executable
+6
View File
@@ -0,0 +1,6 @@
from .nodes import NODE_CLASS_MAPPINGS, NODE_DISPLAY_NAME_MAPPINGS
import folder_paths
WEB_DIRECTORY = "./web"
__all__ = ["NODE_CLASS_MAPPINGS", "NODE_DISPLAY_NAME_MAPPINGS", "WEB_DIRECTORY"]
+20
View File
@@ -0,0 +1,20 @@
import sounddevice
class PlayBackAudio:
@classmethod
def INPUT_TYPES(self):
return {
"required":{
"audio": ("AUDIO",)
}
}
OUTPUT_NODE = True
RETURN_NAMES = ()
RETURN_TYPES = ()
CATEGORY = "DeepFuze (Adavance)"
FUNCTION = "play_audio"
def play_audio(self,audio):
sounddevice.play(audio.audio_data,audio.sample_rate)
return ()
BIN
View File
Binary file not shown.
View File
+137
View File
@@ -0,0 +1,137 @@
from typing import Optional, Any, List
from functools import lru_cache
import numpy
import scipy
from deepfuze.filesystem import is_audio
from deepfuze.ffmpeg import read_audio_buffer
from deepfuze.typing import Fps, Audio, AudioFrame, Spectrogram, MelFilterBank
from deepfuze.voice_extractor import batch_extract_voice
@lru_cache(maxsize = 128)
def read_static_audio(audio_path : str, fps : Fps) -> Optional[List[AudioFrame]]:
return read_audio(audio_path, fps)
def read_audio(audio_path : str, fps : Fps) -> Optional[List[AudioFrame]]:
sample_rate = 48000
channel_total = 2
if is_audio(audio_path):
audio_buffer = read_audio_buffer(audio_path, sample_rate, channel_total)
audio = numpy.frombuffer(audio_buffer, dtype = numpy.int16).reshape(-1, 2)
audio = prepare_audio(audio)
spectrogram = create_spectrogram(audio)
audio_frames = extract_audio_frames(spectrogram, fps)
return audio_frames
return None
@lru_cache(maxsize = 128)
def read_static_voice(audio_path : str, fps : Fps) -> Optional[List[AudioFrame]]:
return read_voice(audio_path, fps)
def read_voice(audio_path : str, fps : Fps) -> Optional[List[AudioFrame]]:
sample_rate = 48000
channel_total = 2
chunk_size = 1024 * 240
step_size = 1024 * 180
if is_audio(audio_path):
audio_buffer = read_audio_buffer(audio_path, sample_rate, channel_total)
audio = numpy.frombuffer(audio_buffer, dtype = numpy.int16).reshape(-1, 2)
audio = batch_extract_voice(audio, chunk_size, step_size)
audio = prepare_voice(audio)
spectrogram = create_spectrogram(audio)
audio_frames = extract_audio_frames(spectrogram, fps)
return audio_frames
return None
def get_audio_frame(audio_path : str, fps : Fps, frame_number : int = 0) -> Optional[AudioFrame]:
if is_audio(audio_path):
audio_frames = read_static_audio(audio_path, fps)
if frame_number in range(len(audio_frames)):
return audio_frames[frame_number]
return None
def get_voice_frame(audio_path : str, fps : Fps, frame_number : int = 0) -> Optional[AudioFrame]:
if is_audio(audio_path):
voice_frames = read_static_voice(audio_path, fps)
if frame_number in range(len(voice_frames)):
return voice_frames[frame_number]
return None
def create_empty_audio_frame() -> AudioFrame:
mel_filter_total = 80
step_size = 16
audio_frame = numpy.zeros((mel_filter_total, step_size)).astype(numpy.int16)
return audio_frame
def prepare_audio(audio : numpy.ndarray[Any, Any]) -> Audio:
if audio.ndim > 1:
audio = numpy.mean(audio, axis = 1)
audio = audio / numpy.max(numpy.abs(audio), axis = 0)
audio = scipy.signal.lfilter([ 1.0, -0.97 ], [ 1.0 ], audio)
return audio
def prepare_voice(audio : numpy.ndarray[Any, Any]) -> Audio:
sample_rate = 48000
resample_rate = 16000
audio = scipy.signal.resample(audio, int(len(audio) * resample_rate / sample_rate))
audio = prepare_audio(audio)
return audio
def convert_hertz_to_mel(hertz : float) -> float:
return 2595 * numpy.log10(1 + hertz / 700)
def convert_mel_to_hertz(mel : numpy.ndarray[Any, Any]) -> numpy.ndarray[Any, Any]:
return 700 * (10 ** (mel / 2595) - 1)
def create_mel_filter_bank() -> MelFilterBank:
mel_filter_total = 80
mel_bin_total = 800
sample_rate = 16000
min_frequency = 55.0
max_frequency = 7600.0
mel_filter_bank = numpy.zeros((mel_filter_total, mel_bin_total // 2 + 1))
mel_frequency_range = numpy.linspace(convert_hertz_to_mel(min_frequency), convert_hertz_to_mel(max_frequency), mel_filter_total + 2)
indices = numpy.floor((mel_bin_total + 1) * convert_mel_to_hertz(mel_frequency_range) / sample_rate).astype(numpy.int16)
for index in range(mel_filter_total):
start = indices[index]
end = indices[index + 1]
mel_filter_bank[index, start:end] = scipy.signal.windows.triang(end - start)
return mel_filter_bank
def create_spectrogram(audio : Audio) -> Spectrogram:
mel_bin_total = 800
mel_bin_overlap = 600
mel_filter_bank = create_mel_filter_bank()
spectrogram = scipy.signal.stft(audio, nperseg = mel_bin_total, nfft = mel_bin_total, noverlap = mel_bin_overlap)[2]
spectrogram = numpy.dot(mel_filter_bank, numpy.abs(spectrogram))
return spectrogram
def extract_audio_frames(spectrogram : Spectrogram, fps : Fps) -> List[AudioFrame]:
mel_filter_total = 80
step_size = 16
audio_frames = []
indices = numpy.arange(0, spectrogram.shape[1], mel_filter_total / fps).astype(numpy.int16)
indices = indices[indices >= step_size]
for index in indices:
start = max(0, index - step_size)
audio_frames.append(spectrogram[:, start:index])
return audio_frames
+37
View File
@@ -0,0 +1,37 @@
from typing import List, Dict
from deepfuze.typing import VideoMemoryStrategy, FaceSelectorMode, FaceAnalyserOrder, FaceAnalyserAge, FaceAnalyserGender, FaceDetectorModel, FaceMaskType, FaceMaskRegion, TempFrameFormat, OutputVideoEncoder, OutputVideoPreset
from deepfuze.common_helper import create_int_range, create_float_range
video_memory_strategies : List[VideoMemoryStrategy] = [ 'strict', 'moderate', 'tolerant' ]
face_analyser_orders : List[FaceAnalyserOrder] = [ 'left-right', 'right-left', 'top-bottom', 'bottom-top', 'small-large', 'large-small', 'best-worst', 'worst-best' ]
face_analyser_ages : List[FaceAnalyserAge] = [ 'child', 'teen', 'adult', 'senior' ]
face_analyser_genders : List[FaceAnalyserGender] = [ 'female', 'male' ]
face_detector_set : Dict[FaceDetectorModel, List[str]] =\
{
'many': [ '640x640' ],
'retinaface': [ '160x160', '320x320', '480x480', '512x512', '640x640' ],
'scrfd': [ '160x160', '320x320', '480x480', '512x512', '640x640' ],
'yoloface': [ '640x640' ],
'yunet': [ '160x160', '320x320', '480x480', '512x512', '640x640', '768x768', '960x960', '1024x1024' ]
}
face_selector_modes : List[FaceSelectorMode] = [ 'many', 'one', 'reference' ]
face_mask_types : List[FaceMaskType] = [ 'box', 'occlusion', 'region' ]
face_mask_regions : List[FaceMaskRegion] = [ 'skin', 'left-eyebrow', 'right-eyebrow', 'left-eye', 'right-eye', 'glasses', 'nose', 'mouth', 'upper-lip', 'lower-lip' ]
temp_frame_formats : List[TempFrameFormat] = [ 'bmp', 'jpg', 'png' ]
output_video_encoders : List[OutputVideoEncoder] = [ 'libx264', 'libx265', 'libvpx-vp9', 'h264_nvenc', 'hevc_nvenc', 'h264_amf', 'hevc_amf' ]
output_video_presets : List[OutputVideoPreset] = [ 'ultrafast', 'superfast', 'veryfast', 'faster', 'fast', 'medium', 'slow', 'slower', 'veryslow' ]
image_template_sizes : List[float] = [ 0.25, 0.5, 0.75, 1, 1.5, 2, 2.5, 3, 3.5, 4 ]
video_template_sizes : List[int] = [ 240, 360, 480, 540, 720, 1080, 1440, 2160, 4320 ]
execution_thread_count_range : List[int] = create_int_range(1, 128, 1)
execution_queue_count_range : List[int] = create_int_range(1, 32, 1)
system_memory_limit_range : List[int] = create_int_range(0, 128, 1)
face_detector_score_range : List[float] = create_float_range(0.0, 1.0, 0.05)
face_landmarker_score_range : List[float] = create_float_range(0.0, 1.0, 0.05)
face_mask_blur_range : List[float] = create_float_range(0.0, 1.0, 0.05)
face_mask_padding_range : List[int] = create_int_range(0, 100, 1)
reference_face_distance_range : List[float] = create_float_range(0.0, 1.5, 0.05)
output_image_quality_range : List[int] = create_int_range(0, 100, 1)
output_video_quality_range : List[int] = create_int_range(0, 100, 1)
+46
View File
@@ -0,0 +1,46 @@
from typing import List, Any
import platform
def create_metavar(ranges : List[Any]) -> str:
return '[' + str(ranges[0]) + '-' + str(ranges[-1]) + ']'
def create_int_range(start : int, end : int, step : int) -> List[int]:
int_range = []
current = start
while current <= end:
int_range.append(current)
current += step
return int_range
def create_float_range(start : float, end : float, step : float) -> List[float]:
float_range = []
current = start
while current <= end:
float_range.append(round(current, 2))
current = round(current + step, 2)
return float_range
def is_linux() -> bool:
return to_lower_case(platform.system()) == 'linux'
def is_macos() -> bool:
return to_lower_case(platform.system()) == 'darwin'
def is_windows() -> bool:
return to_lower_case(platform.system()) == 'windows'
def to_lower_case(__string__ : Any) -> str:
return str(__string__).lower()
def get_first(__list__ : Any) -> Any:
return next(iter(__list__), None)
+91
View File
@@ -0,0 +1,91 @@
from configparser import ConfigParser
from typing import Any, Optional, List
import deepfuze.globals
CONFIG = None
def get_config() -> ConfigParser:
global CONFIG
if CONFIG is None:
CONFIG = ConfigParser()
CONFIG.read(deepfuze.globals.config_path, encoding = 'utf-8')
return CONFIG
def clear_config() -> None:
global CONFIG
CONFIG = None
def get_str_value(key : str, fallback : Optional[str] = None) -> Optional[str]:
value = get_value_by_notation(key)
if value or fallback:
return str(value or fallback)
return None
def get_int_value(key : str, fallback : Optional[str] = None) -> Optional[int]:
value = get_value_by_notation(key)
if value or fallback:
return int(value or fallback)
return None
def get_float_value(key : str, fallback : Optional[str] = None) -> Optional[float]:
value = get_value_by_notation(key)
if value or fallback:
return float(value or fallback)
return None
def get_bool_value(key : str, fallback : Optional[str] = None) -> Optional[bool]:
value = get_value_by_notation(key)
if value == 'True' or fallback == 'True':
return True
if value == 'False' or fallback == 'False':
return False
return None
def get_str_list(key : str, fallback : Optional[str] = None) -> Optional[List[str]]:
value = get_value_by_notation(key)
if value or fallback:
return [ str(value) for value in (value or fallback).split(' ') ]
return None
def get_int_list(key : str, fallback : Optional[str] = None) -> Optional[List[int]]:
value = get_value_by_notation(key)
if value or fallback:
return [ int(value) for value in (value or fallback).split(' ') ]
return None
def get_float_list(key : str, fallback : Optional[str] = None) -> Optional[List[float]]:
value = get_value_by_notation(key)
if value or fallback:
return [ float(value) for value in (value or fallback).split(' ') ]
return None
def get_value_by_notation(key : str) -> Optional[Any]:
config = get_config()
if '.' in key:
section, name = key.split('.')
if section in config and name in config[section]:
return config[section][name]
if key in config:
return config[key]
return None
+112
View File
@@ -0,0 +1,112 @@
from typing import Any
from functools import lru_cache
from time import sleep
import cv2
import numpy
import onnxruntime
from tqdm import tqdm
import deepfuze.globals
from deepfuze import process_manager, wording
from deepfuze.thread_helper import thread_lock, conditional_thread_semaphore
from deepfuze.typing import VisionFrame, ModelSet, Fps
from deepfuze.execution import apply_execution_provider_options
from deepfuze.vision import get_video_frame, count_video_frame_total, read_image, detect_video_fps
from deepfuze.filesystem import resolve_relative_path, is_file
from deepfuze.download import conditional_download
CONTENT_ANALYSER = None
MODELS : ModelSet =\
{
'open_nsfw':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/open_nsfw.onnx',
'path': resolve_relative_path('../../../models/deepfuze/open_nsfw.onnx')
}
}
PROBABILITY_LIMIT = 0.80
RATE_LIMIT = 10
STREAM_COUNTER = 0
def get_content_analyser() -> Any:
global CONTENT_ANALYSER
with thread_lock():
while process_manager.is_checking():
sleep(0.5)
if CONTENT_ANALYSER is None:
model_path = MODELS.get('open_nsfw').get('path')
CONTENT_ANALYSER = onnxruntime.InferenceSession(model_path, providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
return CONTENT_ANALYSER
def clear_content_analyser() -> None:
global CONTENT_ANALYSER
CONTENT_ANALYSER = None
def pre_check() -> bool:
download_directory_path = resolve_relative_path('../../../models/deepfuze')
model_url = MODELS.get('open_nsfw').get('url')
model_path = MODELS.get('open_nsfw').get('path')
if not deepfuze.globals.skip_download:
process_manager.check()
conditional_download(download_directory_path, [ model_url ])
process_manager.end()
return is_file(model_path)
def analyse_stream(vision_frame : VisionFrame, video_fps : Fps) -> bool:
global STREAM_COUNTER
STREAM_COUNTER = STREAM_COUNTER + 1
if STREAM_COUNTER % int(video_fps) == 0:
return analyse_frame(vision_frame)
return False
def analyse_frame(vision_frame : VisionFrame) -> bool:
content_analyser = get_content_analyser()
vision_frame = prepare_frame(vision_frame)
with conditional_thread_semaphore(deepfuze.globals.execution_providers):
probability = content_analyser.run(None,
{
content_analyser.get_inputs()[0].name: vision_frame
})[0][0][1]
return probability > PROBABILITY_LIMIT
def prepare_frame(vision_frame : VisionFrame) -> VisionFrame:
vision_frame = cv2.resize(vision_frame, (224, 224)).astype(numpy.float32)
vision_frame -= numpy.array([ 104, 117, 123 ]).astype(numpy.float32)
vision_frame = numpy.expand_dims(vision_frame, axis = 0)
return vision_frame
@lru_cache(maxsize = None)
def analyse_image(image_path : str) -> bool:
frame = read_image(image_path)
return analyse_frame(frame)
@lru_cache(maxsize = None)
def analyse_video(video_path : str, start_frame : int, end_frame : int) -> bool:
video_frame_total = count_video_frame_total(video_path)
video_fps = detect_video_fps(video_path)
frame_range = range(start_frame or 0, end_frame or video_frame_total)
rate = 0.0
counter = 0
with tqdm(total = len(frame_range), desc = wording.get('analysing'), unit = 'frame', ascii = ' =', disable = deepfuze.globals.log_level in [ 'warn', 'error' ]) as progress:
for frame_number in frame_range:
if frame_number % int(video_fps) == 0:
frame = get_video_frame(video_path, frame_number)
if analyse_frame(frame):
counter += 1
rate = counter * int(video_fps) / len(frame_range) * 100
progress.update()
progress.set_postfix(rate = rate)
return rate > RATE_LIMIT
+438
View File
@@ -0,0 +1,438 @@
import os
os.environ['OMP_NUM_THREADS'] = '1'
import signal
import sys
import warnings
import shutil
import numpy
import onnxruntime
from time import sleep, time
from argparse import ArgumentParser, HelpFormatter
import deepfuze.choices
import deepfuze.globals
from deepfuze.face_analyser import get_one_face, get_average_face
from deepfuze.face_store import get_reference_faces, append_reference_face
from deepfuze import face_analyser, face_masker, content_analyser, config, process_manager, metadata, logger, wording, voice_extractor
from deepfuze.content_analyser import analyse_image, analyse_video
from deepfuze.processors.frame.core import get_frame_processors_modules, load_frame_processor_module
from deepfuze.common_helper import create_metavar, get_first
from deepfuze.execution import encode_execution_providers, decode_execution_providers
from deepfuze.normalizer import normalize_output_path, normalize_padding, normalize_fps
from deepfuze.memory import limit_system_memory
from deepfuze.statistics import conditional_log_statistics
from deepfuze.download import conditional_download
from deepfuze.filesystem import get_temp_frame_paths, get_temp_file_path, create_temp, move_temp, clear_temp, is_image, is_video, filter_audio_paths, resolve_relative_path, list_directory
from deepfuze.ffmpeg import extract_frames, merge_video, copy_image, finalize_image, restore_audio, replace_audio
from deepfuze.vision import read_image, read_static_images, detect_image_resolution, restrict_video_fps, create_image_resolutions, get_video_frame, detect_video_resolution, detect_video_fps, restrict_video_resolution, restrict_image_resolution, create_video_resolutions, pack_resolution, unpack_resolution
onnxruntime.set_default_logger_severity(3)
warnings.filterwarnings('ignore', category = UserWarning, module = 'gradio')
def cli() -> None:
signal.signal(signal.SIGINT, lambda signal_number, frame: destroy())
program = ArgumentParser(formatter_class = lambda prog: HelpFormatter(prog, max_help_position = 200), add_help = False)
# general
program.add_argument('-c', '--config', help = wording.get('help.config'), dest = 'config_path', default = 'deepfuze.ini')
apply_config(program)
program.add_argument('-s', '--source', help = wording.get('help.source'), action = 'append', dest = 'source_paths', default = config.get_str_list('general.source_paths'))
program.add_argument('-t', '--target', help = wording.get('help.target'), dest = 'target_path', default = config.get_str_value('general.target_path'))
program.add_argument('-o', '--output', help = wording.get('help.output'), dest = 'output_path', default = config.get_str_value('general.output_path'))
program.add_argument('-v', '--version', version = metadata.get('name') + ' ' + metadata.get('version'), action = 'version')
# misc
group_misc = program.add_argument_group('misc')
group_misc.add_argument('--force-download', help = wording.get('help.force_download'), action = 'store_true', default = config.get_bool_value('misc.force_download'))
group_misc.add_argument('--skip-download', help = wording.get('help.skip_download'), action = 'store_true', default = config.get_bool_value('misc.skip_download'))
group_misc.add_argument('--headless', help = wording.get('help.headless'), action = 'store_true', default = config.get_bool_value('misc.headless'))
group_misc.add_argument('--log-level', help = wording.get('help.log_level'), default = config.get_str_value('misc.log_level', 'info'), choices = logger.get_log_levels())
# execution
execution_providers = encode_execution_providers(onnxruntime.get_available_providers())
group_execution = program.add_argument_group('execution')
group_execution.add_argument('--execution-device-id', help = wording.get('help.execution_device_id'), default = config.get_str_value('execution.face_detector_size', '0'))
group_execution.add_argument('--execution-providers', help = wording.get('help.execution_providers').format(choices = ', '.join(execution_providers)), default = config.get_str_list('execution.execution_providers', 'cpu'), choices = execution_providers, nargs = '+', metavar = 'EXECUTION_PROVIDERS')
group_execution.add_argument('--execution-thread-count', help = wording.get('help.execution_thread_count'), type = int, default = config.get_int_value('execution.execution_thread_count', '4'), choices = deepfuze.choices.execution_thread_count_range, metavar = create_metavar(deepfuze.choices.execution_thread_count_range))
group_execution.add_argument('--execution-queue-count', help = wording.get('help.execution_queue_count'), type = int, default = config.get_int_value('execution.execution_queue_count', '1'), choices = deepfuze.choices.execution_queue_count_range, metavar = create_metavar(deepfuze.choices.execution_queue_count_range))
# memory
group_memory = program.add_argument_group('memory')
group_memory.add_argument('--video-memory-strategy', help = wording.get('help.video_memory_strategy'), default = config.get_str_value('memory.video_memory_strategy', 'strict'), choices = deepfuze.choices.video_memory_strategies)
group_memory.add_argument('--system-memory-limit', help = wording.get('help.system_memory_limit'), type = int, default = config.get_int_value('memory.system_memory_limit', '0'), choices = deepfuze.choices.system_memory_limit_range, metavar = create_metavar(deepfuze.choices.system_memory_limit_range))
# face analyser
group_face_analyser = program.add_argument_group('face analyser')
group_face_analyser.add_argument('--face-analyser-order', help = wording.get('help.face_analyser_order'), default = config.get_str_value('face_analyser.face_analyser_order', 'left-right'), choices = deepfuze.choices.face_analyser_orders)
group_face_analyser.add_argument('--face-analyser-age', help = wording.get('help.face_analyser_age'), default = config.get_str_value('face_analyser.face_analyser_age'), choices = deepfuze.choices.face_analyser_ages)
group_face_analyser.add_argument('--face-analyser-gender', help = wording.get('help.face_analyser_gender'), default = config.get_str_value('face_analyser.face_analyser_gender'), choices = deepfuze.choices.face_analyser_genders)
group_face_analyser.add_argument('--face-detector-model', help = wording.get('help.face_detector_model'), default = config.get_str_value('face_analyser.face_detector_model', 'yoloface'), choices = deepfuze.choices.face_detector_set.keys())
group_face_analyser.add_argument('--face-detector-size', help = wording.get('help.face_detector_size'), default = config.get_str_value('face_analyser.face_detector_size', '640x640'))
group_face_analyser.add_argument('--face-detector-score', help = wording.get('help.face_detector_score'), type = float, default = config.get_float_value('face_analyser.face_detector_score', '0.5'), choices = deepfuze.choices.face_detector_score_range, metavar = create_metavar(deepfuze.choices.face_detector_score_range))
group_face_analyser.add_argument('--face-landmarker-score', help = wording.get('help.face_landmarker_score'), type = float, default = config.get_float_value('face_analyser.face_landmarker_score', '0.5'), choices = deepfuze.choices.face_landmarker_score_range, metavar = create_metavar(deepfuze.choices.face_landmarker_score_range))
# face selector
group_face_selector = program.add_argument_group('face selector')
group_face_selector.add_argument('--face-selector-mode', help = wording.get('help.face_selector_mode'), default = config.get_str_value('face_selector.face_selector_mode', 'reference'), choices = deepfuze.choices.face_selector_modes)
group_face_selector.add_argument('--reference-face-position', help = wording.get('help.reference_face_position'), type = int, default = config.get_int_value('face_selector.reference_face_position', '0'))
group_face_selector.add_argument('--reference-face-distance', help = wording.get('help.reference_face_distance'), type = float, default = config.get_float_value('face_selector.reference_face_distance', '0.6'), choices = deepfuze.choices.reference_face_distance_range, metavar = create_metavar(deepfuze.choices.reference_face_distance_range))
group_face_selector.add_argument('--reference-frame-number', help = wording.get('help.reference_frame_number'), type = int, default = config.get_int_value('face_selector.reference_frame_number', '0'))
# face mask
group_face_mask = program.add_argument_group('face mask')
group_face_mask.add_argument('--face-mask-types', help = wording.get('help.face_mask_types').format(choices = ', '.join(deepfuze.choices.face_mask_types)), default = config.get_str_list('face_mask.face_mask_types', 'box'), choices = deepfuze.choices.face_mask_types, nargs = '+', metavar = 'FACE_MASK_TYPES')
group_face_mask.add_argument('--face-mask-blur', help = wording.get('help.face_mask_blur'), type = float, default = config.get_float_value('face_mask.face_mask_blur', '0.3'), choices = deepfuze.choices.face_mask_blur_range, metavar = create_metavar(deepfuze.choices.face_mask_blur_range))
group_face_mask.add_argument('--face-mask-padding', help = wording.get('help.face_mask_padding'), type = int, default = config.get_int_list('face_mask.face_mask_padding', '0 0 0 0'), nargs = '+')
group_face_mask.add_argument('--face-mask-regions', help = wording.get('help.face_mask_regions').format(choices = ', '.join(deepfuze.choices.face_mask_regions)), default = config.get_str_list('face_mask.face_mask_regions', ' '.join(deepfuze.choices.face_mask_regions)), choices = deepfuze.choices.face_mask_regions, nargs = '+', metavar = 'FACE_MASK_REGIONS')
# frame extraction
group_frame_extraction = program.add_argument_group('frame extraction')
group_frame_extraction.add_argument('--trim-frame-start', help = wording.get('help.trim_frame_start'), type = int, default = deepfuze.config.get_int_value('frame_extraction.trim_frame_start'))
group_frame_extraction.add_argument('--trim-frame-end', help = wording.get('help.trim_frame_end'), type = int, default = deepfuze.config.get_int_value('frame_extraction.trim_frame_end'))
group_frame_extraction.add_argument('--temp-frame-format', help = wording.get('help.temp_frame_format'), default = config.get_str_value('frame_extraction.temp_frame_format', 'png'), choices = deepfuze.choices.temp_frame_formats)
group_frame_extraction.add_argument('--keep-temp', help = wording.get('help.keep_temp'), action = 'store_true', default = config.get_bool_value('frame_extraction.keep_temp'))
# output creation
group_output_creation = program.add_argument_group('output creation')
group_output_creation.add_argument('--output-image-quality', help = wording.get('help.output_image_quality'), type = int, default = config.get_int_value('output_creation.output_image_quality', '80'), choices = deepfuze.choices.output_image_quality_range, metavar = create_metavar(deepfuze.choices.output_image_quality_range))
group_output_creation.add_argument('--output-image-resolution', help = wording.get('help.output_image_resolution'), default = config.get_str_value('output_creation.output_image_resolution'))
group_output_creation.add_argument('--output-video-encoder', help = wording.get('help.output_video_encoder'), default = config.get_str_value('output_creation.output_video_encoder', 'libx264'), choices = deepfuze.choices.output_video_encoders)
group_output_creation.add_argument('--output-video-preset', help = wording.get('help.output_video_preset'), default = config.get_str_value('output_creation.output_video_preset', 'veryfast'), choices = deepfuze.choices.output_video_presets)
group_output_creation.add_argument('--output-video-quality', help = wording.get('help.output_video_quality'), type = int, default = config.get_int_value('output_creation.output_video_quality', '80'), choices = deepfuze.choices.output_video_quality_range, metavar = create_metavar(deepfuze.choices.output_video_quality_range))
group_output_creation.add_argument('--output-video-resolution', help = wording.get('help.output_video_resolution'), default = config.get_str_value('output_creation.output_video_resolution'))
group_output_creation.add_argument('--output-video-fps', help = wording.get('help.output_video_fps'), type = float, default = config.get_str_value('output_creation.output_video_fps'))
group_output_creation.add_argument('--skip-audio', help = wording.get('help.skip_audio'), action = 'store_true', default = config.get_bool_value('output_creation.skip_audio'))
# frame processors
available_frame_processors = list_directory('deepfuze/processors/frame/modules')
program = ArgumentParser(parents = [ program ], formatter_class = program.formatter_class, add_help = True)
group_frame_processors = program.add_argument_group('frame processors')
group_frame_processors.add_argument('--frame-processors', help = wording.get('help.frame_processors').format(choices = ', '.join(available_frame_processors)), default = config.get_str_list('frame_processors.frame_processors', 'face_swapper'), nargs = '+')
for frame_processor in available_frame_processors:
frame_processor_module = load_frame_processor_module(frame_processor)
frame_processor_module.register_args(group_frame_processors)
# uis
available_ui_layouts = list_directory('deepfuze/uis/layouts')
group_uis = program.add_argument_group('uis')
group_uis.add_argument('--open-browser', help=wording.get('help.open_browser'), action = 'store_true', default = config.get_bool_value('uis.open_browser'))
group_uis.add_argument('--ui-layouts', help = wording.get('help.ui_layouts').format(choices = ', '.join(available_ui_layouts)), default = config.get_str_list('uis.ui_layouts', 'default'), nargs = '+')
run(program)
def apply_config(program : ArgumentParser) -> None:
known_args = program.parse_known_args()
deepfuze.globals.config_path = get_first(known_args).config_path
def validate_args(program : ArgumentParser) -> None:
try:
for action in program._actions:
if action.default:
if isinstance(action.default, list):
for default in action.default:
program._check_value(action, default)
else:
program._check_value(action, action.default)
except Exception as exception:
program.error(str(exception))
def apply_args(program : ArgumentParser) -> None:
args = program.parse_args()
# general
deepfuze.globals.source_paths = args.source_paths
deepfuze.globals.target_path = args.target_path
deepfuze.globals.output_path = args.output_path
# misc
deepfuze.globals.force_download = args.force_download
deepfuze.globals.skip_download = args.skip_download
deepfuze.globals.headless = args.headless
deepfuze.globals.log_level = args.log_level
# execution
deepfuze.globals.execution_device_id = args.execution_device_id
deepfuze.globals.execution_providers = decode_execution_providers(args.execution_providers)
deepfuze.globals.execution_thread_count = args.execution_thread_count
deepfuze.globals.execution_queue_count = args.execution_queue_count
# memory
deepfuze.globals.video_memory_strategy = args.video_memory_strategy
deepfuze.globals.system_memory_limit = args.system_memory_limit
# face analyser
deepfuze.globals.face_analyser_order = args.face_analyser_order
deepfuze.globals.face_analyser_age = args.face_analyser_age
deepfuze.globals.face_analyser_gender = args.face_analyser_gender
deepfuze.globals.face_detector_model = args.face_detector_model
if args.face_detector_size in deepfuze.choices.face_detector_set[args.face_detector_model]:
deepfuze.globals.face_detector_size = args.face_detector_size
else:
deepfuze.globals.face_detector_size = '640x640'
deepfuze.globals.face_detector_score = args.face_detector_score
deepfuze.globals.face_landmarker_score = args.face_landmarker_score
# face selector
deepfuze.globals.face_selector_mode = args.face_selector_mode
deepfuze.globals.reference_face_position = args.reference_face_position
deepfuze.globals.reference_face_distance = args.reference_face_distance
deepfuze.globals.reference_frame_number = args.reference_frame_number
# face mask
deepfuze.globals.face_mask_types = args.face_mask_types
deepfuze.globals.face_mask_blur = args.face_mask_blur
deepfuze.globals.face_mask_padding = normalize_padding(args.face_mask_padding)
deepfuze.globals.face_mask_regions = args.face_mask_regions
# frame extraction
deepfuze.globals.trim_frame_start = args.trim_frame_start
deepfuze.globals.trim_frame_end = args.trim_frame_end
deepfuze.globals.temp_frame_format = args.temp_frame_format
deepfuze.globals.keep_temp = args.keep_temp
# output creation
deepfuze.globals.output_image_quality = args.output_image_quality
if is_image(args.target_path):
output_image_resolution = detect_image_resolution(args.target_path)
output_image_resolutions = create_image_resolutions(output_image_resolution)
if args.output_image_resolution in output_image_resolutions:
deepfuze.globals.output_image_resolution = args.output_image_resolution
else:
deepfuze.globals.output_image_resolution = pack_resolution(output_image_resolution)
deepfuze.globals.output_video_encoder = args.output_video_encoder
deepfuze.globals.output_video_preset = args.output_video_preset
deepfuze.globals.output_video_quality = args.output_video_quality
if is_video(args.target_path):
output_video_resolution = detect_video_resolution(args.target_path)
output_video_resolutions = create_video_resolutions(output_video_resolution)
if args.output_video_resolution in output_video_resolutions:
deepfuze.globals.output_video_resolution = args.output_video_resolution
else:
deepfuze.globals.output_video_resolution = pack_resolution(output_video_resolution)
if args.output_video_fps or is_video(args.target_path):
deepfuze.globals.output_video_fps = normalize_fps(args.output_video_fps) or detect_video_fps(args.target_path)
deepfuze.globals.skip_audio = args.skip_audio
# frame processors
available_frame_processors = list_directory('deepfuze/processors/frame/modules')
deepfuze.globals.frame_processors = args.frame_processors
for frame_processor in available_frame_processors:
frame_processor_module = load_frame_processor_module(frame_processor)
frame_processor_module.apply_args(program)
# uis
deepfuze.globals.open_browser = args.open_browser
deepfuze.globals.ui_layouts = args.ui_layouts
def run(program : ArgumentParser) -> None:
validate_args(program)
apply_args(program)
logger.init(deepfuze.globals.log_level)
if deepfuze.globals.system_memory_limit > 0:
limit_system_memory(deepfuze.globals.system_memory_limit)
if deepfuze.globals.force_download:
force_download()
return
if not pre_check() or not content_analyser.pre_check() or not face_analyser.pre_check() or not face_masker.pre_check() or not voice_extractor.pre_check():
return
for frame_processor_module in get_frame_processors_modules(deepfuze.globals.frame_processors):
if not frame_processor_module.pre_check():
return
if deepfuze.globals.headless:
conditional_process()
else:
import deepfuze.uis.core as ui
for ui_layout in ui.get_ui_layouts_modules(deepfuze.globals.ui_layouts):
if not ui_layout.pre_check():
return
ui.launch()
def destroy() -> None:
process_manager.stop()
while process_manager.is_processing():
sleep(0.5)
if deepfuze.globals.target_path:
clear_temp(deepfuze.globals.target_path)
sys.exit(0)
def pre_check() -> bool:
if sys.version_info < (3, 9):
logger.error(wording.get('python_not_supported').format(version = '3.9'), __name__.upper())
return False
if not shutil.which('ffmpeg'):
logger.error(wording.get('ffmpeg_not_installed'), __name__.upper())
return False
return True
def conditional_process() -> None:
start_time = time()
for frame_processor_module in get_frame_processors_modules(deepfuze.globals.frame_processors):
while not frame_processor_module.post_check():
logger.disable()
sleep(0.5)
logger.enable()
if not frame_processor_module.pre_process('output'):
return
conditional_append_reference_faces()
if is_image(deepfuze.globals.target_path):
process_image(start_time)
if is_video(deepfuze.globals.target_path):
process_video(start_time)
def conditional_append_reference_faces() -> None:
if 'reference' in deepfuze.globals.face_selector_mode and not get_reference_faces():
source_frames = read_static_images(deepfuze.globals.source_paths)
source_face = get_average_face(source_frames)
if is_video(deepfuze.globals.target_path):
reference_frame = get_video_frame(deepfuze.globals.target_path, deepfuze.globals.reference_frame_number)
else:
reference_frame = read_image(deepfuze.globals.target_path)
reference_face = get_one_face(reference_frame, deepfuze.globals.reference_face_position)
append_reference_face('origin', reference_face)
if source_face and reference_face:
for frame_processor_module in get_frame_processors_modules(deepfuze.globals.frame_processors):
abstract_reference_frame = frame_processor_module.get_reference_frame(source_face, reference_face, reference_frame)
if numpy.any(abstract_reference_frame):
reference_frame = abstract_reference_frame
reference_face = get_one_face(reference_frame, deepfuze.globals.reference_face_position)
append_reference_face(frame_processor_module.__name__, reference_face)
def force_download() -> None:
download_directory_path = resolve_relative_path('../../../models/deepfuze')
available_frame_processors = list_directory('deepfuze/processors/frame/modules')
model_list =\
[
content_analyser.MODELS,
face_analyser.MODELS,
face_masker.MODELS,
voice_extractor.MODELS
]
for frame_processor_module in get_frame_processors_modules(available_frame_processors):
if hasattr(frame_processor_module, 'MODELS'):
model_list.append(frame_processor_module.MODELS)
model_urls = [ models[model].get('url') for models in model_list for model in models ]
conditional_download(download_directory_path, model_urls)
def process_image(start_time : float) -> None:
normed_output_path = normalize_output_path(deepfuze.globals.target_path, deepfuze.globals.output_path)
if analyse_image(deepfuze.globals.target_path):
return
# clear temp
logger.debug(wording.get('clearing_temp'), __name__.upper())
clear_temp(deepfuze.globals.target_path)
# create temp
logger.debug(wording.get('creating_temp'), __name__.upper())
create_temp(deepfuze.globals.target_path)
# copy image
process_manager.start()
temp_image_resolution = pack_resolution(restrict_image_resolution(deepfuze.globals.target_path, unpack_resolution(deepfuze.globals.output_image_resolution)))
logger.info(wording.get('copying_image').format(resolution = temp_image_resolution), __name__.upper())
if copy_image(deepfuze.globals.target_path, temp_image_resolution):
logger.debug(wording.get('copying_image_succeed'), __name__.upper())
else:
logger.error(wording.get('copying_image_failed'), __name__.upper())
return
# process image
temp_file_path = get_temp_file_path(deepfuze.globals.target_path)
for frame_processor_module in get_frame_processors_modules(deepfuze.globals.frame_processors):
logger.info(wording.get('processing'), frame_processor_module.NAME)
frame_processor_module.process_image(deepfuze.globals.source_paths, temp_file_path, temp_file_path)
frame_processor_module.post_process()
if is_process_stopping():
return
# finalize image
logger.info(wording.get('finalizing_image').format(resolution = deepfuze.globals.output_image_resolution), __name__.upper())
if finalize_image(deepfuze.globals.target_path, normed_output_path, deepfuze.globals.output_image_resolution):
logger.debug(wording.get('finalizing_image_succeed'), __name__.upper())
else:
logger.warn(wording.get('finalizing_image_skipped'), __name__.upper())
# clear temp
logger.debug(wording.get('clearing_temp'), __name__.upper())
clear_temp(deepfuze.globals.target_path)
# validate image
if is_image(normed_output_path):
seconds = '{:.2f}'.format((time() - start_time) % 60)
logger.info(wording.get('processing_image_succeed').format(seconds = seconds), __name__.upper())
conditional_log_statistics()
else:
logger.error(wording.get('processing_image_failed'), __name__.upper())
process_manager.end()
def process_video(start_time : float) -> None:
normed_output_path = normalize_output_path(deepfuze.globals.target_path, deepfuze.globals.output_path)
if analyse_video(deepfuze.globals.target_path, deepfuze.globals.trim_frame_start, deepfuze.globals.trim_frame_end):
return
# clear temp
logger.debug(wording.get('clearing_temp'), __name__.upper())
clear_temp(deepfuze.globals.target_path)
# create temp
logger.debug(wording.get('creating_temp'), __name__.upper())
create_temp(deepfuze.globals.target_path)
# extract frames
process_manager.start()
temp_video_resolution = pack_resolution(restrict_video_resolution(deepfuze.globals.target_path, unpack_resolution(deepfuze.globals.output_video_resolution)))
temp_video_fps = restrict_video_fps(deepfuze.globals.target_path, deepfuze.globals.output_video_fps)
logger.info(wording.get('extracting_frames').format(resolution = temp_video_resolution, fps = temp_video_fps), __name__.upper())
if extract_frames(deepfuze.globals.target_path, temp_video_resolution, temp_video_fps):
logger.debug(wording.get('extracting_frames_succeed'), __name__.upper())
else:
if is_process_stopping():
return
logger.error(wording.get('extracting_frames_failed'), __name__.upper())
return
# process frames
temp_frame_paths = get_temp_frame_paths(deepfuze.globals.target_path)
if temp_frame_paths:
for frame_processor_module in get_frame_processors_modules(deepfuze.globals.frame_processors):
logger.info(wording.get('processing'), frame_processor_module.NAME)
frame_processor_module.process_video(deepfuze.globals.source_paths, temp_frame_paths)
frame_processor_module.post_process()
if is_process_stopping():
return
else:
logger.error(wording.get('temp_frames_not_found'), __name__.upper())
return
# merge video
logger.info(wording.get('merging_video').format(resolution = deepfuze.globals.output_video_resolution, fps = deepfuze.globals.output_video_fps), __name__.upper())
if merge_video(deepfuze.globals.target_path, deepfuze.globals.output_video_resolution, deepfuze.globals.output_video_fps):
logger.debug(wording.get('merging_video_succeed'), __name__.upper())
else:
if is_process_stopping():
return
logger.error(wording.get('merging_video_failed'), __name__.upper())
return
# handle audio
if deepfuze.globals.skip_audio:
logger.info(wording.get('skipping_audio'), __name__.upper())
move_temp(deepfuze.globals.target_path, normed_output_path)
else:
if 'lip_syncer' in deepfuze.globals.frame_processors:
source_audio_path = get_first(filter_audio_paths(deepfuze.globals.source_paths))
if source_audio_path and replace_audio(deepfuze.globals.target_path, source_audio_path, normed_output_path):
logger.debug(wording.get('restoring_audio_succeed'), __name__.upper())
else:
if is_process_stopping():
return
logger.warn(wording.get('restoring_audio_skipped'), __name__.upper())
move_temp(deepfuze.globals.target_path, normed_output_path)
else:
if restore_audio(deepfuze.globals.target_path, normed_output_path, deepfuze.globals.output_video_fps):
logger.debug(wording.get('restoring_audio_succeed'), __name__.upper())
else:
if is_process_stopping():
return
logger.warn(wording.get('restoring_audio_skipped'), __name__.upper())
move_temp(deepfuze.globals.target_path, normed_output_path)
# clear temp
logger.debug(wording.get('clearing_temp'), __name__.upper())
clear_temp(deepfuze.globals.target_path)
# validate video
if is_video(normed_output_path):
seconds = '{:.2f}'.format((time() - start_time))
logger.info(wording.get('processing_video_succeed').format(seconds = seconds), __name__.upper())
conditional_log_statistics()
else:
logger.error(wording.get('processing_video_failed'), __name__.upper())
process_manager.end()
def is_process_stopping() -> bool:
if process_manager.is_stopping():
process_manager.end()
logger.info(wording.get('processing_stopped'), __name__.upper())
return process_manager.is_pending()
+49
View File
@@ -0,0 +1,49 @@
import os
import subprocess
import ssl
import urllib.request
from typing import List
from functools import lru_cache
from tqdm import tqdm
import deepfuze.globals
from deepfuze import wording
from deepfuze.common_helper import is_macos
from deepfuze.filesystem import get_file_size, is_file
if is_macos():
ssl._create_default_https_context = ssl._create_unverified_context
def conditional_download(download_directory_path : str, urls : List[str]) -> None:
print("here..",download_directory_path)
for url in urls:
download_file_path = os.path.join(download_directory_path, os.path.basename(url))
initial_size = get_file_size(download_file_path)
download_size = get_download_size(url)
if initial_size < download_size:
with tqdm(total = download_size, initial = initial_size, desc = wording.get('downloading'), unit = 'B', unit_scale = True, unit_divisor = 1024, ascii = ' =', disable = deepfuze.globals.log_level in [ 'warn', 'error' ]) as progress:
subprocess.Popen([ 'curl', '--create-dirs', '--silent', '--insecure', '--location', '--continue-at', '-', '--output', download_file_path, url ])
current_size = initial_size
while current_size < download_size:
if is_file(download_file_path):
current_size = get_file_size(download_file_path)
progress.update(current_size - progress.n)
if download_size and not is_download_done(url, download_file_path):
os.remove(download_file_path)
conditional_download(download_directory_path, [ url ])
@lru_cache(maxsize = None)
def get_download_size(url : str) -> int:
try:
response = urllib.request.urlopen(url, timeout = 10)
return int(response.getheader('Content-Length'))
except (OSError, ValueError):
return 0
def is_download_done(url : str, file_path : str) -> bool:
if is_file(file_path):
return get_download_size(url) == get_file_size(file_path)
return False
+112
View File
@@ -0,0 +1,112 @@
from typing import List, Any
from functools import lru_cache
import subprocess
import xml.etree.ElementTree as ElementTree
import onnxruntime
from deepfuze.typing import ExecutionDevice, ValueAndUnit
def encode_execution_providers(execution_providers : List[str]) -> List[str]:
return [ execution_provider.replace('ExecutionProvider', '').lower() for execution_provider in execution_providers ]
def decode_execution_providers(execution_providers : List[str]) -> List[str]:
available_execution_providers = onnxruntime.get_available_providers()
encoded_execution_providers = encode_execution_providers(available_execution_providers)
return [ execution_provider for execution_provider, encoded_execution_provider in zip(available_execution_providers, encoded_execution_providers) if any(execution_provider in encoded_execution_provider for execution_provider in execution_providers) ]
def has_execution_provider(execution_provider : str) -> bool:
return execution_provider in onnxruntime.get_available_providers()
def apply_execution_provider_options(execution_device_id : str, execution_providers : List[str]) -> List[Any]:
execution_providers_with_options : List[Any] = []
for execution_provider in execution_providers:
if execution_provider == 'CUDAExecutionProvider':
execution_providers_with_options.append((execution_provider,
{
'device_id': execution_device_id,
'cudnn_conv_algo_search': 'EXHAUSTIVE' if use_exhaustive() else 'DEFAULT'
}))
elif execution_provider == 'OpenVINOExecutionProvider':
execution_providers_with_options.append((execution_provider,
{
'device_id': execution_device_id,
'device_type': execution_device_id + '_FP32'
}))
elif execution_provider in [ 'DmlExecutionProvider', 'ROCMExecutionProvider' ]:
execution_providers_with_options.append((execution_provider,
{
'device_id': execution_device_id
}))
else:
execution_providers_with_options.append(execution_provider)
return execution_providers_with_options
def use_exhaustive() -> bool:
execution_devices = detect_static_execution_devices()
product_names = ('GeForce GTX 1630', 'GeForce GTX 1650', 'GeForce GTX 1660')
return any(execution_device.get('product').get('name').startswith(product_names) for execution_device in execution_devices)
def run_nvidia_smi() -> subprocess.Popen[bytes]:
commands = [ 'nvidia-smi', '--query', '--xml-format' ]
return subprocess.Popen(commands, stdout = subprocess.PIPE)
@lru_cache(maxsize = None)
def detect_static_execution_devices() -> List[ExecutionDevice]:
return detect_execution_devices()
def detect_execution_devices() -> List[ExecutionDevice]:
execution_devices : List[ExecutionDevice] = []
try:
output, _ = run_nvidia_smi().communicate()
root_element = ElementTree.fromstring(output)
except Exception:
root_element = ElementTree.Element('xml')
for gpu_element in root_element.findall('gpu'):
execution_devices.append(
{
'driver_version': root_element.find('driver_version').text,
'framework':
{
'name': 'CUDA',
'version': root_element.find('cuda_version').text
},
'product':
{
'vendor': 'NVIDIA',
'name': gpu_element.find('product_name').text.replace('NVIDIA ', '')
},
'video_memory':
{
'total': create_value_and_unit(gpu_element.find('fb_memory_usage/total').text),
'free': create_value_and_unit(gpu_element.find('fb_memory_usage/free').text)
},
'utilization':
{
'gpu': create_value_and_unit(gpu_element.find('utilization/gpu_util').text),
'memory': create_value_and_unit(gpu_element.find('utilization/memory_util').text)
}
})
return execution_devices
def create_value_and_unit(text : str) -> ValueAndUnit:
value, unit = text.split()
value_and_unit : ValueAndUnit =\
{
'value': value,
'unit': unit
}
return value_and_unit
+586
View File
@@ -0,0 +1,586 @@
from typing import Any, Optional, List, Tuple
from time import sleep
import cv2
import numpy
import onnxruntime
import deepfuze.globals
from deepfuze import process_manager
from deepfuze.common_helper import get_first
from deepfuze.face_helper import estimate_matrix_by_face_landmark_5, warp_face_by_face_landmark_5, warp_face_by_translation, create_static_anchors, distance_to_face_landmark_5, distance_to_bounding_box, convert_face_landmark_68_to_5, apply_nms, categorize_age, categorize_gender
from deepfuze.face_store import get_static_faces, set_static_faces
from deepfuze.execution import apply_execution_provider_options
from deepfuze.download import conditional_download
from deepfuze.filesystem import resolve_relative_path, is_file
from deepfuze.thread_helper import thread_lock, thread_semaphore, conditional_thread_semaphore
from deepfuze.typing import VisionFrame, Face, FaceSet, FaceAnalyserOrder, FaceAnalyserAge, FaceAnalyserGender, ModelSet, BoundingBox, FaceLandmarkSet, FaceLandmark5, FaceLandmark68, Score, FaceScoreSet, Embedding
from deepfuze.vision import resize_frame_resolution, unpack_resolution
FACE_ANALYSER = None
MODELS : ModelSet =\
{
'face_detector_retinaface':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/retinaface_10g.onnx',
'path': resolve_relative_path('../../../models/deepfuze/retinaface_10g.onnx')
},
'face_detector_scrfd':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/scrfd_2.5g.onnx',
'path': resolve_relative_path('../../../models/deepfuze/scrfd_2.5g.onnx')
},
'face_detector_yoloface':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/yoloface_8n.onnx',
'path': resolve_relative_path('../../../models/deepfuze/yoloface_8n.onnx')
},
'face_detector_yunet':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/yunet_2023mar.onnx',
'path': resolve_relative_path('../../../models/deepfuze/yunet_2023mar.onnx')
},
'face_recognizer_arcface_blendswap':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/arcface_w600k_r50.onnx',
'path': resolve_relative_path('../../../models/deepfuze/arcface_w600k_r50.onnx')
},
'face_recognizer_arcface_inswapper':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/arcface_w600k_r50.onnx',
'path': resolve_relative_path('../../../models/deepfuze/arcface_w600k_r50.onnx')
},
'face_recognizer_arcface_simswap':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/arcface_simswap.onnx',
'path': resolve_relative_path('../../../models/deepfuze/arcface_simswap.onnx')
},
'face_recognizer_arcface_uniface':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/arcface_w600k_r50.onnx',
'path': resolve_relative_path('../../../models/deepfuze/arcface_w600k_r50.onnx')
},
'face_landmarker_68':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/2dfan4.onnx',
'path': resolve_relative_path('../../../models/deepfuze/2dfan4.onnx')
},
'face_landmarker_68_5':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/face_landmarker_68_5.onnx',
'path': resolve_relative_path('../../../models/deepfuze/face_landmarker_68_5.onnx')
},
'gender_age':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/gender_age.onnx',
'path': resolve_relative_path('../../../models/deepfuze/gender_age.onnx')
}
}
def get_face_analyser() -> Any:
global FACE_ANALYSER
face_detectors = {}
face_landmarkers = {}
with thread_lock():
while process_manager.is_checking():
sleep(0.5)
if FACE_ANALYSER is None:
if deepfuze.globals.face_detector_model in [ 'many', 'retinaface' ]:
face_detectors['retinaface'] = onnxruntime.InferenceSession(MODELS.get('face_detector_retinaface').get('path'), providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
if deepfuze.globals.face_detector_model in [ 'many', 'scrfd' ]:
face_detectors['scrfd'] = onnxruntime.InferenceSession(MODELS.get('face_detector_scrfd').get('path'), providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
if deepfuze.globals.face_detector_model in [ 'many', 'yoloface' ]:
face_detectors['yoloface'] = onnxruntime.InferenceSession(MODELS.get('face_detector_yoloface').get('path'), providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
if deepfuze.globals.face_detector_model in [ 'yunet' ]:
face_detectors['yunet'] = cv2.FaceDetectorYN.create(MODELS.get('face_detector_yunet').get('path'), '', (0, 0))
if deepfuze.globals.face_recognizer_model == 'arcface_blendswap':
face_recognizer = onnxruntime.InferenceSession(MODELS.get('face_recognizer_arcface_blendswap').get('path'), providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
if deepfuze.globals.face_recognizer_model == 'arcface_inswapper':
face_recognizer = onnxruntime.InferenceSession(MODELS.get('face_recognizer_arcface_inswapper').get('path'), providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
if deepfuze.globals.face_recognizer_model == 'arcface_simswap':
face_recognizer = onnxruntime.InferenceSession(MODELS.get('face_recognizer_arcface_simswap').get('path'), providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
if deepfuze.globals.face_recognizer_model == 'arcface_uniface':
face_recognizer = onnxruntime.InferenceSession(MODELS.get('face_recognizer_arcface_uniface').get('path'), providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
face_landmarkers['68'] = onnxruntime.InferenceSession(MODELS.get('face_landmarker_68').get('path'), providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
face_landmarkers['68_5'] = onnxruntime.InferenceSession(MODELS.get('face_landmarker_68_5').get('path'), providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
gender_age = onnxruntime.InferenceSession(MODELS.get('gender_age').get('path'), providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
FACE_ANALYSER =\
{
'face_detectors': face_detectors,
'face_recognizer': face_recognizer,
'face_landmarkers': face_landmarkers,
'gender_age': gender_age
}
return FACE_ANALYSER
def clear_face_analyser() -> Any:
global FACE_ANALYSER
FACE_ANALYSER = None
def pre_check() -> bool:
download_directory_path = resolve_relative_path('../../../models/deepfuze')
model_urls =\
[
MODELS.get('face_landmarker_68').get('url'),
MODELS.get('face_landmarker_68_5').get('url'),
MODELS.get('gender_age').get('url')
]
model_paths =\
[
MODELS.get('face_landmarker_68').get('path'),
MODELS.get('face_landmarker_68_5').get('path'),
MODELS.get('gender_age').get('path')
]
if deepfuze.globals.face_detector_model in [ 'many', 'retinaface' ]:
model_urls.append(MODELS.get('face_detector_retinaface').get('url'))
model_paths.append(MODELS.get('face_detector_retinaface').get('path'))
if deepfuze.globals.face_detector_model in [ 'many', 'scrfd' ]:
model_urls.append(MODELS.get('face_detector_scrfd').get('url'))
model_paths.append(MODELS.get('face_detector_scrfd').get('path'))
if deepfuze.globals.face_detector_model in [ 'many', 'yoloface' ]:
model_urls.append(MODELS.get('face_detector_yoloface').get('url'))
model_paths.append(MODELS.get('face_detector_yoloface').get('path'))
if deepfuze.globals.face_detector_model in [ 'yunet' ]:
model_urls.append(MODELS.get('face_detector_yunet').get('url'))
model_paths.append(MODELS.get('face_detector_yunet').get('path'))
if deepfuze.globals.face_recognizer_model == 'arcface_blendswap':
model_urls.append(MODELS.get('face_recognizer_arcface_blendswap').get('url'))
model_paths.append(MODELS.get('face_recognizer_arcface_blendswap').get('path'))
if deepfuze.globals.face_recognizer_model == 'arcface_inswapper':
model_urls.append(MODELS.get('face_recognizer_arcface_inswapper').get('url'))
model_paths.append(MODELS.get('face_recognizer_arcface_inswapper').get('path'))
if deepfuze.globals.face_recognizer_model == 'arcface_simswap':
model_urls.append(MODELS.get('face_recognizer_arcface_simswap').get('url'))
model_paths.append(MODELS.get('face_recognizer_arcface_simswap').get('path'))
if deepfuze.globals.face_recognizer_model == 'arcface_uniface':
model_urls.append(MODELS.get('face_recognizer_arcface_uniface').get('url'))
model_paths.append(MODELS.get('face_recognizer_arcface_uniface').get('path'))
if not deepfuze.globals.skip_download:
process_manager.check()
conditional_download(download_directory_path, model_urls)
process_manager.end()
return all(is_file(model_path) for model_path in model_paths)
def detect_with_retinaface(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]:
face_detector = get_face_analyser().get('face_detectors').get('retinaface')
face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
temp_vision_frame = resize_frame_resolution(vision_frame, (face_detector_width, face_detector_height))
ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0]
ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1]
feature_strides = [ 8, 16, 32 ]
feature_map_channel = 3
anchor_total = 2
bounding_box_list = []
face_landmark_5_list = []
score_list = []
detect_vision_frame = prepare_detect_frame(temp_vision_frame, face_detector_size)
with thread_semaphore():
detections = face_detector.run(None,
{
face_detector.get_inputs()[0].name: detect_vision_frame
})
for index, feature_stride in enumerate(feature_strides):
keep_indices = numpy.where(detections[index] >= deepfuze.globals.face_detector_score)[0]
if keep_indices.any():
stride_height = face_detector_height // feature_stride
stride_width = face_detector_width // feature_stride
anchors = create_static_anchors(feature_stride, anchor_total, stride_height, stride_width)
bounding_box_raw = detections[index + feature_map_channel] * feature_stride
face_landmark_5_raw = detections[index + feature_map_channel * 2] * feature_stride
for bounding_box in distance_to_bounding_box(anchors, bounding_box_raw)[keep_indices]:
bounding_box_list.append(numpy.array(
[
bounding_box[0] * ratio_width,
bounding_box[1] * ratio_height,
bounding_box[2] * ratio_width,
bounding_box[3] * ratio_height
]))
for face_landmark_5 in distance_to_face_landmark_5(anchors, face_landmark_5_raw)[keep_indices]:
face_landmark_5_list.append(face_landmark_5 * [ ratio_width, ratio_height ])
for score in detections[index][keep_indices]:
score_list.append(score[0])
return bounding_box_list, face_landmark_5_list, score_list
def detect_with_scrfd(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]:
face_detector = get_face_analyser().get('face_detectors').get('scrfd')
face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
temp_vision_frame = resize_frame_resolution(vision_frame, (face_detector_width, face_detector_height))
ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0]
ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1]
feature_strides = [ 8, 16, 32 ]
feature_map_channel = 3
anchor_total = 2
bounding_box_list = []
face_landmark_5_list = []
score_list = []
detect_vision_frame = prepare_detect_frame(temp_vision_frame, face_detector_size)
with thread_semaphore():
detections = face_detector.run(None,
{
face_detector.get_inputs()[0].name: detect_vision_frame
})
for index, feature_stride in enumerate(feature_strides):
keep_indices = numpy.where(detections[index] >= deepfuze.globals.face_detector_score)[0]
if keep_indices.any():
stride_height = face_detector_height // feature_stride
stride_width = face_detector_width // feature_stride
anchors = create_static_anchors(feature_stride, anchor_total, stride_height, stride_width)
bounding_box_raw = detections[index + feature_map_channel] * feature_stride
face_landmark_5_raw = detections[index + feature_map_channel * 2] * feature_stride
for bounding_box in distance_to_bounding_box(anchors, bounding_box_raw)[keep_indices]:
bounding_box_list.append(numpy.array(
[
bounding_box[0] * ratio_width,
bounding_box[1] * ratio_height,
bounding_box[2] * ratio_width,
bounding_box[3] * ratio_height
]))
for face_landmark_5 in distance_to_face_landmark_5(anchors, face_landmark_5_raw)[keep_indices]:
face_landmark_5_list.append(face_landmark_5 * [ ratio_width, ratio_height ])
for score in detections[index][keep_indices]:
score_list.append(score[0])
return bounding_box_list, face_landmark_5_list, score_list
def detect_with_yoloface(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]:
face_detector = get_face_analyser().get('face_detectors').get('yoloface')
face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
temp_vision_frame = resize_frame_resolution(vision_frame, (face_detector_width, face_detector_height))
ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0]
ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1]
bounding_box_list = []
face_landmark_5_list = []
score_list = []
detect_vision_frame = prepare_detect_frame(temp_vision_frame, face_detector_size)
with thread_semaphore():
detections = face_detector.run(None,
{
face_detector.get_inputs()[0].name: detect_vision_frame
})
detections = numpy.squeeze(detections).T
bounding_box_raw, score_raw, face_landmark_5_raw = numpy.split(detections, [ 4, 5 ], axis = 1)
keep_indices = numpy.where(score_raw > deepfuze.globals.face_detector_score)[0]
if keep_indices.any():
bounding_box_raw, face_landmark_5_raw, score_raw = bounding_box_raw[keep_indices], face_landmark_5_raw[keep_indices], score_raw[keep_indices]
for bounding_box in bounding_box_raw:
bounding_box_list.append(numpy.array(
[
(bounding_box[0] - bounding_box[2] / 2) * ratio_width,
(bounding_box[1] - bounding_box[3] / 2) * ratio_height,
(bounding_box[0] + bounding_box[2] / 2) * ratio_width,
(bounding_box[1] + bounding_box[3] / 2) * ratio_height
]))
face_landmark_5_raw[:, 0::3] = (face_landmark_5_raw[:, 0::3]) * ratio_width
face_landmark_5_raw[:, 1::3] = (face_landmark_5_raw[:, 1::3]) * ratio_height
for face_landmark_5 in face_landmark_5_raw:
face_landmark_5_list.append(numpy.array(face_landmark_5.reshape(-1, 3)[:, :2]))
score_list = score_raw.ravel().tolist()
return bounding_box_list, face_landmark_5_list, score_list
def detect_with_yunet(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]:
face_detector = get_face_analyser().get('face_detectors').get('yunet')
face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
temp_vision_frame = resize_frame_resolution(vision_frame, (face_detector_width, face_detector_height))
ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0]
ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1]
bounding_box_list = []
face_landmark_5_list = []
score_list = []
face_detector.setInputSize((temp_vision_frame.shape[1], temp_vision_frame.shape[0]))
face_detector.setScoreThreshold(deepfuze.globals.face_detector_score)
with thread_semaphore():
_, detections = face_detector.detect(temp_vision_frame)
if numpy.any(detections):
for detection in detections:
bounding_box_list.append(numpy.array(
[
detection[0] * ratio_width,
detection[1] * ratio_height,
(detection[0] + detection[2]) * ratio_width,
(detection[1] + detection[3]) * ratio_height
]))
face_landmark_5_list.append(detection[4:14].reshape((5, 2)) * [ ratio_width, ratio_height ])
score_list.append(detection[14])
return bounding_box_list, face_landmark_5_list, score_list
def prepare_detect_frame(temp_vision_frame : VisionFrame, face_detector_size : str) -> VisionFrame:
face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
detect_vision_frame = numpy.zeros((face_detector_height, face_detector_width, 3))
detect_vision_frame[:temp_vision_frame.shape[0], :temp_vision_frame.shape[1], :] = temp_vision_frame
detect_vision_frame = (detect_vision_frame - 127.5) / 128.0
detect_vision_frame = numpy.expand_dims(detect_vision_frame.transpose(2, 0, 1), axis = 0).astype(numpy.float32)
return detect_vision_frame
def create_faces(vision_frame : VisionFrame, bounding_box_list : List[BoundingBox], face_landmark_5_list : List[FaceLandmark5], score_list : List[Score]) -> List[Face]:
faces = []
if deepfuze.globals.face_detector_score > 0:
sort_indices = numpy.argsort(-numpy.array(score_list))
bounding_box_list = [ bounding_box_list[index] for index in sort_indices ]
face_landmark_5_list = [face_landmark_5_list[index] for index in sort_indices]
score_list = [ score_list[index] for index in sort_indices ]
iou_threshold = 0.1 if deepfuze.globals.face_detector_model == 'many' else 0.4
keep_indices = apply_nms(bounding_box_list, iou_threshold)
for index in keep_indices:
bounding_box = bounding_box_list[index]
face_landmark_5_68 = face_landmark_5_list[index]
face_landmark_68_5 = expand_face_landmark_68_from_5(face_landmark_5_68)
face_landmark_68 = face_landmark_68_5
face_landmark_68_score = 0.0
if deepfuze.globals.face_landmarker_score > 0:
face_landmark_68, face_landmark_68_score = detect_face_landmark_68(vision_frame, bounding_box)
if face_landmark_68_score > deepfuze.globals.face_landmarker_score:
face_landmark_5_68 = convert_face_landmark_68_to_5(face_landmark_68)
landmarks : FaceLandmarkSet =\
{
'5': face_landmark_5_list[index],
'5/68': face_landmark_5_68,
'68': face_landmark_68,
'68/5': face_landmark_68_5
}
scores : FaceScoreSet = \
{
'detector': score_list[index],
'landmarker': face_landmark_68_score
}
embedding, normed_embedding = calc_embedding(vision_frame, landmarks.get('5/68'))
gender, age = detect_gender_age(vision_frame, bounding_box)
faces.append(Face(
bounding_box = bounding_box,
landmarks = landmarks,
scores = scores,
embedding = embedding,
normed_embedding = normed_embedding,
gender = gender,
age = age
))
return faces
def calc_embedding(temp_vision_frame : VisionFrame, face_landmark_5 : FaceLandmark5) -> Tuple[Embedding, Embedding]:
face_recognizer = get_face_analyser().get('face_recognizer')
crop_vision_frame, matrix = warp_face_by_face_landmark_5(temp_vision_frame, face_landmark_5, 'arcface_112_v2', (112, 112))
crop_vision_frame = crop_vision_frame / 127.5 - 1
crop_vision_frame = crop_vision_frame[:, :, ::-1].transpose(2, 0, 1).astype(numpy.float32)
crop_vision_frame = numpy.expand_dims(crop_vision_frame, axis = 0)
with conditional_thread_semaphore(deepfuze.globals.execution_providers):
embedding = face_recognizer.run(None,
{
face_recognizer.get_inputs()[0].name: crop_vision_frame
})[0]
embedding = embedding.ravel()
normed_embedding = embedding / numpy.linalg.norm(embedding)
return embedding, normed_embedding
def detect_face_landmark_68(temp_vision_frame : VisionFrame, bounding_box : BoundingBox) -> Tuple[FaceLandmark68, Score]:
face_landmarker = get_face_analyser().get('face_landmarkers').get('68')
scale = 195 / numpy.subtract(bounding_box[2:], bounding_box[:2]).max()
translation = (256 - numpy.add(bounding_box[2:], bounding_box[:2]) * scale) * 0.5
crop_vision_frame, affine_matrix = warp_face_by_translation(temp_vision_frame, translation, scale, (256, 256))
crop_vision_frame = cv2.cvtColor(crop_vision_frame, cv2.COLOR_RGB2Lab)
if numpy.mean(crop_vision_frame[:, :, 0]) < 30:
crop_vision_frame[:, :, 0] = cv2.createCLAHE(clipLimit = 2).apply(crop_vision_frame[:, :, 0])
crop_vision_frame = cv2.cvtColor(crop_vision_frame, cv2.COLOR_Lab2RGB)
crop_vision_frame = crop_vision_frame.transpose(2, 0, 1).astype(numpy.float32) / 255.0
with conditional_thread_semaphore(deepfuze.globals.execution_providers):
face_landmark_68, face_heatmap = face_landmarker.run(None,
{
face_landmarker.get_inputs()[0].name: [ crop_vision_frame ]
})
face_landmark_68 = face_landmark_68[:, :, :2][0] / 64
face_landmark_68 = face_landmark_68.reshape(1, -1, 2) * 256
face_landmark_68 = cv2.transform(face_landmark_68, cv2.invertAffineTransform(affine_matrix))
face_landmark_68 = face_landmark_68.reshape(-1, 2)
face_landmark_68_score = numpy.amax(face_heatmap, axis = (2, 3))
face_landmark_68_score = numpy.mean(face_landmark_68_score)
return face_landmark_68, face_landmark_68_score
def expand_face_landmark_68_from_5(face_landmark_5 : FaceLandmark5) -> FaceLandmark68:
face_landmarker = get_face_analyser().get('face_landmarkers').get('68_5')
affine_matrix = estimate_matrix_by_face_landmark_5(face_landmark_5, 'ffhq_512', (1, 1))
face_landmark_5 = cv2.transform(face_landmark_5.reshape(1, -1, 2), affine_matrix).reshape(-1, 2)
with conditional_thread_semaphore(deepfuze.globals.execution_providers):
face_landmark_68_5 = face_landmarker.run(None,
{
face_landmarker.get_inputs()[0].name: [ face_landmark_5 ]
})[0][0]
face_landmark_68_5 = cv2.transform(face_landmark_68_5.reshape(1, -1, 2), cv2.invertAffineTransform(affine_matrix)).reshape(-1, 2)
return face_landmark_68_5
def detect_gender_age(temp_vision_frame : VisionFrame, bounding_box : BoundingBox) -> Tuple[int, int]:
gender_age = get_face_analyser().get('gender_age')
bounding_box = bounding_box.reshape(2, -1)
scale = 64 / numpy.subtract(*bounding_box[::-1]).max()
translation = 48 - bounding_box.sum(axis = 0) * scale * 0.5
crop_vision_frame, affine_matrix = warp_face_by_translation(temp_vision_frame, translation, scale, (96, 96))
crop_vision_frame = crop_vision_frame[:, :, ::-1].transpose(2, 0, 1).astype(numpy.float32)
crop_vision_frame = numpy.expand_dims(crop_vision_frame, axis = 0)
with conditional_thread_semaphore(deepfuze.globals.execution_providers):
prediction = gender_age.run(None,
{
gender_age.get_inputs()[0].name: crop_vision_frame
})[0][0]
gender = int(numpy.argmax(prediction[:2]))
age = int(numpy.round(prediction[2] * 100))
return gender, age
def get_one_face(vision_frame : VisionFrame, position : int = 0) -> Optional[Face]:
many_faces = get_many_faces(vision_frame)
if many_faces:
try:
return many_faces[position]
except IndexError:
return many_faces[-1]
return None
def get_average_face(vision_frames : List[VisionFrame], position : int = 0) -> Optional[Face]:
average_face = None
faces = []
embedding_list = []
normed_embedding_list = []
for vision_frame in vision_frames:
face = get_one_face(vision_frame, position)
if face:
faces.append(face)
embedding_list.append(face.embedding)
normed_embedding_list.append(face.normed_embedding)
if faces:
first_face = get_first(faces)
average_face = Face(
bounding_box = first_face.bounding_box,
landmarks = first_face.landmarks,
scores = first_face.scores,
embedding = numpy.mean(embedding_list, axis = 0),
normed_embedding = numpy.mean(normed_embedding_list, axis = 0),
gender = first_face.gender,
age = first_face.age
)
return average_face
def get_many_faces(vision_frame : VisionFrame) -> List[Face]:
faces = []
try:
faces_cache = get_static_faces(vision_frame)
if faces_cache:
faces = faces_cache
else:
bounding_box_list = []
face_landmark_5_list = []
score_list = []
if deepfuze.globals.face_detector_model in [ 'many', 'retinaface']:
bounding_box_list_retinaface, face_landmark_5_list_retinaface, score_list_retinaface = detect_with_retinaface(vision_frame, deepfuze.globals.face_detector_size)
bounding_box_list.extend(bounding_box_list_retinaface)
face_landmark_5_list.extend(face_landmark_5_list_retinaface)
score_list.extend(score_list_retinaface)
if deepfuze.globals.face_detector_model in [ 'many', 'scrfd' ]:
bounding_box_list_scrfd, face_landmark_5_list_scrfd, score_list_scrfd = detect_with_scrfd(vision_frame, deepfuze.globals.face_detector_size)
bounding_box_list.extend(bounding_box_list_scrfd)
face_landmark_5_list.extend(face_landmark_5_list_scrfd)
score_list.extend(score_list_scrfd)
if deepfuze.globals.face_detector_model in [ 'many', 'yoloface' ]:
bounding_box_list_yoloface, face_landmark_5_list_yoloface, score_list_yoloface = detect_with_yoloface(vision_frame, deepfuze.globals.face_detector_size)
bounding_box_list.extend(bounding_box_list_yoloface)
face_landmark_5_list.extend(face_landmark_5_list_yoloface)
score_list.extend(score_list_yoloface)
if deepfuze.globals.face_detector_model in [ 'yunet' ]:
bounding_box_list_yunet, face_landmark_5_list_yunet, score_list_yunet = detect_with_yunet(vision_frame, deepfuze.globals.face_detector_size)
bounding_box_list.extend(bounding_box_list_yunet)
face_landmark_5_list.extend(face_landmark_5_list_yunet)
score_list.extend(score_list_yunet)
if bounding_box_list and face_landmark_5_list and score_list:
faces = create_faces(vision_frame, bounding_box_list, face_landmark_5_list, score_list)
if faces:
set_static_faces(vision_frame, faces)
if deepfuze.globals.face_analyser_order:
faces = sort_by_order(faces, deepfuze.globals.face_analyser_order)
if deepfuze.globals.face_analyser_age:
faces = filter_by_age(faces, deepfuze.globals.face_analyser_age)
if deepfuze.globals.face_analyser_gender:
faces = filter_by_gender(faces, deepfuze.globals.face_analyser_gender)
except (AttributeError, ValueError):
pass
return faces
def find_similar_faces(reference_faces : FaceSet, vision_frame : VisionFrame, face_distance : float) -> List[Face]:
similar_faces : List[Face] = []
many_faces = get_many_faces(vision_frame)
if reference_faces:
for reference_set in reference_faces:
if not similar_faces:
for reference_face in reference_faces[reference_set]:
for face in many_faces:
if compare_faces(face, reference_face, face_distance):
similar_faces.append(face)
return similar_faces
def compare_faces(face : Face, reference_face : Face, face_distance : float) -> bool:
current_face_distance = calc_face_distance(face, reference_face)
return current_face_distance < face_distance
def calc_face_distance(face : Face, reference_face : Face) -> float:
if hasattr(face, 'normed_embedding') and hasattr(reference_face, 'normed_embedding'):
return 1 - numpy.dot(face.normed_embedding, reference_face.normed_embedding)
return 0
def sort_by_order(faces : List[Face], order : FaceAnalyserOrder) -> List[Face]:
if order == 'left-right':
return sorted(faces, key = lambda face: face.bounding_box[0])
if order == 'right-left':
return sorted(faces, key = lambda face: face.bounding_box[0], reverse = True)
if order == 'top-bottom':
return sorted(faces, key = lambda face: face.bounding_box[1])
if order == 'bottom-top':
return sorted(faces, key = lambda face: face.bounding_box[1], reverse = True)
if order == 'small-large':
return sorted(faces, key = lambda face: (face.bounding_box[2] - face.bounding_box[0]) * (face.bounding_box[3] - face.bounding_box[1]))
if order == 'large-small':
return sorted(faces, key = lambda face: (face.bounding_box[2] - face.bounding_box[0]) * (face.bounding_box[3] - face.bounding_box[1]), reverse = True)
if order == 'best-worst':
return sorted(faces, key = lambda face: face.scores.get('detector'), reverse = True)
if order == 'worst-best':
return sorted(faces, key = lambda face: face.scores.get('detector'))
return faces
def filter_by_age(faces : List[Face], age : FaceAnalyserAge) -> List[Face]:
filter_faces = []
for face in faces:
if categorize_age(face.age) == age:
filter_faces.append(face)
return filter_faces
def filter_by_gender(faces : List[Face], gender : FaceAnalyserGender) -> List[Face]:
filter_faces = []
for face in faces:
if categorize_gender(face.gender) == gender:
filter_faces.append(face)
return filter_faces
+169
View File
@@ -0,0 +1,169 @@
from typing import Any, Tuple, List
from cv2.typing import Size
from functools import lru_cache
import cv2
import numpy
from deepfuze.typing import BoundingBox, FaceLandmark5, FaceLandmark68, VisionFrame, Mask, Matrix, Translation, WarpTemplate, WarpTemplateSet, FaceAnalyserAge, FaceAnalyserGender
WARP_TEMPLATES : WarpTemplateSet =\
{
'arcface_112_v1': numpy.array(
[
[ 0.35473214, 0.45658929 ],
[ 0.64526786, 0.45658929 ],
[ 0.50000000, 0.61154464 ],
[ 0.37913393, 0.77687500 ],
[ 0.62086607, 0.77687500 ]
]),
'arcface_112_v2': numpy.array(
[
[ 0.34191607, 0.46157411 ],
[ 0.65653393, 0.45983393 ],
[ 0.50022500, 0.64050536 ],
[ 0.37097589, 0.82469196 ],
[ 0.63151696, 0.82325089 ]
]),
'arcface_128_v2': numpy.array(
[
[ 0.36167656, 0.40387734 ],
[ 0.63696719, 0.40235469 ],
[ 0.50019687, 0.56044219 ],
[ 0.38710391, 0.72160547 ],
[ 0.61507734, 0.72034453 ]
]),
'ffhq_512': numpy.array(
[
[ 0.37691676, 0.46864664 ],
[ 0.62285697, 0.46912813 ],
[ 0.50123859, 0.61331904 ],
[ 0.39308822, 0.72541100 ],
[ 0.61150205, 0.72490465 ]
])
}
def estimate_matrix_by_face_landmark_5(face_landmark_5 : FaceLandmark5, warp_template : WarpTemplate, crop_size : Size) -> Matrix:
normed_warp_template = WARP_TEMPLATES.get(warp_template) * crop_size
affine_matrix = cv2.estimateAffinePartial2D(face_landmark_5, normed_warp_template, method = cv2.RANSAC, ransacReprojThreshold = 100)[0]
return affine_matrix
def warp_face_by_face_landmark_5(temp_vision_frame : VisionFrame, face_landmark_5 : FaceLandmark5, warp_template : WarpTemplate, crop_size : Size) -> Tuple[VisionFrame, Matrix]:
affine_matrix = estimate_matrix_by_face_landmark_5(face_landmark_5, warp_template, crop_size)
crop_vision_frame = cv2.warpAffine(temp_vision_frame, affine_matrix, crop_size, borderMode = cv2.BORDER_REPLICATE, flags = cv2.INTER_AREA)
return crop_vision_frame, affine_matrix
def warp_face_by_bounding_box(temp_vision_frame : VisionFrame, bounding_box : BoundingBox, crop_size : Size) -> Tuple[VisionFrame, Matrix]:
source_points = numpy.array([ [ bounding_box[0], bounding_box[1] ], [bounding_box[2], bounding_box[1] ], [ bounding_box[0], bounding_box[3] ] ]).astype(numpy.float32)
target_points = numpy.array([ [ 0, 0 ], [ crop_size[0], 0 ], [ 0, crop_size[1] ] ]).astype(numpy.float32)
affine_matrix = cv2.getAffineTransform(source_points, target_points)
if bounding_box[2] - bounding_box[0] > crop_size[0] or bounding_box[3] - bounding_box[1] > crop_size[1]:
interpolation_method = cv2.INTER_AREA
else:
interpolation_method = cv2.INTER_LINEAR
crop_vision_frame = cv2.warpAffine(temp_vision_frame, affine_matrix, crop_size, flags = interpolation_method)
return crop_vision_frame, affine_matrix
def warp_face_by_translation(temp_vision_frame : VisionFrame, translation : Translation, scale : float, crop_size : Size) -> Tuple[VisionFrame, Matrix]:
affine_matrix = numpy.array([ [ scale, 0, translation[0] ], [ 0, scale, translation[1] ] ])
crop_vision_frame = cv2.warpAffine(temp_vision_frame, affine_matrix, crop_size)
return crop_vision_frame, affine_matrix
def paste_back(temp_vision_frame : VisionFrame, crop_vision_frame : VisionFrame, crop_mask : Mask, affine_matrix : Matrix) -> VisionFrame:
inverse_matrix = cv2.invertAffineTransform(affine_matrix)
temp_size = temp_vision_frame.shape[:2][::-1]
inverse_mask = cv2.warpAffine(crop_mask, inverse_matrix, temp_size).clip(0, 1)
inverse_vision_frame = cv2.warpAffine(crop_vision_frame, inverse_matrix, temp_size, borderMode = cv2.BORDER_REPLICATE)
paste_vision_frame = temp_vision_frame.copy()
paste_vision_frame[:, :, 0] = inverse_mask * inverse_vision_frame[:, :, 0] + (1 - inverse_mask) * temp_vision_frame[:, :, 0]
paste_vision_frame[:, :, 1] = inverse_mask * inverse_vision_frame[:, :, 1] + (1 - inverse_mask) * temp_vision_frame[:, :, 1]
paste_vision_frame[:, :, 2] = inverse_mask * inverse_vision_frame[:, :, 2] + (1 - inverse_mask) * temp_vision_frame[:, :, 2]
return paste_vision_frame
@lru_cache(maxsize = None)
def create_static_anchors(feature_stride : int, anchor_total : int, stride_height : int, stride_width : int) -> numpy.ndarray[Any, Any]:
y, x = numpy.mgrid[:stride_height, :stride_width][::-1]
anchors = numpy.stack((y, x), axis = -1)
anchors = (anchors * feature_stride).reshape((-1, 2))
anchors = numpy.stack([ anchors ] * anchor_total, axis = 1).reshape((-1, 2))
return anchors
def create_bounding_box_from_face_landmark_68(face_landmark_68 : FaceLandmark68) -> BoundingBox:
min_x, min_y = numpy.min(face_landmark_68, axis = 0)
max_x, max_y = numpy.max(face_landmark_68, axis = 0)
bounding_box = numpy.array([ min_x, min_y, max_x, max_y ]).astype(numpy.int16)
return bounding_box
def distance_to_bounding_box(points : numpy.ndarray[Any, Any], distance : numpy.ndarray[Any, Any]) -> BoundingBox:
x1 = points[:, 0] - distance[:, 0]
y1 = points[:, 1] - distance[:, 1]
x2 = points[:, 0] + distance[:, 2]
y2 = points[:, 1] + distance[:, 3]
bounding_box = numpy.column_stack([ x1, y1, x2, y2 ])
return bounding_box
def distance_to_face_landmark_5(points : numpy.ndarray[Any, Any], distance : numpy.ndarray[Any, Any]) -> FaceLandmark5:
x = points[:, 0::2] + distance[:, 0::2]
y = points[:, 1::2] + distance[:, 1::2]
face_landmark_5 = numpy.stack((x, y), axis = -1)
return face_landmark_5
def convert_face_landmark_68_to_5(face_landmark_68 : FaceLandmark68) -> FaceLandmark5:
face_landmark_5 = numpy.array(
[
numpy.mean(face_landmark_68[36:42], axis = 0),
numpy.mean(face_landmark_68[42:48], axis = 0),
face_landmark_68[30],
face_landmark_68[48],
face_landmark_68[54]
])
return face_landmark_5
def apply_nms(bounding_box_list : List[BoundingBox], iou_threshold : float) -> List[int]:
keep_indices = []
dimension_list = numpy.reshape(bounding_box_list, (-1, 4))
x1 = dimension_list[:, 0]
y1 = dimension_list[:, 1]
x2 = dimension_list[:, 2]
y2 = dimension_list[:, 3]
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
indices = numpy.arange(len(bounding_box_list))
while indices.size > 0:
index = indices[0]
remain_indices = indices[1:]
keep_indices.append(index)
xx1 = numpy.maximum(x1[index], x1[remain_indices])
yy1 = numpy.maximum(y1[index], y1[remain_indices])
xx2 = numpy.minimum(x2[index], x2[remain_indices])
yy2 = numpy.minimum(y2[index], y2[remain_indices])
width = numpy.maximum(0, xx2 - xx1 + 1)
height = numpy.maximum(0, yy2 - yy1 + 1)
iou = width * height / (areas[index] + areas[remain_indices] - width * height)
indices = indices[numpy.where(iou <= iou_threshold)[0] + 1]
return keep_indices
def categorize_age(age : int) -> FaceAnalyserAge:
if age < 13:
return 'child'
elif age < 19:
return 'teen'
elif age < 60:
return 'adult'
return 'senior'
def categorize_gender(gender : int) -> FaceAnalyserGender:
if gender == 0:
return 'female'
return 'male'
+155
View File
@@ -0,0 +1,155 @@
from typing import Any, Dict, List
from cv2.typing import Size
from functools import lru_cache
from time import sleep
import cv2
import numpy
import onnxruntime
import deepfuze.globals
from deepfuze import process_manager
from deepfuze.thread_helper import thread_lock, conditional_thread_semaphore
from deepfuze.typing import FaceLandmark68, VisionFrame, Mask, Padding, FaceMaskRegion, ModelSet
from deepfuze.execution import apply_execution_provider_options
from deepfuze.filesystem import resolve_relative_path, is_file
from deepfuze.download import conditional_download
FACE_OCCLUDER = None
FACE_PARSER = None
MODELS : ModelSet =\
{
'face_occluder':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/face_occluder.onnx',
'path': resolve_relative_path('../../../models/deepfuze/face_occluder.onnx')
},
'face_parser':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/face_parser.onnx',
'path': resolve_relative_path('../../../models/deepfuze/face_parser.onnx')
}
}
FACE_MASK_REGIONS : Dict[FaceMaskRegion, int] =\
{
'skin': 1,
'left-eyebrow': 2,
'right-eyebrow': 3,
'left-eye': 4,
'right-eye': 5,
'glasses': 6,
'nose': 10,
'mouth': 11,
'upper-lip': 12,
'lower-lip': 13
}
def get_face_occluder() -> Any:
global FACE_OCCLUDER
with thread_lock():
while process_manager.is_checking():
sleep(0.5)
if FACE_OCCLUDER is None:
model_path = MODELS.get('face_occluder').get('path')
FACE_OCCLUDER = onnxruntime.InferenceSession(model_path, providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
return FACE_OCCLUDER
def get_face_parser() -> Any:
global FACE_PARSER
with thread_lock():
while process_manager.is_checking():
sleep(0.5)
if FACE_PARSER is None:
model_path = MODELS.get('face_parser').get('path')
FACE_PARSER = onnxruntime.InferenceSession(model_path, providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
return FACE_PARSER
def clear_face_occluder() -> None:
global FACE_OCCLUDER
FACE_OCCLUDER = None
def clear_face_parser() -> None:
global FACE_PARSER
FACE_PARSER = None
def pre_check() -> bool:
download_directory_path = resolve_relative_path('../../../models/deepfuze')
model_urls =\
[
MODELS.get('face_occluder').get('url'),
MODELS.get('face_parser').get('url')
]
model_paths =\
[
MODELS.get('face_occluder').get('path'),
MODELS.get('face_parser').get('path')
]
if not deepfuze.globals.skip_download:
process_manager.check()
conditional_download(download_directory_path, model_urls)
process_manager.end()
return all(is_file(model_path) for model_path in model_paths)
@lru_cache(maxsize = None)
def create_static_box_mask(crop_size : Size, face_mask_blur : float, face_mask_padding : Padding) -> Mask:
blur_amount = int(crop_size[0] * 0.5 * face_mask_blur)
blur_area = max(blur_amount // 2, 1)
box_mask : Mask = numpy.ones(crop_size, numpy.float32)
box_mask[:max(blur_area, int(crop_size[1] * face_mask_padding[0] / 100)), :] = 0
box_mask[-max(blur_area, int(crop_size[1] * face_mask_padding[2] / 100)):, :] = 0
box_mask[:, :max(blur_area, int(crop_size[0] * face_mask_padding[3] / 100))] = 0
box_mask[:, -max(blur_area, int(crop_size[0] * face_mask_padding[1] / 100)):] = 0
if blur_amount > 0:
box_mask = cv2.GaussianBlur(box_mask, (0, 0), blur_amount * 0.25)
return box_mask
def create_occlusion_mask(crop_vision_frame : VisionFrame) -> Mask:
face_occluder = get_face_occluder()
prepare_vision_frame = cv2.resize(crop_vision_frame, face_occluder.get_inputs()[0].shape[1:3][::-1])
prepare_vision_frame = numpy.expand_dims(prepare_vision_frame, axis = 0).astype(numpy.float32) / 255
prepare_vision_frame = prepare_vision_frame.transpose(0, 1, 2, 3)
with conditional_thread_semaphore(deepfuze.globals.execution_providers):
occlusion_mask : Mask = face_occluder.run(None,
{
face_occluder.get_inputs()[0].name: prepare_vision_frame
})[0][0]
occlusion_mask = occlusion_mask.transpose(0, 1, 2).clip(0, 1).astype(numpy.float32)
occlusion_mask = cv2.resize(occlusion_mask, crop_vision_frame.shape[:2][::-1])
occlusion_mask = (cv2.GaussianBlur(occlusion_mask.clip(0, 1), (0, 0), 5).clip(0.5, 1) - 0.5) * 2
return occlusion_mask
def create_region_mask(crop_vision_frame : VisionFrame, face_mask_regions : List[FaceMaskRegion]) -> Mask:
face_parser = get_face_parser()
prepare_vision_frame = cv2.flip(cv2.resize(crop_vision_frame, (512, 512)), 1)
prepare_vision_frame = numpy.expand_dims(prepare_vision_frame, axis = 0).astype(numpy.float32)[:, :, ::-1] / 127.5 - 1
prepare_vision_frame = prepare_vision_frame.transpose(0, 3, 1, 2)
with conditional_thread_semaphore(deepfuze.globals.execution_providers):
region_mask : Mask = face_parser.run(None,
{
face_parser.get_inputs()[0].name: prepare_vision_frame
})[0][0]
region_mask = numpy.isin(region_mask.argmax(0), [ FACE_MASK_REGIONS[region] for region in face_mask_regions ])
region_mask = cv2.resize(region_mask.astype(numpy.float32), crop_vision_frame.shape[:2][::-1])
region_mask = (cv2.GaussianBlur(region_mask.clip(0, 1), (0, 0), 5).clip(0.5, 1) - 0.5) * 2
return region_mask
def create_mouth_mask(face_landmark_68 : FaceLandmark68) -> Mask:
convex_hull = cv2.convexHull(face_landmark_68[numpy.r_[3:14, 31:36]].astype(numpy.int32))
mouth_mask : Mask = numpy.zeros((512, 512)).astype(numpy.float32)
mouth_mask = cv2.fillConvexPoly(mouth_mask, convex_hull, 1.0)
mouth_mask = cv2.erode(mouth_mask.clip(0, 1), numpy.ones((21, 3)))
mouth_mask = cv2.GaussianBlur(mouth_mask, (0, 0), sigmaX = 1, sigmaY = 15)
return mouth_mask
+48
View File
@@ -0,0 +1,48 @@
from typing import Optional, List
import hashlib
import numpy
from deepfuze.typing import VisionFrame, Face, FaceStore, FaceSet
FACE_STORE: FaceStore =\
{
'static_faces': {},
'reference_faces': {}
}
def get_static_faces(vision_frame : VisionFrame) -> Optional[List[Face]]:
frame_hash = create_frame_hash(vision_frame)
if frame_hash in FACE_STORE['static_faces']:
return FACE_STORE['static_faces'][frame_hash]
return None
def set_static_faces(vision_frame : VisionFrame, faces : List[Face]) -> None:
frame_hash = create_frame_hash(vision_frame)
if frame_hash:
FACE_STORE['static_faces'][frame_hash] = faces
def clear_static_faces() -> None:
FACE_STORE['static_faces'] = {}
def create_frame_hash(vision_frame : VisionFrame) -> Optional[str]:
return hashlib.sha1(vision_frame.tobytes()).hexdigest() if numpy.any(vision_frame) else None
def get_reference_faces() -> Optional[FaceSet]:
if FACE_STORE['reference_faces']:
return FACE_STORE['reference_faces']
return None
def append_reference_face(name : str, face : Face) -> None:
if name not in FACE_STORE['reference_faces']:
FACE_STORE['reference_faces'][name] = []
FACE_STORE['reference_faces'][name].append(face)
def clear_reference_faces() -> None:
FACE_STORE['reference_faces'] = {}
+146
View File
@@ -0,0 +1,146 @@
from typing import List, Optional
import os
import subprocess
import filetype
import deepfuze.globals
from deepfuze import logger, process_manager
from deepfuze.typing import OutputVideoPreset, Fps, AudioBuffer
from deepfuze.filesystem import get_temp_frames_pattern, get_temp_file_path
from deepfuze.vision import restrict_video_fps
def run_ffmpeg(args : List[str]) -> bool:
commands = [ 'ffmpeg', '-hide_banner', '-loglevel', 'error' ]
commands.extend(args)
process = subprocess.Popen(commands, stderr = subprocess.PIPE, stdout = subprocess.PIPE)
while process_manager.is_processing():
try:
if deepfuze.globals.log_level == 'debug':
log_debug(process)
return process.wait(timeout = 0.5) == 0
except subprocess.TimeoutExpired:
continue
return process.returncode == 0
def open_ffmpeg(args : List[str]) -> subprocess.Popen[bytes]:
commands = [ 'ffmpeg', '-hide_banner', '-loglevel', 'quiet' ]
commands.extend(args)
return subprocess.Popen(commands, stdin = subprocess.PIPE, stdout = subprocess.PIPE)
def log_debug(process : subprocess.Popen[bytes]) -> None:
_, stderr = process.communicate()
errors = stderr.decode().split(os.linesep)
for error in errors:
if error.strip():
logger.debug(error.strip(), __name__.upper())
def extract_frames(target_path : str, temp_video_resolution : str, temp_video_fps : Fps) -> bool:
trim_frame_start = deepfuze.globals.trim_frame_start
trim_frame_end = deepfuze.globals.trim_frame_end
temp_frames_pattern = get_temp_frames_pattern(target_path, '%04d')
commands = [ '-i', target_path, '-s', str(temp_video_resolution), '-q:v', '0' ]
if trim_frame_start is not None and trim_frame_end is not None:
commands.extend([ '-vf', 'trim=start_frame=' + str(trim_frame_start) + ':end_frame=' + str(trim_frame_end) + ',fps=' + str(temp_video_fps) ])
elif trim_frame_start is not None:
commands.extend([ '-vf', 'trim=start_frame=' + str(trim_frame_start) + ',fps=' + str(temp_video_fps) ])
elif trim_frame_end is not None:
commands.extend([ '-vf', 'trim=end_frame=' + str(trim_frame_end) + ',fps=' + str(temp_video_fps) ])
else:
commands.extend([ '-vf', 'fps=' + str(temp_video_fps) ])
commands.extend([ '-vsync', '0', temp_frames_pattern ])
return run_ffmpeg(commands)
def merge_video(target_path : str, output_video_resolution : str, output_video_fps : Fps) -> bool:
temp_video_fps = restrict_video_fps(target_path, output_video_fps)
temp_file_path = get_temp_file_path(target_path)
temp_frames_pattern = get_temp_frames_pattern(target_path, '%04d')
commands = [ '-r', str(temp_video_fps), '-i', temp_frames_pattern, '-s', str(output_video_resolution), '-c:v', deepfuze.globals.output_video_encoder ]
if deepfuze.globals.output_video_encoder in [ 'libx264', 'libx265' ]:
output_video_compression = round(51 - (deepfuze.globals.output_video_quality * 0.51))
commands.extend([ '-crf', str(output_video_compression), '-preset', deepfuze.globals.output_video_preset ])
if deepfuze.globals.output_video_encoder in [ 'libvpx-vp9' ]:
output_video_compression = round(63 - (deepfuze.globals.output_video_quality * 0.63))
commands.extend([ '-crf', str(output_video_compression) ])
if deepfuze.globals.output_video_encoder in [ 'h264_nvenc', 'hevc_nvenc' ]:
output_video_compression = round(51 - (deepfuze.globals.output_video_quality * 0.51))
commands.extend([ '-cq', str(output_video_compression), '-preset', map_nvenc_preset(deepfuze.globals.output_video_preset) ])
if deepfuze.globals.output_video_encoder in [ 'h264_amf', 'hevc_amf' ]:
output_video_compression = round(51 - (deepfuze.globals.output_video_quality * 0.51))
commands.extend([ '-qp_i', str(output_video_compression), '-qp_p', str(output_video_compression), '-quality', map_amf_preset(deepfuze.globals.output_video_preset) ])
commands.extend([ '-vf', 'framerate=fps=' + str(output_video_fps), '-pix_fmt', 'yuv420p', '-colorspace', 'bt709', '-y', temp_file_path ])
return run_ffmpeg(commands)
def copy_image(target_path : str, temp_image_resolution : str) -> bool:
temp_file_path = get_temp_file_path(target_path)
is_webp = filetype.guess_mime(target_path) == 'image/webp'
temp_image_compression = 100 if is_webp else 0
commands = [ '-i', target_path, '-s', str(temp_image_resolution), '-q:v', str(temp_image_compression), '-y', temp_file_path ]
return run_ffmpeg(commands)
def finalize_image(target_path : str, output_path : str, output_image_resolution : str) -> bool:
temp_file_path = get_temp_file_path(target_path)
output_image_compression = round(31 - (deepfuze.globals.output_image_quality * 0.31))
commands = [ '-i', temp_file_path, '-s', str(output_image_resolution), '-q:v', str(output_image_compression), '-y', output_path ]
return run_ffmpeg(commands)
def read_audio_buffer(target_path : str, sample_rate : int, channel_total : int) -> Optional[AudioBuffer]:
commands = [ '-i', target_path, '-vn', '-f', 's16le', '-acodec', 'pcm_s16le', '-ar', str(sample_rate), '-ac', str(channel_total), '-']
process = open_ffmpeg(commands)
audio_buffer, _ = process.communicate()
if process.returncode == 0:
return audio_buffer
return None
def restore_audio(target_path : str, output_path : str, output_video_fps : Fps) -> bool:
trim_frame_start = deepfuze.globals.trim_frame_start
trim_frame_end = deepfuze.globals.trim_frame_end
temp_file_path = get_temp_file_path(target_path)
commands = [ '-i', temp_file_path ]
if trim_frame_start is not None:
start_time = trim_frame_start / output_video_fps
commands.extend([ '-ss', str(start_time) ])
if trim_frame_end is not None:
end_time = trim_frame_end / output_video_fps
commands.extend([ '-to', str(end_time) ])
commands.extend([ '-i', target_path, '-c', 'copy', '-map', '0:v:0', '-map', '1:a:0', '-shortest', '-y', output_path ])
return run_ffmpeg(commands)
def replace_audio(target_path : str, audio_path : str, output_path : str) -> bool:
temp_file_path = get_temp_file_path(target_path)
commands = [ '-i', temp_file_path, '-i', audio_path, '-af', 'apad', '-shortest', '-y', output_path ]
return run_ffmpeg(commands)
def map_nvenc_preset(output_video_preset : OutputVideoPreset) -> Optional[str]:
if output_video_preset in [ 'ultrafast', 'superfast', 'veryfast', 'faster', 'fast' ]:
return 'fast'
if output_video_preset == 'medium':
return 'medium'
if output_video_preset in [ 'slow', 'slower', 'veryslow' ]:
return 'slow'
return None
def map_amf_preset(output_video_preset : OutputVideoPreset) -> Optional[str]:
if output_video_preset in [ 'ultrafast', 'superfast', 'veryfast' ]:
return 'speed'
if output_video_preset in [ 'faster', 'fast', 'medium' ]:
return 'balanced'
if output_video_preset in [ 'slow', 'slower', 'veryslow' ]:
return 'quality'
return None
+135
View File
@@ -0,0 +1,135 @@
from typing import List, Optional
import glob
import os
import shutil
import tempfile
import filetype
from pathlib import Path
import deepfuze.globals
from deepfuze.common_helper import is_windows
if is_windows():
import ctypes
def get_temp_frame_paths(target_path : str) -> List[str]:
temp_frames_pattern = get_temp_frames_pattern(target_path, '*')
return sorted(glob.glob(temp_frames_pattern))
def get_temp_frames_pattern(target_path : str, temp_frame_prefix : str) -> str:
temp_directory_path = get_temp_directory_path(target_path)
return os.path.join(temp_directory_path, temp_frame_prefix + '.' + deepfuze.globals.temp_frame_format)
def get_temp_file_path(target_path : str) -> str:
_, target_extension = os.path.splitext(os.path.basename(target_path))
temp_directory_path = get_temp_directory_path(target_path)
return os.path.join(temp_directory_path, 'temp' + target_extension)
def get_temp_directory_path(target_path : str) -> str:
target_name, _ = os.path.splitext(os.path.basename(target_path))
temp_directory_path = os.path.join(tempfile.gettempdir(), 'facefusion')
return os.path.join(temp_directory_path, target_name)
def create_temp(target_path : str) -> None:
temp_directory_path = get_temp_directory_path(target_path)
Path(temp_directory_path).mkdir(parents = True, exist_ok = True)
def move_temp(target_path : str, output_path : str) -> None:
temp_file_path = get_temp_file_path(target_path)
if is_file(temp_file_path):
if is_file(output_path):
os.remove(output_path)
shutil.move(temp_file_path, output_path)
def clear_temp(target_path : str) -> None:
temp_directory_path = get_temp_directory_path(target_path)
parent_directory_path = os.path.dirname(temp_directory_path)
if not deepfuze.globals.keep_temp and is_directory(temp_directory_path):
shutil.rmtree(temp_directory_path, ignore_errors = True)
if os.path.exists(parent_directory_path) and not os.listdir(parent_directory_path):
os.rmdir(parent_directory_path)
def get_file_size(file_path : str) -> int:
if is_file(file_path):
return os.path.getsize(file_path)
return 0
def is_file(file_path : str) -> bool:
return bool(file_path and os.path.isfile(file_path))
def is_directory(directory_path : str) -> bool:
return bool(directory_path and os.path.isdir(directory_path))
def is_audio(audio_path : str) -> bool:
return is_file(audio_path) and filetype.helpers.is_audio(audio_path)
def has_audio(audio_paths : List[str]) -> bool:
if audio_paths:
return any(is_audio(audio_path) for audio_path in audio_paths)
return False
def is_image(image_path : str) -> bool:
return is_file(image_path) and filetype.helpers.is_image(image_path)
def has_image(image_paths: List[str]) -> bool:
if image_paths:
return any(is_image(image_path) for image_path in image_paths)
return False
def is_video(video_path : str) -> bool:
return is_file(video_path) and filetype.helpers.is_video(video_path)
def filter_audio_paths(paths : List[str]) -> List[str]:
if paths:
return [ path for path in paths if is_audio(path) ]
return []
def filter_image_paths(paths : List[str]) -> List[str]:
if paths:
return [ path for path in paths if is_image(path) ]
return []
def resolve_relative_path(path : str) -> str:
return os.path.abspath(os.path.join(os.path.dirname(__file__), path))
def list_directory(directory_path : str) -> Optional[List[str]]:
if is_directory(directory_path):
files = os.listdir(directory_path)
files = [ Path(file).stem for file in files if not Path(file).stem.startswith(('.', '__')) ]
return sorted(files)
return None
def sanitize_path_for_windows(full_path : str) -> Optional[str]:
buffer_size = 0
while True:
unicode_buffer = ctypes.create_unicode_buffer(buffer_size)
buffer_threshold = ctypes.windll.kernel32.GetShortPathNameW(full_path, unicode_buffer, buffer_size) #type:ignore[attr-defined]
if buffer_size > buffer_threshold:
return unicode_buffer.value
if buffer_threshold == 0:
return None
buffer_size = buffer_threshold
+60
View File
@@ -0,0 +1,60 @@
from typing import List, Optional
from deepfuze.typing import LogLevel, VideoMemoryStrategy, FaceSelectorMode, FaceAnalyserOrder, FaceAnalyserAge, FaceAnalyserGender, FaceMaskType, FaceMaskRegion, OutputVideoEncoder, OutputVideoPreset, FaceDetectorModel, FaceRecognizerModel, TempFrameFormat, Padding
# general
config_path : Optional[str] = None
source_paths : Optional[List[str]] = None
target_path : Optional[str] = None
output_path : Optional[str] = None
# misc
force_download : Optional[bool] = None
skip_download : Optional[bool] = None
headless : Optional[bool] = None
log_level : Optional[LogLevel] = None
# execution
execution_device_id : Optional[str] = None
execution_providers : List[str] = []
execution_thread_count : Optional[int] = None
execution_queue_count : Optional[int] = None
# memory
video_memory_strategy : Optional[VideoMemoryStrategy] = None
system_memory_limit : Optional[int] = None
# face analyser
face_analyser_order : Optional[FaceAnalyserOrder] = None
face_analyser_age : Optional[FaceAnalyserAge] = None
face_analyser_gender : Optional[FaceAnalyserGender] = None
face_detector_model : Optional[FaceDetectorModel] = None
face_detector_size : Optional[str] = None
face_detector_score : Optional[float] = None
face_landmarker_score : Optional[float] = None
face_recognizer_model : Optional[FaceRecognizerModel] = None
# face selector
face_selector_mode : Optional[FaceSelectorMode] = None
reference_face_position : Optional[int] = None
reference_face_distance : Optional[float] = None
reference_frame_number : Optional[int] = None
# face mask
face_mask_types : Optional[List[FaceMaskType]] = None
face_mask_blur : Optional[float] = None
face_mask_padding : Optional[Padding] = None
face_mask_regions : Optional[List[FaceMaskRegion]] = None
# frame extraction
trim_frame_start : Optional[int] = None
trim_frame_end : Optional[int] = None
temp_frame_format : Optional[TempFrameFormat] = None
keep_temp : Optional[bool] = None
# output creation
output_image_quality : Optional[int] = None
output_image_resolution : Optional[str] = None
output_video_encoder : Optional[OutputVideoEncoder] = None
output_video_preset : Optional[OutputVideoPreset] = None
output_video_quality : Optional[int] = None
output_video_resolution : Optional[str] = None
output_video_fps : Optional[float] = None
skip_audio : Optional[bool] = None
# frame processors
frame_processors : List[str] = []
# uis
open_browser : Optional[bool] = None
ui_layouts : List[str] = []
+77
View File
@@ -0,0 +1,77 @@
from typing import Dict, Tuple
import sys
import os
import tempfile
import subprocess
import inquirer
from argparse import ArgumentParser, HelpFormatter
from deepfuze import metadata, wording
from deepfuze.common_helper import is_linux, is_macos, is_windows
if is_macos():
os.environ['SYSTEM_VERSION_COMPAT'] = '0'
ONNXRUNTIMES : Dict[str, Tuple[str, str]] = {}
if is_macos():
ONNXRUNTIMES['default'] = ('onnxruntime', '1.17.3')
else:
ONNXRUNTIMES['default'] = ('onnxruntime', '1.17.3')
ONNXRUNTIMES['cuda-12.2'] = ('onnxruntime-gpu', '1.17.1')
ONNXRUNTIMES['cuda-11.8'] = ('onnxruntime-gpu', '1.17.1')
ONNXRUNTIMES['openvino'] = ('onnxruntime-openvino', '1.15.0')
if is_linux():
ONNXRUNTIMES['rocm-5.4.2'] = ('onnxruntime-rocm', '1.16.3')
ONNXRUNTIMES['rocm-5.6'] = ('onnxruntime-rocm', '1.16.3')
if is_windows():
ONNXRUNTIMES['directml'] = ('onnxruntime-directml', '1.17.3')
def cli() -> None:
program = ArgumentParser(formatter_class = lambda prog: HelpFormatter(prog, max_help_position = 200))
program.add_argument('--onnxruntime', help = wording.get('help.install_dependency').format(dependency = 'onnxruntime'), choices = ONNXRUNTIMES.keys())
program.add_argument('--skip-conda', help = wording.get('help.skip_conda'), action = 'store_true')
program.add_argument('-v', '--version', version = metadata.get('name') + ' ' + metadata.get('version'), action = 'version')
run(program)
def run(program : ArgumentParser) -> None:
args = program.parse_args()
python_id = 'cp' + str(sys.version_info.major) + str(sys.version_info.minor)
if not args.skip_conda and 'CONDA_PREFIX' not in os.environ:
sys.stdout.write(wording.get('conda_not_activated') + os.linesep)
sys.exit(1)
if args.onnxruntime:
answers =\
{
'onnxruntime': args.onnxruntime
}
else:
answers = inquirer.prompt(
[
inquirer.List('onnxruntime', message = wording.get('help.install_dependency').format(dependency = 'onnxruntime'), choices = list(ONNXRUNTIMES.keys()))
])
if answers:
onnxruntime = answers['onnxruntime']
onnxruntime_name, onnxruntime_version = ONNXRUNTIMES[onnxruntime]
subprocess.call([ 'pip', 'install', '-r', 'requirements.txt', '--force-reinstall' ])
if onnxruntime == 'rocm-5.4.2' or onnxruntime == 'rocm-5.6':
if python_id in [ 'cp39', 'cp310', 'cp311' ]:
rocm_version = onnxruntime.replace('-', '')
rocm_version = rocm_version.replace('.', '')
wheel_name = 'onnxruntime_training-' + onnxruntime_version + '+' + rocm_version + '-' + python_id + '-' + python_id + '-manylinux_2_17_x86_64.manylinux2014_x86_64.whl'
wheel_path = os.path.join(tempfile.gettempdir(), wheel_name)
wheel_url = 'https://download.onnxruntime.ai/' + wheel_name
subprocess.call([ 'curl', '--silent', '--location', '--continue-at', '-', '--output', wheel_path, wheel_url ])
subprocess.call([ 'pip', 'uninstall', wheel_path, '-y', '-q' ])
subprocess.call([ 'pip', 'install', wheel_path, '--force-reinstall' ])
os.remove(wheel_path)
else:
subprocess.call([ 'pip', 'uninstall', 'onnxruntime', onnxruntime_name, '-y', '-q' ])
if onnxruntime == 'cuda-12.2':
subprocess.call([ 'pip', 'install', onnxruntime_name + '==' + onnxruntime_version, '--extra-index-url', 'https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/onnxruntime-cuda-12/pypi/simple', '--force-reinstall' ])
else:
subprocess.call([ 'pip', 'install', onnxruntime_name + '==' + onnxruntime_version, '--force-reinstall' ])
+47
View File
@@ -0,0 +1,47 @@
from typing import Dict
from logging import basicConfig, getLogger, Logger, DEBUG, INFO, WARNING, ERROR
from deepfuze.typing import LogLevel
def init(log_level : LogLevel) -> None:
basicConfig(format = None)
get_package_logger().setLevel(get_log_levels()[log_level])
def get_package_logger() -> Logger:
return getLogger('facefusion')
def debug(message : str, scope : str) -> None:
get_package_logger().debug('[' + scope + '] ' + message)
def info(message : str, scope : str) -> None:
get_package_logger().info('[' + scope + '] ' + message)
def warn(message : str, scope : str) -> None:
get_package_logger().warning('[' + scope + '] ' + message)
def error(message : str, scope : str) -> None:
get_package_logger().error('[' + scope + '] ' + message)
def enable() -> None:
get_package_logger().disabled = False
def disable() -> None:
get_package_logger().disabled = True
def get_log_levels() -> Dict[LogLevel, int]:
return\
{
'error': ERROR,
'warn': WARNING,
'info': INFO,
'debug': DEBUG
}
+21
View File
@@ -0,0 +1,21 @@
from deepfuze.common_helper import is_macos, is_windows
if is_windows():
import ctypes
else:
import resource
def limit_system_memory(system_memory_limit : int = 1) -> bool:
if is_macos():
system_memory_limit = system_memory_limit * (1024 ** 6)
else:
system_memory_limit = system_memory_limit * (1024 ** 3)
try:
if is_windows():
ctypes.windll.kernel32.SetProcessWorkingSetSize(-1, ctypes.c_size_t(system_memory_limit), ctypes.c_size_t(system_memory_limit)) #type:ignore[attr-defined]
else:
resource.setrlimit(resource.RLIMIT_DATA, (system_memory_limit, system_memory_limit))
return True
except Exception:
return False
+13
View File
@@ -0,0 +1,13 @@
METADATA =\
{
'name': 'FaceFusion',
'description': 'Next generation face swapper and enhancer',
'version': '2.6.0',
'license': 'MIT',
'author': 'Henry Ruhs',
'url': 'https://deepfuze.io'
}
def get(key : str) -> str:
return METADATA[key]
+39
View File
@@ -0,0 +1,39 @@
from typing import List, Optional
import hashlib
import os
import deepfuze.globals
from deepfuze.filesystem import is_directory
from deepfuze.typing import Padding, Fps
def normalize_output_path(target_path : Optional[str], output_path : Optional[str]) -> Optional[str]:
if target_path and output_path:
target_name, target_extension = os.path.splitext(os.path.basename(target_path))
if is_directory(output_path):
output_hash = hashlib.sha1(str(deepfuze.globals.__dict__).encode('utf-8')).hexdigest()[:8]
output_name = target_name + '-' + output_hash
return os.path.join(output_path, output_name + target_extension)
output_name, output_extension = os.path.splitext(os.path.basename(output_path))
output_directory_path = os.path.dirname(output_path)
if is_directory(output_directory_path) and output_extension:
return os.path.join(output_directory_path, output_name + target_extension)
return None
def normalize_padding(padding : Optional[List[int]]) -> Optional[Padding]:
if padding and len(padding) == 1:
return tuple([ padding[0] ] * 4) #type:ignore[return-value]
if padding and len(padding) == 2:
return tuple([ padding[0], padding[1], padding[0], padding[1] ]) #type:ignore[return-value]
if padding and len(padding) == 3:
return tuple([ padding[0], padding[1], padding[2], padding[1] ]) #type:ignore[return-value]
if padding and len(padding) == 4:
return tuple(padding) #type:ignore[return-value]
return None
def normalize_fps(fps : Optional[float]) -> Optional[Fps]:
if fps is not None:
return max(1.0, min(fps, 60.0))
return None
+53
View File
@@ -0,0 +1,53 @@
from typing import Generator, List
from deepfuze.typing import QueuePayload, ProcessState
PROCESS_STATE : ProcessState = 'pending'
def get_process_state() -> ProcessState:
return PROCESS_STATE
def set_process_state(process_state : ProcessState) -> None:
global PROCESS_STATE
PROCESS_STATE = process_state
def is_checking() -> bool:
return get_process_state() == 'checking'
def is_processing() -> bool:
return get_process_state() == 'processing'
def is_stopping() -> bool:
return get_process_state() == 'stopping'
def is_pending() -> bool:
return get_process_state() == 'pending'
def check() -> None:
set_process_state('checking')
def start() -> None:
set_process_state('processing')
def stop() -> None:
set_process_state('stopping')
def end() -> None:
set_process_state('pending')
def manage(queue_payloads : List[QueuePayload]) -> Generator[QueuePayload, None, None]:
for query_payload in queue_payloads:
if is_processing():
yield query_payload
View File
+16
View File
@@ -0,0 +1,16 @@
from typing import List
from deepfuze.common_helper import create_int_range
from deepfuze.processors.frame.typings import FaceDebuggerItem, FaceEnhancerModel, FaceSwapperModel, FrameColorizerModel, FrameEnhancerModel, LipSyncerModel
face_debugger_items : List[FaceDebuggerItem] = [ 'bounding-box', 'face-landmark-5', 'face-landmark-5/68', 'face-landmark-68', 'face-landmark-68/5', 'face-mask', 'face-detector-score', 'face-landmarker-score', 'age', 'gender' ]
face_enhancer_models : List[FaceEnhancerModel] = [ 'codeformer', 'gfpgan_1.2', 'gfpgan_1.3', 'gfpgan_1.4', 'gpen_bfr_256', 'gpen_bfr_512', 'gpen_bfr_1024', 'gpen_bfr_2048', 'restoreformer_plus_plus' ]
face_swapper_models : List[FaceSwapperModel] = [ 'blendswap_256', 'inswapper_128', 'inswapper_128_fp16', 'simswap_256', 'simswap_512_unofficial', 'uniface_256' ]
frame_colorizer_models : List[FrameColorizerModel] = [ 'ddcolor', 'ddcolor_artistic', 'deoldify', 'deoldify_artistic', 'deoldify_stable' ]
frame_colorizer_sizes : List[str] = [ '192x192', '256x256', '384x384', '512x512' ]
frame_enhancer_models : List[FrameEnhancerModel] = [ 'clear_reality_x4', 'lsdir_x4', 'nomos8k_sc_x4', 'real_esrgan_x2', 'real_esrgan_x2_fp16', 'real_esrgan_x4', 'real_esrgan_x4_fp16', 'real_hatgan_x4', 'span_kendata_x4', 'ultra_sharp_x4' ]
lip_syncer_models : List[LipSyncerModel] = [ 'wav2lip_gan' ]
face_enhancer_blend_range : List[int] = create_int_range(0, 100, 1)
frame_colorizer_blend_range : List[int] = create_int_range(0, 100, 1)
frame_enhancer_blend_range : List[int] = create_int_range(0, 100, 1)
+116
View File
@@ -0,0 +1,116 @@
import os
import sys
import importlib
from concurrent.futures import ThreadPoolExecutor, as_completed
from queue import Queue
from types import ModuleType
from typing import Any, List
from tqdm import tqdm
import deepfuze.globals
from deepfuze.typing import ProcessFrames, QueuePayload
from deepfuze.execution import encode_execution_providers
from deepfuze import logger, wording
FRAME_PROCESSORS_MODULES : List[ModuleType] = []
FRAME_PROCESSORS_METHODS =\
[
'get_frame_processor',
'clear_frame_processor',
'get_options',
'set_options',
'register_args',
'apply_args',
'pre_check',
'post_check',
'pre_process',
'post_process',
'get_reference_frame',
'process_frame',
'process_frames',
'process_image',
'process_video'
]
def load_frame_processor_module(frame_processor : str) -> Any:
try:
frame_processor_module = importlib.import_module('deepfuze.processors.frame.modules.' + frame_processor)
for method_name in FRAME_PROCESSORS_METHODS:
if not hasattr(frame_processor_module, method_name):
raise NotImplementedError
except ModuleNotFoundError as exception:
logger.error(wording.get('frame_processor_not_loaded').format(frame_processor = frame_processor), __name__.upper())
logger.debug(exception.msg, __name__.upper())
sys.exit(1)
except NotImplementedError:
logger.error(wording.get('frame_processor_not_implemented').format(frame_processor = frame_processor), __name__.upper())
sys.exit(1)
return frame_processor_module
def get_frame_processors_modules(frame_processors : List[str]) -> List[ModuleType]:
global FRAME_PROCESSORS_MODULES
if not FRAME_PROCESSORS_MODULES:
for frame_processor in frame_processors:
frame_processor_module = load_frame_processor_module(frame_processor)
FRAME_PROCESSORS_MODULES.append(frame_processor_module)
return FRAME_PROCESSORS_MODULES
def clear_frame_processors_modules() -> None:
global FRAME_PROCESSORS_MODULES
for frame_processor_module in get_frame_processors_modules(deepfuze.globals.frame_processors):
frame_processor_module.clear_frame_processor()
FRAME_PROCESSORS_MODULES = []
def multi_process_frames(source_paths : List[str], temp_frame_paths : List[str], process_frames : ProcessFrames) -> None:
queue_payloads = create_queue_payloads(temp_frame_paths)
with tqdm(total = len(queue_payloads), desc = wording.get('processing'), unit = 'frame', ascii = ' =', disable = deepfuze.globals.log_level in [ 'warn', 'error' ]) as progress:
progress.set_postfix(
{
'execution_providers': encode_execution_providers(deepfuze.globals.execution_providers),
'execution_thread_count': deepfuze.globals.execution_thread_count,
'execution_queue_count': deepfuze.globals.execution_queue_count
})
with ThreadPoolExecutor(max_workers = deepfuze.globals.execution_thread_count) as executor:
futures = []
queue : Queue[QueuePayload] = create_queue(queue_payloads)
queue_per_future = max(len(queue_payloads) // deepfuze.globals.execution_thread_count * deepfuze.globals.execution_queue_count, 1)
while not queue.empty():
future = executor.submit(process_frames, source_paths, pick_queue(queue, queue_per_future), progress.update)
futures.append(future)
for future_done in as_completed(futures):
future_done.result()
def create_queue(queue_payloads : List[QueuePayload]) -> Queue[QueuePayload]:
queue : Queue[QueuePayload] = Queue()
for queue_payload in queue_payloads:
queue.put(queue_payload)
return queue
def pick_queue(queue : Queue[QueuePayload], queue_per_future : int) -> List[QueuePayload]:
queues = []
for _ in range(queue_per_future):
if not queue.empty():
queues.append(queue.get())
return queues
def create_queue_payloads(temp_frame_paths : List[str]) -> List[QueuePayload]:
queue_payloads = []
temp_frame_paths = sorted(temp_frame_paths, key = os.path.basename)
for frame_number, frame_path in enumerate(temp_frame_paths):
frame_payload : QueuePayload =\
{
'frame_number': frame_number,
'frame_path': frame_path
}
queue_payloads.append(frame_payload)
return queue_payloads
+14
View File
@@ -0,0 +1,14 @@
from typing import List, Optional
from deepfuze.processors.frame.typings import FaceDebuggerItem, FaceEnhancerModel, FaceSwapperModel, FrameColorizerModel, FrameEnhancerModel, LipSyncerModel
face_debugger_items : Optional[List[FaceDebuggerItem]] = None
face_enhancer_model : Optional[FaceEnhancerModel] = None
face_enhancer_blend : Optional[int] = None
face_swapper_model : Optional[FaceSwapperModel] = None
frame_colorizer_model : Optional[FrameColorizerModel] = None
frame_colorizer_blend : Optional[int] = None
frame_colorizer_size : Optional[str] = None
frame_enhancer_model : Optional[FrameEnhancerModel] = None
frame_enhancer_blend : Optional[int] = None
lip_syncer_model : Optional[LipSyncerModel] = None
+192
View File
@@ -0,0 +1,192 @@
from typing import Any, List, Literal
from argparse import ArgumentParser
import cv2
import numpy
import deepfuze.globals
import deepfuze.processors.frame.core as frame_processors
from deepfuze import config, process_manager, wording
from deepfuze.face_analyser import get_one_face, get_many_faces, find_similar_faces, clear_face_analyser
from deepfuze.face_masker import create_static_box_mask, create_occlusion_mask, create_region_mask, clear_face_occluder, clear_face_parser
from deepfuze.face_helper import warp_face_by_face_landmark_5, categorize_age, categorize_gender
from deepfuze.face_store import get_reference_faces
from deepfuze.content_analyser import clear_content_analyser
from deepfuze.typing import Face, VisionFrame, UpdateProgress, ProcessMode, QueuePayload
from deepfuze.vision import read_image, read_static_image, write_image
from deepfuze.processors.frame.typings import FaceDebuggerInputs
from deepfuze.processors.frame import globals as frame_processors_globals, choices as frame_processors_choices
NAME = __name__.upper()
def get_frame_processor() -> None:
pass
def clear_frame_processor() -> None:
pass
def get_options(key : Literal['model']) -> None:
pass
def set_options(key : Literal['model'], value : Any) -> None:
pass
def register_args(program : ArgumentParser) -> None:
program.add_argument('--face-debugger-items', help = wording.get('help.face_debugger_items').format(choices = ', '.join(frame_processors_choices.face_debugger_items)), default = config.get_str_list('frame_processors.face_debugger_items', 'face-landmark-5/68 face-mask'), choices = frame_processors_choices.face_debugger_items, nargs = '+', metavar = 'FACE_DEBUGGER_ITEMS')
def apply_args(program : ArgumentParser) -> None:
args = program.parse_args()
frame_processors_globals.face_debugger_items = args.face_debugger_items
def pre_check() -> bool:
return True
def post_check() -> bool:
return True
def pre_process(mode : ProcessMode) -> bool:
return True
def post_process() -> None:
read_static_image.cache_clear()
if deepfuze.globals.video_memory_strategy == 'strict' or deepfuze.globals.video_memory_strategy == 'moderate':
clear_frame_processor()
if deepfuze.globals.video_memory_strategy == 'strict':
clear_face_analyser()
clear_content_analyser()
clear_face_occluder()
clear_face_parser()
def debug_face(target_face : Face, temp_vision_frame : VisionFrame) -> VisionFrame:
primary_color = (0, 0, 255)
secondary_color = (0, 255, 0)
tertiary_color = (255, 255, 0)
bounding_box = target_face.bounding_box.astype(numpy.int32)
temp_vision_frame = temp_vision_frame.copy()
has_face_landmark_5_fallback = numpy.array_equal(target_face.landmarks.get('5'), target_face.landmarks.get('5/68'))
has_face_landmark_68_fallback = numpy.array_equal(target_face.landmarks.get('68'), target_face.landmarks.get('68/5'))
if 'bounding-box' in frame_processors_globals.face_debugger_items:
cv2.rectangle(temp_vision_frame, (bounding_box[0], bounding_box[1]), (bounding_box[2], bounding_box[3]), primary_color, 2)
if 'face-mask' in frame_processors_globals.face_debugger_items:
crop_vision_frame, affine_matrix = warp_face_by_face_landmark_5(temp_vision_frame, target_face.landmarks.get('5/68'), 'arcface_128_v2', (512, 512))
inverse_matrix = cv2.invertAffineTransform(affine_matrix)
temp_size = temp_vision_frame.shape[:2][::-1]
crop_mask_list = []
if 'box' in deepfuze.globals.face_mask_types:
box_mask = create_static_box_mask(crop_vision_frame.shape[:2][::-1], 0, deepfuze.globals.face_mask_padding)
crop_mask_list.append(box_mask)
if 'occlusion' in deepfuze.globals.face_mask_types:
occlusion_mask = create_occlusion_mask(crop_vision_frame)
crop_mask_list.append(occlusion_mask)
if 'region' in deepfuze.globals.face_mask_types:
region_mask = create_region_mask(crop_vision_frame, deepfuze.globals.face_mask_regions)
crop_mask_list.append(region_mask)
crop_mask = numpy.minimum.reduce(crop_mask_list).clip(0, 1)
crop_mask = (crop_mask * 255).astype(numpy.uint8)
inverse_vision_frame = cv2.warpAffine(crop_mask, inverse_matrix, temp_size)
inverse_vision_frame = cv2.threshold(inverse_vision_frame, 100, 255, cv2.THRESH_BINARY)[1]
inverse_vision_frame[inverse_vision_frame > 0] = 255
inverse_contours = cv2.findContours(inverse_vision_frame, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)[0]
cv2.drawContours(temp_vision_frame, inverse_contours, -1, tertiary_color if has_face_landmark_5_fallback else secondary_color, 2)
if 'face-landmark-5' in frame_processors_globals.face_debugger_items and numpy.any(target_face.landmarks.get('5')):
face_landmark_5 = target_face.landmarks.get('5').astype(numpy.int32)
for index in range(face_landmark_5.shape[0]):
cv2.circle(temp_vision_frame, (face_landmark_5[index][0], face_landmark_5[index][1]), 3, primary_color, -1)
if 'face-landmark-5/68' in frame_processors_globals.face_debugger_items and numpy.any(target_face.landmarks.get('5/68')):
face_landmark_5_68 = target_face.landmarks.get('5/68').astype(numpy.int32)
for index in range(face_landmark_5_68.shape[0]):
cv2.circle(temp_vision_frame, (face_landmark_5_68[index][0], face_landmark_5_68[index][1]), 3, tertiary_color if has_face_landmark_5_fallback else secondary_color, -1)
if 'face-landmark-68' in frame_processors_globals.face_debugger_items and numpy.any(target_face.landmarks.get('68')):
face_landmark_68 = target_face.landmarks.get('68').astype(numpy.int32)
for index in range(face_landmark_68.shape[0]):
cv2.circle(temp_vision_frame, (face_landmark_68[index][0], face_landmark_68[index][1]), 3, tertiary_color if has_face_landmark_68_fallback else secondary_color, -1)
if 'face-landmark-68/5' in frame_processors_globals.face_debugger_items and numpy.any(target_face.landmarks.get('68')):
face_landmark_68 = target_face.landmarks.get('68/5').astype(numpy.int32)
for index in range(face_landmark_68.shape[0]):
cv2.circle(temp_vision_frame, (face_landmark_68[index][0], face_landmark_68[index][1]), 3, primary_color, -1)
if bounding_box[3] - bounding_box[1] > 50 and bounding_box[2] - bounding_box[0] > 50:
top = bounding_box[1]
left = bounding_box[0] - 20
if 'face-detector-score' in frame_processors_globals.face_debugger_items:
face_score_text = str(round(target_face.scores.get('detector'), 2))
top = top + 20
cv2.putText(temp_vision_frame, face_score_text, (left, top), cv2.FONT_HERSHEY_SIMPLEX, 0.5, primary_color, 2)
if 'face-landmarker-score' in frame_processors_globals.face_debugger_items:
face_score_text = str(round(target_face.scores.get('landmarker'), 2))
top = top + 20
cv2.putText(temp_vision_frame, face_score_text, (left, top), cv2.FONT_HERSHEY_SIMPLEX, 0.5, tertiary_color if has_face_landmark_5_fallback else secondary_color, 2)
if 'age' in frame_processors_globals.face_debugger_items:
face_age_text = categorize_age(target_face.age)
top = top + 20
cv2.putText(temp_vision_frame, face_age_text, (left, top), cv2.FONT_HERSHEY_SIMPLEX, 0.5, primary_color, 2)
if 'gender' in frame_processors_globals.face_debugger_items:
face_gender_text = categorize_gender(target_face.gender)
top = top + 20
cv2.putText(temp_vision_frame, face_gender_text, (left, top), cv2.FONT_HERSHEY_SIMPLEX, 0.5, primary_color, 2)
return temp_vision_frame
def get_reference_frame(source_face : Face, target_face : Face, temp_vision_frame : VisionFrame) -> VisionFrame:
pass
def process_frame(inputs : FaceDebuggerInputs) -> VisionFrame:
reference_faces = inputs.get('reference_faces')
target_vision_frame = inputs.get('target_vision_frame')
if deepfuze.globals.face_selector_mode == 'many':
many_faces = get_many_faces(target_vision_frame)
if many_faces:
for target_face in many_faces:
target_vision_frame = debug_face(target_face, target_vision_frame)
if deepfuze.globals.face_selector_mode == 'one':
target_face = get_one_face(target_vision_frame)
if target_face:
target_vision_frame = debug_face(target_face, target_vision_frame)
if deepfuze.globals.face_selector_mode == 'reference':
similar_faces = find_similar_faces(reference_faces, target_vision_frame, deepfuze.globals.reference_face_distance)
if similar_faces:
for similar_face in similar_faces:
target_vision_frame = debug_face(similar_face, target_vision_frame)
return target_vision_frame
def process_frames(source_paths : List[str], queue_payloads : List[QueuePayload], update_progress : UpdateProgress) -> None:
reference_faces = get_reference_faces() if 'reference' in deepfuze.globals.face_selector_mode else None
for queue_payload in process_manager.manage(queue_payloads):
target_vision_path = queue_payload['frame_path']
target_vision_frame = read_image(target_vision_path)
output_vision_frame = process_frame(
{
'reference_faces': reference_faces,
'target_vision_frame': target_vision_frame
})
write_image(target_vision_path, output_vision_frame)
update_progress(1)
def process_image(source_paths : List[str], target_path : str, output_path : str) -> None:
reference_faces = get_reference_faces() if 'reference' in deepfuze.globals.face_selector_mode else None
target_vision_frame = read_static_image(target_path)
output_vision_frame = process_frame(
{
'reference_faces': reference_faces,
'target_vision_frame': target_vision_frame
})
write_image(output_path, output_vision_frame)
def process_video(source_paths : List[str], temp_frame_paths : List[str]) -> None:
frame_processors.multi_process_frames(source_paths, temp_frame_paths, process_frames)
+301
View File
@@ -0,0 +1,301 @@
from typing import Any, List, Literal, Optional
from argparse import ArgumentParser
from time import sleep
import cv2
import numpy
import onnxruntime
import deepfuze.globals
import deepfuze.processors.frame.core as frame_processors
from deepfuze import config, process_manager, logger, wording
from deepfuze.face_analyser import get_many_faces, clear_face_analyser, find_similar_faces, get_one_face
from deepfuze.face_masker import create_static_box_mask, create_occlusion_mask, clear_face_occluder
from deepfuze.face_helper import warp_face_by_face_landmark_5, paste_back
from deepfuze.execution import apply_execution_provider_options
from deepfuze.content_analyser import clear_content_analyser
from deepfuze.face_store import get_reference_faces
from deepfuze.normalizer import normalize_output_path
from deepfuze.thread_helper import thread_lock, thread_semaphore
from deepfuze.typing import Face, VisionFrame, UpdateProgress, ProcessMode, ModelSet, OptionsWithModel, QueuePayload
from deepfuze.common_helper import create_metavar
from deepfuze.filesystem import is_file, is_image, is_video, resolve_relative_path
from deepfuze.download import conditional_download, is_download_done
from deepfuze.vision import read_image, read_static_image, write_image
from deepfuze.processors.frame.typings import FaceEnhancerInputs
from deepfuze.processors.frame import globals as frame_processors_globals
from deepfuze.processors.frame import choices as frame_processors_choices
FRAME_PROCESSOR = None
NAME = __name__.upper()
MODELS : ModelSet =\
{
'codeformer':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/codeformer.onnx',
'path': resolve_relative_path('../../../models/deepfuze/codeformer.onnx'),
'template': 'ffhq_512',
'size': (512, 512)
},
'gfpgan_1.2':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/gfpgan_1.2.onnx',
'path': resolve_relative_path('../../../models/deepfuze/gfpgan_1.2.onnx'),
'template': 'ffhq_512',
'size': (512, 512)
},
'gfpgan_1.3':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/gfpgan_1.3.onnx',
'path': resolve_relative_path('../../../models/deepfuze/gfpgan_1.3.onnx'),
'template': 'ffhq_512',
'size': (512, 512)
},
'gfpgan_1.4':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/gfpgan_1.4.onnx',
'path': resolve_relative_path('../../../models/deepfuze/gfpgan_1.4.onnx'),
'template': 'ffhq_512',
'size': (512, 512)
},
'gpen_bfr_256':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/gpen_bfr_256.onnx',
'path': resolve_relative_path('../../../models/deepfuze/gpen_bfr_256.onnx'),
'template': 'arcface_128_v2',
'size': (256, 256)
},
'gpen_bfr_512':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/gpen_bfr_512.onnx',
'path': resolve_relative_path('../../../models/deepfuze/gpen_bfr_512.onnx'),
'template': 'ffhq_512',
'size': (512, 512)
},
'gpen_bfr_1024':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/gpen_bfr_1024.onnx',
'path': resolve_relative_path('../../../models/deepfuze/gpen_bfr_1024.onnx'),
'template': 'ffhq_512',
'size': (1024, 1024)
},
'gpen_bfr_2048':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/gpen_bfr_2048.onnx',
'path': resolve_relative_path('../../../models/deepfuze/gpen_bfr_2048.onnx'),
'template': 'ffhq_512',
'size': (2048, 2048)
},
'restoreformer_plus_plus':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/restoreformer_plus_plus.onnx',
'path': resolve_relative_path('../../../models/deepfuze/restoreformer_plus_plus.onnx'),
'template': 'ffhq_512',
'size': (512, 512)
}
}
OPTIONS : Optional[OptionsWithModel] = None
def get_frame_processor() -> Any:
global FRAME_PROCESSOR
with thread_lock():
while process_manager.is_checking():
sleep(0.5)
if FRAME_PROCESSOR is None:
model_path = get_options('model').get('path')
FRAME_PROCESSOR = onnxruntime.InferenceSession(model_path, providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
return FRAME_PROCESSOR
def clear_frame_processor() -> None:
global FRAME_PROCESSOR
FRAME_PROCESSOR = None
def get_options(key : Literal['model']) -> Any:
global OPTIONS
if OPTIONS is None:
OPTIONS =\
{
'model': MODELS[frame_processors_globals.face_enhancer_model]
}
return OPTIONS.get(key)
def set_options(key : Literal['model'], value : Any) -> None:
global OPTIONS
OPTIONS[key] = value
def register_args(program : ArgumentParser) -> None:
program.add_argument('--face-enhancer-model', help = wording.get('help.face_enhancer_model'), default = config.get_str_value('frame_processors.face_enhancer_model', 'gfpgan_1.4'), choices = frame_processors_choices.face_enhancer_models)
program.add_argument('--face-enhancer-blend', help = wording.get('help.face_enhancer_blend'), type = int, default = config.get_int_value('frame_processors.face_enhancer_blend', '80'), choices = frame_processors_choices.face_enhancer_blend_range, metavar = create_metavar(frame_processors_choices.face_enhancer_blend_range))
def apply_args(program : ArgumentParser) -> None:
args = program.parse_args()
frame_processors_globals.face_enhancer_model = args.face_enhancer_model
frame_processors_globals.face_enhancer_blend = args.face_enhancer_blend
def pre_check() -> bool:
download_directory_path = resolve_relative_path('../../../models/deepfuze')
model_url = get_options('model').get('url')
model_path = get_options('model').get('path')
if not deepfuze.globals.skip_download:
process_manager.check()
conditional_download(download_directory_path, [ model_url ])
process_manager.end()
return is_file(model_path)
def post_check() -> bool:
model_url = get_options('model').get('url')
model_path = get_options('model').get('path')
if not deepfuze.globals.skip_download and not is_download_done(model_url, model_path):
logger.error(wording.get('model_download_not_done') + wording.get('exclamation_mark'), NAME)
return False
if not is_file(model_path):
logger.error(wording.get('model_file_not_present') + wording.get('exclamation_mark'), NAME)
return False
return True
def pre_process(mode : ProcessMode) -> bool:
if mode in [ 'output', 'preview' ] and not is_image(deepfuze.globals.target_path) and not is_video(deepfuze.globals.target_path):
logger.error(wording.get('select_image_or_video_target') + wording.get('exclamation_mark'), NAME)
return False
if mode == 'output' and not normalize_output_path(deepfuze.globals.target_path, deepfuze.globals.output_path):
logger.error(wording.get('select_file_or_directory_output') + wording.get('exclamation_mark'), NAME)
return False
return True
def post_process() -> None:
read_static_image.cache_clear()
if deepfuze.globals.video_memory_strategy == 'strict' or deepfuze.globals.video_memory_strategy == 'moderate':
clear_frame_processor()
if deepfuze.globals.video_memory_strategy == 'strict':
clear_face_analyser()
clear_content_analyser()
clear_face_occluder()
def enhance_face(target_face: Face, temp_vision_frame : VisionFrame) -> VisionFrame:
model_template = get_options('model').get('template')
model_size = get_options('model').get('size')
crop_vision_frame, affine_matrix = warp_face_by_face_landmark_5(temp_vision_frame, target_face.landmarks.get('5/68'), model_template, model_size)
box_mask = create_static_box_mask(crop_vision_frame.shape[:2][::-1], deepfuze.globals.face_mask_blur, (0, 0, 0, 0))
crop_mask_list =\
[
box_mask
]
if 'occlusion' in deepfuze.globals.face_mask_types:
occlusion_mask = create_occlusion_mask(crop_vision_frame)
crop_mask_list.append(occlusion_mask)
crop_vision_frame = prepare_crop_frame(crop_vision_frame)
crop_vision_frame = apply_enhance(crop_vision_frame)
crop_vision_frame = normalize_crop_frame(crop_vision_frame)
crop_mask = numpy.minimum.reduce(crop_mask_list).clip(0, 1)
paste_vision_frame = paste_back(temp_vision_frame, crop_vision_frame, crop_mask, affine_matrix)
temp_vision_frame = blend_frame(temp_vision_frame, paste_vision_frame)
return temp_vision_frame
def apply_enhance(crop_vision_frame : VisionFrame) -> VisionFrame:
frame_processor = get_frame_processor()
frame_processor_inputs = {}
for frame_processor_input in frame_processor.get_inputs():
if frame_processor_input.name == 'input':
frame_processor_inputs[frame_processor_input.name] = crop_vision_frame
if frame_processor_input.name == 'weight':
weight = numpy.array([ 1 ]).astype(numpy.double)
frame_processor_inputs[frame_processor_input.name] = weight
with thread_semaphore():
crop_vision_frame = frame_processor.run(None, frame_processor_inputs)[0][0]
return crop_vision_frame
def prepare_crop_frame(crop_vision_frame : VisionFrame) -> VisionFrame:
crop_vision_frame = crop_vision_frame[:, :, ::-1] / 255.0
crop_vision_frame = (crop_vision_frame - 0.5) / 0.5
crop_vision_frame = numpy.expand_dims(crop_vision_frame.transpose(2, 0, 1), axis = 0).astype(numpy.float32)
return crop_vision_frame
def normalize_crop_frame(crop_vision_frame : VisionFrame) -> VisionFrame:
crop_vision_frame = numpy.clip(crop_vision_frame, -1, 1)
crop_vision_frame = (crop_vision_frame + 1) / 2
crop_vision_frame = crop_vision_frame.transpose(1, 2, 0)
crop_vision_frame = (crop_vision_frame * 255.0).round()
crop_vision_frame = crop_vision_frame.astype(numpy.uint8)[:, :, ::-1]
return crop_vision_frame
def blend_frame(temp_vision_frame : VisionFrame, paste_vision_frame : VisionFrame) -> VisionFrame:
face_enhancer_blend = 1 - (frame_processors_globals.face_enhancer_blend / 100)
temp_vision_frame = cv2.addWeighted(temp_vision_frame, face_enhancer_blend, paste_vision_frame, 1 - face_enhancer_blend, 0)
return temp_vision_frame
def get_reference_frame(source_face : Face, target_face : Face, temp_vision_frame : VisionFrame) -> VisionFrame:
return enhance_face(target_face, temp_vision_frame)
def process_frame(inputs : FaceEnhancerInputs) -> VisionFrame:
reference_faces = inputs.get('reference_faces')
target_vision_frame = inputs.get('target_vision_frame')
if deepfuze.globals.face_selector_mode == 'many':
many_faces = get_many_faces(target_vision_frame)
if many_faces:
for target_face in many_faces:
target_vision_frame = enhance_face(target_face, target_vision_frame)
if deepfuze.globals.face_selector_mode == 'one':
target_face = get_one_face(target_vision_frame)
if target_face:
target_vision_frame = enhance_face(target_face, target_vision_frame)
if deepfuze.globals.face_selector_mode == 'reference':
similar_faces = find_similar_faces(reference_faces, target_vision_frame, deepfuze.globals.reference_face_distance)
if similar_faces:
for similar_face in similar_faces:
target_vision_frame = enhance_face(similar_face, target_vision_frame)
return target_vision_frame
def process_frames(source_path : List[str], queue_payloads : List[QueuePayload], update_progress : UpdateProgress) -> None:
reference_faces = get_reference_faces() if 'reference' in deepfuze.globals.face_selector_mode else None
for queue_payload in process_manager.manage(queue_payloads):
target_vision_path = queue_payload['frame_path']
target_vision_frame = read_image(target_vision_path)
output_vision_frame = process_frame(
{
'reference_faces': reference_faces,
'target_vision_frame': target_vision_frame
})
write_image(target_vision_path, output_vision_frame)
update_progress(1)
def process_image(source_path : str, target_path : str, output_path : str) -> None:
reference_faces = get_reference_faces() if 'reference' in deepfuze.globals.face_selector_mode else None
target_vision_frame = read_static_image(target_path)
output_vision_frame = process_frame(
{
'reference_faces': reference_faces,
'target_vision_frame': target_vision_frame
})
write_image(output_path, output_vision_frame)
def process_video(source_paths : List[str], temp_frame_paths : List[str]) -> None:
frame_processors.multi_process_frames(None, temp_frame_paths, process_frames)
+369
View File
@@ -0,0 +1,369 @@
from typing import Any, List, Literal, Optional
from argparse import ArgumentParser
from time import sleep
import numpy
import onnx
import onnxruntime
from onnx import numpy_helper
import deepfuze.globals
import deepfuze.processors.frame.core as frame_processors
from deepfuze import config, process_manager, logger, wording
from deepfuze.execution import has_execution_provider, apply_execution_provider_options
from deepfuze.face_analyser import get_one_face, get_average_face, get_many_faces, find_similar_faces, clear_face_analyser
from deepfuze.face_masker import create_static_box_mask, create_occlusion_mask, create_region_mask, clear_face_occluder, clear_face_parser
from deepfuze.face_helper import warp_face_by_face_landmark_5, paste_back
from deepfuze.face_store import get_reference_faces
from deepfuze.content_analyser import clear_content_analyser
from deepfuze.normalizer import normalize_output_path
from deepfuze.thread_helper import thread_lock, conditional_thread_semaphore
from deepfuze.typing import Face, Embedding, VisionFrame, UpdateProgress, ProcessMode, ModelSet, OptionsWithModel, QueuePayload
from deepfuze.filesystem import is_file, is_image, has_image, is_video, filter_image_paths, resolve_relative_path
from deepfuze.download import conditional_download, is_download_done
from deepfuze.vision import read_image, read_static_image, read_static_images, write_image
from deepfuze.processors.frame.typings import FaceSwapperInputs
from deepfuze.processors.frame import globals as frame_processors_globals
from deepfuze.processors.frame import choices as frame_processors_choices
FRAME_PROCESSOR = None
MODEL_INITIALIZER = None
NAME = __name__.upper()
MODELS : ModelSet =\
{
'blendswap_256':
{
'type': 'blendswap',
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/blendswap_256.onnx',
'path': resolve_relative_path('../../../models/deepfuze/blendswap_256.onnx'),
'template': 'ffhq_512',
'size': (256, 256),
'mean': [ 0.0, 0.0, 0.0 ],
'standard_deviation': [ 1.0, 1.0, 1.0 ]
},
'inswapper_128':
{
'type': 'inswapper',
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/inswapper_128.onnx',
'path': resolve_relative_path('../../../models/deepfuze/inswapper_128.onnx'),
'template': 'arcface_128_v2',
'size': (128, 128),
'mean': [ 0.0, 0.0, 0.0 ],
'standard_deviation': [ 1.0, 1.0, 1.0 ]
},
'inswapper_128_fp16':
{
'type': 'inswapper',
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/inswapper_128_fp16.onnx',
'path': resolve_relative_path('../../../models/deepfuze/inswapper_128_fp16.onnx'),
'template': 'arcface_128_v2',
'size': (128, 128),
'mean': [ 0.0, 0.0, 0.0 ],
'standard_deviation': [ 1.0, 1.0, 1.0 ]
},
'simswap_256':
{
'type': 'simswap',
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/simswap_256.onnx',
'path': resolve_relative_path('../../../models/deepfuze/simswap_256.onnx'),
'template': 'arcface_112_v1',
'size': (256, 256),
'mean': [ 0.485, 0.456, 0.406 ],
'standard_deviation': [ 0.229, 0.224, 0.225 ]
},
'simswap_512_unofficial':
{
'type': 'simswap',
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/simswap_512_unofficial.onnx',
'path': resolve_relative_path('../../../models/deepfuze/simswap_512_unofficial.onnx'),
'template': 'arcface_112_v1',
'size': (512, 512),
'mean': [ 0.0, 0.0, 0.0 ],
'standard_deviation': [ 1.0, 1.0, 1.0 ]
},
'uniface_256':
{
'type': 'uniface',
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/uniface_256.onnx',
'path': resolve_relative_path('../../../models/deepfuze/uniface_256.onnx'),
'template': 'ffhq_512',
'size': (256, 256),
'mean': [ 0.0, 0.0, 0.0 ],
'standard_deviation': [ 1.0, 1.0, 1.0 ]
}
}
OPTIONS : Optional[OptionsWithModel] = None
def get_frame_processor() -> Any:
global FRAME_PROCESSOR
with thread_lock():
while process_manager.is_checking():
sleep(0.5)
if FRAME_PROCESSOR is None:
model_path = get_options('model').get('path')
FRAME_PROCESSOR = onnxruntime.InferenceSession(model_path, providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
return FRAME_PROCESSOR
def clear_frame_processor() -> None:
global FRAME_PROCESSOR
FRAME_PROCESSOR = None
def get_model_initializer() -> Any:
global MODEL_INITIALIZER
with thread_lock():
while process_manager.is_checking():
sleep(0.5)
if MODEL_INITIALIZER is None:
model_path = get_options('model').get('path')
model = onnx.load(model_path)
MODEL_INITIALIZER = numpy_helper.to_array(model.graph.initializer[-1])
return MODEL_INITIALIZER
def clear_model_initializer() -> None:
global MODEL_INITIALIZER
MODEL_INITIALIZER = None
def get_options(key : Literal['model']) -> Any:
global OPTIONS
if OPTIONS is None:
OPTIONS =\
{
'model': MODELS[frame_processors_globals.face_swapper_model]
}
return OPTIONS.get(key)
def set_options(key : Literal['model'], value : Any) -> None:
global OPTIONS
OPTIONS[key] = value
def register_args(program : ArgumentParser) -> None:
if has_execution_provider('CoreMLExecutionProvider') or has_execution_provider('OpenVINOExecutionProvider'):
face_swapper_model_fallback = 'inswapper_128'
else:
face_swapper_model_fallback = 'inswapper_128_fp16'
program.add_argument('--face-swapper-model', help = wording.get('help.face_swapper_model'), default = config.get_str_value('frame_processors.face_swapper_model', face_swapper_model_fallback), choices = frame_processors_choices.face_swapper_models)
def apply_args(program : ArgumentParser) -> None:
args = program.parse_args()
frame_processors_globals.face_swapper_model = args.face_swapper_model
if args.face_swapper_model == 'blendswap_256':
deepfuze.globals.face_recognizer_model = 'arcface_blendswap'
if args.face_swapper_model == 'inswapper_128' or args.face_swapper_model == 'inswapper_128_fp16':
deepfuze.globals.face_recognizer_model = 'arcface_inswapper'
if args.face_swapper_model == 'simswap_256' or args.face_swapper_model == 'simswap_512_unofficial':
deepfuze.globals.face_recognizer_model = 'arcface_simswap'
if args.face_swapper_model == 'uniface_256':
deepfuze.globals.face_recognizer_model = 'arcface_uniface'
def pre_check() -> bool:
download_directory_path = resolve_relative_path('../../../models/deepfuze')
model_url = get_options('model').get('url')
model_path = get_options('model').get('path')
if not deepfuze.globals.skip_download:
process_manager.check()
conditional_download(download_directory_path, [ model_url ])
process_manager.end()
return is_file(model_path)
def post_check() -> bool:
model_url = get_options('model').get('url')
model_path = get_options('model').get('path')
if not deepfuze.globals.skip_download and not is_download_done(model_url, model_path):
logger.error(wording.get('model_download_not_done') + wording.get('exclamation_mark'), NAME)
return False
if not is_file(model_path):
logger.error(wording.get('model_file_not_present') + wording.get('exclamation_mark'), NAME)
return False
return True
def pre_process(mode : ProcessMode) -> bool:
if not has_image(deepfuze.globals.source_paths):
logger.error(wording.get('select_image_source') + wording.get('exclamation_mark'), NAME)
return False
source_image_paths = filter_image_paths(deepfuze.globals.source_paths)
source_frames = read_static_images(source_image_paths)
for source_frame in source_frames:
if not get_one_face(source_frame):
logger.error(wording.get('no_source_face_detected') + wording.get('exclamation_mark'), NAME)
return False
if mode in [ 'output', 'preview' ] and not is_image(deepfuze.globals.target_path) and not is_video(deepfuze.globals.target_path):
logger.error(wording.get('select_image_or_video_target') + wording.get('exclamation_mark'), NAME)
return False
if mode == 'output' and not normalize_output_path(deepfuze.globals.target_path, deepfuze.globals.output_path):
logger.error(wording.get('select_file_or_directory_output') + wording.get('exclamation_mark'), NAME)
return False
return True
def post_process() -> None:
read_static_image.cache_clear()
if deepfuze.globals.video_memory_strategy == 'strict' or deepfuze.globals.video_memory_strategy == 'moderate':
clear_model_initializer()
clear_frame_processor()
if deepfuze.globals.video_memory_strategy == 'strict':
clear_face_analyser()
clear_content_analyser()
clear_face_occluder()
clear_face_parser()
def swap_face(source_face : Face, target_face : Face, temp_vision_frame : VisionFrame) -> VisionFrame:
model_template = get_options('model').get('template')
model_size = get_options('model').get('size')
crop_vision_frame, affine_matrix = warp_face_by_face_landmark_5(temp_vision_frame, target_face.landmarks.get('5/68'), model_template, model_size)
crop_mask_list = []
if 'box' in deepfuze.globals.face_mask_types:
box_mask = create_static_box_mask(crop_vision_frame.shape[:2][::-1], deepfuze.globals.face_mask_blur, deepfuze.globals.face_mask_padding)
crop_mask_list.append(box_mask)
if 'occlusion' in deepfuze.globals.face_mask_types:
occlusion_mask = create_occlusion_mask(crop_vision_frame)
crop_mask_list.append(occlusion_mask)
crop_vision_frame = prepare_crop_frame(crop_vision_frame)
crop_vision_frame = apply_swap(source_face, crop_vision_frame)
crop_vision_frame = normalize_crop_frame(crop_vision_frame)
if 'region' in deepfuze.globals.face_mask_types:
region_mask = create_region_mask(crop_vision_frame, deepfuze.globals.face_mask_regions)
crop_mask_list.append(region_mask)
crop_mask = numpy.minimum.reduce(crop_mask_list).clip(0, 1)
temp_vision_frame = paste_back(temp_vision_frame, crop_vision_frame, crop_mask, affine_matrix)
return temp_vision_frame
def apply_swap(source_face : Face, crop_vision_frame : VisionFrame) -> VisionFrame:
frame_processor = get_frame_processor()
model_type = get_options('model').get('type')
frame_processor_inputs = {}
for frame_processor_input in frame_processor.get_inputs():
if frame_processor_input.name == 'source':
if model_type == 'blendswap' or model_type == 'uniface':
frame_processor_inputs[frame_processor_input.name] = prepare_source_frame(source_face)
else:
frame_processor_inputs[frame_processor_input.name] = prepare_source_embedding(source_face)
if frame_processor_input.name == 'target':
frame_processor_inputs[frame_processor_input.name] = crop_vision_frame
with conditional_thread_semaphore(deepfuze.globals.execution_providers):
crop_vision_frame = frame_processor.run(None, frame_processor_inputs)[0][0]
return crop_vision_frame
def prepare_source_frame(source_face : Face) -> VisionFrame:
model_type = get_options('model').get('type')
source_vision_frame = read_static_image(deepfuze.globals.source_paths[0])
if model_type == 'blendswap':
source_vision_frame, _ = warp_face_by_face_landmark_5(source_vision_frame, source_face.landmarks.get('5/68'), 'arcface_112_v2', (112, 112))
if model_type == 'uniface':
source_vision_frame, _ = warp_face_by_face_landmark_5(source_vision_frame, source_face.landmarks.get('5/68'), 'ffhq_512', (256, 256))
source_vision_frame = source_vision_frame[:, :, ::-1] / 255.0
source_vision_frame = source_vision_frame.transpose(2, 0, 1)
source_vision_frame = numpy.expand_dims(source_vision_frame, axis = 0).astype(numpy.float32)
return source_vision_frame
def prepare_source_embedding(source_face : Face) -> Embedding:
model_type = get_options('model').get('type')
if model_type == 'inswapper':
model_initializer = get_model_initializer()
source_embedding = source_face.embedding.reshape((1, -1))
source_embedding = numpy.dot(source_embedding, model_initializer) / numpy.linalg.norm(source_embedding)
else:
source_embedding = source_face.normed_embedding.reshape(1, -1)
return source_embedding
def prepare_crop_frame(crop_vision_frame : VisionFrame) -> VisionFrame:
model_mean = get_options('model').get('mean')
model_standard_deviation = get_options('model').get('standard_deviation')
crop_vision_frame = crop_vision_frame[:, :, ::-1] / 255.0
crop_vision_frame = (crop_vision_frame - model_mean) / model_standard_deviation
crop_vision_frame = crop_vision_frame.transpose(2, 0, 1)
crop_vision_frame = numpy.expand_dims(crop_vision_frame, axis = 0).astype(numpy.float32)
return crop_vision_frame
def normalize_crop_frame(crop_vision_frame : VisionFrame) -> VisionFrame:
crop_vision_frame = crop_vision_frame.transpose(1, 2, 0)
crop_vision_frame = (crop_vision_frame * 255.0).round()
crop_vision_frame = crop_vision_frame[:, :, ::-1]
return crop_vision_frame
def get_reference_frame(source_face : Face, target_face : Face, temp_vision_frame : VisionFrame) -> VisionFrame:
return swap_face(source_face, target_face, temp_vision_frame)
def process_frame(inputs : FaceSwapperInputs) -> VisionFrame:
reference_faces = inputs.get('reference_faces')
source_face = inputs.get('source_face')
target_vision_frame = inputs.get('target_vision_frame')
if deepfuze.globals.face_selector_mode == 'many':
many_faces = get_many_faces(target_vision_frame)
if many_faces:
for target_face in many_faces:
target_vision_frame = swap_face(source_face, target_face, target_vision_frame)
if deepfuze.globals.face_selector_mode == 'one':
target_face = get_one_face(target_vision_frame)
if target_face:
target_vision_frame = swap_face(source_face, target_face, target_vision_frame)
if deepfuze.globals.face_selector_mode == 'reference':
similar_faces = find_similar_faces(reference_faces, target_vision_frame, deepfuze.globals.reference_face_distance)
if similar_faces:
for similar_face in similar_faces:
target_vision_frame = swap_face(source_face, similar_face, target_vision_frame)
return target_vision_frame
def process_frames(source_paths : List[str], queue_payloads : List[QueuePayload], update_progress : UpdateProgress) -> None:
reference_faces = get_reference_faces() if 'reference' in deepfuze.globals.face_selector_mode else None
source_frames = read_static_images(source_paths)
source_face = get_average_face(source_frames)
for queue_payload in process_manager.manage(queue_payloads):
target_vision_path = queue_payload['frame_path']
target_vision_frame = read_image(target_vision_path)
output_vision_frame = process_frame(
{
'reference_faces': reference_faces,
'source_face': source_face,
'target_vision_frame': target_vision_frame
})
write_image(target_vision_path, output_vision_frame)
update_progress(1)
def process_image(source_paths : List[str], target_path : str, output_path : str) -> None:
reference_faces = get_reference_faces() if 'reference' in deepfuze.globals.face_selector_mode else None
source_frames = read_static_images(source_paths)
source_face = get_average_face(source_frames)
target_vision_frame = read_static_image(target_path)
output_vision_frame = process_frame(
{
'reference_faces': reference_faces,
'source_face': source_face,
'target_vision_frame': target_vision_frame
})
write_image(output_path, output_vision_frame)
def process_video(source_paths : List[str], temp_frame_paths : List[str]) -> None:
frame_processors.multi_process_frames(source_paths, temp_frame_paths, process_frames)
@@ -0,0 +1,241 @@
from typing import Any, List, Literal, Optional
from argparse import ArgumentParser
from time import sleep
import cv2
import numpy
import onnxruntime
import deepfuze.globals
import deepfuze.processors.frame.core as frame_processors
from deepfuze import config, process_manager, logger, wording
from deepfuze.face_analyser import clear_face_analyser
from deepfuze.content_analyser import clear_content_analyser
from deepfuze.execution import apply_execution_provider_options
from deepfuze.normalizer import normalize_output_path
from deepfuze.thread_helper import thread_lock, thread_semaphore
from deepfuze.typing import Face, VisionFrame, UpdateProgress, ProcessMode, ModelSet, OptionsWithModel, QueuePayload
from deepfuze.common_helper import create_metavar
from deepfuze.filesystem import is_file, resolve_relative_path, is_image, is_video
from deepfuze.download import conditional_download, is_download_done
from deepfuze.vision import read_image, read_static_image, write_image, unpack_resolution
from deepfuze.processors.frame.typings import FrameColorizerInputs
from deepfuze.processors.frame import globals as frame_processors_globals
from deepfuze.processors.frame import choices as frame_processors_choices
FRAME_PROCESSOR = None
NAME = __name__.upper()
MODELS : ModelSet =\
{
'ddcolor':
{
'type': 'ddcolor',
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/ddcolor.onnx',
'path': resolve_relative_path('../../../models/deepfuze/ddcolor.onnx')
},
'ddcolor_artistic':
{
'type': 'ddcolor',
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/ddcolor_artistic.onnx',
'path': resolve_relative_path('../../../models/deepfuze/ddcolor_artistic.onnx')
},
'deoldify':
{
'type': 'deoldify',
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/deoldify.onnx',
'path': resolve_relative_path('../../../models/deepfuze/deoldify.onnx')
},
'deoldify_artistic':
{
'type': 'deoldify',
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/deoldify_artistic.onnx',
'path': resolve_relative_path('../../../models/deepfuze/deoldify_artistic.onnx')
},
'deoldify_stable':
{
'type': 'deoldify',
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/deoldify_stable.onnx',
'path': resolve_relative_path('../../../models/deepfuze/deoldify_stable.onnx')
}
}
OPTIONS : Optional[OptionsWithModel] = None
def get_frame_processor() -> Any:
global FRAME_PROCESSOR
with thread_lock():
while process_manager.is_checking():
sleep(0.5)
if FRAME_PROCESSOR is None:
model_path = get_options('model').get('path')
FRAME_PROCESSOR = onnxruntime.InferenceSession(model_path, providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
return FRAME_PROCESSOR
def clear_frame_processor() -> None:
global FRAME_PROCESSOR
FRAME_PROCESSOR = None
def get_options(key : Literal['model']) -> Any:
global OPTIONS
if OPTIONS is None:
OPTIONS =\
{
'model': MODELS[frame_processors_globals.frame_colorizer_model]
}
return OPTIONS.get(key)
def set_options(key : Literal['model'], value : Any) -> None:
global OPTIONS
OPTIONS[key] = value
def register_args(program : ArgumentParser) -> None:
program.add_argument('--frame-colorizer-model', help = wording.get('help.frame_colorizer_model'), default = config.get_str_value('frame_processors.frame_colorizer_model', 'ddcolor'), choices = frame_processors_choices.frame_colorizer_models)
program.add_argument('--frame-colorizer-blend', help = wording.get('help.frame_colorizer_blend'), type = int, default = config.get_int_value('frame_processors.frame_colorizer_blend', '100'), choices = frame_processors_choices.frame_colorizer_blend_range, metavar = create_metavar(frame_processors_choices.frame_colorizer_blend_range))
program.add_argument('--frame-colorizer-size', help = wording.get('help.frame_colorizer_size'), type = str, default = config.get_str_value('frame_processors.frame_colorizer_size', '256x256'), choices = frame_processors_choices.frame_colorizer_sizes)
def apply_args(program : ArgumentParser) -> None:
args = program.parse_args()
frame_processors_globals.frame_colorizer_model = args.frame_colorizer_model
frame_processors_globals.frame_colorizer_blend = args.frame_colorizer_blend
frame_processors_globals.frame_colorizer_size = args.frame_colorizer_size
def pre_check() -> bool:
download_directory_path = resolve_relative_path('../../../models/deepfuze')
model_url = get_options('model').get('url')
model_path = get_options('model').get('path')
if not deepfuze.globals.skip_download:
process_manager.check()
conditional_download(download_directory_path, [ model_url ])
process_manager.end()
return is_file(model_path)
def post_check() -> bool:
model_url = get_options('model').get('url')
model_path = get_options('model').get('path')
if not deepfuze.globals.skip_download and not is_download_done(model_url, model_path):
logger.error(wording.get('model_download_not_done') + wording.get('exclamation_mark'), NAME)
return False
if not is_file(model_path):
logger.error(wording.get('model_file_not_present') + wording.get('exclamation_mark'), NAME)
return False
return True
def pre_process(mode : ProcessMode) -> bool:
if mode in [ 'output', 'preview' ] and not is_image(deepfuze.globals.target_path) and not is_video(deepfuze.globals.target_path):
logger.error(wording.get('select_image_or_video_target') + wording.get('exclamation_mark'), NAME)
return False
if mode == 'output' and not normalize_output_path(deepfuze.globals.target_path, deepfuze.globals.output_path):
logger.error(wording.get('select_file_or_directory_output') + wording.get('exclamation_mark'), NAME)
return False
return True
def post_process() -> None:
read_static_image.cache_clear()
if deepfuze.globals.video_memory_strategy == 'strict' or deepfuze.globals.video_memory_strategy == 'moderate':
clear_frame_processor()
if deepfuze.globals.video_memory_strategy == 'strict':
clear_face_analyser()
clear_content_analyser()
def colorize_frame(temp_vision_frame : VisionFrame) -> VisionFrame:
frame_processor = get_frame_processor()
prepare_vision_frame = prepare_temp_frame(temp_vision_frame)
with thread_semaphore():
color_vision_frame = frame_processor.run(None,
{
frame_processor.get_inputs()[0].name: prepare_vision_frame
})[0][0]
color_vision_frame = merge_color_frame(temp_vision_frame, color_vision_frame)
color_vision_frame = blend_frame(temp_vision_frame, color_vision_frame)
return color_vision_frame
def prepare_temp_frame(temp_vision_frame : VisionFrame) -> VisionFrame:
model_size = unpack_resolution(frame_processors_globals.frame_colorizer_size)
model_type = get_options('model').get('type')
temp_vision_frame = cv2.cvtColor(temp_vision_frame, cv2.COLOR_BGR2GRAY)
temp_vision_frame = cv2.cvtColor(temp_vision_frame, cv2.COLOR_GRAY2RGB)
if model_type == 'ddcolor':
temp_vision_frame = (temp_vision_frame / 255.0).astype(numpy.float32)
temp_vision_frame = cv2.cvtColor(temp_vision_frame, cv2.COLOR_RGB2LAB)[:, :, :1]
temp_vision_frame = numpy.concatenate((temp_vision_frame, numpy.zeros_like(temp_vision_frame), numpy.zeros_like(temp_vision_frame)), axis = -1)
temp_vision_frame = cv2.cvtColor(temp_vision_frame, cv2.COLOR_LAB2RGB)
temp_vision_frame = cv2.resize(temp_vision_frame, model_size)
temp_vision_frame = temp_vision_frame.transpose((2, 0, 1))
temp_vision_frame = numpy.expand_dims(temp_vision_frame, axis = 0).astype(numpy.float32)
return temp_vision_frame
def merge_color_frame(temp_vision_frame : VisionFrame, color_vision_frame : VisionFrame) -> VisionFrame:
model_type = get_options('model').get('type')
color_vision_frame = color_vision_frame.transpose(1, 2, 0)
color_vision_frame = cv2.resize(color_vision_frame, (temp_vision_frame.shape[1], temp_vision_frame.shape[0]))
if model_type == 'ddcolor':
temp_vision_frame = (temp_vision_frame / 255.0).astype(numpy.float32)
temp_vision_frame = cv2.cvtColor(temp_vision_frame, cv2.COLOR_BGR2LAB)[:, :, :1]
color_vision_frame = numpy.concatenate((temp_vision_frame, color_vision_frame), axis = -1)
color_vision_frame = cv2.cvtColor(color_vision_frame, cv2.COLOR_LAB2BGR)
color_vision_frame = (color_vision_frame * 255.0).round().astype(numpy.uint8)
if model_type == 'deoldify':
temp_blue_channel, _, _ = cv2.split(temp_vision_frame)
color_vision_frame = cv2.cvtColor(color_vision_frame, cv2.COLOR_BGR2RGB).astype(numpy.uint8)
color_vision_frame = cv2.cvtColor(color_vision_frame, cv2.COLOR_BGR2LAB)
_, color_green_channel, color_red_channel = cv2.split(color_vision_frame)
color_vision_frame = cv2.merge((temp_blue_channel, color_green_channel, color_red_channel))
color_vision_frame = cv2.cvtColor(color_vision_frame, cv2.COLOR_LAB2BGR)
return color_vision_frame
def blend_frame(temp_vision_frame : VisionFrame, paste_vision_frame : VisionFrame) -> VisionFrame:
frame_colorizer_blend = 1 - (frame_processors_globals.frame_colorizer_blend / 100)
temp_vision_frame = cv2.addWeighted(temp_vision_frame, frame_colorizer_blend, paste_vision_frame, 1 - frame_colorizer_blend, 0)
return temp_vision_frame
def get_reference_frame(source_face : Face, target_face : Face, temp_vision_frame : VisionFrame) -> VisionFrame:
pass
def process_frame(inputs : FrameColorizerInputs) -> VisionFrame:
target_vision_frame = inputs.get('target_vision_frame')
return colorize_frame(target_vision_frame)
def process_frames(source_paths : List[str], queue_payloads : List[QueuePayload], update_progress : UpdateProgress) -> None:
for queue_payload in process_manager.manage(queue_payloads):
target_vision_path = queue_payload['frame_path']
target_vision_frame = read_image(target_vision_path)
output_vision_frame = process_frame(
{
'target_vision_frame': target_vision_frame
})
write_image(target_vision_path, output_vision_frame)
update_progress(1)
def process_image(source_paths : List[str], target_path : str, output_path : str) -> None:
target_vision_frame = read_static_image(target_path)
output_vision_frame = process_frame(
{
'target_vision_frame': target_vision_frame
})
write_image(output_path, output_vision_frame)
def process_video(source_paths : List[str], temp_frame_paths : List[str]) -> None:
frame_processors.multi_process_frames(None, temp_frame_paths, process_frames)
@@ -0,0 +1,263 @@
from typing import Any, List, Literal, Optional
from argparse import ArgumentParser
from time import sleep
import cv2
import numpy
import onnxruntime
import deepfuze.globals
import deepfuze.processors.frame.core as frame_processors
from deepfuze import config, process_manager, logger, wording
from deepfuze.face_analyser import clear_face_analyser
from deepfuze.content_analyser import clear_content_analyser
from deepfuze.execution import apply_execution_provider_options
from deepfuze.normalizer import normalize_output_path
from deepfuze.thread_helper import thread_lock, conditional_thread_semaphore
from deepfuze.typing import Face, VisionFrame, UpdateProgress, ProcessMode, ModelSet, OptionsWithModel, QueuePayload
from deepfuze.common_helper import create_metavar
from deepfuze.filesystem import is_file, resolve_relative_path, is_image, is_video
from deepfuze.download import conditional_download, is_download_done
from deepfuze.vision import read_image, read_static_image, write_image, merge_tile_frames, create_tile_frames
from deepfuze.processors.frame.typings import FrameEnhancerInputs
from deepfuze.processors.frame import globals as frame_processors_globals
from deepfuze.processors.frame import choices as frame_processors_choices
FRAME_PROCESSOR = None
NAME = __name__.upper()
MODELS : ModelSet =\
{
'clear_reality_x4':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/clear_reality_x4.onnx',
'path': resolve_relative_path('../../../models/deepfuze/clear_reality_x4.onnx'),
'size': (128, 8, 4),
'scale': 4
},
'lsdir_x4':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/lsdir_x4.onnx',
'path': resolve_relative_path('../../../models/deepfuze/lsdir_x4.onnx'),
'size': (128, 8, 4),
'scale': 4
},
'nomos8k_sc_x4':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/nomos8k_sc_x4.onnx',
'path': resolve_relative_path('../../../models/deepfuze/nomos8k_sc_x4.onnx'),
'size': (128, 8, 4),
'scale': 4
},
'real_esrgan_x2':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/real_esrgan_x2.onnx',
'path': resolve_relative_path('../../../models/deepfuze/real_esrgan_x2.onnx'),
'size': (256, 16, 8),
'scale': 2
},
'real_esrgan_x2_fp16':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/real_esrgan_x2_fp16.onnx',
'path': resolve_relative_path('../../../models/deepfuze/real_esrgan_x2_fp16.onnx'),
'size': (256, 16, 8),
'scale': 2
},
'real_esrgan_x4':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/real_esrgan_x4.onnx',
'path': resolve_relative_path('../../../models/deepfuze/real_esrgan_x4.onnx'),
'size': (256, 16, 8),
'scale': 4
},
'real_esrgan_x4_fp16':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/real_esrgan_x4_fp16.onnx',
'path': resolve_relative_path('../../../models/deepfuze/real_esrgan_x4_fp16.onnx'),
'size': (256, 16, 8),
'scale': 4
},
'real_hatgan_x4':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/real_hatgan_x4.onnx',
'path': resolve_relative_path('../../../models/deepfuze/real_hatgan_x4.onnx'),
'size': (256, 16, 8),
'scale': 4
},
'span_kendata_x4':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/span_kendata_x4.onnx',
'path': resolve_relative_path('../../../models/deepfuze/span_kendata_x4.onnx'),
'size': (128, 8, 4),
'scale': 4
},
'ultra_sharp_x4':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/ultra_sharp_x4.onnx',
'path': resolve_relative_path('../../../models/deepfuze/ultra_sharp_x4.onnx'),
'size': (128, 8, 4),
'scale': 4
}
}
OPTIONS : Optional[OptionsWithModel] = None
def get_frame_processor() -> Any:
global FRAME_PROCESSOR
with thread_lock():
while process_manager.is_checking():
sleep(0.5)
if FRAME_PROCESSOR is None:
model_path = get_options('model').get('path')
FRAME_PROCESSOR = onnxruntime.InferenceSession(model_path, providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
return FRAME_PROCESSOR
def clear_frame_processor() -> None:
global FRAME_PROCESSOR
FRAME_PROCESSOR = None
def get_options(key : Literal['model']) -> Any:
global OPTIONS
if OPTIONS is None:
OPTIONS =\
{
'model': MODELS[frame_processors_globals.frame_enhancer_model]
}
return OPTIONS.get(key)
def set_options(key : Literal['model'], value : Any) -> None:
global OPTIONS
OPTIONS[key] = value
def register_args(program : ArgumentParser) -> None:
program.add_argument('--frame-enhancer-model', help = wording.get('help.frame_enhancer_model'), default = config.get_str_value('frame_processors.frame_enhancer_model', 'span_kendata_x4'), choices = frame_processors_choices.frame_enhancer_models)
program.add_argument('--frame-enhancer-blend', help = wording.get('help.frame_enhancer_blend'), type = int, default = config.get_int_value('frame_processors.frame_enhancer_blend', '80'), choices = frame_processors_choices.frame_enhancer_blend_range, metavar = create_metavar(frame_processors_choices.frame_enhancer_blend_range))
def apply_args(program : ArgumentParser) -> None:
args = program.parse_args()
frame_processors_globals.frame_enhancer_model = args.frame_enhancer_model
frame_processors_globals.frame_enhancer_blend = args.frame_enhancer_blend
def pre_check() -> bool:
download_directory_path = resolve_relative_path('../../../models/deepfuze')
model_url = get_options('model').get('url')
model_path = get_options('model').get('path')
if not deepfuze.globals.skip_download:
process_manager.check()
conditional_download(download_directory_path, [ model_url ])
process_manager.end()
return is_file(model_path)
def post_check() -> bool:
model_url = get_options('model').get('url')
model_path = get_options('model').get('path')
if not deepfuze.globals.skip_download and not is_download_done(model_url, model_path):
logger.error(wording.get('model_download_not_done') + wording.get('exclamation_mark'), NAME)
return False
if not is_file(model_path):
logger.error(wording.get('model_file_not_present') + wording.get('exclamation_mark'), NAME)
return False
return True
def pre_process(mode : ProcessMode) -> bool:
if mode in [ 'output', 'preview' ] and not is_image(deepfuze.globals.target_path) and not is_video(deepfuze.globals.target_path):
logger.error(wording.get('select_image_or_video_target') + wording.get('exclamation_mark'), NAME)
return False
if mode == 'output' and not normalize_output_path(deepfuze.globals.target_path, deepfuze.globals.output_path):
logger.error(wording.get('select_file_or_directory_output') + wording.get('exclamation_mark'), NAME)
return False
return True
def post_process() -> None:
read_static_image.cache_clear()
if deepfuze.globals.video_memory_strategy == 'strict' or deepfuze.globals.video_memory_strategy == 'moderate':
clear_frame_processor()
if deepfuze.globals.video_memory_strategy == 'strict':
clear_face_analyser()
clear_content_analyser()
def enhance_frame(temp_vision_frame : VisionFrame) -> VisionFrame:
frame_processor = get_frame_processor()
size = get_options('model').get('size')
scale = get_options('model').get('scale')
temp_height, temp_width = temp_vision_frame.shape[:2]
tile_vision_frames, pad_width, pad_height = create_tile_frames(temp_vision_frame, size)
for index, tile_vision_frame in enumerate(tile_vision_frames):
with conditional_thread_semaphore(deepfuze.globals.execution_providers):
tile_vision_frame = frame_processor.run(None,
{
frame_processor.get_inputs()[0].name : prepare_tile_frame(tile_vision_frame)
})[0]
tile_vision_frames[index] = normalize_tile_frame(tile_vision_frame)
merge_vision_frame = merge_tile_frames(tile_vision_frames, temp_width * scale, temp_height * scale, pad_width * scale, pad_height * scale, (size[0] * scale, size[1] * scale, size[2] * scale))
temp_vision_frame = blend_frame(temp_vision_frame, merge_vision_frame)
return temp_vision_frame
def prepare_tile_frame(vision_tile_frame : VisionFrame) -> VisionFrame:
vision_tile_frame = numpy.expand_dims(vision_tile_frame[:, :, ::-1], axis = 0)
vision_tile_frame = vision_tile_frame.transpose(0, 3, 1, 2)
vision_tile_frame = vision_tile_frame.astype(numpy.float32) / 255
return vision_tile_frame
def normalize_tile_frame(vision_tile_frame : VisionFrame) -> VisionFrame:
vision_tile_frame = vision_tile_frame.transpose(0, 2, 3, 1).squeeze(0) * 255
vision_tile_frame = vision_tile_frame.clip(0, 255).astype(numpy.uint8)[:, :, ::-1]
return vision_tile_frame
def blend_frame(temp_vision_frame : VisionFrame, merge_vision_frame : VisionFrame) -> VisionFrame:
frame_enhancer_blend = 1 - (frame_processors_globals.frame_enhancer_blend / 100)
temp_vision_frame = cv2.resize(temp_vision_frame, (merge_vision_frame.shape[1], merge_vision_frame.shape[0]))
temp_vision_frame = cv2.addWeighted(temp_vision_frame, frame_enhancer_blend, merge_vision_frame, 1 - frame_enhancer_blend, 0)
return temp_vision_frame
def get_reference_frame(source_face : Face, target_face : Face, temp_vision_frame : VisionFrame) -> VisionFrame:
pass
def process_frame(inputs : FrameEnhancerInputs) -> VisionFrame:
target_vision_frame = inputs.get('target_vision_frame')
return enhance_frame(target_vision_frame)
def process_frames(source_paths : List[str], queue_payloads : List[QueuePayload], update_progress : UpdateProgress) -> None:
for queue_payload in process_manager.manage(queue_payloads):
target_vision_path = queue_payload['frame_path']
target_vision_frame = read_image(target_vision_path)
output_vision_frame = process_frame(
{
'target_vision_frame': target_vision_frame
})
write_image(target_vision_path, output_vision_frame)
update_progress(1)
def process_image(source_paths : List[str], target_path : str, output_path : str) -> None:
target_vision_frame = read_static_image(target_path)
output_vision_frame = process_frame(
{
'target_vision_frame': target_vision_frame
})
write_image(output_path, output_vision_frame)
def process_video(source_paths : List[str], temp_frame_paths : List[str]) -> None:
frame_processors.multi_process_frames(None, temp_frame_paths, process_frames)
+260
View File
@@ -0,0 +1,260 @@
from typing import Any, List, Literal, Optional
from argparse import ArgumentParser
from time import sleep
import cv2
import numpy
import onnxruntime
import deepfuze.globals
import deepfuze.processors.frame.core as frame_processors
from deepfuze import config, process_manager, logger, wording
from deepfuze.execution import apply_execution_provider_options
from deepfuze.face_analyser import get_one_face, get_many_faces, find_similar_faces, clear_face_analyser
from deepfuze.face_masker import create_static_box_mask, create_occlusion_mask, create_mouth_mask, clear_face_occluder, clear_face_parser
from deepfuze.face_helper import warp_face_by_face_landmark_5, warp_face_by_bounding_box, paste_back, create_bounding_box_from_face_landmark_68
from deepfuze.face_store import get_reference_faces
from deepfuze.content_analyser import clear_content_analyser
from deepfuze.normalizer import normalize_output_path
from deepfuze.thread_helper import thread_lock, conditional_thread_semaphore
from deepfuze.typing import Face, VisionFrame, UpdateProgress, ProcessMode, ModelSet, OptionsWithModel, AudioFrame, QueuePayload
from deepfuze.filesystem import is_file, has_audio, resolve_relative_path
from deepfuze.download import conditional_download, is_download_done
from deepfuze.audio import read_static_voice, get_voice_frame, create_empty_audio_frame
from deepfuze.filesystem import is_image, is_video, filter_audio_paths
from deepfuze.common_helper import get_first
from deepfuze.vision import read_image, read_static_image, write_image, restrict_video_fps
from deepfuze.processors.frame.typings import LipSyncerInputs
from deepfuze.voice_extractor import clear_voice_extractor
from deepfuze.processors.frame import globals as frame_processors_globals
from deepfuze.processors.frame import choices as frame_processors_choices
FRAME_PROCESSOR = None
NAME = __name__.upper()
MODELS : ModelSet =\
{
'wav2lip_gan':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/wav2lip_gan.onnx',
'path': resolve_relative_path('../../../models/deepfuze/wav2lip_gan.onnx')
}
}
OPTIONS : Optional[OptionsWithModel] = None
def get_frame_processor() -> Any:
global FRAME_PROCESSOR
with thread_lock():
while process_manager.is_checking():
sleep(0.5)
if FRAME_PROCESSOR is None:
model_path = get_options('model').get('path')
FRAME_PROCESSOR = onnxruntime.InferenceSession(model_path, providers = apply_execution_provider_options(deepfuze.globals.execution_device_id, deepfuze.globals.execution_providers))
return FRAME_PROCESSOR
def clear_frame_processor() -> None:
global FRAME_PROCESSOR
FRAME_PROCESSOR = None
def get_options(key : Literal['model']) -> Any:
global OPTIONS
if OPTIONS is None:
OPTIONS =\
{
'model': MODELS[frame_processors_globals.lip_syncer_model]
}
return OPTIONS.get(key)
def set_options(key : Literal['model'], value : Any) -> None:
global OPTIONS
OPTIONS[key] = value
def register_args(program : ArgumentParser) -> None:
program.add_argument('--lip-syncer-model', help = wording.get('help.lip_syncer_model'), default = config.get_str_value('frame_processors.lip_syncer_model', 'wav2lip_gan'), choices = frame_processors_choices.lip_syncer_models)
def apply_args(program : ArgumentParser) -> None:
args = program.parse_args()
frame_processors_globals.lip_syncer_model = args.lip_syncer_model
def pre_check() -> bool:
download_directory_path = resolve_relative_path('../../../models/deepfuze')
model_url = get_options('model').get('url')
model_path = get_options('model').get('path')
if not deepfuze.globals.skip_download:
process_manager.check()
conditional_download(download_directory_path, [ model_url ])
process_manager.end()
return is_file(model_path)
def post_check() -> bool:
model_url = get_options('model').get('url')
model_path = get_options('model').get('path')
if not deepfuze.globals.skip_download and not is_download_done(model_url, model_path):
logger.error(wording.get('model_download_not_done') + wording.get('exclamation_mark'), NAME)
return False
if not is_file(model_path):
logger.error(wording.get('model_file_not_present') + wording.get('exclamation_mark'), NAME)
return False
return True
def pre_process(mode : ProcessMode) -> bool:
if not has_audio(deepfuze.globals.source_paths):
logger.error(wording.get('select_audio_source') + wording.get('exclamation_mark'), NAME)
return False
if mode in [ 'output', 'preview' ] and not is_image(deepfuze.globals.target_path) and not is_video(deepfuze.globals.target_path):
logger.error(wording.get('select_image_or_video_target') + wording.get('exclamation_mark'), NAME)
return False
if mode == 'output' and not normalize_output_path(deepfuze.globals.target_path, deepfuze.globals.output_path):
logger.error(wording.get('select_file_or_directory_output') + wording.get('exclamation_mark'), NAME)
return False
return True
def post_process() -> None:
read_static_image.cache_clear()
read_static_voice.cache_clear()
if deepfuze.globals.video_memory_strategy == 'strict' or deepfuze.globals.video_memory_strategy == 'moderate':
clear_frame_processor()
if deepfuze.globals.video_memory_strategy == 'strict':
clear_face_analyser()
clear_content_analyser()
clear_face_occluder()
clear_face_parser()
clear_voice_extractor()
def sync_lip(target_face : Face, temp_audio_frame : AudioFrame, temp_vision_frame : VisionFrame) -> VisionFrame:
frame_processor = get_frame_processor()
crop_mask_list = []
temp_audio_frame = prepare_audio_frame(temp_audio_frame)
crop_vision_frame, affine_matrix = warp_face_by_face_landmark_5(temp_vision_frame, target_face.landmarks.get('5/68'), 'ffhq_512', (512, 512))
face_landmark_68 = cv2.transform(target_face.landmarks.get('68').reshape(1, -1, 2), affine_matrix).reshape(-1, 2)
bounding_box = create_bounding_box_from_face_landmark_68(face_landmark_68)
bounding_box[1] -= numpy.abs(bounding_box[3] - bounding_box[1]) * 0.125
mouth_mask = create_mouth_mask(face_landmark_68)
crop_mask_list.append(mouth_mask)
box_mask = create_static_box_mask(crop_vision_frame.shape[:2][::-1], deepfuze.globals.face_mask_blur, deepfuze.globals.face_mask_padding)
crop_mask_list.append(box_mask)
if 'occlusion' in deepfuze.globals.face_mask_types:
occlusion_mask = create_occlusion_mask(crop_vision_frame)
crop_mask_list.append(occlusion_mask)
close_vision_frame, close_matrix = warp_face_by_bounding_box(crop_vision_frame, bounding_box, (96, 96))
close_vision_frame = prepare_crop_frame(close_vision_frame)
with conditional_thread_semaphore(deepfuze.globals.execution_providers):
close_vision_frame = frame_processor.run(None,
{
'source': temp_audio_frame,
'target': close_vision_frame
})[0]
crop_vision_frame = normalize_crop_frame(close_vision_frame)
crop_vision_frame = cv2.warpAffine(crop_vision_frame, cv2.invertAffineTransform(close_matrix), (512, 512), borderMode = cv2.BORDER_REPLICATE)
crop_mask = numpy.minimum.reduce(crop_mask_list)
paste_vision_frame = paste_back(temp_vision_frame, crop_vision_frame, crop_mask, affine_matrix)
return paste_vision_frame
def prepare_audio_frame(temp_audio_frame : AudioFrame) -> AudioFrame:
temp_audio_frame = numpy.maximum(numpy.exp(-5 * numpy.log(10)), temp_audio_frame)
temp_audio_frame = numpy.log10(temp_audio_frame) * 1.6 + 3.2
temp_audio_frame = temp_audio_frame.clip(-4, 4).astype(numpy.float32)
temp_audio_frame = numpy.expand_dims(temp_audio_frame, axis = (0, 1))
return temp_audio_frame
def prepare_crop_frame(crop_vision_frame : VisionFrame) -> VisionFrame:
crop_vision_frame = numpy.expand_dims(crop_vision_frame, axis = 0)
prepare_vision_frame = crop_vision_frame.copy()
prepare_vision_frame[:, 48:] = 0
crop_vision_frame = numpy.concatenate((prepare_vision_frame, crop_vision_frame), axis = 3)
crop_vision_frame = crop_vision_frame.transpose(0, 3, 1, 2).astype('float32') / 255.0
return crop_vision_frame
def normalize_crop_frame(crop_vision_frame : VisionFrame) -> VisionFrame:
crop_vision_frame = crop_vision_frame[0].transpose(1, 2, 0)
crop_vision_frame = crop_vision_frame.clip(0, 1) * 255
crop_vision_frame = crop_vision_frame.astype(numpy.uint8)
return crop_vision_frame
def get_reference_frame(source_face : Face, target_face : Face, temp_vision_frame : VisionFrame) -> VisionFrame:
pass
def process_frame(inputs : LipSyncerInputs) -> VisionFrame:
reference_faces = inputs.get('reference_faces')
source_audio_frame = inputs.get('source_audio_frame')
target_vision_frame = inputs.get('target_vision_frame')
if deepfuze.globals.face_selector_mode == 'many':
many_faces = get_many_faces(target_vision_frame)
if many_faces:
for target_face in many_faces:
target_vision_frame = sync_lip(target_face, source_audio_frame, target_vision_frame)
if deepfuze.globals.face_selector_mode == 'one':
target_face = get_one_face(target_vision_frame)
if target_face:
target_vision_frame = sync_lip(target_face, source_audio_frame, target_vision_frame)
if deepfuze.globals.face_selector_mode == 'reference':
similar_faces = find_similar_faces(reference_faces, target_vision_frame, deepfuze.globals.reference_face_distance)
if similar_faces:
for similar_face in similar_faces:
target_vision_frame = sync_lip(similar_face, source_audio_frame, target_vision_frame)
return target_vision_frame
def process_frames(source_paths : List[str], queue_payloads : List[QueuePayload], update_progress : UpdateProgress) -> None:
reference_faces = get_reference_faces() if 'reference' in deepfuze.globals.face_selector_mode else None
source_audio_path = get_first(filter_audio_paths(source_paths))
temp_video_fps = restrict_video_fps(deepfuze.globals.target_path, deepfuze.globals.output_video_fps)
for queue_payload in process_manager.manage(queue_payloads):
frame_number = queue_payload['frame_number']
target_vision_path = queue_payload['frame_path']
source_audio_frame = get_voice_frame(source_audio_path, temp_video_fps, frame_number)
if not numpy.any(source_audio_frame):
source_audio_frame = create_empty_audio_frame()
target_vision_frame = read_image(target_vision_path)
output_vision_frame = process_frame(
{
'reference_faces': reference_faces,
'source_audio_frame': source_audio_frame,
'target_vision_frame': target_vision_frame
})
write_image(target_vision_path, output_vision_frame)
update_progress(1)
def process_image(source_paths : List[str], target_path : str, output_path : str) -> None:
reference_faces = get_reference_faces() if 'reference' in deepfuze.globals.face_selector_mode else None
source_audio_frame = create_empty_audio_frame()
target_vision_frame = read_static_image(target_path)
output_vision_frame = process_frame(
{
'reference_faces': reference_faces,
'source_audio_frame': source_audio_frame,
'target_vision_frame': target_vision_frame
})
write_image(output_path, output_vision_frame)
def process_video(source_paths : List[str], temp_frame_paths : List[str]) -> None:
source_audio_paths = filter_audio_paths(deepfuze.globals.source_paths)
temp_video_fps = restrict_video_fps(deepfuze.globals.target_path, deepfuze.globals.output_video_fps)
for source_audio_path in source_audio_paths:
read_static_voice(source_audio_path, temp_video_fps)
frame_processors.multi_process_frames(source_paths, temp_frame_paths, process_frames)
+41
View File
@@ -0,0 +1,41 @@
from typing import Literal, TypedDict
from deepfuze.typing import Face, FaceSet, AudioFrame, VisionFrame
FaceDebuggerItem = Literal['bounding-box', 'face-landmark-5', 'face-landmark-5/68', 'face-landmark-68', 'face-landmark-68/5', 'face-mask', 'face-detector-score', 'face-landmarker-score', 'age', 'gender']
FaceEnhancerModel = Literal['codeformer', 'gfpgan_1.2', 'gfpgan_1.3', 'gfpgan_1.4', 'gpen_bfr_256', 'gpen_bfr_512', 'gpen_bfr_1024', 'gpen_bfr_2048', 'restoreformer_plus_plus']
FaceSwapperModel = Literal['blendswap_256', 'inswapper_128', 'inswapper_128_fp16', 'simswap_256', 'simswap_512_unofficial', 'uniface_256']
FrameColorizerModel = Literal['ddcolor', 'ddcolor_artistic', 'deoldify', 'deoldify_artistic', 'deoldify_stable']
FrameEnhancerModel = Literal['clear_reality_x4', 'lsdir_x4', 'nomos8k_sc_x4', 'real_esrgan_x2', 'real_esrgan_x2_fp16', 'real_esrgan_x4', 'real_esrgan_x4_fp16', 'real_hatgan_x4', 'span_kendata_x4', 'ultra_sharp_x4']
LipSyncerModel = Literal['wav2lip_gan']
FaceDebuggerInputs = TypedDict('FaceDebuggerInputs',
{
'reference_faces' : FaceSet,
'target_vision_frame' : VisionFrame
})
FaceEnhancerInputs = TypedDict('FaceEnhancerInputs',
{
'reference_faces' : FaceSet,
'target_vision_frame' : VisionFrame
})
FaceSwapperInputs = TypedDict('FaceSwapperInputs',
{
'reference_faces' : FaceSet,
'source_face' : Face,
'target_vision_frame' : VisionFrame
})
FrameColorizerInputs = TypedDict('FrameColorizerInputs',
{
'target_vision_frame' : VisionFrame
})
FrameEnhancerInputs = TypedDict('FrameEnhancerInputs',
{
'target_vision_frame' : VisionFrame
})
LipSyncerInputs = TypedDict('LipSyncerInputs',
{
'reference_faces' : FaceSet,
'source_audio_frame' : AudioFrame,
'target_vision_frame' : VisionFrame
})
+51
View File
@@ -0,0 +1,51 @@
from typing import Any, Dict
import numpy
import deepfuze.globals
from deepfuze.face_store import FACE_STORE
from deepfuze.typing import FaceSet
from deepfuze import logger
def create_statistics(static_faces : FaceSet) -> Dict[str, Any]:
face_detector_score_list = []
face_landmarker_score_list = []
statistics =\
{
'min_face_detector_score': 0,
'min_face_landmarker_score': 0,
'max_face_detector_score': 0,
'max_face_landmarker_score': 0,
'average_face_detector_score': 0,
'average_face_landmarker_score': 0,
'total_face_landmark_5_fallbacks': 0,
'total_frames_with_faces': 0,
'total_faces': 0
}
for faces in static_faces.values():
statistics['total_frames_with_faces'] = statistics.get('total_frames_with_faces') + 1
for face in faces:
statistics['total_faces'] = statistics.get('total_faces') + 1
face_detector_score_list.append(face.scores.get('detector'))
face_landmarker_score_list.append(face.scores.get('landmarker'))
if numpy.array_equal(face.landmarks.get('5'), face.landmarks.get('5/68')):
statistics['total_face_landmark_5_fallbacks'] = statistics.get('total_face_landmark_5_fallbacks') + 1
if face_detector_score_list:
statistics['min_face_detector_score'] = round(min(face_detector_score_list), 2)
statistics['max_face_detector_score'] = round(max(face_detector_score_list), 2)
statistics['average_face_detector_score'] = round(numpy.mean(face_detector_score_list), 2)
if face_landmarker_score_list:
statistics['min_face_landmarker_score'] = round(min(face_landmarker_score_list), 2)
statistics['max_face_landmarker_score'] = round(max(face_landmarker_score_list), 2)
statistics['average_face_landmarker_score'] = round(numpy.mean(face_landmarker_score_list), 2)
return statistics
def conditional_log_statistics() -> None:
if deepfuze.globals.log_level == 'debug':
statistics = create_statistics(FACE_STORE.get('static_faces'))
for name, value in statistics.items():
logger.debug(str(name) + ': ' + str(value), __name__.upper())
+21
View File
@@ -0,0 +1,21 @@
from typing import List, Union, ContextManager
import threading
from contextlib import nullcontext
THREAD_LOCK : threading.Lock = threading.Lock()
THREAD_SEMAPHORE : threading.Semaphore = threading.Semaphore()
NULL_CONTEXT : ContextManager[None] = nullcontext()
def thread_lock() -> threading.Lock:
return THREAD_LOCK
def thread_semaphore() -> threading.Semaphore:
return THREAD_SEMAPHORE
def conditional_thread_semaphore(execution_providers : List[str]) -> Union[threading.Semaphore, ContextManager[None]]:
if 'DmlExecutionProvider' in execution_providers:
return THREAD_SEMAPHORE
return NULL_CONTEXT
+122
View File
@@ -0,0 +1,122 @@
from typing import Any, Literal, Callable, List, Tuple, Dict, TypedDict
from collections import namedtuple
import numpy
BoundingBox = numpy.ndarray[Any, Any]
FaceLandmark5 = numpy.ndarray[Any, Any]
FaceLandmark68 = numpy.ndarray[Any, Any]
FaceLandmarkSet = TypedDict('FaceLandmarkSet',
{
'5' : FaceLandmark5, #type:ignore[valid-type]
'5/68' : FaceLandmark5, #type:ignore[valid-type]
'68' : FaceLandmark68, #type:ignore[valid-type]
'68/5' : FaceLandmark68 #type:ignore[valid-type]
})
Score = float
FaceScoreSet = TypedDict('FaceScoreSet',
{
'detector' : Score,
'landmarker' : Score
})
Embedding = numpy.ndarray[Any, Any]
Face = namedtuple('Face',
[
'bounding_box',
'landmarks',
'scores',
'embedding',
'normed_embedding',
'gender',
'age'
])
FaceSet = Dict[str, List[Face]]
FaceStore = TypedDict('FaceStore',
{
'static_faces' : FaceSet,
'reference_faces': FaceSet
})
VisionFrame = numpy.ndarray[Any, Any]
Mask = numpy.ndarray[Any, Any]
Matrix = numpy.ndarray[Any, Any]
Translation = numpy.ndarray[Any, Any]
AudioBuffer = bytes
Audio = numpy.ndarray[Any, Any]
AudioChunk = numpy.ndarray[Any, Any]
AudioFrame = numpy.ndarray[Any, Any]
Spectrogram = numpy.ndarray[Any, Any]
MelFilterBank = numpy.ndarray[Any, Any]
Fps = float
Padding = Tuple[int, int, int, int]
Resolution = Tuple[int, int]
ProcessState = Literal['checking', 'processing', 'stopping', 'pending']
QueuePayload = TypedDict('QueuePayload',
{
'frame_number' : int,
'frame_path' : str
})
UpdateProgress = Callable[[int], None]
ProcessFrames = Callable[[List[str], List[QueuePayload], UpdateProgress], None]
WarpTemplate = Literal['arcface_112_v1', 'arcface_112_v2', 'arcface_128_v2', 'ffhq_512']
WarpTemplateSet = Dict[WarpTemplate, numpy.ndarray[Any, Any]]
ProcessMode = Literal['output', 'preview', 'stream']
LogLevel = Literal['error', 'warn', 'info', 'debug']
VideoMemoryStrategy = Literal['strict', 'moderate', 'tolerant']
FaceSelectorMode = Literal['many', 'one', 'reference']
FaceAnalyserOrder = Literal['left-right', 'right-left', 'top-bottom', 'bottom-top', 'small-large', 'large-small', 'best-worst', 'worst-best']
FaceAnalyserAge = Literal['child', 'teen', 'adult', 'senior']
FaceAnalyserGender = Literal['female', 'male']
FaceDetectorModel = Literal['many', 'retinaface', 'scrfd', 'yoloface', 'yunet']
FaceDetectorTweak = Literal['low-luminance', 'high-luminance']
FaceRecognizerModel = Literal['arcface_blendswap', 'arcface_inswapper', 'arcface_simswap', 'arcface_uniface']
FaceMaskType = Literal['box', 'occlusion', 'region']
FaceMaskRegion = Literal['skin', 'left-eyebrow', 'right-eyebrow', 'left-eye', 'right-eye', 'glasses', 'nose', 'mouth', 'upper-lip', 'lower-lip']
TempFrameFormat = Literal['jpg', 'png', 'bmp']
OutputVideoEncoder = Literal['libx264', 'libx265', 'libvpx-vp9', 'h264_nvenc', 'hevc_nvenc', 'h264_amf', 'hevc_amf']
OutputVideoPreset = Literal['ultrafast', 'superfast', 'veryfast', 'faster', 'fast', 'medium', 'slow', 'slower', 'veryslow']
ModelValue = Dict[str, Any]
ModelSet = Dict[str, ModelValue]
OptionsWithModel = TypedDict('OptionsWithModel',
{
'model' : ModelValue
})
ValueAndUnit = TypedDict('ValueAndUnit',
{
'value' : str,
'unit' : str
})
ExecutionDeviceFramework = TypedDict('ExecutionDeviceFramework',
{
'name' : str,
'version' : str
})
ExecutionDeviceProduct = TypedDict('ExecutionDeviceProduct',
{
'vendor' : str,
'name' : str
})
ExecutionDeviceVideoMemory = TypedDict('ExecutionDeviceVideoMemory',
{
'total' : ValueAndUnit,
'free' : ValueAndUnit
})
ExecutionDeviceUtilization = TypedDict('ExecutionDeviceUtilization',
{
'gpu' : ValueAndUnit,
'memory' : ValueAndUnit
})
ExecutionDevice = TypedDict('ExecutionDevice',
{
'driver_version' : str,
'framework' : ExecutionDeviceFramework,
'product' : ExecutionDeviceProduct,
'video_memory' : ExecutionDeviceVideoMemory,
'utilization' : ExecutionDeviceUtilization
})
View File
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
+7
View File
@@ -0,0 +1,7 @@
:root:root:root button:not([class])
{
border-radius: 0.375rem;
float: left;
overflow: hidden;
width: 100%;
}
+58
View File
@@ -0,0 +1,58 @@
:root:root:root input[type="number"]
{
max-width: 6rem;
}
:root:root:root [type="checkbox"],
:root:root:root [type="radio"]
{
border-radius: 50%;
height: 1.125rem;
width: 1.125rem;
}
:root:root:root input[type="range"]
{
height: 0.5rem;
}
:root:root:root input[type="range"]::-moz-range-thumb,
:root:root:root input[type="range"]::-webkit-slider-thumb
{
background: var(--neutral-300);
border: unset;
border-radius: 50%;
height: 1.125rem;
width: 1.125rem;
}
:root:root:root input[type="range"]::-webkit-slider-thumb
{
margin-top: 0.375rem;
}
:root:root:root .grid-wrap.fixed-height
{
min-height: unset;
}
:root:root:root .grid-container
{
grid-auto-rows: minmax(5em, 1fr);
grid-template-columns: repeat(var(--grid-cols), minmax(5em, 1fr));
grid-template-rows: repeat(var(--grid-rows), minmax(5em, 1fr));
}
:root:root:root .tab-nav > button
{
border: unset;
border-bottom: 0.125rem solid transparent;
font-size: 1.125em;
margin: 0.5rem 1rem;
padding: 0;
}
:root:root:root .tab-nav > button.selected
{
border-bottom: 0.125rem solid;
}
+7
View File
@@ -0,0 +1,7 @@
from typing import List
from deepfuze.uis.typing import WebcamMode
common_options : List[str] = [ 'keep-temp', 'skip-audio', 'skip-download' ]
webcam_modes : List[WebcamMode] = [ 'inline', 'udp', 'v4l2' ]
webcam_resolutions : List[str] = [ '320x240', '640x480', '800x600', '1024x768', '1280x720', '1280x960', '1920x1080', '2560x1440', '3840x2160' ]
View File

Some files were not shown because too many files have changed in this diff Show More