Are there parameters for making the source facial details stronger? #14
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
First of all, great work. Are there parameters that can keep more of the details from the source image, or is this something that needs to be trained on? For example, sometimes there might be key details missing from the eyes, or maybe there are other features(piercings, tattoos, moles, and so on for example) that I may want to keep. Thanks!
Thank you for carefully studying our project engineering. We do not have parameters to adjust the degree of facial feature retention, because this project is our open source code for ACMMM2020. From a research perspective, our goal at the time was that features such as tattoos and piercings should be removed, because this is not The nature of the character. Of course we will consider your suggestions, and maybe we will add an interface to control the degree of feature retention in the version 2 that we will release in the future. This should be cool, many thanks~
If you have any other suggestions, please feel free to ask questions, and we will consider your comments and make improvements.
:)
Thanks for your reply! I'm honestly impressed that you can throw a full video into the script without any pre-production editing, so great work. I have some suggestions that should be easy to implement.
The bounding box is very visible after inference. Filling the face based on facial landmarks, then using that as a blurred matte should solve that issue.
An option to perform super resolution on the face during or after inference. An example would be something like Face Sparnet.
Support for the side of faces, or more extreme angles. I'm sure this is due to how the model was trained and not the actual project itself.
All of these things I can do myself, but I have to use other projects to implement them. It would be nice to have them native to what the script uses like ONNX and insightface.
Cool, I will try to add these modules to see if there is any improvement.
Great. Another idea for a problem I've just realized. If the face is tilted too far, there are a lot of jitters on the swapped face. For example, if the head goes from a 0 degree angle to a 90 degree one, the facial alignment struggles a bit. To fix this, you can set a parameter by getting the rotation matrix from the top of the head to the bottom. It would be something like this in steps.
That should solve that issue. If this was Dlib I could probably implement this, but I'm not familiar with what you're using :).
I see, it sounds very reasonable, let me spend some time to try this idea these two days.
For info, I use an optional post-processing with GPEN [1] in one of my SimSwap-Colab notebooks:
gpen/SimSwap_images.ipynbIf you want to try GPEN on Colab, independently from SimSwap, I have a small Github Gist for that purpose: https://gist.github.com/woctezuma/ecfc9849fac9f8d99edd7c1d6d03758a
Based on my experiments with super-resolution, GPEN looks the best among open-source super-resolution algorithms for faces. When it works, it is great. However, one has to be careful, because it can introduce artifacts, especially if the input photos had already been processed with a super-resolution algorithm. So it really has to be optional, to avoid inadvertently stacking super-resolution processing steps, which would degrade the final result.
[1] Yang, Tao, et al. GAN Prior Embedded Network for Blind Face Restoration in the Wild. arXiv preprint arXiv:2105.06070. 2021.
(code and paper)
@woctezuma Wow, that works really well. Thanks for the tip!
@NNNNAI I actually tried implementing the video rotation idea, but that didn't solve the issue. The performance comes down to the function here. The alignment algorithm seems to make the face crop shake a lot, resulting in poor alignment on some more extreme angles.
Would also be good to have the mask saved as a separate output to bring the footage into after effects to edit
After playing with GPEN and DFNet (GPEN gives much better quality results in my opinion based on some tests).
Like the suggested above, but I would like to focus on local machine (Windows 10 + Anaconda)
It will be really nice and useful to have optional extra command to add: GPEN as post:
1️⃣ - GPEN should run on every frame that SimSwap generated.
2️⃣ - Then SimSwap merge-connect everything together hopefully with lossless quality from the source video.
I couldn't install GPEN locally so I could only test it via google colab, but if this could be merged to SimSwap as optional command so the user won't have to use GPEN (like the current version) but using a command to run GPEN as post with it's basic properties (scaling, generate ONLY the final GPEN without the other files to speed up things and focus SimSwap's goal) will be more than nice!
@ftaker887 you say that you can solve the problem with the bounding box after inference.
Would you share your solution/ source code for that?
You can try to use this face parsing repo to get the mask, and blend the original image and swapping image according to the mask. The results by using mask or by bounding box are shown below. Many thansk~.


Impressive! the smooth masking bounding box looking really nice!
@NNNNAI Any chance this feature will be added to SimSwap as extra command?
--MaskSmoothor if needed extra parameters or values?
I will release this feature may be within a week, cause there are works I got to busy with recently. Many thanks~.
That will be great! thank you 👍
Thanks.
I think I will wait until this feature is released.
Btw.
I've added two extra parameters to set start and endframe (cut_in/cut_out) for the inference to be set from the commandline if one wants to process only parts of the input video.
Video works fine so far, but I have problems to set the audio cut_in....
@instant-high You can check ffmpeg and moviepy, both third party libraries support audio injection.
@NNNNAI . Yes I know... But never did that in python.... Most of the time I use VisualBasic for my projects.
(Some videotools, GUI for first order motion and motion co-segmentation,, wav2lip...)
@instant-high Hhhhh, I see, hope U can get it done~.
I've actually started training a model a few days ago using Self Correction for Human Parsing. It's very fast, and I believe it works without adding too many modules to SimSwap's repository.
The dataset I'm using is one that I've found here called LaPa, and would probably be best as it deals with occlusion, faces, and difficult situations.
The idea is that you would either mask it first for better alignment, or use it as a post processing step. There's another idea of using face inpainting after masking, but that's starting to get into third party plugin territory which might add complexity.
Would you like me to create a pull request when or if I get this implemented?
I don't know how to train my own model or anything complicated like that, I currently messing around with SimSwap built-in models and I see some issues I would love to see solve in future SimSwap, sometimes it can't handle some parts unlike the example above, and many times it is flickering in a way it won't recognize some frames that are similar to one before or after which is weird, also sometimes it is resizing the face because of not accurate face recognize I guess.
@ftaker887 I must mention that I'm VERY impressed from how accurate the mask is on the example you just posted! :o
I see the issues I currently have with "Face behind shoulders" sometimes with Hands over face when someone for example fix their hair. and with THIS example you just showed... WOW!!! that looks very accurate with so much details related to how it mask it.
@NNNNAI is there a chance this will be used for making in SimSwap future version?
If this is possible, I imagine how accurate the issues me and others may already tested, I'm no programmer and have no idea how to do such thing but I hope anyone here could merge with that way of masking. 🙏
@ftaker887 Wonderful~~~~~!!!!!!!!!!!!!!! But it will indeed add complexity to the original simswap repo by introducing self training mask model. How about this you create a individual repo named "simswap with more accuary mask" or something like that, and I will add a link refer to your repo in the simswap homepage when you get the function implemented.It all depends on you, looking forward to your feedback.
Sorry, but just another stupid question:
Messing around with ./util/reverse2original.py I found out that in line 11
swaped_img = swaped_img.cpu().detach().numpy().transpose((1, 2, 0))
is the "swapped" result image
As an absolute python beginner I've managed to blur / resize this swaped image by adding this code:
for swaped_img, mat in zip(swaped_imgs, mats):
swaped_img = swaped_img.cpu().detach().numpy().transpose((1, 2, 0))
img_white = np.full((crop_size,crop_size), 255, dtype=float)
swaped_img = cv2.resize(swaped_img,(256,256))
swaped_img = cv2.GaussianBlur(swaped_img, (19,19), sigmaX=0, sigmaY=0, borderType = cv2.BORDER_DEFAULT)
Wouldn't it be possible to make a smooth bounding box before blending it to the original image there?
Need help for this because it's very hard to search the net for every function and every error message when something goes wrong.
Here's an example of blur and resize swaped_img

Sure, that seems like a good idea! It may take a bit of time since I have to make sure everything is neat and ready to use.
The model is almost ready (this is an early iteration). It's not perfect, but it works well when it wants to. Hopefully when the training is done it will be a bit better.
Ok. But I was thinking about a simple, fixed mask around the square image that contains the swapped face before blending it to the whole frame. ? Soft border for that square.?
@instant-high I see.But actually, I have been use soft border blending in the original code.You can check it in line 34-35 ./util/reverse2origianl.py. There are some small problem about the code you provided. Firstly, you should not resize the swaped image which will lead to the misalign while blend the swapped image back to whole image. Secondly, you currently smooth the whole swapped image instead of the border which leads to your current oversmooth result inside the bounding box. If you still get confused, please feel free to ask~. Have a nice day~.
Resize and blur I only did to clarify what I mean. Of course that doesn't make sense.
I'll check line 34/35
But why do we see sharp edges when you already did soft blending the whole square?
Thank you
Check these images: https://github.com/neuralchen/SimSwap/issues/14#issuecomment-877135637
Or this link for an easier comparison: https://imgsli.com/NjA2MzE
You can see that there is a blur near the edges of the bounding-box, whereas the center with the face remains sharp.
Bounding-box:

The issue is less important with the mask method:
Mask:

Because our current method will make the background more or less different from the original, soft border can only alleviate sharp edge but cannot completely solve sharp edge. So maybe using mask to blend would be a better choice.
Sorry again. Maybe I don't understand the code completly. What I'm trying to achieve is a mask like this for the whole square image whit the face and the background: that should work even if parts of the face, eg. the chin, are ouside the square,
Thank you for your patience
fc4b701354/util/reverse2original.py (L12)fc4b701354/util/reverse2original.py (L27)fc4b701354/util/reverse2original.py (L32-L35)@instant-high I see. But if you want to achieve the goal you mentioned, you should do the blur on the image mask instead of on the swapped image. The goal you mentioned have already implemented by the line 34-35 inthe util/reverse2original,py, you can try to tune the kernel size for cv.erode or the iterations to find out if it can get better results.
I've tried. But increasing kernel size and/or iterations decreases the size of the "bounding box" but whith the same sharp edges
It looks to me that you have managed to blur the edges on the picture shown at https://github.com/neuralchen/SimSwap/issues/14#issuecomment-877754868
Otherwise, try to figure out the mask that you want. Here is an example with Gaussian Blur, which works on Colab:
Anyway, this (figuring out how you exactly want to blur the edges) should be another issue in my opinion.
Plus, the mask approach is more potent than the bounding-box approach, as it is based on the face parts.
I finally added the following 3 lines of code to 'reverse2original.py' from the above example for blurred mask and got the desired result.
Kernel value should be 2 x kernel_size for blurring.
img_mask = img_white
kernel = np.ones((40,40),np.uint8)
img_mask = cv2.erode(img_mask,kernel, iterations=1)
kernel_size = (20, 20)
blur_size = tuple(2*i+1 for i in kernel_size)
img_mask = cv2.GaussianBlur(img_mask, blur_size, 0)
Original script with visible bounding box:

Modified script with blurred mask:

Looking good.
@instant-high Great! It did work to make the results much more better.
@instant-high That looks REALLY good!! 😮
@NNNNAI Any chance we'll see this soon on the next update of SimSwap?
I would like to test it as well but I'm not a programmer so I can only try this when it will be officially included on the the official .ZIP
@AlonDan
If you run simswap locally on your computer you can insert the above 3 lines yourself by using something like windows notepad...
Copy and paste the lines to reverse2original.py starting at line 36. Be aware of the indent. Same as line 35. Then save it by overwriting the original file.
In case something goes wrong you could make a backup before
Don't forget to change kerne to 40 40 at line 34
Thanks @instant-high ! I tried it by putting the code with the 6 lines above
and got some errors:
and I got this error:
Without the 3 other lines (anything with the word KERNEL) it works, so I see the 20 blur gets to 40 but it seems like it also blur the inside so I'm not sure if it's correct. I would love to get the nice results as the example you got above :)
Do I need to install something to make it work?
or if possible to share the right file and I'll replace / overwrite mine? (I made a backup)
TabError: inconsistent use of tabs and spaces in indentation
Don't know how to say in english. That's the indent error I mentionend
Set the three lines to the left border and then put 8x space to get them to the same position as the lines above one by one.
It's just a problem of text formatting.
As an alternative I could send you the modified script....
Oh! I just used TABS, now I did manually 8 spaces as you mention and it works!
Thank you, it looks REALLY GOOD!
I hope that the next step will be more like whatever magic is done in the examples above so we can get much more accurate masks in general, but I guess it's a totally different thing to handle and probably much more complicated... it will be AWESOME for sure!
@instant-high Your work is great, do you mind I uploaded modified reverseoriginal.py in next simswap updated? I will make an acknowledgement for you in the "News" section.
Of course you can do that.
This is an amazing repo & this is also a fantastic enhancement. I think it's worth pointing out, since I can't see it mentioned anywhere above, that adding the extra code to reverse2original.py adds extra processing time. I assume it's the Gaussian blur adding the extra time, but I have not benchmarked it. All I know is that a clip that used to take 4 minutes now takes 6 minutes, so it's a noticeable speed decrease (worth it, in my opinion). I'm using Anaconda + Windows 10 + RTX 3070. Feel free to do your own tests or to see if there's a cheaper way of getting the same effect. Thanks again for everything :)
Hey guys, I have been update the simswap. All the example command lines from Inference for image or video face swapping are using mask(Better visual effects). Please check it in for details. Please don’t forget to go to Preparation to check the latest set up. Have a nice day~.
Colab is not ready now, I will let you know when it got done.
Thank you for the updates!
I have tested --use_mask true & false, and "false" still gives better output for my use cases. use_mask = true gives a very obvious line through the chin, which is not present for use_mask = false (thanks to the new blur, mentioned in previous comments). Can anyone suggest which code to add to only blur the bottom of the changed image (ie, the chin)? I would like to use the mask but also blur the chin to remove the line. I can do this in post-processing (ie After Effects) if not.
Also, will it ever be possible to replace the full face - including the full chin?
Thank you again.
@bmc84 Could you share the picture which you found there is a very obvious line through the chin? It will help me to locate the problem, many thanks~.
You can play around with the values for mask erode kernel and blur.
In my local installation I've added a few more parameters like mask height and width, cut in and duration, detection-size (640 480 360 256) which make it much faster but depending of the size of the face in the full video frame.
Also added preview of first swapped frame to decide wether to continue swapping or not....
But to set all parameters I've insertet simswap to my FOM GUI
Hello thank you for responding so quickly. Your project is amazing!! Thank you for everything you've released.
This isn't an example with the chin (the clip I first tested had a big beard, so that was probably causing the obvious line)... but this example shows that a bounding-box line is visible (not blurred) with masking. This example was using the ironman.jpg & use_mask = True.
@bmc84 could you send me the video and then I can test it by myself to find out how to fix it.My email is nicklau26@foxmail.com. Btw,can you share the command line you are using?
Done :) I have emailed the .mp4 and the target.jpg
The command being used was
python test_video_swapspecific.py --use_mask --pic_specific_path ./demo_file/target.jpg --isTrain false --name people --Arc_path arcface_model/arcface_checkpoint.tar --pic_a_path ./demo_file/Iron_man.jpg --video_path ./demo_file/officeSpace_test.mp4 --output_path ./output/officeSpace_output.mp4 --temp_path ./temp_results
Thank you
Having a quick look at new reverse2original I see you've added part segmentation/face part id list.
Does taht mean I can select which parts to swap just like in motion co-segmentation (part swap?)
Would be the greatest thing so far...
(Unfortunately I don' t can try right now)
Yes ,you can do it.
The problem have been solved , make sure you get the latest version of the code. Using mask should be better than not using mask now.
It's definitely fixed! Looks fantastic now, great job :) Thank you.
@bmc84 If you think this repo is helpful for you, please star it. Many thanks~.
Last update 07/19/2021 I got this error
FileNotFoundError: [Errno 2] No such file or directory: './parsing_model/checkpoint\79999_iter.pth'
I only replaced following scripts:
../options/base-options, train_options, test_options
../util/reverse2original, videoswap
test_video_swapsingle
and new folder
parsing_model --> model and resnet.py
Or do I have to reinstall all files?
Check it here preparation, download the file and place it in ./parsing_model/checkpoint
Thank you, seems to work now. Maybe I haven't read that carefully ....
New version results much better. Usung mask of course way slow, but doesn't matter...
'79999_iter.pth' just the same as in zllrunning faceparsing/facemakeup? So could change color of for eg. the hair?
Honestly, I am not sure wheather the file from face-parsing.PyTorch is same as zllrunning faceparsing/facemakeup. But you can have a try, btw I have updated the reverse2original.py to fix a small bug, make sure you get the newest version.
That was the small bug with the soft bounding box, only working when using mask, I've noticed that.?
Line 12 in reverse2original.py are the parts that are parsed using mask I think.
face_part_ids = [1, 2, 3, 4, 5, 6, 10, 12, 13] if no_neck else [1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 13, 14]
As I've figured out, every part in that line will be swapped. Is that correct?( I've made some short tests)
Removing eg. number 4 and 5 will not swap the eyes I think.
So this should be the complete part list:
atts = ['skin', 'l_brow', 'r_brow', 'l_eye', 'r_eye', 'eye_g', 'l_ear', 'r_ear', 'ear_r', 'nose', 'mouth', 'u_lip', 'l_lip', 'neck', 'neck_l', 'cloth', 'hair', 'hat']
1 face
2 left brow
3 right brow
4 left eye
5 right eye
6 eye g
7 left ear
8 right ear
9 ear r
10 nose
11 mouth
12 upper lip
13 lower lip
14 neck
15 neck l
16 cloth
17 hair
18 hat
@instant-high
Yes, you are correct.
Have you ever gotten GPEN to work on Windows? I am trying to replicate what you did locally, but running into this error: "Ninja is required to load C++ extensions" Ninja is installed, ugh
I am interested in this. Did you get it working?
Did you fail to get GPEN working on Windows?
Was this ever integrated?
@cwalt2014 the soft boundingbox is integrated.
Sorry, I have never tried that. I have always used Google Colab.
This is just in my opinion.
GPEN is often too artificial, I supposed.
On the other hand, GFPGAN is more suitable.
I was satisfied when I tried tg-bomze's repo below.
https://github.com/tg-bomze/collection-of-notebooks/blob/master/QuickFaceSwap.ipynb
Although it has some unresolved issues yet......
Hey what's your email? I would like to collaborate with you on some of my edits.
hello everyone! first of all, thank you so much for your work. secondly, has anyone encountered the problem that the final face is superimposed on the hand or other object in front of the face? is there any way to improve/add segmentation? are there ready-made solutions?
@ea-evdokimov
https://github.com/neuralchen/SimSwap/issues/14#issuecomment-882652186
Try with removing 1,11,12,13
from the list in reverse2original.py
Thanks. Is it possible to find, which areas the numbers correspond to?
Look here
https://github.com/neuralchen/SimSwap/issues/14#issuecomment-882652186
Thanks one more time. but is there any way to automate the overlay process if there are hands or something else in front of the face? if you need these parts of the face to be transferred while nothing is blocking them? for example using face segmentation on the each frame?
Hi @ftaker887 , Did you get around to solving this issue? I am also facing the problem where when the subject in the video looks down, or tilts their head at extreme angles, the mask becomes distorted and it is visibly apparent.
Any leads on this would be helpful. Thanks!
@NNNNAI Thank you for all the great work!
I am observing a problem wherein when the subject in the video looks towards the right or the left, the mask distorts completely, displaying a very obvious line through the chin. Has this already been taken care of or is it still a WIP?
@rkhilnani9 Hello. I haven't been able to get around to it, but the same rules should still apply. To get around this, either mess with the mask parameters or train a model using FFHQ alignment.
UPDATE - The subject in the video was a bearded male, hence the distortions. It is working fine with a female subject. Thanks again for the great repo!
Hello! I have a question - does it make sense to train model 224 with the --gdeep True parameter (if I understand correctly, this is FFHQ alignment)? Will this increase the quality and detail of the face with 224 trained model?
I had convergence issues when setting Gdeep to True. For some reason, it wouldn't capture head rotations properly, although I don't know if adding this parameter adds a step or two to training, but I didn't feel like spending more money to wait and see :). Disabling it made everything work fine for me on an FFHQ VGGFace2 & CelebAMask aligned dataset, both on 512 and 256.
In theory, training on a dataset with full head alignment will allow for better results than the insightface crop method used before. The reason being is that the crop method used to train the old model results in the edge artifacts, and resulted in us needing a second solution for masking such as faceParsing.
My results so far are showing better convergence than how the public model was trained.
I completely agree with you, so before I start training I try to learn how to do it right :) By your results do you mean the model that you trained with this published code or something from your own developments and solutions?
With the published code. Not finished training yet, but I expect it to be done in the next few days as I'm on a batch size of 60.
I train model in Colab, the max batch size value with which I managed to start the training is 22 on Tesla T4 and 17 on K80. Please post your results after finishing training, very interested to see)
Hey ftaker887, can you leave your email? I rewrote the SimSwap code and imporoved performance. I am making a generalized framework for multiple single shot models. Would love to collaborate with you!
Recently I am working on simswap using pretrained model.
The result is good and smooth.
But I want to improve the result.
How can I achieve this?
Should I use transfer learning on existing model?
@ExponentialML I am encountering the same issue, do you have any insights on how you do it? I am planning to train Lapo on Bisenet, is that what you did? Thank you very much
Has someone created a Windows gui or auto install .bat for this? :)