Is this the expected quality after training 800k iterations? #262

Open
opened 2022-05-09 16:22:58 +02:00 by Noixas · 1 comment
Noixas commented 2022-05-09 16:22:58 +02:00 (Migrated from github.com)

Hi, I have been training the Simswap model with the VGGFace2-224 dataset for a few days up to 800k iterations as suggested, the model training crashed due to a problem unrelated to the code (I think) just before it reached the 300k iterations, so I used the flags to load the checkpoint and continue training. I noticed that the output right before the model crash was decent with some face swapping but after continuing the training and by the time it reached 800k the face swapping seems negligible.
I monitor the losses in wandb and after it continues training the losses seem at the same spot where they were before and slowly decrease like always so I assumed it works as intended.

I use batch size 16 and one RTX3090.
tldr: Are these results normal after this many iterations or is the saving/loading of checkpoints not working properly?

Output at iteration 291k:

Losses up to 291k iter

Output at iteration 800k:

Losses from 291k to 900k iter

Hi, I have been training the Simswap model with the VGGFace2-224 dataset for a few days up to 800k iterations as suggested, the model training crashed due to a problem unrelated to the code (I think) just before it reached the 300k iterations, so I used the flags to load the checkpoint and continue training. I noticed that the output right before the model crash was decent with some face swapping but after continuing the training and by the time it reached 800k the face swapping seems negligible. I monitor the losses in wandb and after it continues training the losses seem at the same spot where they were before and slowly decrease like always so I assumed it works as intended. I use batch size 16 and one RTX3090. **tldr**: Are these results normal after this many iterations or is the saving/loading of checkpoints not working properly? **Output at iteration 291k:** <img src="https://user-images.githubusercontent.com/23318473/167426704-48682bfd-a8d2-4691-b78d-d24e3309f053.jpg" width="80%" height="80%"> **Losses up to 291k iter** <img src="https://user-images.githubusercontent.com/23318473/167429519-97256259-9684-4908-9851-971943847973.png" width="80%" height="80%"> --- **Output at iteration 800k:** <img src="https://user-images.githubusercontent.com/23318473/167426710-a27be0ea-2eaa-4ab4-b0ef-03b91caf224e.jpg" width="80%" height="80%"> **Losses from 291k to 900k iter** <img src="https://user-images.githubusercontent.com/23318473/167429211-c279b364-435a-417d-a78d-c4547e6fbf9d.png" width="80%" height="80%">
Fibonacci134 commented 2023-03-21 14:58:25 +01:00 (Migrated from github.com)

Hi, I have been training the Simswap model with the VGGFace2-224 dataset for a few days up to 800k iterations as suggested, the model training crashed due to a problem unrelated to the code (I think) just before it reached the 300k iterations, so I used the flags to load the checkpoint and continue training. I noticed that the output right before the model crash was decent with some face swapping but after continuing the training and by the time it reached 800k the face swapping seems negligible.
I monitor the losses in wandb and after it continues training the losses seem at the same spot where they were before and slowly decrease like always so I assumed it works as intended.

I use batch size 16 and one RTX3090.
tldr: Are these results normal after this many iterations or is the saving/loading of checkpoints not working properly?

Output at iteration 291k:

Losses up to 291k iter

Output at iteration 800k:

Losses from 291k to 900k iter

Mind sharing the model?

> Hi, I have been training the Simswap model with the VGGFace2-224 dataset for a few days up to 800k iterations as suggested, the model training crashed due to a problem unrelated to the code (I think) just before it reached the 300k iterations, so I used the flags to load the checkpoint and continue training. I noticed that the output right before the model crash was decent with some face swapping but after continuing the training and by the time it reached 800k the face swapping seems negligible. > I monitor the losses in wandb and after it continues training the losses seem at the same spot where they were before and slowly decrease like always so I assumed it works as intended. > > I use batch size 16 and one RTX3090. > **tldr**: Are these results normal after this many iterations or is the saving/loading of checkpoints not working properly? > > **Output at iteration 291k:** > <img src="https://user-images.githubusercontent.com/23318473/167426704-48682bfd-a8d2-4691-b78d-d24e3309f053.jpg" width="80%" height="80%"> > **Losses up to 291k iter** > <img src="https://user-images.githubusercontent.com/23318473/167429519-97256259-9684-4908-9851-971943847973.png" width="80%" height="80%"> > --- > > **Output at iteration 800k:** > > <img src="https://user-images.githubusercontent.com/23318473/167426710-a27be0ea-2eaa-4ab4-b0ef-03b91caf224e.jpg" width="80%" height="80%"> > > **Losses from 291k to 900k iter** > <img src="https://user-images.githubusercontent.com/23318473/167429211-c279b364-435a-417d-a78d-c4547e6fbf9d.png" width="80%" height="80%"> > Mind sharing the model?
Sign in to join this conversation.