arcface
This commit is contained in:
@@ -0,0 +1,5 @@
|
||||
**__pycache__/
|
||||
.vscode
|
||||
bak*/
|
||||
work_dirs/
|
||||
models/
|
||||
@@ -0,0 +1,136 @@
|
||||
# Distributed Arcface Training in Pytorch
|
||||
|
||||
This is a deep learning library that makes face recognition efficient, and effective, which can train tens of millions
|
||||
identity on a single server.
|
||||
|
||||
## Requirements
|
||||
|
||||
- Install [PyTorch](http://pytorch.org) (torch>=1.6.0), our doc for [install.md](docs/install.md).
|
||||
- (Optional) Install [DALI](https://docs.nvidia.com/deeplearning/dali/user-guide/docs/), our doc for [install_dali.md](docs/install_dali.md).
|
||||
- `pip install -r requirement.txt`.
|
||||
|
||||
## How to Training
|
||||
|
||||
To train a model, run `train.py` with the path to the configs.
|
||||
The example commands below show how to run
|
||||
distributed training.
|
||||
|
||||
### 1. To run on a machine with 8 GPUs:
|
||||
|
||||
```shell
|
||||
python -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --node_rank=0 --master_addr="127.0.0.1" --master_port=12581 train.py configs/ms1mv3_r50_lr02
|
||||
```
|
||||
|
||||
### 2. To run on 2 machines with 8 GPUs each:
|
||||
|
||||
Node 0:
|
||||
|
||||
```shell
|
||||
python -m torch.distributed.launch --nproc_per_node=8 --nnodes=2 --node_rank=0 --master_addr="ip1" --master_port=12581 train.py configs/webface42m_r100_lr01_pfc02_bs4k_16gpus
|
||||
```
|
||||
|
||||
Node 1:
|
||||
|
||||
```shell
|
||||
python -m torch.distributed.launch --nproc_per_node=8 --nnodes=2 --node_rank=1 --master_addr="ip1" --master_port=12581 train.py configs/webface42m_r100_lr01_pfc02_bs4k_16gpus
|
||||
```
|
||||
|
||||
## Download Datasets or Prepare Datasets
|
||||
|
||||
- [MS1MV3](https://github.com/deepinsight/insightface/tree/master/recognition/_datasets_#ms1m-retinaface) (93k IDs, 5.2M images)
|
||||
- [Glint360K](https://github.com/deepinsight/insightface/tree/master/recognition/partial_fc#4-download) (360k IDs, 17.1M images)
|
||||
- [WebFace42M](docs/prepare_webface42m.md) (2M IDs, 42.5M images)
|
||||
|
||||
## Model Zoo
|
||||
|
||||
- The models are available for non-commercial research purposes only.
|
||||
- All models can be found in here.
|
||||
- [Baidu Yun Pan](https://pan.baidu.com/s/1CL-l4zWqsI1oDuEEYVhj-g): e8pw
|
||||
- [OneDrive](https://1drv.ms/u/s!AswpsDO2toNKq0lWY69vN58GR6mw?e=p9Ov5d)
|
||||
|
||||
### Performance on IJB-C and [**ICCV2021-MFR**](https://github.com/deepinsight/insightface/blob/master/challenges/mfr/README.md)
|
||||
|
||||
ICCV2021-MFR testset consists of non-celebrities so we can ensure that it has very few overlap with public available face
|
||||
recognition training set, such as MS1M and CASIA as they mostly collected from online celebrities.
|
||||
As the result, we can evaluate the FAIR performance for different algorithms.
|
||||
|
||||
For **ICCV2021-MFR-ALL** set, TAR is measured on all-to-all 1:1 protocal, with FAR less than 0.000001(e-6). The
|
||||
globalised multi-racial testset contains 242,143 identities and 1,624,305 images.
|
||||
|
||||
|
||||
|
||||
| Datasets | Backbone | **MFR-ALL** | IJB-C(1E-4) | IJB-C(1E-5) | Training Throughout | log |
|
||||
|:-------------------------|:-----------|:------------|:------------|:------------|:--------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| MS1MV3 | mobileface | 65.76 | 94.44 | 91.85 | ~13000 | [log](https://raw.githubusercontent.com/anxiangsir/insightface_arcface_log/master/ms1mv3_mobileface_lr02/training.log)\|[config](configs/ms1mv3_mobileface_lr02.py) |
|
||||
| Glint360K | mobileface | 69.83 | 95.17 | 92.58 | -11000 | [log](https://raw.githubusercontent.com/anxiangsir/insightface_arcface_log/master/glint360k_mobileface_lr02_bs4k/training.log)\|[config](configs/glint360k_mobileface_lr02_bs4k.py) |
|
||||
| WebFace42M-PartialFC-0.2 | mobileface | 73.80 | 95.40 | 92.64 | (16GPUs)~18583 | [log](https://raw.githubusercontent.com/anxiangsir/insightface_arcface_log/master/webface42m_mobilefacenet_pfc02_bs8k_16gpus/training.log)\|[config](configs/webface42m_mobilefacenet_pfc02_bs8k_16gpus.py) |
|
||||
| MS1MV3 | r100 | 83.23 | 96.88 | 95.31 | ~3400 | [log](https://raw.githubusercontent.com/anxiangsir/insightface_arcface_log/master/ms1mv3_r100_lr02/training.log)\|[config](configs/ms1mv3_r100_lr02.py) |
|
||||
| Glint360K | r100 | 90.86 | 97.53 | 96.43 | ~5000 | [log](https://raw.githubusercontent.com/anxiangsir/insightface_arcface_log/master/glint360k_r100_lr02_bs4k_16gpus/training.log)\|[config](configs/glint360k_r100_lr02_bs4k_16gpus.py) |
|
||||
| WebFace42M-PartialFC-0.2 | r50(bs4k) | 93.83 | 97.53 | 96.16 | (8 GPUs)~5900 | [log](https://raw.githubusercontent.com/anxiangsir/insightface_arcface_log/master/webface42m_r50_bs4k_pfc02/training.log)\|[config](configs/webface42m_r50_lr01_pfc02_bs4k_8gpus.py) |
|
||||
| WebFace42M-PartialFC-0.2 | r50(bs8k) | 93.96 | 97.46 | 96.12 | (16GPUs)~11000 | [log](https://raw.githubusercontent.com/anxiangsir/insightface_arcface_log/master/webface42m_r50_lr01_pfc02_bs8k_16gpus/training.log)\|[config](configs/webface42m_r50_lr01_pfc02_bs8k_16gpus.py) |
|
||||
| WebFace42M-PartialFC-0.2 | r50(bs4k) | 94.04 | 97.48 | 95.94 | (32GPUs)~17000 | log\|[config](configs/webface42m_r50_lr01_pfc02_bs4k_32gpus.py) |
|
||||
| WebFace42M-PartialFC-0.2 | r100(bs4k) | 96.69 | 97.85 | 96.63 | (16GPUs)~5200 | [log](https://raw.githubusercontent.com/anxiangsir/insightface_arcface_log/master/webface42m_r100_bs4k_pfc02/training.log)\|[config](configs/webface42m_r100_lr01_pfc02_bs4k_16gpus.py) |
|
||||
| WebFace42M-PartialFC-0.2 | r200 | - | - | - | - | log\|config |
|
||||
|
||||
`PartialFC-0.2` means negivate class centers sample rate is 0.2.
|
||||
|
||||
|
||||
## Speed Benchmark
|
||||
|
||||
`arcface_torch` can train large-scale face recognition training set efficiently and quickly. When the number of
|
||||
classes in training sets is greater than 1 Million, partial fc sampling strategy will get same
|
||||
accuracy with several times faster training performance and smaller GPU memory.
|
||||
Partial FC is a sparse variant of the model parallel architecture for large sacle face recognition. Partial FC use a
|
||||
sparse softmax, where each batch dynamicly sample a subset of class centers for training. In each iteration, only a
|
||||
sparse part of the parameters will be updated, which can reduce a lot of GPU memory and calculations. With Partial FC,
|
||||
we can scale trainset of 29 millions identities, the largest to date. Partial FC also supports multi-machine distributed
|
||||
training and mixed precision training.
|
||||
|
||||

|
||||
|
||||
More details see
|
||||
[speed_benchmark.md](docs/speed_benchmark.md) in docs.
|
||||
|
||||
### 1. Training speed of different parallel methods (samples / second), Tesla V100 32GB * 8. (Larger is better)
|
||||
|
||||
`-` means training failed because of gpu memory limitations.
|
||||
|
||||
| Number of Identities in Dataset | Data Parallel | Model Parallel | Partial FC 0.1 |
|
||||
|:--------------------------------|:--------------|:---------------|:---------------|
|
||||
| 125000 | 4681 | 4824 | 5004 |
|
||||
| 1400000 | **1672** | 3043 | 4738 |
|
||||
| 5500000 | **-** | **1389** | 3975 |
|
||||
| 8000000 | **-** | **-** | 3565 |
|
||||
| 16000000 | **-** | **-** | 2679 |
|
||||
| 29000000 | **-** | **-** | **1855** |
|
||||
|
||||
### 2. GPU memory cost of different parallel methods (MB per GPU), Tesla V100 32GB * 8. (Smaller is better)
|
||||
|
||||
| Number of Identities in Dataset | Data Parallel | Model Parallel | Partial FC 0.1 |
|
||||
|:--------------------------------|:--------------|:---------------|:---------------|
|
||||
| 125000 | 7358 | 5306 | 4868 |
|
||||
| 1400000 | 32252 | 11178 | 6056 |
|
||||
| 5500000 | **-** | 32188 | 9854 |
|
||||
| 8000000 | **-** | **-** | 12310 |
|
||||
| 16000000 | **-** | **-** | 19950 |
|
||||
| 29000000 | **-** | **-** | 32324 |
|
||||
|
||||
|
||||
## Citations
|
||||
|
||||
```
|
||||
@inproceedings{deng2019arcface,
|
||||
title={Arcface: Additive angular margin loss for deep face recognition},
|
||||
author={Deng, Jiankang and Guo, Jia and Xue, Niannan and Zafeiriou, Stefanos},
|
||||
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
|
||||
pages={4690--4699},
|
||||
year={2019}
|
||||
}
|
||||
@inproceedings{an2020partical_fc,
|
||||
title={Partial FC: Training 10 Million Identities on a Single Machine},
|
||||
author={An, Xiang and Zhu, Xuhan and Xiao, Yang and Wu, Lan and Zhang, Ming and Gao, Yuan and Qin, Bin and
|
||||
Zhang, Debing and Fu Ying},
|
||||
booktitle={Arxiv 2010.05222},
|
||||
year={2020}
|
||||
}
|
||||
```
|
||||
@@ -0,0 +1,25 @@
|
||||
from .iresnet import iresnet18, iresnet34, iresnet50, iresnet100, iresnet200
|
||||
from .mobilefacenet import get_mbf
|
||||
|
||||
|
||||
def get_model(name, **kwargs):
|
||||
# resnet
|
||||
if name == "r18":
|
||||
return iresnet18(False, **kwargs)
|
||||
elif name == "r34":
|
||||
return iresnet34(False, **kwargs)
|
||||
elif name == "r50":
|
||||
return iresnet50(False, **kwargs)
|
||||
elif name == "r100":
|
||||
return iresnet100(False, **kwargs)
|
||||
elif name == "r200":
|
||||
return iresnet200(False, **kwargs)
|
||||
elif name == "r2060":
|
||||
from .iresnet2060 import iresnet2060
|
||||
return iresnet2060(False, **kwargs)
|
||||
elif name == "mbf":
|
||||
fp16 = kwargs.get("fp16", False)
|
||||
num_features = kwargs.get("num_features", 512)
|
||||
return get_mbf(fp16=fp16, num_features=num_features)
|
||||
else:
|
||||
raise ValueError()
|
||||
@@ -0,0 +1,186 @@
|
||||
import torch
|
||||
from torch import nn
|
||||
|
||||
__all__ = ['iresnet18', 'iresnet34', 'iresnet50', 'iresnet100', 'iresnet200']
|
||||
|
||||
|
||||
def conv3x3(in_planes, out_planes, stride=1, groups=1, dilation=1):
|
||||
"""3x3 convolution with padding"""
|
||||
return nn.Conv2d(in_planes,
|
||||
out_planes,
|
||||
kernel_size=3,
|
||||
stride=stride,
|
||||
padding=dilation,
|
||||
groups=groups,
|
||||
bias=False,
|
||||
dilation=dilation)
|
||||
|
||||
|
||||
def conv1x1(in_planes, out_planes, stride=1):
|
||||
"""1x1 convolution"""
|
||||
return nn.Conv2d(in_planes,
|
||||
out_planes,
|
||||
kernel_size=1,
|
||||
stride=stride,
|
||||
bias=False)
|
||||
|
||||
|
||||
class IBasicBlock(nn.Module):
|
||||
expansion = 1
|
||||
def __init__(self, inplanes, planes, stride=1, downsample=None,
|
||||
groups=1, base_width=64, dilation=1):
|
||||
super(IBasicBlock, self).__init__()
|
||||
if groups != 1 or base_width != 64:
|
||||
raise ValueError('BasicBlock only supports groups=1 and base_width=64')
|
||||
if dilation > 1:
|
||||
raise NotImplementedError("Dilation > 1 not supported in BasicBlock")
|
||||
self.bn1 = nn.BatchNorm2d(inplanes, eps=1e-05,)
|
||||
self.conv1 = conv3x3(inplanes, planes)
|
||||
self.bn2 = nn.BatchNorm2d(planes, eps=1e-05,)
|
||||
self.prelu = nn.PReLU(planes)
|
||||
self.conv2 = conv3x3(planes, planes, stride)
|
||||
self.bn3 = nn.BatchNorm2d(planes, eps=1e-05,)
|
||||
self.downsample = downsample
|
||||
self.stride = stride
|
||||
|
||||
def forward(self, x):
|
||||
identity = x
|
||||
out = self.bn1(x)
|
||||
out = self.conv1(out)
|
||||
out = self.bn2(out)
|
||||
out = self.prelu(out)
|
||||
out = self.conv2(out)
|
||||
out = self.bn3(out)
|
||||
if self.downsample is not None:
|
||||
identity = self.downsample(x)
|
||||
out += identity
|
||||
return out
|
||||
|
||||
|
||||
class IResNet(nn.Module):
|
||||
fc_scale = 7 * 7
|
||||
def __init__(self,
|
||||
block, layers, dropout=0, num_features=512, zero_init_residual=False,
|
||||
groups=1, width_per_group=64, replace_stride_with_dilation=None, fp16=False):
|
||||
super(IResNet, self).__init__()
|
||||
self.fp16 = fp16
|
||||
self.inplanes = 64
|
||||
self.dilation = 1
|
||||
if replace_stride_with_dilation is None:
|
||||
replace_stride_with_dilation = [False, False, False]
|
||||
if len(replace_stride_with_dilation) != 3:
|
||||
raise ValueError("replace_stride_with_dilation should be None "
|
||||
"or a 3-element tuple, got {}".format(replace_stride_with_dilation))
|
||||
self.groups = groups
|
||||
self.base_width = width_per_group
|
||||
self.conv1 = nn.Conv2d(3, self.inplanes, kernel_size=3, stride=1, padding=1, bias=False)
|
||||
self.bn1 = nn.BatchNorm2d(self.inplanes, eps=1e-05)
|
||||
self.prelu = nn.PReLU(self.inplanes)
|
||||
self.layer1 = self._make_layer(block, 64, layers[0], stride=2)
|
||||
self.layer2 = self._make_layer(block,
|
||||
128,
|
||||
layers[1],
|
||||
stride=2,
|
||||
dilate=replace_stride_with_dilation[0])
|
||||
self.layer3 = self._make_layer(block,
|
||||
256,
|
||||
layers[2],
|
||||
stride=2,
|
||||
dilate=replace_stride_with_dilation[1])
|
||||
self.layer4 = self._make_layer(block,
|
||||
512,
|
||||
layers[3],
|
||||
stride=2,
|
||||
dilate=replace_stride_with_dilation[2])
|
||||
self.bn2 = nn.BatchNorm2d(512 * block.expansion, eps=1e-05,)
|
||||
self.dropout = nn.Dropout(p=dropout, inplace=True)
|
||||
self.fc = nn.Linear(512 * block.expansion * self.fc_scale, num_features)
|
||||
self.features = nn.BatchNorm1d(num_features, eps=1e-05)
|
||||
nn.init.constant_(self.features.weight, 1.0)
|
||||
self.features.weight.requires_grad = False
|
||||
|
||||
for m in self.modules():
|
||||
if isinstance(m, nn.Conv2d):
|
||||
nn.init.normal_(m.weight, 0, 0.1)
|
||||
elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):
|
||||
nn.init.constant_(m.weight, 1)
|
||||
nn.init.constant_(m.bias, 0)
|
||||
|
||||
if zero_init_residual:
|
||||
for m in self.modules():
|
||||
if isinstance(m, IBasicBlock):
|
||||
nn.init.constant_(m.bn2.weight, 0)
|
||||
|
||||
def _make_layer(self, block, planes, blocks, stride=1, dilate=False):
|
||||
downsample = None
|
||||
previous_dilation = self.dilation
|
||||
if dilate:
|
||||
self.dilation *= stride
|
||||
stride = 1
|
||||
if stride != 1 or self.inplanes != planes * block.expansion:
|
||||
downsample = nn.Sequential(
|
||||
conv1x1(self.inplanes, planes * block.expansion, stride),
|
||||
nn.BatchNorm2d(planes * block.expansion, eps=1e-05, ),
|
||||
)
|
||||
layers = []
|
||||
layers.append(
|
||||
block(self.inplanes, planes, stride, downsample, self.groups,
|
||||
self.base_width, previous_dilation))
|
||||
self.inplanes = planes * block.expansion
|
||||
for _ in range(1, blocks):
|
||||
layers.append(
|
||||
block(self.inplanes,
|
||||
planes,
|
||||
groups=self.groups,
|
||||
base_width=self.base_width,
|
||||
dilation=self.dilation))
|
||||
|
||||
return nn.Sequential(*layers)
|
||||
|
||||
def forward(self, x):
|
||||
with torch.cuda.amp.autocast(self.fp16):
|
||||
x = self.conv1(x)
|
||||
x = self.bn1(x)
|
||||
x = self.prelu(x)
|
||||
x = self.layer1(x)
|
||||
x = self.layer2(x)
|
||||
x = self.layer3(x)
|
||||
x = self.layer4(x)
|
||||
x = self.bn2(x)
|
||||
x = torch.flatten(x, 1)
|
||||
x = self.dropout(x)
|
||||
x = self.fc(x.float() if self.fp16 else x)
|
||||
x = self.features(x)
|
||||
return x
|
||||
|
||||
|
||||
def _iresnet(arch, block, layers, pretrained, progress, **kwargs):
|
||||
model = IResNet(block, layers, **kwargs)
|
||||
if pretrained:
|
||||
raise ValueError()
|
||||
return model
|
||||
|
||||
|
||||
def iresnet18(pretrained=False, progress=True, **kwargs):
|
||||
return _iresnet('iresnet18', IBasicBlock, [2, 2, 2, 2], pretrained,
|
||||
progress, **kwargs)
|
||||
|
||||
|
||||
def iresnet34(pretrained=False, progress=True, **kwargs):
|
||||
return _iresnet('iresnet34', IBasicBlock, [3, 4, 6, 3], pretrained,
|
||||
progress, **kwargs)
|
||||
|
||||
|
||||
def iresnet50(pretrained=False, progress=True, **kwargs):
|
||||
return _iresnet('iresnet50', IBasicBlock, [3, 4, 14, 3], pretrained,
|
||||
progress, **kwargs)
|
||||
|
||||
|
||||
def iresnet100(pretrained=False, progress=True, **kwargs):
|
||||
return _iresnet('iresnet100', IBasicBlock, [3, 13, 30, 3], pretrained,
|
||||
progress, **kwargs)
|
||||
|
||||
|
||||
def iresnet200(pretrained=False, progress=True, **kwargs):
|
||||
return _iresnet('iresnet200', IBasicBlock, [6, 26, 60, 6], pretrained,
|
||||
progress, **kwargs)
|
||||
@@ -0,0 +1,176 @@
|
||||
import torch
|
||||
from torch import nn
|
||||
|
||||
assert torch.__version__ >= "1.8.1"
|
||||
from torch.utils.checkpoint import checkpoint_sequential
|
||||
|
||||
__all__ = ['iresnet2060']
|
||||
|
||||
|
||||
def conv3x3(in_planes, out_planes, stride=1, groups=1, dilation=1):
|
||||
"""3x3 convolution with padding"""
|
||||
return nn.Conv2d(in_planes,
|
||||
out_planes,
|
||||
kernel_size=3,
|
||||
stride=stride,
|
||||
padding=dilation,
|
||||
groups=groups,
|
||||
bias=False,
|
||||
dilation=dilation)
|
||||
|
||||
|
||||
def conv1x1(in_planes, out_planes, stride=1):
|
||||
"""1x1 convolution"""
|
||||
return nn.Conv2d(in_planes,
|
||||
out_planes,
|
||||
kernel_size=1,
|
||||
stride=stride,
|
||||
bias=False)
|
||||
|
||||
|
||||
class IBasicBlock(nn.Module):
|
||||
expansion = 1
|
||||
|
||||
def __init__(self, inplanes, planes, stride=1, downsample=None,
|
||||
groups=1, base_width=64, dilation=1):
|
||||
super(IBasicBlock, self).__init__()
|
||||
if groups != 1 or base_width != 64:
|
||||
raise ValueError('BasicBlock only supports groups=1 and base_width=64')
|
||||
if dilation > 1:
|
||||
raise NotImplementedError("Dilation > 1 not supported in BasicBlock")
|
||||
self.bn1 = nn.BatchNorm2d(inplanes, eps=1e-05, )
|
||||
self.conv1 = conv3x3(inplanes, planes)
|
||||
self.bn2 = nn.BatchNorm2d(planes, eps=1e-05, )
|
||||
self.prelu = nn.PReLU(planes)
|
||||
self.conv2 = conv3x3(planes, planes, stride)
|
||||
self.bn3 = nn.BatchNorm2d(planes, eps=1e-05, )
|
||||
self.downsample = downsample
|
||||
self.stride = stride
|
||||
|
||||
def forward(self, x):
|
||||
identity = x
|
||||
out = self.bn1(x)
|
||||
out = self.conv1(out)
|
||||
out = self.bn2(out)
|
||||
out = self.prelu(out)
|
||||
out = self.conv2(out)
|
||||
out = self.bn3(out)
|
||||
if self.downsample is not None:
|
||||
identity = self.downsample(x)
|
||||
out += identity
|
||||
return out
|
||||
|
||||
|
||||
class IResNet(nn.Module):
|
||||
fc_scale = 7 * 7
|
||||
|
||||
def __init__(self,
|
||||
block, layers, dropout=0, num_features=512, zero_init_residual=False,
|
||||
groups=1, width_per_group=64, replace_stride_with_dilation=None, fp16=False):
|
||||
super(IResNet, self).__init__()
|
||||
self.fp16 = fp16
|
||||
self.inplanes = 64
|
||||
self.dilation = 1
|
||||
if replace_stride_with_dilation is None:
|
||||
replace_stride_with_dilation = [False, False, False]
|
||||
if len(replace_stride_with_dilation) != 3:
|
||||
raise ValueError("replace_stride_with_dilation should be None "
|
||||
"or a 3-element tuple, got {}".format(replace_stride_with_dilation))
|
||||
self.groups = groups
|
||||
self.base_width = width_per_group
|
||||
self.conv1 = nn.Conv2d(3, self.inplanes, kernel_size=3, stride=1, padding=1, bias=False)
|
||||
self.bn1 = nn.BatchNorm2d(self.inplanes, eps=1e-05)
|
||||
self.prelu = nn.PReLU(self.inplanes)
|
||||
self.layer1 = self._make_layer(block, 64, layers[0], stride=2)
|
||||
self.layer2 = self._make_layer(block,
|
||||
128,
|
||||
layers[1],
|
||||
stride=2,
|
||||
dilate=replace_stride_with_dilation[0])
|
||||
self.layer3 = self._make_layer(block,
|
||||
256,
|
||||
layers[2],
|
||||
stride=2,
|
||||
dilate=replace_stride_with_dilation[1])
|
||||
self.layer4 = self._make_layer(block,
|
||||
512,
|
||||
layers[3],
|
||||
stride=2,
|
||||
dilate=replace_stride_with_dilation[2])
|
||||
self.bn2 = nn.BatchNorm2d(512 * block.expansion, eps=1e-05, )
|
||||
self.dropout = nn.Dropout(p=dropout, inplace=True)
|
||||
self.fc = nn.Linear(512 * block.expansion * self.fc_scale, num_features)
|
||||
self.features = nn.BatchNorm1d(num_features, eps=1e-05)
|
||||
nn.init.constant_(self.features.weight, 1.0)
|
||||
self.features.weight.requires_grad = False
|
||||
|
||||
for m in self.modules():
|
||||
if isinstance(m, nn.Conv2d):
|
||||
nn.init.normal_(m.weight, 0, 0.1)
|
||||
elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):
|
||||
nn.init.constant_(m.weight, 1)
|
||||
nn.init.constant_(m.bias, 0)
|
||||
|
||||
if zero_init_residual:
|
||||
for m in self.modules():
|
||||
if isinstance(m, IBasicBlock):
|
||||
nn.init.constant_(m.bn2.weight, 0)
|
||||
|
||||
def _make_layer(self, block, planes, blocks, stride=1, dilate=False):
|
||||
downsample = None
|
||||
previous_dilation = self.dilation
|
||||
if dilate:
|
||||
self.dilation *= stride
|
||||
stride = 1
|
||||
if stride != 1 or self.inplanes != planes * block.expansion:
|
||||
downsample = nn.Sequential(
|
||||
conv1x1(self.inplanes, planes * block.expansion, stride),
|
||||
nn.BatchNorm2d(planes * block.expansion, eps=1e-05, ),
|
||||
)
|
||||
layers = []
|
||||
layers.append(
|
||||
block(self.inplanes, planes, stride, downsample, self.groups,
|
||||
self.base_width, previous_dilation))
|
||||
self.inplanes = planes * block.expansion
|
||||
for _ in range(1, blocks):
|
||||
layers.append(
|
||||
block(self.inplanes,
|
||||
planes,
|
||||
groups=self.groups,
|
||||
base_width=self.base_width,
|
||||
dilation=self.dilation))
|
||||
|
||||
return nn.Sequential(*layers)
|
||||
|
||||
def checkpoint(self, func, num_seg, x):
|
||||
if self.training:
|
||||
return checkpoint_sequential(func, num_seg, x)
|
||||
else:
|
||||
return func(x)
|
||||
|
||||
def forward(self, x):
|
||||
with torch.cuda.amp.autocast(self.fp16):
|
||||
x = self.conv1(x)
|
||||
x = self.bn1(x)
|
||||
x = self.prelu(x)
|
||||
x = self.layer1(x)
|
||||
x = self.checkpoint(self.layer2, 20, x)
|
||||
x = self.checkpoint(self.layer3, 100, x)
|
||||
x = self.layer4(x)
|
||||
x = self.bn2(x)
|
||||
x = torch.flatten(x, 1)
|
||||
x = self.dropout(x)
|
||||
x = self.fc(x.float() if self.fp16 else x)
|
||||
x = self.features(x)
|
||||
return x
|
||||
|
||||
|
||||
def _iresnet(arch, block, layers, pretrained, progress, **kwargs):
|
||||
model = IResNet(block, layers, **kwargs)
|
||||
if pretrained:
|
||||
raise ValueError()
|
||||
return model
|
||||
|
||||
|
||||
def iresnet2060(pretrained=False, progress=True, **kwargs):
|
||||
return _iresnet('iresnet2060', IBasicBlock, [3, 128, 1024 - 128, 3], pretrained, progress, **kwargs)
|
||||
@@ -0,0 +1,130 @@
|
||||
'''
|
||||
Adapted from https://github.com/cavalleria/cavaface.pytorch/blob/master/backbone/mobilefacenet.py
|
||||
Original author cavalleria
|
||||
'''
|
||||
|
||||
import torch.nn as nn
|
||||
from torch.nn import Linear, Conv2d, BatchNorm1d, BatchNorm2d, PReLU, Sequential, Module
|
||||
import torch
|
||||
|
||||
|
||||
class Flatten(Module):
|
||||
def forward(self, x):
|
||||
return x.view(x.size(0), -1)
|
||||
|
||||
|
||||
class ConvBlock(Module):
|
||||
def __init__(self, in_c, out_c, kernel=(1, 1), stride=(1, 1), padding=(0, 0), groups=1):
|
||||
super(ConvBlock, self).__init__()
|
||||
self.layers = nn.Sequential(
|
||||
Conv2d(in_c, out_c, kernel, groups=groups, stride=stride, padding=padding, bias=False),
|
||||
BatchNorm2d(num_features=out_c),
|
||||
PReLU(num_parameters=out_c)
|
||||
)
|
||||
|
||||
def forward(self, x):
|
||||
return self.layers(x)
|
||||
|
||||
|
||||
class LinearBlock(Module):
|
||||
def __init__(self, in_c, out_c, kernel=(1, 1), stride=(1, 1), padding=(0, 0), groups=1):
|
||||
super(LinearBlock, self).__init__()
|
||||
self.layers = nn.Sequential(
|
||||
Conv2d(in_c, out_c, kernel, stride, padding, groups=groups, bias=False),
|
||||
BatchNorm2d(num_features=out_c)
|
||||
)
|
||||
|
||||
def forward(self, x):
|
||||
return self.layers(x)
|
||||
|
||||
|
||||
class DepthWise(Module):
|
||||
def __init__(self, in_c, out_c, residual=False, kernel=(3, 3), stride=(2, 2), padding=(1, 1), groups=1):
|
||||
super(DepthWise, self).__init__()
|
||||
self.residual = residual
|
||||
self.layers = nn.Sequential(
|
||||
ConvBlock(in_c, out_c=groups, kernel=(1, 1), padding=(0, 0), stride=(1, 1)),
|
||||
ConvBlock(groups, groups, groups=groups, kernel=kernel, padding=padding, stride=stride),
|
||||
LinearBlock(groups, out_c, kernel=(1, 1), padding=(0, 0), stride=(1, 1))
|
||||
)
|
||||
|
||||
def forward(self, x):
|
||||
short_cut = None
|
||||
if self.residual:
|
||||
short_cut = x
|
||||
x = self.layers(x)
|
||||
if self.residual:
|
||||
output = short_cut + x
|
||||
else:
|
||||
output = x
|
||||
return output
|
||||
|
||||
|
||||
class Residual(Module):
|
||||
def __init__(self, c, num_block, groups, kernel=(3, 3), stride=(1, 1), padding=(1, 1)):
|
||||
super(Residual, self).__init__()
|
||||
modules = []
|
||||
for _ in range(num_block):
|
||||
modules.append(DepthWise(c, c, True, kernel, stride, padding, groups))
|
||||
self.layers = Sequential(*modules)
|
||||
|
||||
def forward(self, x):
|
||||
return self.layers(x)
|
||||
|
||||
|
||||
class GDC(Module):
|
||||
def __init__(self, embedding_size):
|
||||
super(GDC, self).__init__()
|
||||
self.layers = nn.Sequential(
|
||||
LinearBlock(512, 512, groups=512, kernel=(7, 7), stride=(1, 1), padding=(0, 0)),
|
||||
Flatten(),
|
||||
Linear(512, embedding_size, bias=False),
|
||||
BatchNorm1d(embedding_size))
|
||||
|
||||
def forward(self, x):
|
||||
return self.layers(x)
|
||||
|
||||
|
||||
class MobileFaceNet(Module):
|
||||
def __init__(self, fp16=False, num_features=512):
|
||||
super(MobileFaceNet, self).__init__()
|
||||
scale = 2
|
||||
self.fp16 = fp16
|
||||
self.layers = nn.Sequential(
|
||||
ConvBlock(3, 64 * scale, kernel=(3, 3), stride=(2, 2), padding=(1, 1)),
|
||||
ConvBlock(64 * scale, 64 * scale, kernel=(3, 3), stride=(1, 1), padding=(1, 1), groups=64),
|
||||
DepthWise(64 * scale, 64 * scale, kernel=(3, 3), stride=(2, 2), padding=(1, 1), groups=128),
|
||||
Residual(64 * scale, num_block=4, groups=128, kernel=(3, 3), stride=(1, 1), padding=(1, 1)),
|
||||
DepthWise(64 * scale, 128 * scale, kernel=(3, 3), stride=(2, 2), padding=(1, 1), groups=256),
|
||||
Residual(128 * scale, num_block=6, groups=256, kernel=(3, 3), stride=(1, 1), padding=(1, 1)),
|
||||
DepthWise(128 * scale, 128 * scale, kernel=(3, 3), stride=(2, 2), padding=(1, 1), groups=512),
|
||||
Residual(128 * scale, num_block=2, groups=256, kernel=(3, 3), stride=(1, 1), padding=(1, 1)),
|
||||
)
|
||||
self.conv_sep = ConvBlock(128 * scale, 512, kernel=(1, 1), stride=(1, 1), padding=(0, 0))
|
||||
self.features = GDC(num_features)
|
||||
self._initialize_weights()
|
||||
|
||||
def _initialize_weights(self):
|
||||
for m in self.modules():
|
||||
if isinstance(m, nn.Conv2d):
|
||||
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
|
||||
if m.bias is not None:
|
||||
m.bias.data.zero_()
|
||||
elif isinstance(m, nn.BatchNorm2d):
|
||||
m.weight.data.fill_(1)
|
||||
m.bias.data.zero_()
|
||||
elif isinstance(m, nn.Linear):
|
||||
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
|
||||
if m.bias is not None:
|
||||
m.bias.data.zero_()
|
||||
|
||||
def forward(self, x):
|
||||
with torch.cuda.amp.autocast(self.fp16):
|
||||
x = self.layers(x)
|
||||
x = self.conv_sep(x.float() if self.fp16 else x)
|
||||
x = self.features(x)
|
||||
return x
|
||||
|
||||
|
||||
def get_mbf(fp16, num_features):
|
||||
return MobileFaceNet(fp16, num_features)
|
||||
@@ -0,0 +1,22 @@
|
||||
from easydict import EasyDict as edict
|
||||
|
||||
# configs for test speed
|
||||
|
||||
config = edict()
|
||||
config.loss = "cosface"
|
||||
config.network = "r50"
|
||||
config.resume = False
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.sample_rate = 0.99
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 64 # total_batch_size = batch_size * num_gpus
|
||||
config.lr = 0.1 # batch size is 512
|
||||
|
||||
config.rec = "synthetic"
|
||||
config.num_classes = 300 * 10000
|
||||
config.num_epoch = 30
|
||||
config.warmup_epoch = -1
|
||||
config.val_targets = []
|
||||
@@ -0,0 +1,47 @@
|
||||
from easydict import EasyDict as edict
|
||||
|
||||
# make training faster
|
||||
# our RAM is 256G
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "arcface"
|
||||
config.network = "r50"
|
||||
config.resume = False
|
||||
config.output = "ms1mv3_arcface_r50"
|
||||
|
||||
config.embedding_size = 512
|
||||
config.sample_rate = 1
|
||||
config.fp16 = False
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.1 # batch size is 512
|
||||
config.dali = False
|
||||
config.verbose = 2000
|
||||
config.frequent = 10
|
||||
config.score = None
|
||||
|
||||
# if config.dataset == "emore":
|
||||
# config.rec = "/train_tmp/faces_emore"
|
||||
# config.num_classes = 85742
|
||||
# config.num_image = 5822653
|
||||
# config.num_epoch = 16
|
||||
# config.warmup_epoch = -1
|
||||
# config.val_targets = ["lfw", ]
|
||||
|
||||
# elif config.dataset == "ms1m-retinaface-t1":
|
||||
# config.rec = "/train_tmp/ms1m-retinaface-t1"
|
||||
# config.num_classes = 93431
|
||||
# config.num_image = 5179510
|
||||
# config.num_epoch = 25
|
||||
# config.warmup_epoch = -1
|
||||
# config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
|
||||
|
||||
# elif config.dataset == "glint360k":
|
||||
# config.rec = "/train_tmp/glint360k"
|
||||
# config.num_classes = 360232
|
||||
# config.num_image = 17091657
|
||||
# config.num_epoch = 20
|
||||
# config.warmup_epoch = -1
|
||||
# config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
|
||||
@@ -0,0 +1,27 @@
|
||||
from easydict import EasyDict as edict
|
||||
|
||||
# make training faster
|
||||
# our RAM is 256G
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "cosface"
|
||||
config.network = "mbf"
|
||||
config.resume = False
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.sample_rate = 1.0
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 1e-4
|
||||
config.batch_size = 512
|
||||
config.lr = 0.4
|
||||
config.verbose = 5000
|
||||
config.dali = False
|
||||
|
||||
config.rec = "/train_tmp/glint360k"
|
||||
config.num_classes = 360232
|
||||
config.num_image = 17091657
|
||||
config.num_epoch = 20
|
||||
config.warmup_epoch = 2
|
||||
config.val_targets = ['lfw', 'cfp_fp', "agedb_30"]
|
||||
@@ -0,0 +1,27 @@
|
||||
from easydict import EasyDict as edict
|
||||
|
||||
# make training faster
|
||||
# our RAM is 256G
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "cosface"
|
||||
config.network = "r100"
|
||||
config.resume = False
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.sample_rate = 1.0
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 256
|
||||
config.lr = 0.4
|
||||
config.verbose = 5000
|
||||
config.dali = False
|
||||
|
||||
config.rec = "/train_tmp/glint360k"
|
||||
config.num_classes = 360232
|
||||
config.num_image = 17091657
|
||||
config.num_epoch = 20
|
||||
config.warmup_epoch = 2
|
||||
config.val_targets = ['lfw', 'cfp_fp', "agedb_30"]
|
||||
@@ -0,0 +1,27 @@
|
||||
from easydict import EasyDict as edict
|
||||
|
||||
# make training faster
|
||||
# our RAM is 256G
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "arcface"
|
||||
config.network = "mbf"
|
||||
config.resume = False
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.sample_rate = 1.0
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 1e-4
|
||||
config.batch_size = 256
|
||||
config.lr = 0.2
|
||||
config.verbose = 5000
|
||||
config.dali = False
|
||||
|
||||
config.rec = "/train_tmp/ms1m-retinaface-t1"
|
||||
config.num_classes = 93431
|
||||
config.num_image = 5179510
|
||||
config.num_epoch = 40
|
||||
config.warmup_epoch = 2
|
||||
config.val_targets = ['lfw', 'cfp_fp', "agedb_30"]
|
||||
@@ -0,0 +1,27 @@
|
||||
from easydict import EasyDict as edict
|
||||
|
||||
# make training faster
|
||||
# our RAM is 256G
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "arcface"
|
||||
config.network = "r100"
|
||||
config.resume = False
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.sample_rate = 1.0
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.2
|
||||
config.verbose = 2000
|
||||
config.dali = False
|
||||
|
||||
config.rec = "/train_tmp/ms1m-retinaface-t1"
|
||||
config.num_classes = 93431
|
||||
config.num_image = 5179510
|
||||
config.num_epoch = 25
|
||||
config.warmup_epoch = 0
|
||||
config.val_targets = ['lfw', 'cfp_fp', "agedb_30"]
|
||||
@@ -0,0 +1,27 @@
|
||||
from easydict import EasyDict as edict
|
||||
|
||||
# make training faster
|
||||
# our RAM is 256G
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "arcface"
|
||||
config.network = "r50"
|
||||
config.resume = False
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.sample_rate = 1.0
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.2
|
||||
config.verbose = 2000
|
||||
config.dali = False
|
||||
|
||||
config.rec = "/train_tmp/ms1m-retinaface-t1"
|
||||
config.num_classes = 93431
|
||||
config.num_image = 5179510
|
||||
config.num_epoch = 25
|
||||
config.warmup_epoch = 2
|
||||
config.val_targets = ['lfw', 'cfp_fp', "agedb_30"]
|
||||
@@ -0,0 +1,27 @@
|
||||
from easydict import EasyDict as edict
|
||||
|
||||
# make training faster
|
||||
# our RAM is 256G
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "cosface"
|
||||
config.network = "mbf"
|
||||
config.resume = False
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.sample_rate = 0.2
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 1e-4
|
||||
config.batch_size = 512
|
||||
config.lr = 0.4
|
||||
config.verbose = 10000
|
||||
config.dali = False
|
||||
|
||||
config.rec = "/train_tmp/WebFace42M"
|
||||
config.num_classes = 2059906
|
||||
config.num_image = 42474557
|
||||
config.num_epoch = 20
|
||||
config.warmup_epoch = 2
|
||||
config.val_targets = []
|
||||
@@ -0,0 +1,27 @@
|
||||
from easydict import EasyDict as edict
|
||||
|
||||
# make training faster
|
||||
# our RAM is 256G
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "cosface"
|
||||
config.network = "r100"
|
||||
config.resume = False
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.sample_rate = 0.2
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 256
|
||||
config.lr = 0.3
|
||||
config.verbose = 2000
|
||||
config.dali = False
|
||||
|
||||
config.rec = "/train_tmp/WebFace42M"
|
||||
config.num_classes = 2059906
|
||||
config.num_image = 42474557
|
||||
config.num_epoch = 20
|
||||
config.warmup_epoch = 1
|
||||
config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
|
||||
@@ -0,0 +1,27 @@
|
||||
from easydict import EasyDict as edict
|
||||
|
||||
# make training faster
|
||||
# our RAM is 256G
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "cosface"
|
||||
config.network = "r50"
|
||||
config.resume = False
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.sample_rate = 0.2
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 128
|
||||
config.lr = 0.4
|
||||
config.verbose = 10000
|
||||
config.dali = False
|
||||
|
||||
config.rec = "/train_tmp/WebFace42M"
|
||||
config.num_classes = 2059906
|
||||
config.num_image = 42474557
|
||||
config.num_epoch = 20
|
||||
config.warmup_epoch = 2
|
||||
config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
|
||||
@@ -0,0 +1,27 @@
|
||||
from easydict import EasyDict as edict
|
||||
|
||||
# make training faster
|
||||
# our RAM is 256G
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "cosface"
|
||||
config.network = "r50"
|
||||
config.resume = False
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.sample_rate = 0.2
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 512
|
||||
config.lr = 0.4
|
||||
config.verbose = 10000
|
||||
config.dali = False
|
||||
|
||||
config.rec = "/train_tmp/WebFace42M"
|
||||
config.num_classes = 2059906
|
||||
config.num_image = 42474557
|
||||
config.num_epoch = 20
|
||||
config.warmup_epoch = 2
|
||||
config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
|
||||
@@ -0,0 +1,27 @@
|
||||
from easydict import EasyDict as edict
|
||||
|
||||
# make training faster
|
||||
# our RAM is 256G
|
||||
# mount -t tmpfs -o size=140G tmpfs /train_tmp
|
||||
|
||||
config = edict()
|
||||
config.loss = "cosface"
|
||||
config.network = "r50"
|
||||
config.resume = False
|
||||
config.output = None
|
||||
config.embedding_size = 512
|
||||
config.sample_rate = 0.2
|
||||
config.fp16 = True
|
||||
config.momentum = 0.9
|
||||
config.weight_decay = 5e-4
|
||||
config.batch_size = 512
|
||||
config.lr = 0.6
|
||||
config.verbose = 10000
|
||||
config.dali = False
|
||||
|
||||
config.rec = "/train_tmp/WebFace42M"
|
||||
config.num_classes = 2059906
|
||||
config.num_image = 42474557
|
||||
config.num_epoch = 20
|
||||
config.warmup_epoch = 4
|
||||
config.val_targets = ["lfw", "cfp_fp", "agedb_30"]
|
||||
@@ -0,0 +1,209 @@
|
||||
import numbers
|
||||
import os
|
||||
import queue as Queue
|
||||
import threading
|
||||
from typing import Iterable
|
||||
|
||||
import mxnet as mx
|
||||
import numpy as np
|
||||
import torch
|
||||
from torch import distributed
|
||||
from torch.utils.data import DataLoader, Dataset
|
||||
from torchvision import transforms
|
||||
|
||||
def get_dataloader(
|
||||
root_dir: str,
|
||||
local_rank: int,
|
||||
batch_size: int,
|
||||
dali = False) -> Iterable:
|
||||
if dali and root_dir != "synthetic":
|
||||
rec = os.path.join(root_dir, 'train.rec')
|
||||
idx = os.path.join(root_dir, 'train.idx')
|
||||
return dali_data_iter(
|
||||
batch_size=batch_size, rec_file=rec,
|
||||
idx_file=idx, num_threads=2, local_rank=local_rank)
|
||||
else:
|
||||
if root_dir == "synthetic":
|
||||
train_set = SyntheticDataset()
|
||||
else:
|
||||
train_set = MXFaceDataset(root_dir=root_dir, local_rank=local_rank)
|
||||
train_sampler = torch.utils.data.distributed.DistributedSampler(train_set, shuffle=True)
|
||||
train_loader = DataLoaderX(
|
||||
local_rank=local_rank,
|
||||
dataset=train_set,
|
||||
batch_size=batch_size,
|
||||
sampler=train_sampler,
|
||||
num_workers=2,
|
||||
pin_memory=True,
|
||||
drop_last=True,
|
||||
)
|
||||
return train_loader
|
||||
|
||||
class BackgroundGenerator(threading.Thread):
|
||||
def __init__(self, generator, local_rank, max_prefetch=6):
|
||||
super(BackgroundGenerator, self).__init__()
|
||||
self.queue = Queue.Queue(max_prefetch)
|
||||
self.generator = generator
|
||||
self.local_rank = local_rank
|
||||
self.daemon = True
|
||||
self.start()
|
||||
|
||||
def run(self):
|
||||
torch.cuda.set_device(self.local_rank)
|
||||
for item in self.generator:
|
||||
self.queue.put(item)
|
||||
self.queue.put(None)
|
||||
|
||||
def next(self):
|
||||
next_item = self.queue.get()
|
||||
if next_item is None:
|
||||
raise StopIteration
|
||||
return next_item
|
||||
|
||||
def __next__(self):
|
||||
return self.next()
|
||||
|
||||
def __iter__(self):
|
||||
return self
|
||||
|
||||
|
||||
class DataLoaderX(DataLoader):
|
||||
|
||||
def __init__(self, local_rank, **kwargs):
|
||||
super(DataLoaderX, self).__init__(**kwargs)
|
||||
self.stream = torch.cuda.Stream(local_rank)
|
||||
self.local_rank = local_rank
|
||||
|
||||
def __iter__(self):
|
||||
self.iter = super(DataLoaderX, self).__iter__()
|
||||
self.iter = BackgroundGenerator(self.iter, self.local_rank)
|
||||
self.preload()
|
||||
return self
|
||||
|
||||
def preload(self):
|
||||
self.batch = next(self.iter, None)
|
||||
if self.batch is None:
|
||||
return None
|
||||
with torch.cuda.stream(self.stream):
|
||||
for k in range(len(self.batch)):
|
||||
self.batch[k] = self.batch[k].to(device=self.local_rank, non_blocking=True)
|
||||
|
||||
def __next__(self):
|
||||
torch.cuda.current_stream().wait_stream(self.stream)
|
||||
batch = self.batch
|
||||
if batch is None:
|
||||
raise StopIteration
|
||||
self.preload()
|
||||
return batch
|
||||
|
||||
|
||||
class MXFaceDataset(Dataset):
|
||||
def __init__(self, root_dir, local_rank):
|
||||
super(MXFaceDataset, self).__init__()
|
||||
self.transform = transforms.Compose(
|
||||
[transforms.ToPILImage(),
|
||||
transforms.RandomHorizontalFlip(),
|
||||
transforms.ToTensor(),
|
||||
transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]),
|
||||
])
|
||||
self.root_dir = root_dir
|
||||
self.local_rank = local_rank
|
||||
path_imgrec = os.path.join(root_dir, 'train.rec')
|
||||
path_imgidx = os.path.join(root_dir, 'train.idx')
|
||||
self.imgrec = mx.recordio.MXIndexedRecordIO(path_imgidx, path_imgrec, 'r')
|
||||
s = self.imgrec.read_idx(0)
|
||||
header, _ = mx.recordio.unpack(s)
|
||||
if header.flag > 0:
|
||||
self.header0 = (int(header.label[0]), int(header.label[1]))
|
||||
self.imgidx = np.array(range(1, int(header.label[0])))
|
||||
else:
|
||||
self.imgidx = np.array(list(self.imgrec.keys))
|
||||
|
||||
def __getitem__(self, index):
|
||||
idx = self.imgidx[index]
|
||||
s = self.imgrec.read_idx(idx)
|
||||
header, img = mx.recordio.unpack(s)
|
||||
label = header.label
|
||||
if not isinstance(label, numbers.Number):
|
||||
label = label[0]
|
||||
label = torch.tensor(label, dtype=torch.long)
|
||||
sample = mx.image.imdecode(img).asnumpy()
|
||||
if self.transform is not None:
|
||||
sample = self.transform(sample)
|
||||
return sample, label
|
||||
|
||||
def __len__(self):
|
||||
return len(self.imgidx)
|
||||
|
||||
|
||||
class SyntheticDataset(Dataset):
|
||||
def __init__(self):
|
||||
super(SyntheticDataset, self).__init__()
|
||||
img = np.random.randint(0, 255, size=(112, 112, 3), dtype=np.int32)
|
||||
img = np.transpose(img, (2, 0, 1))
|
||||
img = torch.from_numpy(img).squeeze(0).float()
|
||||
img = ((img / 255) - 0.5) / 0.5
|
||||
self.img = img
|
||||
self.label = 1
|
||||
|
||||
def __getitem__(self, index):
|
||||
return self.img, self.label
|
||||
|
||||
def __len__(self):
|
||||
return 1000000
|
||||
|
||||
|
||||
def dali_data_iter(
|
||||
batch_size: int, rec_file: str, idx_file: str, num_threads: int,
|
||||
initial_fill=32768, random_shuffle=True,
|
||||
prefetch_queue_depth=1, local_rank=0, name="reader",
|
||||
mean=(127.5, 127.5, 127.5),
|
||||
std=(127.5, 127.5, 127.5)):
|
||||
"""
|
||||
Parameters:
|
||||
----------
|
||||
initial_fill: int
|
||||
Size of the buffer that is used for shuffling. If random_shuffle is False, this parameter is ignored.
|
||||
|
||||
"""
|
||||
rank: int = distributed.get_rank()
|
||||
world_size: int = distributed.get_world_size()
|
||||
import nvidia.dali.fn as fn
|
||||
import nvidia.dali.types as types
|
||||
from nvidia.dali.pipeline import Pipeline
|
||||
from nvidia.dali.plugin.pytorch import DALIClassificationIterator
|
||||
|
||||
pipe = Pipeline(
|
||||
batch_size=batch_size, num_threads=num_threads,
|
||||
device_id=local_rank, prefetch_queue_depth=prefetch_queue_depth, )
|
||||
condition_flip = fn.random.coin_flip(probability=0.5)
|
||||
with pipe:
|
||||
jpegs, labels = fn.readers.mxnet(
|
||||
path=rec_file, index_path=idx_file, initial_fill=initial_fill,
|
||||
num_shards=world_size, shard_id=rank,
|
||||
random_shuffle=random_shuffle, pad_last_batch=False, name=name)
|
||||
images = fn.decoders.image(jpegs, device="mixed", output_type=types.RGB)
|
||||
images = fn.crop_mirror_normalize(
|
||||
images, dtype=types.FLOAT, mean=mean, std=std, mirror=condition_flip)
|
||||
pipe.set_outputs(images, labels)
|
||||
pipe.build()
|
||||
return DALIWarper(DALIClassificationIterator(pipelines=[pipe], reader_name=name, ))
|
||||
|
||||
|
||||
@torch.no_grad()
|
||||
class DALIWarper(object):
|
||||
def __init__(self, dali_iter):
|
||||
self.iter = dali_iter
|
||||
|
||||
def __next__(self):
|
||||
data_dict = self.iter.__next__()[0]
|
||||
tensor_data = data_dict['data'].cuda()
|
||||
tensor_label: torch.Tensor = data_dict['label'].cuda().long()
|
||||
tensor_label.squeeze_()
|
||||
return tensor_data, tensor_label
|
||||
|
||||
def __iter__(self):
|
||||
return self
|
||||
|
||||
def reset(self):
|
||||
self.iter.reset()
|
||||
@@ -0,0 +1,31 @@
|
||||
## Eval on ICCV2021-MFR
|
||||
|
||||
coming soon.
|
||||
|
||||
|
||||
## Eval IJBC
|
||||
You can eval ijbc with pytorch or onnx.
|
||||
|
||||
|
||||
1. Eval IJBC With Onnx
|
||||
```shell
|
||||
CUDA_VISIBLE_DEVICES=0 python onnx_ijbc.py --model-root ms1mv3_arcface_r50 --image-path IJB_release/IJBC --result-dir ms1mv3_arcface_r50
|
||||
```
|
||||
|
||||
2. Eval IJBC With Pytorch
|
||||
```shell
|
||||
CUDA_VISIBLE_DEVICES=0,1 python eval_ijbc.py \
|
||||
--model-prefix ms1mv3_arcface_r50/backbone.pth \
|
||||
--image-path IJB_release/IJBC \
|
||||
--result-dir ms1mv3_arcface_r50 \
|
||||
--batch-size 128 \
|
||||
--job ms1mv3_arcface_r50 \
|
||||
--target IJBC \
|
||||
--network iresnet50
|
||||
```
|
||||
|
||||
## Inference
|
||||
|
||||
```shell
|
||||
python inference.py --weight ms1mv3_arcface_r50/backbone.pth --network r50
|
||||
```
|
||||
@@ -0,0 +1,51 @@
|
||||
## v1.8.0
|
||||
### Linux and Windows
|
||||
```shell
|
||||
# CUDA 11.0
|
||||
pip --default-timeout=100 install torch==1.8.0+cu111 torchvision==0.9.0+cu111 torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
|
||||
|
||||
# CUDA 10.2
|
||||
pip --default-timeout=100 install torch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0
|
||||
|
||||
# CPU only
|
||||
pip --default-timeout=100 install torch==1.8.0+cpu torchvision==0.9.0+cpu torchaudio==0.8.0 -f https://download.pytorch.org/whl/torch_stable.html
|
||||
|
||||
```
|
||||
|
||||
|
||||
## v1.7.1
|
||||
### Linux and Windows
|
||||
```shell
|
||||
# CUDA 11.0
|
||||
pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
|
||||
|
||||
# CUDA 10.2
|
||||
pip install torch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2
|
||||
|
||||
# CUDA 10.1
|
||||
pip install torch==1.7.1+cu101 torchvision==0.8.2+cu101 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
|
||||
|
||||
# CUDA 9.2
|
||||
pip install torch==1.7.1+cu92 torchvision==0.8.2+cu92 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
|
||||
|
||||
# CPU only
|
||||
pip install torch==1.7.1+cpu torchvision==0.8.2+cpu torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
|
||||
```
|
||||
|
||||
|
||||
## v1.6.0
|
||||
|
||||
### Linux and Windows
|
||||
```shell
|
||||
# CUDA 10.2
|
||||
pip install torch==1.6.0 torchvision==0.7.0
|
||||
|
||||
# CUDA 10.1
|
||||
pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
|
||||
|
||||
# CUDA 9.2
|
||||
pip install torch==1.6.0+cu92 torchvision==0.7.0+cu92 -f https://download.pytorch.org/whl/torch_stable.html
|
||||
|
||||
# CPU only
|
||||
pip install torch==1.6.0+cpu torchvision==0.7.0+cpu -f https://download.pytorch.org/whl/torch_stable.html
|
||||
```
|
||||
@@ -0,0 +1 @@
|
||||
TODO
|
||||
@@ -0,0 +1,22 @@
|
||||
|
||||
|
||||
|
||||
## 1. Download Datasets and Unzip
|
||||
|
||||
Download WebFace42M from [https://www.face-benchmark.org/download.html](https://www.face-benchmark.org/download.html).
|
||||
|
||||
|
||||
## 2. Create **Pre-shuffle** Rec File for DALI
|
||||
|
||||
Note: preshuffled rec is very important to DALI, and rec without preshuffled can cause performance degradation, origin insightface style rec file
|
||||
do not support Nvidia DALI, you must follow this command [mxnet.tools.im2rec](https://github.com/apache/incubator-mxnet/blob/master/tools/im2rec.py) to generate a pre-shuffle rec file.
|
||||
|
||||
```shell
|
||||
# 1) create train.lst using follow command
|
||||
python -m mxnet.tools.im2rec --list --recursive train "Your WebFace42M Root"
|
||||
|
||||
# 2) create train.rec and train.idx using train.lst using following command
|
||||
python -m mxnet.tools.im2rec --num-thread 16 --quality 100 train "Your WebFace42M Root"
|
||||
```
|
||||
|
||||
Finally, you will get three files: `train.lst`, `train.rec`, `train.idx`. which `train.idx`, `train.rec` are using for training.
|
||||
@@ -0,0 +1,93 @@
|
||||
## Test Training Speed
|
||||
|
||||
- Test Commands
|
||||
|
||||
You need to use the following two commands to test the Partial FC training performance.
|
||||
The number of identites is **3 millions** (synthetic data), turn mixed precision training on, backbone is resnet50,
|
||||
batch size is 1024.
|
||||
```shell
|
||||
# Model Parallel
|
||||
python -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --node_rank=0 --master_addr="127.0.0.1" --master_port=1234 train.py configs/3millions
|
||||
# Partial FC 0.1
|
||||
python -m torch.distributed.launch --nproc_per_node=8 --nnodes=1 --node_rank=0 --master_addr="127.0.0.1" --master_port=1234 train.py configs/3millions_pfc
|
||||
```
|
||||
|
||||
- GPU Memory
|
||||
|
||||
```
|
||||
# (Model Parallel) gpustat -i
|
||||
[0] Tesla V100-SXM2-32GB | 64'C, 94 % | 30338 / 32510 MB
|
||||
[1] Tesla V100-SXM2-32GB | 60'C, 99 % | 28876 / 32510 MB
|
||||
[2] Tesla V100-SXM2-32GB | 60'C, 99 % | 28872 / 32510 MB
|
||||
[3] Tesla V100-SXM2-32GB | 69'C, 99 % | 28872 / 32510 MB
|
||||
[4] Tesla V100-SXM2-32GB | 66'C, 99 % | 28888 / 32510 MB
|
||||
[5] Tesla V100-SXM2-32GB | 60'C, 99 % | 28932 / 32510 MB
|
||||
[6] Tesla V100-SXM2-32GB | 68'C, 100 % | 28916 / 32510 MB
|
||||
[7] Tesla V100-SXM2-32GB | 65'C, 99 % | 28860 / 32510 MB
|
||||
|
||||
# (Partial FC 0.1) gpustat -i
|
||||
[0] Tesla V100-SXM2-32GB | 60'C, 95 % | 10488 / 32510 MB │·······················
|
||||
[1] Tesla V100-SXM2-32GB | 60'C, 97 % | 10344 / 32510 MB │·······················
|
||||
[2] Tesla V100-SXM2-32GB | 61'C, 95 % | 10340 / 32510 MB │·······················
|
||||
[3] Tesla V100-SXM2-32GB | 66'C, 95 % | 10340 / 32510 MB │·······················
|
||||
[4] Tesla V100-SXM2-32GB | 65'C, 94 % | 10356 / 32510 MB │·······················
|
||||
[5] Tesla V100-SXM2-32GB | 61'C, 95 % | 10400 / 32510 MB │·······················
|
||||
[6] Tesla V100-SXM2-32GB | 68'C, 96 % | 10384 / 32510 MB │·······················
|
||||
[7] Tesla V100-SXM2-32GB | 64'C, 95 % | 10328 / 32510 MB │·······················
|
||||
```
|
||||
|
||||
- Training Speed
|
||||
|
||||
```python
|
||||
# (Model Parallel) trainging.log
|
||||
Training: Speed 2271.33 samples/sec Loss 1.1624 LearningRate 0.2000 Epoch: 0 Global Step: 100
|
||||
Training: Speed 2269.94 samples/sec Loss 0.0000 LearningRate 0.2000 Epoch: 0 Global Step: 150
|
||||
Training: Speed 2272.67 samples/sec Loss 0.0000 LearningRate 0.2000 Epoch: 0 Global Step: 200
|
||||
Training: Speed 2266.55 samples/sec Loss 0.0000 LearningRate 0.2000 Epoch: 0 Global Step: 250
|
||||
Training: Speed 2272.54 samples/sec Loss 0.0000 LearningRate 0.2000 Epoch: 0 Global Step: 300
|
||||
|
||||
# (Partial FC 0.1) trainging.log
|
||||
Training: Speed 5299.56 samples/sec Loss 1.0965 LearningRate 0.2000 Epoch: 0 Global Step: 100
|
||||
Training: Speed 5296.37 samples/sec Loss 0.0000 LearningRate 0.2000 Epoch: 0 Global Step: 150
|
||||
Training: Speed 5304.37 samples/sec Loss 0.0000 LearningRate 0.2000 Epoch: 0 Global Step: 200
|
||||
Training: Speed 5274.43 samples/sec Loss 0.0000 LearningRate 0.2000 Epoch: 0 Global Step: 250
|
||||
Training: Speed 5300.10 samples/sec Loss 0.0000 LearningRate 0.2000 Epoch: 0 Global Step: 300
|
||||
```
|
||||
|
||||
In this test case, Partial FC 0.1 only use1 1/3 of the GPU memory of the model parallel,
|
||||
and the training speed is 2.5 times faster than the model parallel.
|
||||
|
||||
|
||||
## Speed Benchmark
|
||||
|
||||
1. Training speed of different parallel methods (samples/second), Tesla V100 32GB * 8. (Larger is better)
|
||||
|
||||
| Number of Identities in Dataset | Data Parallel | Model Parallel | Partial FC 0.1 |
|
||||
| :--- | :--- | :--- | :--- |
|
||||
|125000 | 4681 | 4824 | 5004 |
|
||||
|250000 | 4047 | 4521 | 4976 |
|
||||
|500000 | 3087 | 4013 | 4900 |
|
||||
|1000000 | 2090 | 3449 | 4803 |
|
||||
|1400000 | 1672 | 3043 | 4738 |
|
||||
|2000000 | - | 2593 | 4626 |
|
||||
|4000000 | - | 1748 | 4208 |
|
||||
|5500000 | - | 1389 | 3975 |
|
||||
|8000000 | - | - | 3565 |
|
||||
|16000000 | - | - | 2679 |
|
||||
|29000000 | - | - | 1855 |
|
||||
|
||||
2. GPU memory cost of different parallel methods (GB per GPU), Tesla V100 32GB * 8. (Smaller is better)
|
||||
|
||||
| Number of Identities in Dataset | Data Parallel | Model Parallel | Partial FC 0.1 |
|
||||
| :--- | :--- | :--- | :--- |
|
||||
|125000 | 7358 | 5306 | 4868 |
|
||||
|250000 | 9940 | 5826 | 5004 |
|
||||
|500000 | 14220 | 7114 | 5202 |
|
||||
|1000000 | 23708 | 9966 | 5620 |
|
||||
|1400000 | 32252 | 11178 | 6056 |
|
||||
|2000000 | - | 13978 | 6472 |
|
||||
|4000000 | - | 23238 | 8284 |
|
||||
|5500000 | - | 32188 | 9854 |
|
||||
|8000000 | - | - | 12310 |
|
||||
|16000000 | - | - | 19950 |
|
||||
|29000000 | - | - | 32324 |
|
||||
@@ -0,0 +1,409 @@
|
||||
"""Helper for evaluation on the Labeled Faces in the Wild dataset
|
||||
"""
|
||||
|
||||
# MIT License
|
||||
#
|
||||
# Copyright (c) 2016 David Sandberg
|
||||
#
|
||||
# Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
# of this software and associated documentation files (the "Software"), to deal
|
||||
# in the Software without restriction, including without limitation the rights
|
||||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
# copies of the Software, and to permit persons to whom the Software is
|
||||
# furnished to do so, subject to the following conditions:
|
||||
#
|
||||
# The above copyright notice and this permission notice shall be included in all
|
||||
# copies or substantial portions of the Software.
|
||||
#
|
||||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
# SOFTWARE.
|
||||
|
||||
|
||||
import datetime
|
||||
import os
|
||||
import pickle
|
||||
|
||||
import mxnet as mx
|
||||
import numpy as np
|
||||
import sklearn
|
||||
import torch
|
||||
from mxnet import ndarray as nd
|
||||
from scipy import interpolate
|
||||
from sklearn.decomposition import PCA
|
||||
from sklearn.model_selection import KFold
|
||||
|
||||
|
||||
class LFold:
|
||||
def __init__(self, n_splits=2, shuffle=False):
|
||||
self.n_splits = n_splits
|
||||
if self.n_splits > 1:
|
||||
self.k_fold = KFold(n_splits=n_splits, shuffle=shuffle)
|
||||
|
||||
def split(self, indices):
|
||||
if self.n_splits > 1:
|
||||
return self.k_fold.split(indices)
|
||||
else:
|
||||
return [(indices, indices)]
|
||||
|
||||
|
||||
def calculate_roc(thresholds,
|
||||
embeddings1,
|
||||
embeddings2,
|
||||
actual_issame,
|
||||
nrof_folds=10,
|
||||
pca=0):
|
||||
assert (embeddings1.shape[0] == embeddings2.shape[0])
|
||||
assert (embeddings1.shape[1] == embeddings2.shape[1])
|
||||
nrof_pairs = min(len(actual_issame), embeddings1.shape[0])
|
||||
nrof_thresholds = len(thresholds)
|
||||
k_fold = LFold(n_splits=nrof_folds, shuffle=False)
|
||||
|
||||
tprs = np.zeros((nrof_folds, nrof_thresholds))
|
||||
fprs = np.zeros((nrof_folds, nrof_thresholds))
|
||||
accuracy = np.zeros((nrof_folds))
|
||||
indices = np.arange(nrof_pairs)
|
||||
|
||||
if pca == 0:
|
||||
diff = np.subtract(embeddings1, embeddings2)
|
||||
dist = np.sum(np.square(diff), 1)
|
||||
|
||||
for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)):
|
||||
if pca > 0:
|
||||
print('doing pca on', fold_idx)
|
||||
embed1_train = embeddings1[train_set]
|
||||
embed2_train = embeddings2[train_set]
|
||||
_embed_train = np.concatenate((embed1_train, embed2_train), axis=0)
|
||||
pca_model = PCA(n_components=pca)
|
||||
pca_model.fit(_embed_train)
|
||||
embed1 = pca_model.transform(embeddings1)
|
||||
embed2 = pca_model.transform(embeddings2)
|
||||
embed1 = sklearn.preprocessing.normalize(embed1)
|
||||
embed2 = sklearn.preprocessing.normalize(embed2)
|
||||
diff = np.subtract(embed1, embed2)
|
||||
dist = np.sum(np.square(diff), 1)
|
||||
|
||||
# Find the best threshold for the fold
|
||||
acc_train = np.zeros((nrof_thresholds))
|
||||
for threshold_idx, threshold in enumerate(thresholds):
|
||||
_, _, acc_train[threshold_idx] = calculate_accuracy(
|
||||
threshold, dist[train_set], actual_issame[train_set])
|
||||
best_threshold_index = np.argmax(acc_train)
|
||||
for threshold_idx, threshold in enumerate(thresholds):
|
||||
tprs[fold_idx, threshold_idx], fprs[fold_idx, threshold_idx], _ = calculate_accuracy(
|
||||
threshold, dist[test_set],
|
||||
actual_issame[test_set])
|
||||
_, _, accuracy[fold_idx] = calculate_accuracy(
|
||||
thresholds[best_threshold_index], dist[test_set],
|
||||
actual_issame[test_set])
|
||||
|
||||
tpr = np.mean(tprs, 0)
|
||||
fpr = np.mean(fprs, 0)
|
||||
return tpr, fpr, accuracy
|
||||
|
||||
|
||||
def calculate_accuracy(threshold, dist, actual_issame):
|
||||
predict_issame = np.less(dist, threshold)
|
||||
tp = np.sum(np.logical_and(predict_issame, actual_issame))
|
||||
fp = np.sum(np.logical_and(predict_issame, np.logical_not(actual_issame)))
|
||||
tn = np.sum(
|
||||
np.logical_and(np.logical_not(predict_issame),
|
||||
np.logical_not(actual_issame)))
|
||||
fn = np.sum(np.logical_and(np.logical_not(predict_issame), actual_issame))
|
||||
|
||||
tpr = 0 if (tp + fn == 0) else float(tp) / float(tp + fn)
|
||||
fpr = 0 if (fp + tn == 0) else float(fp) / float(fp + tn)
|
||||
acc = float(tp + tn) / dist.size
|
||||
return tpr, fpr, acc
|
||||
|
||||
|
||||
def calculate_val(thresholds,
|
||||
embeddings1,
|
||||
embeddings2,
|
||||
actual_issame,
|
||||
far_target,
|
||||
nrof_folds=10):
|
||||
assert (embeddings1.shape[0] == embeddings2.shape[0])
|
||||
assert (embeddings1.shape[1] == embeddings2.shape[1])
|
||||
nrof_pairs = min(len(actual_issame), embeddings1.shape[0])
|
||||
nrof_thresholds = len(thresholds)
|
||||
k_fold = LFold(n_splits=nrof_folds, shuffle=False)
|
||||
|
||||
val = np.zeros(nrof_folds)
|
||||
far = np.zeros(nrof_folds)
|
||||
|
||||
diff = np.subtract(embeddings1, embeddings2)
|
||||
dist = np.sum(np.square(diff), 1)
|
||||
indices = np.arange(nrof_pairs)
|
||||
|
||||
for fold_idx, (train_set, test_set) in enumerate(k_fold.split(indices)):
|
||||
|
||||
# Find the threshold that gives FAR = far_target
|
||||
far_train = np.zeros(nrof_thresholds)
|
||||
for threshold_idx, threshold in enumerate(thresholds):
|
||||
_, far_train[threshold_idx] = calculate_val_far(
|
||||
threshold, dist[train_set], actual_issame[train_set])
|
||||
if np.max(far_train) >= far_target:
|
||||
f = interpolate.interp1d(far_train, thresholds, kind='slinear')
|
||||
threshold = f(far_target)
|
||||
else:
|
||||
threshold = 0.0
|
||||
|
||||
val[fold_idx], far[fold_idx] = calculate_val_far(
|
||||
threshold, dist[test_set], actual_issame[test_set])
|
||||
|
||||
val_mean = np.mean(val)
|
||||
far_mean = np.mean(far)
|
||||
val_std = np.std(val)
|
||||
return val_mean, val_std, far_mean
|
||||
|
||||
|
||||
def calculate_val_far(threshold, dist, actual_issame):
|
||||
predict_issame = np.less(dist, threshold)
|
||||
true_accept = np.sum(np.logical_and(predict_issame, actual_issame))
|
||||
false_accept = np.sum(
|
||||
np.logical_and(predict_issame, np.logical_not(actual_issame)))
|
||||
n_same = np.sum(actual_issame)
|
||||
n_diff = np.sum(np.logical_not(actual_issame))
|
||||
# print(true_accept, false_accept)
|
||||
# print(n_same, n_diff)
|
||||
val = float(true_accept) / float(n_same)
|
||||
far = float(false_accept) / float(n_diff)
|
||||
return val, far
|
||||
|
||||
|
||||
def evaluate(embeddings, actual_issame, nrof_folds=10, pca=0):
|
||||
# Calculate evaluation metrics
|
||||
thresholds = np.arange(0, 4, 0.01)
|
||||
embeddings1 = embeddings[0::2]
|
||||
embeddings2 = embeddings[1::2]
|
||||
tpr, fpr, accuracy = calculate_roc(thresholds,
|
||||
embeddings1,
|
||||
embeddings2,
|
||||
np.asarray(actual_issame),
|
||||
nrof_folds=nrof_folds,
|
||||
pca=pca)
|
||||
thresholds = np.arange(0, 4, 0.001)
|
||||
val, val_std, far = calculate_val(thresholds,
|
||||
embeddings1,
|
||||
embeddings2,
|
||||
np.asarray(actual_issame),
|
||||
1e-3,
|
||||
nrof_folds=nrof_folds)
|
||||
return tpr, fpr, accuracy, val, val_std, far
|
||||
|
||||
@torch.no_grad()
|
||||
def load_bin(path, image_size):
|
||||
try:
|
||||
with open(path, 'rb') as f:
|
||||
bins, issame_list = pickle.load(f) # py2
|
||||
except UnicodeDecodeError as e:
|
||||
with open(path, 'rb') as f:
|
||||
bins, issame_list = pickle.load(f, encoding='bytes') # py3
|
||||
data_list = []
|
||||
for flip in [0, 1]:
|
||||
data = torch.empty((len(issame_list) * 2, 3, image_size[0], image_size[1]))
|
||||
data_list.append(data)
|
||||
for idx in range(len(issame_list) * 2):
|
||||
_bin = bins[idx]
|
||||
img = mx.image.imdecode(_bin)
|
||||
if img.shape[1] != image_size[0]:
|
||||
img = mx.image.resize_short(img, image_size[0])
|
||||
img = nd.transpose(img, axes=(2, 0, 1))
|
||||
for flip in [0, 1]:
|
||||
if flip == 1:
|
||||
img = mx.ndarray.flip(data=img, axis=2)
|
||||
data_list[flip][idx][:] = torch.from_numpy(img.asnumpy())
|
||||
if idx % 1000 == 0:
|
||||
print('loading bin', idx)
|
||||
print(data_list[0].shape)
|
||||
return data_list, issame_list
|
||||
|
||||
@torch.no_grad()
|
||||
def test(data_set, backbone, batch_size, nfolds=10):
|
||||
print('testing verification..')
|
||||
data_list = data_set[0]
|
||||
issame_list = data_set[1]
|
||||
embeddings_list = []
|
||||
time_consumed = 0.0
|
||||
for i in range(len(data_list)):
|
||||
data = data_list[i]
|
||||
embeddings = None
|
||||
ba = 0
|
||||
while ba < data.shape[0]:
|
||||
bb = min(ba + batch_size, data.shape[0])
|
||||
count = bb - ba
|
||||
_data = data[bb - batch_size: bb]
|
||||
time0 = datetime.datetime.now()
|
||||
img = ((_data / 255) - 0.5) / 0.5
|
||||
net_out: torch.Tensor = backbone(img)
|
||||
_embeddings = net_out.detach().cpu().numpy()
|
||||
time_now = datetime.datetime.now()
|
||||
diff = time_now - time0
|
||||
time_consumed += diff.total_seconds()
|
||||
if embeddings is None:
|
||||
embeddings = np.zeros((data.shape[0], _embeddings.shape[1]))
|
||||
embeddings[ba:bb, :] = _embeddings[(batch_size - count):, :]
|
||||
ba = bb
|
||||
embeddings_list.append(embeddings)
|
||||
|
||||
_xnorm = 0.0
|
||||
_xnorm_cnt = 0
|
||||
for embed in embeddings_list:
|
||||
for i in range(embed.shape[0]):
|
||||
_em = embed[i]
|
||||
_norm = np.linalg.norm(_em)
|
||||
_xnorm += _norm
|
||||
_xnorm_cnt += 1
|
||||
_xnorm /= _xnorm_cnt
|
||||
|
||||
embeddings = embeddings_list[0].copy()
|
||||
embeddings = sklearn.preprocessing.normalize(embeddings)
|
||||
acc1 = 0.0
|
||||
std1 = 0.0
|
||||
embeddings = embeddings_list[0] + embeddings_list[1]
|
||||
embeddings = sklearn.preprocessing.normalize(embeddings)
|
||||
print(embeddings.shape)
|
||||
print('infer time', time_consumed)
|
||||
_, _, accuracy, val, val_std, far = evaluate(embeddings, issame_list, nrof_folds=nfolds)
|
||||
acc2, std2 = np.mean(accuracy), np.std(accuracy)
|
||||
return acc1, std1, acc2, std2, _xnorm, embeddings_list
|
||||
|
||||
|
||||
def dumpR(data_set,
|
||||
backbone,
|
||||
batch_size,
|
||||
name='',
|
||||
data_extra=None,
|
||||
label_shape=None):
|
||||
print('dump verification embedding..')
|
||||
data_list = data_set[0]
|
||||
issame_list = data_set[1]
|
||||
embeddings_list = []
|
||||
time_consumed = 0.0
|
||||
for i in range(len(data_list)):
|
||||
data = data_list[i]
|
||||
embeddings = None
|
||||
ba = 0
|
||||
while ba < data.shape[0]:
|
||||
bb = min(ba + batch_size, data.shape[0])
|
||||
count = bb - ba
|
||||
|
||||
_data = nd.slice_axis(data, axis=0, begin=bb - batch_size, end=bb)
|
||||
time0 = datetime.datetime.now()
|
||||
if data_extra is None:
|
||||
db = mx.io.DataBatch(data=(_data,), label=(_label,))
|
||||
else:
|
||||
db = mx.io.DataBatch(data=(_data, _data_extra),
|
||||
label=(_label,))
|
||||
model.forward(db, is_train=False)
|
||||
net_out = model.get_outputs()
|
||||
_embeddings = net_out[0].asnumpy()
|
||||
time_now = datetime.datetime.now()
|
||||
diff = time_now - time0
|
||||
time_consumed += diff.total_seconds()
|
||||
if embeddings is None:
|
||||
embeddings = np.zeros((data.shape[0], _embeddings.shape[1]))
|
||||
embeddings[ba:bb, :] = _embeddings[(batch_size - count):, :]
|
||||
ba = bb
|
||||
embeddings_list.append(embeddings)
|
||||
embeddings = embeddings_list[0] + embeddings_list[1]
|
||||
embeddings = sklearn.preprocessing.normalize(embeddings)
|
||||
actual_issame = np.asarray(issame_list)
|
||||
outname = os.path.join('temp.bin')
|
||||
with open(outname, 'wb') as f:
|
||||
pickle.dump((embeddings, issame_list),
|
||||
f,
|
||||
protocol=pickle.HIGHEST_PROTOCOL)
|
||||
|
||||
|
||||
# if __name__ == '__main__':
|
||||
#
|
||||
# parser = argparse.ArgumentParser(description='do verification')
|
||||
# # general
|
||||
# parser.add_argument('--data-dir', default='', help='')
|
||||
# parser.add_argument('--model',
|
||||
# default='../model/softmax,50',
|
||||
# help='path to load model.')
|
||||
# parser.add_argument('--target',
|
||||
# default='lfw,cfp_ff,cfp_fp,agedb_30',
|
||||
# help='test targets.')
|
||||
# parser.add_argument('--gpu', default=0, type=int, help='gpu id')
|
||||
# parser.add_argument('--batch-size', default=32, type=int, help='')
|
||||
# parser.add_argument('--max', default='', type=str, help='')
|
||||
# parser.add_argument('--mode', default=0, type=int, help='')
|
||||
# parser.add_argument('--nfolds', default=10, type=int, help='')
|
||||
# args = parser.parse_args()
|
||||
# image_size = [112, 112]
|
||||
# print('image_size', image_size)
|
||||
# ctx = mx.gpu(args.gpu)
|
||||
# nets = []
|
||||
# vec = args.model.split(',')
|
||||
# prefix = args.model.split(',')[0]
|
||||
# epochs = []
|
||||
# if len(vec) == 1:
|
||||
# pdir = os.path.dirname(prefix)
|
||||
# for fname in os.listdir(pdir):
|
||||
# if not fname.endswith('.params'):
|
||||
# continue
|
||||
# _file = os.path.join(pdir, fname)
|
||||
# if _file.startswith(prefix):
|
||||
# epoch = int(fname.split('.')[0].split('-')[1])
|
||||
# epochs.append(epoch)
|
||||
# epochs = sorted(epochs, reverse=True)
|
||||
# if len(args.max) > 0:
|
||||
# _max = [int(x) for x in args.max.split(',')]
|
||||
# assert len(_max) == 2
|
||||
# if len(epochs) > _max[1]:
|
||||
# epochs = epochs[_max[0]:_max[1]]
|
||||
#
|
||||
# else:
|
||||
# epochs = [int(x) for x in vec[1].split('|')]
|
||||
# print('model number', len(epochs))
|
||||
# time0 = datetime.datetime.now()
|
||||
# for epoch in epochs:
|
||||
# print('loading', prefix, epoch)
|
||||
# sym, arg_params, aux_params = mx.model.load_checkpoint(prefix, epoch)
|
||||
# # arg_params, aux_params = ch_dev(arg_params, aux_params, ctx)
|
||||
# all_layers = sym.get_internals()
|
||||
# sym = all_layers['fc1_output']
|
||||
# model = mx.mod.Module(symbol=sym, context=ctx, label_names=None)
|
||||
# # model.bind(data_shapes=[('data', (args.batch_size, 3, image_size[0], image_size[1]))], label_shapes=[('softmax_label', (args.batch_size,))])
|
||||
# model.bind(data_shapes=[('data', (args.batch_size, 3, image_size[0],
|
||||
# image_size[1]))])
|
||||
# model.set_params(arg_params, aux_params)
|
||||
# nets.append(model)
|
||||
# time_now = datetime.datetime.now()
|
||||
# diff = time_now - time0
|
||||
# print('model loading time', diff.total_seconds())
|
||||
#
|
||||
# ver_list = []
|
||||
# ver_name_list = []
|
||||
# for name in args.target.split(','):
|
||||
# path = os.path.join(args.data_dir, name + ".bin")
|
||||
# if os.path.exists(path):
|
||||
# print('loading.. ', name)
|
||||
# data_set = load_bin(path, image_size)
|
||||
# ver_list.append(data_set)
|
||||
# ver_name_list.append(name)
|
||||
#
|
||||
# if args.mode == 0:
|
||||
# for i in range(len(ver_list)):
|
||||
# results = []
|
||||
# for model in nets:
|
||||
# acc1, std1, acc2, std2, xnorm, embeddings_list = test(
|
||||
# ver_list[i], model, args.batch_size, args.nfolds)
|
||||
# print('[%s]XNorm: %f' % (ver_name_list[i], xnorm))
|
||||
# print('[%s]Accuracy: %1.5f+-%1.5f' % (ver_name_list[i], acc1, std1))
|
||||
# print('[%s]Accuracy-Flip: %1.5f+-%1.5f' % (ver_name_list[i], acc2, std2))
|
||||
# results.append(acc2)
|
||||
# print('Max of [%s] is %1.5f' % (ver_name_list[i], np.max(results)))
|
||||
# elif args.mode == 1:
|
||||
# raise ValueError
|
||||
# else:
|
||||
# model = nets[0]
|
||||
# dumpR(ver_list[0], model, args.batch_size, args.target)
|
||||
@@ -0,0 +1,483 @@
|
||||
# coding: utf-8
|
||||
|
||||
import os
|
||||
import pickle
|
||||
|
||||
import matplotlib
|
||||
import pandas as pd
|
||||
|
||||
matplotlib.use('Agg')
|
||||
import matplotlib.pyplot as plt
|
||||
import timeit
|
||||
import sklearn
|
||||
import argparse
|
||||
import cv2
|
||||
import numpy as np
|
||||
import torch
|
||||
from skimage import transform as trans
|
||||
from backbones import get_model
|
||||
from sklearn.metrics import roc_curve, auc
|
||||
|
||||
from menpo.visualize.viewmatplotlib import sample_colours_from_colourmap
|
||||
from prettytable import PrettyTable
|
||||
from pathlib import Path
|
||||
|
||||
import sys
|
||||
import warnings
|
||||
|
||||
sys.path.insert(0, "../")
|
||||
warnings.filterwarnings("ignore")
|
||||
|
||||
parser = argparse.ArgumentParser(description='do ijb test')
|
||||
# general
|
||||
parser.add_argument('--model-prefix', default='', help='path to load model.')
|
||||
parser.add_argument('--image-path', default='', type=str, help='')
|
||||
parser.add_argument('--result-dir', default='.', type=str, help='')
|
||||
parser.add_argument('--batch-size', default=128, type=int, help='')
|
||||
parser.add_argument('--network', default='iresnet50', type=str, help='')
|
||||
parser.add_argument('--job', default='insightface', type=str, help='job name')
|
||||
parser.add_argument('--target', default='IJBC', type=str, help='target, set to IJBC or IJBB')
|
||||
args = parser.parse_args()
|
||||
|
||||
target = args.target
|
||||
model_path = args.model_prefix
|
||||
image_path = args.image_path
|
||||
result_dir = args.result_dir
|
||||
gpu_id = None
|
||||
use_norm_score = True # if Ture, TestMode(N1)
|
||||
use_detector_score = True # if Ture, TestMode(D1)
|
||||
use_flip_test = True # if Ture, TestMode(F1)
|
||||
job = args.job
|
||||
batch_size = args.batch_size
|
||||
|
||||
|
||||
class Embedding(object):
|
||||
def __init__(self, prefix, data_shape, batch_size=1):
|
||||
image_size = (112, 112)
|
||||
self.image_size = image_size
|
||||
weight = torch.load(prefix)
|
||||
resnet = get_model(args.network, dropout=0, fp16=False).cuda()
|
||||
resnet.load_state_dict(weight)
|
||||
model = torch.nn.DataParallel(resnet)
|
||||
self.model = model
|
||||
self.model.eval()
|
||||
src = np.array([
|
||||
[30.2946, 51.6963],
|
||||
[65.5318, 51.5014],
|
||||
[48.0252, 71.7366],
|
||||
[33.5493, 92.3655],
|
||||
[62.7299, 92.2041]], dtype=np.float32)
|
||||
src[:, 0] += 8.0
|
||||
self.src = src
|
||||
self.batch_size = batch_size
|
||||
self.data_shape = data_shape
|
||||
|
||||
def get(self, rimg, landmark):
|
||||
|
||||
assert landmark.shape[0] == 68 or landmark.shape[0] == 5
|
||||
assert landmark.shape[1] == 2
|
||||
if landmark.shape[0] == 68:
|
||||
landmark5 = np.zeros((5, 2), dtype=np.float32)
|
||||
landmark5[0] = (landmark[36] + landmark[39]) / 2
|
||||
landmark5[1] = (landmark[42] + landmark[45]) / 2
|
||||
landmark5[2] = landmark[30]
|
||||
landmark5[3] = landmark[48]
|
||||
landmark5[4] = landmark[54]
|
||||
else:
|
||||
landmark5 = landmark
|
||||
tform = trans.SimilarityTransform()
|
||||
tform.estimate(landmark5, self.src)
|
||||
M = tform.params[0:2, :]
|
||||
img = cv2.warpAffine(rimg,
|
||||
M, (self.image_size[1], self.image_size[0]),
|
||||
borderValue=0.0)
|
||||
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
|
||||
img_flip = np.fliplr(img)
|
||||
img = np.transpose(img, (2, 0, 1)) # 3*112*112, RGB
|
||||
img_flip = np.transpose(img_flip, (2, 0, 1))
|
||||
input_blob = np.zeros((2, 3, self.image_size[1], self.image_size[0]), dtype=np.uint8)
|
||||
input_blob[0] = img
|
||||
input_blob[1] = img_flip
|
||||
return input_blob
|
||||
|
||||
@torch.no_grad()
|
||||
def forward_db(self, batch_data):
|
||||
imgs = torch.Tensor(batch_data).cuda()
|
||||
imgs.div_(255).sub_(0.5).div_(0.5)
|
||||
feat = self.model(imgs)
|
||||
feat = feat.reshape([self.batch_size, 2 * feat.shape[1]])
|
||||
return feat.cpu().numpy()
|
||||
|
||||
|
||||
# 将一个list尽量均分成n份,限制len(list)==n,份数大于原list内元素个数则分配空list[]
|
||||
def divideIntoNstrand(listTemp, n):
|
||||
twoList = [[] for i in range(n)]
|
||||
for i, e in enumerate(listTemp):
|
||||
twoList[i % n].append(e)
|
||||
return twoList
|
||||
|
||||
|
||||
def read_template_media_list(path):
|
||||
# ijb_meta = np.loadtxt(path, dtype=str)
|
||||
ijb_meta = pd.read_csv(path, sep=' ', header=None).values
|
||||
templates = ijb_meta[:, 1].astype(np.int)
|
||||
medias = ijb_meta[:, 2].astype(np.int)
|
||||
return templates, medias
|
||||
|
||||
|
||||
# In[ ]:
|
||||
|
||||
|
||||
def read_template_pair_list(path):
|
||||
# pairs = np.loadtxt(path, dtype=str)
|
||||
pairs = pd.read_csv(path, sep=' ', header=None).values
|
||||
# print(pairs.shape)
|
||||
# print(pairs[:, 0].astype(np.int))
|
||||
t1 = pairs[:, 0].astype(np.int)
|
||||
t2 = pairs[:, 1].astype(np.int)
|
||||
label = pairs[:, 2].astype(np.int)
|
||||
return t1, t2, label
|
||||
|
||||
|
||||
# In[ ]:
|
||||
|
||||
|
||||
def read_image_feature(path):
|
||||
with open(path, 'rb') as fid:
|
||||
img_feats = pickle.load(fid)
|
||||
return img_feats
|
||||
|
||||
|
||||
# In[ ]:
|
||||
|
||||
|
||||
def get_image_feature(img_path, files_list, model_path, epoch, gpu_id):
|
||||
batch_size = args.batch_size
|
||||
data_shape = (3, 112, 112)
|
||||
|
||||
files = files_list
|
||||
print('files:', len(files))
|
||||
rare_size = len(files) % batch_size
|
||||
faceness_scores = []
|
||||
batch = 0
|
||||
img_feats = np.empty((len(files), 1024), dtype=np.float32)
|
||||
|
||||
batch_data = np.empty((2 * batch_size, 3, 112, 112))
|
||||
embedding = Embedding(model_path, data_shape, batch_size)
|
||||
for img_index, each_line in enumerate(files[:len(files) - rare_size]):
|
||||
name_lmk_score = each_line.strip().split(' ')
|
||||
img_name = os.path.join(img_path, name_lmk_score[0])
|
||||
img = cv2.imread(img_name)
|
||||
lmk = np.array([float(x) for x in name_lmk_score[1:-1]],
|
||||
dtype=np.float32)
|
||||
lmk = lmk.reshape((5, 2))
|
||||
input_blob = embedding.get(img, lmk)
|
||||
|
||||
batch_data[2 * (img_index - batch * batch_size)][:] = input_blob[0]
|
||||
batch_data[2 * (img_index - batch * batch_size) + 1][:] = input_blob[1]
|
||||
if (img_index + 1) % batch_size == 0:
|
||||
print('batch', batch)
|
||||
img_feats[batch * batch_size:batch * batch_size +
|
||||
batch_size][:] = embedding.forward_db(batch_data)
|
||||
batch += 1
|
||||
faceness_scores.append(name_lmk_score[-1])
|
||||
|
||||
batch_data = np.empty((2 * rare_size, 3, 112, 112))
|
||||
embedding = Embedding(model_path, data_shape, rare_size)
|
||||
for img_index, each_line in enumerate(files[len(files) - rare_size:]):
|
||||
name_lmk_score = each_line.strip().split(' ')
|
||||
img_name = os.path.join(img_path, name_lmk_score[0])
|
||||
img = cv2.imread(img_name)
|
||||
lmk = np.array([float(x) for x in name_lmk_score[1:-1]],
|
||||
dtype=np.float32)
|
||||
lmk = lmk.reshape((5, 2))
|
||||
input_blob = embedding.get(img, lmk)
|
||||
batch_data[2 * img_index][:] = input_blob[0]
|
||||
batch_data[2 * img_index + 1][:] = input_blob[1]
|
||||
if (img_index + 1) % rare_size == 0:
|
||||
print('batch', batch)
|
||||
img_feats[len(files) -
|
||||
rare_size:][:] = embedding.forward_db(batch_data)
|
||||
batch += 1
|
||||
faceness_scores.append(name_lmk_score[-1])
|
||||
faceness_scores = np.array(faceness_scores).astype(np.float32)
|
||||
# img_feats = np.ones( (len(files), 1024), dtype=np.float32) * 0.01
|
||||
# faceness_scores = np.ones( (len(files), ), dtype=np.float32 )
|
||||
return img_feats, faceness_scores
|
||||
|
||||
|
||||
# In[ ]:
|
||||
|
||||
|
||||
def image2template_feature(img_feats=None, templates=None, medias=None):
|
||||
# ==========================================================
|
||||
# 1. face image feature l2 normalization. img_feats:[number_image x feats_dim]
|
||||
# 2. compute media feature.
|
||||
# 3. compute template feature.
|
||||
# ==========================================================
|
||||
unique_templates = np.unique(templates)
|
||||
template_feats = np.zeros((len(unique_templates), img_feats.shape[1]))
|
||||
|
||||
for count_template, uqt in enumerate(unique_templates):
|
||||
|
||||
(ind_t,) = np.where(templates == uqt)
|
||||
face_norm_feats = img_feats[ind_t]
|
||||
face_medias = medias[ind_t]
|
||||
unique_medias, unique_media_counts = np.unique(face_medias,
|
||||
return_counts=True)
|
||||
media_norm_feats = []
|
||||
for u, ct in zip(unique_medias, unique_media_counts):
|
||||
(ind_m,) = np.where(face_medias == u)
|
||||
if ct == 1:
|
||||
media_norm_feats += [face_norm_feats[ind_m]]
|
||||
else: # image features from the same video will be aggregated into one feature
|
||||
media_norm_feats += [
|
||||
np.mean(face_norm_feats[ind_m], axis=0, keepdims=True)
|
||||
]
|
||||
media_norm_feats = np.array(media_norm_feats)
|
||||
# media_norm_feats = media_norm_feats / np.sqrt(np.sum(media_norm_feats ** 2, -1, keepdims=True))
|
||||
template_feats[count_template] = np.sum(media_norm_feats, axis=0)
|
||||
if count_template % 2000 == 0:
|
||||
print('Finish Calculating {} template features.'.format(
|
||||
count_template))
|
||||
# template_norm_feats = template_feats / np.sqrt(np.sum(template_feats ** 2, -1, keepdims=True))
|
||||
template_norm_feats = sklearn.preprocessing.normalize(template_feats)
|
||||
# print(template_norm_feats.shape)
|
||||
return template_norm_feats, unique_templates
|
||||
|
||||
|
||||
# In[ ]:
|
||||
|
||||
|
||||
def verification(template_norm_feats=None,
|
||||
unique_templates=None,
|
||||
p1=None,
|
||||
p2=None):
|
||||
# ==========================================================
|
||||
# Compute set-to-set Similarity Score.
|
||||
# ==========================================================
|
||||
template2id = np.zeros((max(unique_templates) + 1, 1), dtype=int)
|
||||
for count_template, uqt in enumerate(unique_templates):
|
||||
template2id[uqt] = count_template
|
||||
|
||||
score = np.zeros((len(p1),)) # save cosine distance between pairs
|
||||
|
||||
total_pairs = np.array(range(len(p1)))
|
||||
batchsize = 100000 # small batchsize instead of all pairs in one batch due to the memory limiation
|
||||
sublists = [
|
||||
total_pairs[i:i + batchsize] for i in range(0, len(p1), batchsize)
|
||||
]
|
||||
total_sublists = len(sublists)
|
||||
for c, s in enumerate(sublists):
|
||||
feat1 = template_norm_feats[template2id[p1[s]]]
|
||||
feat2 = template_norm_feats[template2id[p2[s]]]
|
||||
similarity_score = np.sum(feat1 * feat2, -1)
|
||||
score[s] = similarity_score.flatten()
|
||||
if c % 10 == 0:
|
||||
print('Finish {}/{} pairs.'.format(c, total_sublists))
|
||||
return score
|
||||
|
||||
|
||||
# In[ ]:
|
||||
def verification2(template_norm_feats=None,
|
||||
unique_templates=None,
|
||||
p1=None,
|
||||
p2=None):
|
||||
template2id = np.zeros((max(unique_templates) + 1, 1), dtype=int)
|
||||
for count_template, uqt in enumerate(unique_templates):
|
||||
template2id[uqt] = count_template
|
||||
score = np.zeros((len(p1),)) # save cosine distance between pairs
|
||||
total_pairs = np.array(range(len(p1)))
|
||||
batchsize = 100000 # small batchsize instead of all pairs in one batch due to the memory limiation
|
||||
sublists = [
|
||||
total_pairs[i:i + batchsize] for i in range(0, len(p1), batchsize)
|
||||
]
|
||||
total_sublists = len(sublists)
|
||||
for c, s in enumerate(sublists):
|
||||
feat1 = template_norm_feats[template2id[p1[s]]]
|
||||
feat2 = template_norm_feats[template2id[p2[s]]]
|
||||
similarity_score = np.sum(feat1 * feat2, -1)
|
||||
score[s] = similarity_score.flatten()
|
||||
if c % 10 == 0:
|
||||
print('Finish {}/{} pairs.'.format(c, total_sublists))
|
||||
return score
|
||||
|
||||
|
||||
def read_score(path):
|
||||
with open(path, 'rb') as fid:
|
||||
img_feats = pickle.load(fid)
|
||||
return img_feats
|
||||
|
||||
|
||||
# # Step1: Load Meta Data
|
||||
|
||||
# In[ ]:
|
||||
|
||||
assert target == 'IJBC' or target == 'IJBB'
|
||||
|
||||
# =============================================================
|
||||
# load image and template relationships for template feature embedding
|
||||
# tid --> template id, mid --> media id
|
||||
# format:
|
||||
# image_name tid mid
|
||||
# =============================================================
|
||||
start = timeit.default_timer()
|
||||
templates, medias = read_template_media_list(
|
||||
os.path.join('%s/meta' % image_path,
|
||||
'%s_face_tid_mid.txt' % target.lower()))
|
||||
stop = timeit.default_timer()
|
||||
print('Time: %.2f s. ' % (stop - start))
|
||||
|
||||
# In[ ]:
|
||||
|
||||
# =============================================================
|
||||
# load template pairs for template-to-template verification
|
||||
# tid : template id, label : 1/0
|
||||
# format:
|
||||
# tid_1 tid_2 label
|
||||
# =============================================================
|
||||
start = timeit.default_timer()
|
||||
p1, p2, label = read_template_pair_list(
|
||||
os.path.join('%s/meta' % image_path,
|
||||
'%s_template_pair_label.txt' % target.lower()))
|
||||
stop = timeit.default_timer()
|
||||
print('Time: %.2f s. ' % (stop - start))
|
||||
|
||||
# # Step 2: Get Image Features
|
||||
|
||||
# In[ ]:
|
||||
|
||||
# =============================================================
|
||||
# load image features
|
||||
# format:
|
||||
# img_feats: [image_num x feats_dim] (227630, 512)
|
||||
# =============================================================
|
||||
start = timeit.default_timer()
|
||||
img_path = '%s/loose_crop' % image_path
|
||||
img_list_path = '%s/meta/%s_name_5pts_score.txt' % (image_path, target.lower())
|
||||
img_list = open(img_list_path)
|
||||
files = img_list.readlines()
|
||||
# files_list = divideIntoNstrand(files, rank_size)
|
||||
files_list = files
|
||||
|
||||
# img_feats
|
||||
# for i in range(rank_size):
|
||||
img_feats, faceness_scores = get_image_feature(img_path, files_list,
|
||||
model_path, 0, gpu_id)
|
||||
stop = timeit.default_timer()
|
||||
print('Time: %.2f s. ' % (stop - start))
|
||||
print('Feature Shape: ({} , {}) .'.format(img_feats.shape[0],
|
||||
img_feats.shape[1]))
|
||||
|
||||
# # Step3: Get Template Features
|
||||
|
||||
# In[ ]:
|
||||
|
||||
# =============================================================
|
||||
# compute template features from image features.
|
||||
# =============================================================
|
||||
start = timeit.default_timer()
|
||||
# ==========================================================
|
||||
# Norm feature before aggregation into template feature?
|
||||
# Feature norm from embedding network and faceness score are able to decrease weights for noise samples (not face).
|
||||
# ==========================================================
|
||||
# 1. FaceScore (Feature Norm)
|
||||
# 2. FaceScore (Detector)
|
||||
|
||||
if use_flip_test:
|
||||
# concat --- F1
|
||||
# img_input_feats = img_feats
|
||||
# add --- F2
|
||||
img_input_feats = img_feats[:, 0:img_feats.shape[1] //
|
||||
2] + img_feats[:, img_feats.shape[1] // 2:]
|
||||
else:
|
||||
img_input_feats = img_feats[:, 0:img_feats.shape[1] // 2]
|
||||
|
||||
if use_norm_score:
|
||||
img_input_feats = img_input_feats
|
||||
else:
|
||||
# normalise features to remove norm information
|
||||
img_input_feats = img_input_feats / np.sqrt(
|
||||
np.sum(img_input_feats ** 2, -1, keepdims=True))
|
||||
|
||||
if use_detector_score:
|
||||
print(img_input_feats.shape, faceness_scores.shape)
|
||||
img_input_feats = img_input_feats * faceness_scores[:, np.newaxis]
|
||||
else:
|
||||
img_input_feats = img_input_feats
|
||||
|
||||
template_norm_feats, unique_templates = image2template_feature(
|
||||
img_input_feats, templates, medias)
|
||||
stop = timeit.default_timer()
|
||||
print('Time: %.2f s. ' % (stop - start))
|
||||
|
||||
# # Step 4: Get Template Similarity Scores
|
||||
|
||||
# In[ ]:
|
||||
|
||||
# =============================================================
|
||||
# compute verification scores between template pairs.
|
||||
# =============================================================
|
||||
start = timeit.default_timer()
|
||||
score = verification(template_norm_feats, unique_templates, p1, p2)
|
||||
stop = timeit.default_timer()
|
||||
print('Time: %.2f s. ' % (stop - start))
|
||||
|
||||
# In[ ]:
|
||||
save_path = os.path.join(result_dir, args.job)
|
||||
# save_path = result_dir + '/%s_result' % target
|
||||
|
||||
if not os.path.exists(save_path):
|
||||
os.makedirs(save_path)
|
||||
|
||||
score_save_file = os.path.join(save_path, "%s.npy" % target.lower())
|
||||
np.save(score_save_file, score)
|
||||
|
||||
# # Step 5: Get ROC Curves and TPR@FPR Table
|
||||
|
||||
# In[ ]:
|
||||
|
||||
files = [score_save_file]
|
||||
methods = []
|
||||
scores = []
|
||||
for file in files:
|
||||
methods.append(Path(file).stem)
|
||||
scores.append(np.load(file))
|
||||
|
||||
methods = np.array(methods)
|
||||
scores = dict(zip(methods, scores))
|
||||
colours = dict(
|
||||
zip(methods, sample_colours_from_colourmap(methods.shape[0], 'Set2')))
|
||||
x_labels = [10 ** -6, 10 ** -5, 10 ** -4, 10 ** -3, 10 ** -2, 10 ** -1]
|
||||
tpr_fpr_table = PrettyTable(['Methods'] + [str(x) for x in x_labels])
|
||||
fig = plt.figure()
|
||||
for method in methods:
|
||||
fpr, tpr, _ = roc_curve(label, scores[method])
|
||||
roc_auc = auc(fpr, tpr)
|
||||
fpr = np.flipud(fpr)
|
||||
tpr = np.flipud(tpr) # select largest tpr at same fpr
|
||||
plt.plot(fpr,
|
||||
tpr,
|
||||
color=colours[method],
|
||||
lw=1,
|
||||
label=('[%s (AUC = %0.4f %%)]' %
|
||||
(method.split('-')[-1], roc_auc * 100)))
|
||||
tpr_fpr_row = []
|
||||
tpr_fpr_row.append("%s-%s" % (method, target))
|
||||
for fpr_iter in np.arange(len(x_labels)):
|
||||
_, min_index = min(
|
||||
list(zip(abs(fpr - x_labels[fpr_iter]), range(len(fpr)))))
|
||||
tpr_fpr_row.append('%.2f' % (tpr[min_index] * 100))
|
||||
tpr_fpr_table.add_row(tpr_fpr_row)
|
||||
plt.xlim([10 ** -6, 0.1])
|
||||
plt.ylim([0.3, 1.0])
|
||||
plt.grid(linestyle='--', linewidth=1)
|
||||
plt.xticks(x_labels)
|
||||
plt.yticks(np.linspace(0.3, 1.0, 8, endpoint=True))
|
||||
plt.xscale('log')
|
||||
plt.xlabel('False Positive Rate')
|
||||
plt.ylabel('True Positive Rate')
|
||||
plt.title('ROC on IJB')
|
||||
plt.legend(loc="lower right")
|
||||
fig.savefig(os.path.join(save_path, '%s.pdf' % target.lower()))
|
||||
print(tpr_fpr_table)
|
||||
@@ -0,0 +1,35 @@
|
||||
import argparse
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
import torch
|
||||
|
||||
from backbones import get_model
|
||||
|
||||
|
||||
@torch.no_grad()
|
||||
def inference(weight, name, img):
|
||||
if img is None:
|
||||
img = np.random.randint(0, 255, size=(112, 112, 3), dtype=np.uint8)
|
||||
else:
|
||||
img = cv2.imread(img)
|
||||
img = cv2.resize(img, (112, 112))
|
||||
|
||||
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
|
||||
img = np.transpose(img, (2, 0, 1))
|
||||
img = torch.from_numpy(img).unsqueeze(0).float()
|
||||
img.div_(255).sub_(0.5).div_(0.5)
|
||||
net = get_model(name, fp16=False)
|
||||
net.load_state_dict(torch.load(weight))
|
||||
net.eval()
|
||||
feat = net(img).numpy()
|
||||
print(feat)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser(description='PyTorch ArcFace Training')
|
||||
parser.add_argument('--network', type=str, default='r50', help='backbone network')
|
||||
parser.add_argument('--weight', type=str, default='')
|
||||
parser.add_argument('--img', type=str, default=None)
|
||||
args = parser.parse_args()
|
||||
inference(args.weight, args.network, args.img)
|
||||
@@ -0,0 +1,47 @@
|
||||
import torch
|
||||
import math
|
||||
|
||||
class ArcFace(torch.nn.Module):
|
||||
""" ArcFace (https://arxiv.org/pdf/1801.07698v1.pdf):
|
||||
"""
|
||||
def __init__(self, s=64.0, margin=0.5):
|
||||
super(ArcFace, self).__init__()
|
||||
self.scale = s
|
||||
self.cos_m = math.cos(margin)
|
||||
self.sin_m = math.sin(margin)
|
||||
self.theta = math.cos(math.pi - margin)
|
||||
self.sinmm = math.sin(math.pi - margin) * margin
|
||||
self.easy_margin = False
|
||||
|
||||
|
||||
def forward(self, logits: torch.Tensor, labels: torch.Tensor):
|
||||
index = torch.where(labels != -1)[0]
|
||||
target_logit = logits[index, labels[index].view(-1)]
|
||||
|
||||
sin_theta = torch.sqrt(1.0 - torch.pow(target_logit, 2))
|
||||
cos_theta_m = target_logit * self.cos_m - sin_theta * self.sin_m # cos(target+margin)
|
||||
if self.easy_margin:
|
||||
final_target_logit = torch.where(
|
||||
target_logit > 0, cos_theta_m, target_logit)
|
||||
else:
|
||||
final_target_logit = torch.where(
|
||||
target_logit > self.theta, cos_theta_m, target_logit - self.sinmm)
|
||||
|
||||
logits[index, labels[index].view(-1)] = final_target_logit
|
||||
logits = logits * self.scale
|
||||
return logits
|
||||
|
||||
|
||||
class CosFace(torch.nn.Module):
|
||||
def __init__(self, s=64.0, m=0.40):
|
||||
super(CosFace, self).__init__()
|
||||
self.s = s
|
||||
self.m = m
|
||||
|
||||
def forward(self, logits: torch.Tensor, labels: torch.Tensor):
|
||||
index = torch.where(labels != -1)[0]
|
||||
target_logit = logits[index, labels[index].view(-1)]
|
||||
final_target_logit = target_logit - self.m
|
||||
logits[index, labels[index].view(-1)] = final_target_logit
|
||||
logits = logits * self.s
|
||||
return logits
|
||||
@@ -0,0 +1,29 @@
|
||||
from torch.optim.lr_scheduler import _LRScheduler
|
||||
|
||||
|
||||
class PolyScheduler(_LRScheduler):
|
||||
def __init__(self, optimizer, base_lr, max_steps, warmup_steps, last_epoch=-1):
|
||||
self.base_lr = base_lr
|
||||
self.warmup_lr_init = 0.0001
|
||||
self.max_steps: int = max_steps
|
||||
self.warmup_steps: int = warmup_steps
|
||||
self.power = 2
|
||||
super(PolyScheduler, self).__init__(optimizer, last_epoch, False)
|
||||
|
||||
def get_warmup_lr(self):
|
||||
alpha = float(self.last_epoch) / float(self.warmup_steps)
|
||||
return [self.base_lr * alpha for _ in self.optimizer.param_groups]
|
||||
|
||||
def get_lr(self):
|
||||
if self.last_epoch == -1:
|
||||
return [self.warmup_lr_init for _ in self.optimizer.param_groups]
|
||||
if self.last_epoch < self.warmup_steps:
|
||||
return self.get_warmup_lr()
|
||||
else:
|
||||
alpha = pow(
|
||||
1
|
||||
- float(self.last_epoch - self.warmup_steps)
|
||||
/ float(self.max_steps - self.warmup_steps),
|
||||
self.power,
|
||||
)
|
||||
return [self.base_lr * alpha for _ in self.optimizer.param_groups]
|
||||
@@ -0,0 +1,250 @@
|
||||
from __future__ import division
|
||||
import datetime
|
||||
import os
|
||||
import os.path as osp
|
||||
import glob
|
||||
import numpy as np
|
||||
import cv2
|
||||
import sys
|
||||
import onnxruntime
|
||||
import onnx
|
||||
import argparse
|
||||
from onnx import numpy_helper
|
||||
from insightface.data import get_image
|
||||
|
||||
class ArcFaceORT:
|
||||
def __init__(self, model_path, cpu=False):
|
||||
self.model_path = model_path
|
||||
# providers = None will use available provider, for onnxruntime-gpu it will be "CUDAExecutionProvider"
|
||||
self.providers = ['CPUExecutionProvider'] if cpu else None
|
||||
|
||||
#input_size is (w,h), return error message, return None if success
|
||||
def check(self, track='cfat', test_img = None):
|
||||
#default is cfat
|
||||
max_model_size_mb=1024
|
||||
max_feat_dim=512
|
||||
max_time_cost=15
|
||||
if track.startswith('ms1m'):
|
||||
max_model_size_mb=1024
|
||||
max_feat_dim=512
|
||||
max_time_cost=10
|
||||
elif track.startswith('glint'):
|
||||
max_model_size_mb=1024
|
||||
max_feat_dim=1024
|
||||
max_time_cost=20
|
||||
elif track.startswith('cfat'):
|
||||
max_model_size_mb = 1024
|
||||
max_feat_dim = 512
|
||||
max_time_cost = 15
|
||||
elif track.startswith('unconstrained'):
|
||||
max_model_size_mb=1024
|
||||
max_feat_dim=1024
|
||||
max_time_cost=30
|
||||
else:
|
||||
return "track not found"
|
||||
|
||||
if not os.path.exists(self.model_path):
|
||||
return "model_path not exists"
|
||||
if not os.path.isdir(self.model_path):
|
||||
return "model_path should be directory"
|
||||
onnx_files = []
|
||||
for _file in os.listdir(self.model_path):
|
||||
if _file.endswith('.onnx'):
|
||||
onnx_files.append(osp.join(self.model_path, _file))
|
||||
if len(onnx_files)==0:
|
||||
return "do not have onnx files"
|
||||
self.model_file = sorted(onnx_files)[-1]
|
||||
print('use onnx-model:', self.model_file)
|
||||
try:
|
||||
session = onnxruntime.InferenceSession(self.model_file, providers=self.providers)
|
||||
except:
|
||||
return "load onnx failed"
|
||||
input_cfg = session.get_inputs()[0]
|
||||
input_shape = input_cfg.shape
|
||||
print('input-shape:', input_shape)
|
||||
if len(input_shape)!=4:
|
||||
return "length of input_shape should be 4"
|
||||
if not isinstance(input_shape[0], str):
|
||||
#return "input_shape[0] should be str to support batch-inference"
|
||||
print('reset input-shape[0] to None')
|
||||
model = onnx.load(self.model_file)
|
||||
model.graph.input[0].type.tensor_type.shape.dim[0].dim_param = 'None'
|
||||
new_model_file = osp.join(self.model_path, 'zzzzrefined.onnx')
|
||||
onnx.save(model, new_model_file)
|
||||
self.model_file = new_model_file
|
||||
print('use new onnx-model:', self.model_file)
|
||||
try:
|
||||
session = onnxruntime.InferenceSession(self.model_file, providers=self.providers)
|
||||
except:
|
||||
return "load onnx failed"
|
||||
input_cfg = session.get_inputs()[0]
|
||||
input_shape = input_cfg.shape
|
||||
print('new-input-shape:', input_shape)
|
||||
|
||||
self.image_size = tuple(input_shape[2:4][::-1])
|
||||
#print('image_size:', self.image_size)
|
||||
input_name = input_cfg.name
|
||||
outputs = session.get_outputs()
|
||||
output_names = []
|
||||
for o in outputs:
|
||||
output_names.append(o.name)
|
||||
#print(o.name, o.shape)
|
||||
if len(output_names)!=1:
|
||||
return "number of output nodes should be 1"
|
||||
self.session = session
|
||||
self.input_name = input_name
|
||||
self.output_names = output_names
|
||||
#print(self.output_names)
|
||||
model = onnx.load(self.model_file)
|
||||
graph = model.graph
|
||||
if len(graph.node)<8:
|
||||
return "too small onnx graph"
|
||||
|
||||
input_size = (112,112)
|
||||
self.crop = None
|
||||
if track=='cfat':
|
||||
crop_file = osp.join(self.model_path, 'crop.txt')
|
||||
if osp.exists(crop_file):
|
||||
lines = open(crop_file,'r').readlines()
|
||||
if len(lines)!=6:
|
||||
return "crop.txt should contain 6 lines"
|
||||
lines = [int(x) for x in lines]
|
||||
self.crop = lines[:4]
|
||||
input_size = tuple(lines[4:6])
|
||||
if input_size!=self.image_size:
|
||||
return "input-size is inconsistant with onnx model input, %s vs %s"%(input_size, self.image_size)
|
||||
|
||||
self.model_size_mb = os.path.getsize(self.model_file) / float(1024*1024)
|
||||
if self.model_size_mb > max_model_size_mb:
|
||||
return "max model size exceed, given %.3f-MB"%self.model_size_mb
|
||||
|
||||
input_mean = None
|
||||
input_std = None
|
||||
if track=='cfat':
|
||||
pn_file = osp.join(self.model_path, 'pixel_norm.txt')
|
||||
if osp.exists(pn_file):
|
||||
lines = open(pn_file,'r').readlines()
|
||||
if len(lines)!=2:
|
||||
return "pixel_norm.txt should contain 2 lines"
|
||||
input_mean = float(lines[0])
|
||||
input_std = float(lines[1])
|
||||
if input_mean is not None or input_std is not None:
|
||||
if input_mean is None or input_std is None:
|
||||
return "please set input_mean and input_std simultaneously"
|
||||
else:
|
||||
find_sub = False
|
||||
find_mul = False
|
||||
for nid, node in enumerate(graph.node[:8]):
|
||||
print(nid, node.name)
|
||||
if node.name.startswith('Sub') or node.name.startswith('_minus'):
|
||||
find_sub = True
|
||||
if node.name.startswith('Mul') or node.name.startswith('_mul') or node.name.startswith('Div'):
|
||||
find_mul = True
|
||||
if find_sub and find_mul:
|
||||
print("find sub and mul")
|
||||
#mxnet arcface model
|
||||
input_mean = 0.0
|
||||
input_std = 1.0
|
||||
else:
|
||||
input_mean = 127.5
|
||||
input_std = 127.5
|
||||
self.input_mean = input_mean
|
||||
self.input_std = input_std
|
||||
for initn in graph.initializer:
|
||||
weight_array = numpy_helper.to_array(initn)
|
||||
dt = weight_array.dtype
|
||||
if dt.itemsize<4:
|
||||
return 'invalid weight type - (%s:%s)' % (initn.name, dt.name)
|
||||
if test_img is None:
|
||||
test_img = get_image('Tom_Hanks_54745')
|
||||
test_img = cv2.resize(test_img, self.image_size)
|
||||
else:
|
||||
test_img = cv2.resize(test_img, self.image_size)
|
||||
feat, cost = self.benchmark(test_img)
|
||||
batch_result = self.check_batch(test_img)
|
||||
batch_result_sum = float(np.sum(batch_result))
|
||||
if batch_result_sum in [float('inf'), -float('inf')] or batch_result_sum != batch_result_sum:
|
||||
print(batch_result)
|
||||
print(batch_result_sum)
|
||||
return "batch result output contains NaN!"
|
||||
|
||||
if len(feat.shape) < 2:
|
||||
return "the shape of the feature must be two, but get {}".format(str(feat.shape))
|
||||
|
||||
if feat.shape[1] > max_feat_dim:
|
||||
return "max feat dim exceed, given %d"%feat.shape[1]
|
||||
self.feat_dim = feat.shape[1]
|
||||
cost_ms = cost*1000
|
||||
if cost_ms>max_time_cost:
|
||||
return "max time cost exceed, given %.4f"%cost_ms
|
||||
self.cost_ms = cost_ms
|
||||
print('check stat:, model-size-mb: %.4f, feat-dim: %d, time-cost-ms: %.4f, input-mean: %.3f, input-std: %.3f'%(self.model_size_mb, self.feat_dim, self.cost_ms, self.input_mean, self.input_std))
|
||||
return None
|
||||
|
||||
def check_batch(self, img):
|
||||
if not isinstance(img, list):
|
||||
imgs = [img, ] * 32
|
||||
if self.crop is not None:
|
||||
nimgs = []
|
||||
for img in imgs:
|
||||
nimg = img[self.crop[1]:self.crop[3], self.crop[0]:self.crop[2], :]
|
||||
if nimg.shape[0] != self.image_size[1] or nimg.shape[1] != self.image_size[0]:
|
||||
nimg = cv2.resize(nimg, self.image_size)
|
||||
nimgs.append(nimg)
|
||||
imgs = nimgs
|
||||
blob = cv2.dnn.blobFromImages(
|
||||
images=imgs, scalefactor=1.0 / self.input_std, size=self.image_size,
|
||||
mean=(self.input_mean, self.input_mean, self.input_mean), swapRB=True)
|
||||
net_out = self.session.run(self.output_names, {self.input_name: blob})[0]
|
||||
return net_out
|
||||
|
||||
|
||||
def meta_info(self):
|
||||
return {'model-size-mb':self.model_size_mb, 'feature-dim':self.feat_dim, 'infer': self.cost_ms}
|
||||
|
||||
|
||||
def forward(self, imgs):
|
||||
if not isinstance(imgs, list):
|
||||
imgs = [imgs]
|
||||
input_size = self.image_size
|
||||
if self.crop is not None:
|
||||
nimgs = []
|
||||
for img in imgs:
|
||||
nimg = img[self.crop[1]:self.crop[3],self.crop[0]:self.crop[2],:]
|
||||
if nimg.shape[0]!=input_size[1] or nimg.shape[1]!=input_size[0]:
|
||||
nimg = cv2.resize(nimg, input_size)
|
||||
nimgs.append(nimg)
|
||||
imgs = nimgs
|
||||
blob = cv2.dnn.blobFromImages(imgs, 1.0/self.input_std, input_size, (self.input_mean, self.input_mean, self.input_mean), swapRB=True)
|
||||
net_out = self.session.run(self.output_names, {self.input_name : blob})[0]
|
||||
return net_out
|
||||
|
||||
def benchmark(self, img):
|
||||
input_size = self.image_size
|
||||
if self.crop is not None:
|
||||
nimg = img[self.crop[1]:self.crop[3],self.crop[0]:self.crop[2],:]
|
||||
if nimg.shape[0]!=input_size[1] or nimg.shape[1]!=input_size[0]:
|
||||
nimg = cv2.resize(nimg, input_size)
|
||||
img = nimg
|
||||
blob = cv2.dnn.blobFromImage(img, 1.0/self.input_std, input_size, (self.input_mean, self.input_mean, self.input_mean), swapRB=True)
|
||||
costs = []
|
||||
for _ in range(50):
|
||||
ta = datetime.datetime.now()
|
||||
net_out = self.session.run(self.output_names, {self.input_name : blob})[0]
|
||||
tb = datetime.datetime.now()
|
||||
cost = (tb-ta).total_seconds()
|
||||
costs.append(cost)
|
||||
costs = sorted(costs)
|
||||
cost = costs[5]
|
||||
return net_out, cost
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser(description='')
|
||||
# general
|
||||
parser.add_argument('workdir', help='submitted work dir', type=str)
|
||||
parser.add_argument('--track', help='track name, for different challenge', type=str, default='cfat')
|
||||
args = parser.parse_args()
|
||||
handler = ArcFaceORT(args.workdir)
|
||||
err = handler.check(args.track)
|
||||
print('err:', err)
|
||||
@@ -0,0 +1,269 @@
|
||||
import argparse
|
||||
import os
|
||||
import pickle
|
||||
import timeit
|
||||
|
||||
import cv2
|
||||
import mxnet as mx
|
||||
import numpy as np
|
||||
import pandas as pd
|
||||
import prettytable
|
||||
import skimage.transform
|
||||
import torch
|
||||
from sklearn.metrics import roc_curve
|
||||
from sklearn.preprocessing import normalize
|
||||
from torch.utils.data import DataLoader
|
||||
from onnx_helper import ArcFaceORT
|
||||
|
||||
SRC = np.array(
|
||||
[
|
||||
[30.2946, 51.6963],
|
||||
[65.5318, 51.5014],
|
||||
[48.0252, 71.7366],
|
||||
[33.5493, 92.3655],
|
||||
[62.7299, 92.2041]]
|
||||
, dtype=np.float32)
|
||||
SRC[:, 0] += 8.0
|
||||
|
||||
|
||||
@torch.no_grad()
|
||||
class AlignedDataSet(mx.gluon.data.Dataset):
|
||||
def __init__(self, root, lines, align=True):
|
||||
self.lines = lines
|
||||
self.root = root
|
||||
self.align = align
|
||||
|
||||
def __len__(self):
|
||||
return len(self.lines)
|
||||
|
||||
def __getitem__(self, idx):
|
||||
each_line = self.lines[idx]
|
||||
name_lmk_score = each_line.strip().split(' ')
|
||||
name = os.path.join(self.root, name_lmk_score[0])
|
||||
img = cv2.cvtColor(cv2.imread(name), cv2.COLOR_BGR2RGB)
|
||||
landmark5 = np.array([float(x) for x in name_lmk_score[1:-1]], dtype=np.float32).reshape((5, 2))
|
||||
st = skimage.transform.SimilarityTransform()
|
||||
st.estimate(landmark5, SRC)
|
||||
img = cv2.warpAffine(img, st.params[0:2, :], (112, 112), borderValue=0.0)
|
||||
img_1 = np.expand_dims(img, 0)
|
||||
img_2 = np.expand_dims(np.fliplr(img), 0)
|
||||
output = np.concatenate((img_1, img_2), axis=0).astype(np.float32)
|
||||
output = np.transpose(output, (0, 3, 1, 2))
|
||||
return torch.from_numpy(output)
|
||||
|
||||
|
||||
@torch.no_grad()
|
||||
def extract(model_root, dataset):
|
||||
model = ArcFaceORT(model_path=model_root)
|
||||
model.check()
|
||||
feat_mat = np.zeros(shape=(len(dataset), 2 * model.feat_dim))
|
||||
|
||||
def collate_fn(data):
|
||||
return torch.cat(data, dim=0)
|
||||
|
||||
data_loader = DataLoader(
|
||||
dataset, batch_size=128, drop_last=False, num_workers=4, collate_fn=collate_fn, )
|
||||
num_iter = 0
|
||||
for batch in data_loader:
|
||||
batch = batch.numpy()
|
||||
batch = (batch - model.input_mean) / model.input_std
|
||||
feat = model.session.run(model.output_names, {model.input_name: batch})[0]
|
||||
feat = np.reshape(feat, (-1, model.feat_dim * 2))
|
||||
feat_mat[128 * num_iter: 128 * num_iter + feat.shape[0], :] = feat
|
||||
num_iter += 1
|
||||
if num_iter % 50 == 0:
|
||||
print(num_iter)
|
||||
return feat_mat
|
||||
|
||||
|
||||
def read_template_media_list(path):
|
||||
ijb_meta = pd.read_csv(path, sep=' ', header=None).values
|
||||
templates = ijb_meta[:, 1].astype(np.int)
|
||||
medias = ijb_meta[:, 2].astype(np.int)
|
||||
return templates, medias
|
||||
|
||||
|
||||
def read_template_pair_list(path):
|
||||
pairs = pd.read_csv(path, sep=' ', header=None).values
|
||||
t1 = pairs[:, 0].astype(np.int)
|
||||
t2 = pairs[:, 1].astype(np.int)
|
||||
label = pairs[:, 2].astype(np.int)
|
||||
return t1, t2, label
|
||||
|
||||
|
||||
def read_image_feature(path):
|
||||
with open(path, 'rb') as fid:
|
||||
img_feats = pickle.load(fid)
|
||||
return img_feats
|
||||
|
||||
|
||||
def image2template_feature(img_feats=None,
|
||||
templates=None,
|
||||
medias=None):
|
||||
unique_templates = np.unique(templates)
|
||||
template_feats = np.zeros((len(unique_templates), img_feats.shape[1]))
|
||||
for count_template, uqt in enumerate(unique_templates):
|
||||
(ind_t,) = np.where(templates == uqt)
|
||||
face_norm_feats = img_feats[ind_t]
|
||||
face_medias = medias[ind_t]
|
||||
unique_medias, unique_media_counts = np.unique(face_medias, return_counts=True)
|
||||
media_norm_feats = []
|
||||
for u, ct in zip(unique_medias, unique_media_counts):
|
||||
(ind_m,) = np.where(face_medias == u)
|
||||
if ct == 1:
|
||||
media_norm_feats += [face_norm_feats[ind_m]]
|
||||
else: # image features from the same video will be aggregated into one feature
|
||||
media_norm_feats += [np.mean(face_norm_feats[ind_m], axis=0, keepdims=True), ]
|
||||
media_norm_feats = np.array(media_norm_feats)
|
||||
template_feats[count_template] = np.sum(media_norm_feats, axis=0)
|
||||
if count_template % 2000 == 0:
|
||||
print('Finish Calculating {} template features.'.format(
|
||||
count_template))
|
||||
template_norm_feats = normalize(template_feats)
|
||||
return template_norm_feats, unique_templates
|
||||
|
||||
|
||||
def verification(template_norm_feats=None,
|
||||
unique_templates=None,
|
||||
p1=None,
|
||||
p2=None):
|
||||
template2id = np.zeros((max(unique_templates) + 1, 1), dtype=int)
|
||||
for count_template, uqt in enumerate(unique_templates):
|
||||
template2id[uqt] = count_template
|
||||
score = np.zeros((len(p1),))
|
||||
total_pairs = np.array(range(len(p1)))
|
||||
batchsize = 100000
|
||||
sublists = [total_pairs[i: i + batchsize] for i in range(0, len(p1), batchsize)]
|
||||
total_sublists = len(sublists)
|
||||
for c, s in enumerate(sublists):
|
||||
feat1 = template_norm_feats[template2id[p1[s]]]
|
||||
feat2 = template_norm_feats[template2id[p2[s]]]
|
||||
similarity_score = np.sum(feat1 * feat2, -1)
|
||||
score[s] = similarity_score.flatten()
|
||||
if c % 10 == 0:
|
||||
print('Finish {}/{} pairs.'.format(c, total_sublists))
|
||||
return score
|
||||
|
||||
|
||||
def verification2(template_norm_feats=None,
|
||||
unique_templates=None,
|
||||
p1=None,
|
||||
p2=None):
|
||||
template2id = np.zeros((max(unique_templates) + 1, 1), dtype=int)
|
||||
for count_template, uqt in enumerate(unique_templates):
|
||||
template2id[uqt] = count_template
|
||||
score = np.zeros((len(p1),)) # save cosine distance between pairs
|
||||
total_pairs = np.array(range(len(p1)))
|
||||
batchsize = 100000 # small batchsize instead of all pairs in one batch due to the memory limiation
|
||||
sublists = [total_pairs[i:i + batchsize] for i in range(0, len(p1), batchsize)]
|
||||
total_sublists = len(sublists)
|
||||
for c, s in enumerate(sublists):
|
||||
feat1 = template_norm_feats[template2id[p1[s]]]
|
||||
feat2 = template_norm_feats[template2id[p2[s]]]
|
||||
similarity_score = np.sum(feat1 * feat2, -1)
|
||||
score[s] = similarity_score.flatten()
|
||||
if c % 10 == 0:
|
||||
print('Finish {}/{} pairs.'.format(c, total_sublists))
|
||||
return score
|
||||
|
||||
|
||||
def main(args):
|
||||
use_norm_score = True # if Ture, TestMode(N1)
|
||||
use_detector_score = True # if Ture, TestMode(D1)
|
||||
use_flip_test = True # if Ture, TestMode(F1)
|
||||
assert args.target == 'IJBC' or args.target == 'IJBB'
|
||||
|
||||
start = timeit.default_timer()
|
||||
templates, medias = read_template_media_list(
|
||||
os.path.join('%s/meta' % args.image_path, '%s_face_tid_mid.txt' % args.target.lower()))
|
||||
stop = timeit.default_timer()
|
||||
print('Time: %.2f s. ' % (stop - start))
|
||||
|
||||
start = timeit.default_timer()
|
||||
p1, p2, label = read_template_pair_list(
|
||||
os.path.join('%s/meta' % args.image_path,
|
||||
'%s_template_pair_label.txt' % args.target.lower()))
|
||||
stop = timeit.default_timer()
|
||||
print('Time: %.2f s. ' % (stop - start))
|
||||
|
||||
start = timeit.default_timer()
|
||||
img_path = '%s/loose_crop' % args.image_path
|
||||
img_list_path = '%s/meta/%s_name_5pts_score.txt' % (args.image_path, args.target.lower())
|
||||
img_list = open(img_list_path)
|
||||
files = img_list.readlines()
|
||||
dataset = AlignedDataSet(root=img_path, lines=files, align=True)
|
||||
img_feats = extract(args.model_root, dataset)
|
||||
|
||||
faceness_scores = []
|
||||
for each_line in files:
|
||||
name_lmk_score = each_line.split()
|
||||
faceness_scores.append(name_lmk_score[-1])
|
||||
faceness_scores = np.array(faceness_scores).astype(np.float32)
|
||||
stop = timeit.default_timer()
|
||||
print('Time: %.2f s. ' % (stop - start))
|
||||
print('Feature Shape: ({} , {}) .'.format(img_feats.shape[0], img_feats.shape[1]))
|
||||
start = timeit.default_timer()
|
||||
|
||||
if use_flip_test:
|
||||
img_input_feats = img_feats[:, 0:img_feats.shape[1] // 2] + img_feats[:, img_feats.shape[1] // 2:]
|
||||
else:
|
||||
img_input_feats = img_feats[:, 0:img_feats.shape[1] // 2]
|
||||
|
||||
if use_norm_score:
|
||||
img_input_feats = img_input_feats
|
||||
else:
|
||||
img_input_feats = img_input_feats / np.sqrt(np.sum(img_input_feats ** 2, -1, keepdims=True))
|
||||
|
||||
if use_detector_score:
|
||||
print(img_input_feats.shape, faceness_scores.shape)
|
||||
img_input_feats = img_input_feats * faceness_scores[:, np.newaxis]
|
||||
else:
|
||||
img_input_feats = img_input_feats
|
||||
|
||||
template_norm_feats, unique_templates = image2template_feature(
|
||||
img_input_feats, templates, medias)
|
||||
stop = timeit.default_timer()
|
||||
print('Time: %.2f s. ' % (stop - start))
|
||||
|
||||
start = timeit.default_timer()
|
||||
score = verification(template_norm_feats, unique_templates, p1, p2)
|
||||
stop = timeit.default_timer()
|
||||
print('Time: %.2f s. ' % (stop - start))
|
||||
result_dir = args.model_root
|
||||
|
||||
save_path = os.path.join(result_dir, "{}_result".format(args.target))
|
||||
if not os.path.exists(save_path):
|
||||
os.makedirs(save_path)
|
||||
score_save_file = os.path.join(save_path, "{}.npy".format(args.target))
|
||||
np.save(score_save_file, score)
|
||||
files = [score_save_file]
|
||||
methods = []
|
||||
scores = []
|
||||
for file in files:
|
||||
methods.append(os.path.basename(file))
|
||||
scores.append(np.load(file))
|
||||
methods = np.array(methods)
|
||||
scores = dict(zip(methods, scores))
|
||||
x_labels = [10 ** -6, 10 ** -5, 10 ** -4, 10 ** -3, 10 ** -2, 10 ** -1]
|
||||
tpr_fpr_table = prettytable.PrettyTable(['Methods'] + [str(x) for x in x_labels])
|
||||
for method in methods:
|
||||
fpr, tpr, _ = roc_curve(label, scores[method])
|
||||
fpr = np.flipud(fpr)
|
||||
tpr = np.flipud(tpr)
|
||||
tpr_fpr_row = []
|
||||
tpr_fpr_row.append("%s-%s" % (method, args.target))
|
||||
for fpr_iter in np.arange(len(x_labels)):
|
||||
_, min_index = min(
|
||||
list(zip(abs(fpr - x_labels[fpr_iter]), range(len(fpr)))))
|
||||
tpr_fpr_row.append('%.2f' % (tpr[min_index] * 100))
|
||||
tpr_fpr_table.add_row(tpr_fpr_row)
|
||||
print(tpr_fpr_table)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
parser = argparse.ArgumentParser(description='do ijb test')
|
||||
# general
|
||||
parser.add_argument('--model-root', default='', help='path to load model.')
|
||||
parser.add_argument('--image-path', default='/train_tmp/IJB_release/IJBC', type=str, help='')
|
||||
parser.add_argument('--target', default='IJBC', type=str, help='target, set to IJBC or IJBB')
|
||||
main(parser.parse_args())
|
||||
@@ -0,0 +1,330 @@
|
||||
import collections
|
||||
from typing import Callable
|
||||
|
||||
import torch
|
||||
from torch import distributed
|
||||
from torch.nn.functional import linear, normalize
|
||||
|
||||
|
||||
class PartialFC(torch.nn.Module):
|
||||
"""
|
||||
https://arxiv.org/abs/2010.05222
|
||||
A distributed sparsely updating variant of the FC layer, named Partial FC (PFC).
|
||||
|
||||
When sample rate less than 1, in each iteration, positive class centers and a random subset of
|
||||
negative class centers are selected to compute the margin-based softmax loss, all class
|
||||
centers are still maintained throughout the whole training process, but only a subset is
|
||||
selected and updated in each iteration.
|
||||
|
||||
.. note::
|
||||
When sample rate equal to 1, Partial FC is equal to model parallelism(default sample rate is 1).
|
||||
|
||||
Example:
|
||||
--------
|
||||
>>> module_pfc = PartialFC(embedding_size=512, num_classes=8000000, sample_rate=0.2)
|
||||
>>> for img, labels in data_loader:
|
||||
>>> embeddings = net(img)
|
||||
>>> loss = module_pfc(embeddings, labels, optimizer)
|
||||
>>> loss.backward()
|
||||
>>> optimizer.step()
|
||||
"""
|
||||
_version = 1
|
||||
def __init__(
|
||||
self,
|
||||
margin_loss: Callable,
|
||||
embedding_size: int,
|
||||
num_classes: int,
|
||||
sample_rate: float = 1.0,
|
||||
fp16: bool = False,
|
||||
):
|
||||
"""
|
||||
Paramenters:
|
||||
-----------
|
||||
embedding_size: int
|
||||
The dimension of embedding, required
|
||||
num_classes: int
|
||||
Total number of classes, required
|
||||
sample_rate: float
|
||||
The rate of negative centers participating in the calculation, default is 1.0.
|
||||
"""
|
||||
super(PartialFC, self).__init__()
|
||||
assert (
|
||||
distributed.is_initialized()
|
||||
), "must initialize distributed before create this"
|
||||
self.rank = distributed.get_rank()
|
||||
self.world_size = distributed.get_world_size()
|
||||
|
||||
self.dist_cross_entropy = DistCrossEntropy()
|
||||
self.embedding_size = embedding_size
|
||||
self.sample_rate: float = sample_rate
|
||||
self.fp16 = fp16
|
||||
self.num_local: int = num_classes // self.world_size + int(
|
||||
self.rank < num_classes % self.world_size
|
||||
)
|
||||
self.class_start: int = num_classes // self.world_size * self.rank + min(
|
||||
self.rank, num_classes % self.world_size
|
||||
)
|
||||
self.num_sample: int = int(self.sample_rate * self.num_local)
|
||||
self.last_batch_size: int = 0
|
||||
self.weight: torch.Tensor
|
||||
self.weight_mom: torch.Tensor
|
||||
self.weight_activated: torch.nn.Parameter
|
||||
self.weight_activated_mom: torch.Tensor
|
||||
self.is_updated: bool = True
|
||||
self.init_weight_update: bool = True
|
||||
|
||||
if self.sample_rate < 1:
|
||||
self.register_buffer("weight",
|
||||
tensor=torch.normal(0, 0.01, (self.num_local, embedding_size)))
|
||||
self.register_buffer("weight_mom",
|
||||
tensor=torch.zeros_like(self.weight))
|
||||
self.register_parameter("weight_activated",
|
||||
param=torch.nn.Parameter(torch.empty(0, 0)))
|
||||
self.register_buffer("weight_activated_mom",
|
||||
tensor=torch.empty(0, 0))
|
||||
self.register_buffer("weight_index",
|
||||
tensor=torch.empty(0, 0))
|
||||
else:
|
||||
self.weight_activated = torch.nn.Parameter(torch.normal(0, 0.01, (self.num_local, embedding_size)))
|
||||
|
||||
# margin_loss
|
||||
if isinstance(margin_loss, Callable):
|
||||
self.margin_softmax = margin_loss
|
||||
else:
|
||||
raise
|
||||
|
||||
@torch.no_grad()
|
||||
def sample(self,
|
||||
labels: torch.Tensor,
|
||||
index_positive: torch.Tensor,
|
||||
optimizer: torch.optim.Optimizer):
|
||||
"""
|
||||
This functions will change the value of labels
|
||||
|
||||
Parameters:
|
||||
-----------
|
||||
labels: torch.Tensor
|
||||
pass
|
||||
index_positive: torch.Tensor
|
||||
pass
|
||||
optimizer: torch.optim.Optimizer
|
||||
pass
|
||||
"""
|
||||
positive = torch.unique(labels[index_positive], sorted=True).cuda()
|
||||
if self.num_sample - positive.size(0) >= 0:
|
||||
perm = torch.rand(size=[self.num_local]).cuda()
|
||||
perm[positive] = 2.0
|
||||
index = torch.topk(perm, k=self.num_sample)[1].cuda()
|
||||
index = index.sort()[0].cuda()
|
||||
else:
|
||||
index = positive
|
||||
self.weight_index = index
|
||||
|
||||
labels[index_positive] = torch.searchsorted(index, labels[index_positive])
|
||||
|
||||
self.weight_activated = torch.nn.Parameter(self.weight[self.weight_index])
|
||||
self.weight_activated_mom = self.weight_mom[self.weight_index]
|
||||
|
||||
if isinstance(optimizer, torch.optim.SGD):
|
||||
# TODO the params of partial fc must be last in the params list
|
||||
optimizer.state.pop(optimizer.param_groups[-1]["params"][0], None)
|
||||
optimizer.param_groups[-1]["params"][0] = self.weight_activated
|
||||
optimizer.state[self.weight_activated][
|
||||
"momentum_buffer"
|
||||
] = self.weight_activated_mom
|
||||
else:
|
||||
raise
|
||||
|
||||
@torch.no_grad()
|
||||
def update(self):
|
||||
""" partial weight to global
|
||||
"""
|
||||
if self.init_weight_update:
|
||||
self.init_weight_update = False
|
||||
return
|
||||
|
||||
if self.sample_rate < 1:
|
||||
self.weight[self.weight_index] = self.weight_activated
|
||||
self.weight_mom[self.weight_index] = self.weight_activated_mom
|
||||
|
||||
|
||||
def forward(
|
||||
self,
|
||||
local_embeddings: torch.Tensor,
|
||||
local_labels: torch.Tensor,
|
||||
optimizer: torch.optim.Optimizer,
|
||||
):
|
||||
"""
|
||||
Parameters:
|
||||
----------
|
||||
local_embeddings: torch.Tensor
|
||||
feature embeddings on each GPU(Rank).
|
||||
local_labels: torch.Tensor
|
||||
labels on each GPU(Rank).
|
||||
|
||||
Returns:
|
||||
-------
|
||||
loss: torch.Tensor
|
||||
pass
|
||||
"""
|
||||
local_labels.squeeze_()
|
||||
local_labels = local_labels.long()
|
||||
self.update()
|
||||
|
||||
batch_size = local_embeddings.size(0)
|
||||
if self.last_batch_size == 0:
|
||||
self.last_batch_size = batch_size
|
||||
assert self.last_batch_size == batch_size, (
|
||||
"last batch size do not equal current batch size: {} vs {}".format(
|
||||
self.last_batch_size, batch_size))
|
||||
|
||||
_gather_embeddings = [
|
||||
torch.zeros((batch_size, self.embedding_size)).cuda()
|
||||
for _ in range(self.world_size)
|
||||
]
|
||||
_gather_labels = [
|
||||
torch.zeros(batch_size).long().cuda() for _ in range(self.world_size)
|
||||
]
|
||||
_list_embeddings = AllGather(local_embeddings, *_gather_embeddings)
|
||||
distributed.all_gather(_gather_labels, local_labels)
|
||||
|
||||
embeddings = torch.cat(_list_embeddings)
|
||||
labels = torch.cat(_gather_labels)
|
||||
|
||||
labels = labels.view(-1, 1)
|
||||
index_positive = (self.class_start <= labels) & (
|
||||
labels < self.class_start + self.num_local
|
||||
)
|
||||
labels[~index_positive] = -1
|
||||
labels[index_positive] -= self.class_start
|
||||
|
||||
if self.sample_rate < 1:
|
||||
self.sample(labels, index_positive, optimizer)
|
||||
|
||||
with torch.cuda.amp.autocast(self.fp16):
|
||||
norm_embeddings = normalize(embeddings)
|
||||
norm_weight_activated = normalize(self.weight_activated)
|
||||
logits = linear(norm_embeddings, norm_weight_activated)
|
||||
if self.fp16:
|
||||
logits = logits.float()
|
||||
logits = logits.clamp(-1, 1)
|
||||
|
||||
logits = self.margin_softmax(logits, labels)
|
||||
loss = self.dist_cross_entropy(logits, labels)
|
||||
return loss
|
||||
|
||||
def state_dict(self, destination=None, prefix="", keep_vars=False):
|
||||
if destination is None:
|
||||
destination = collections.OrderedDict()
|
||||
destination._metadata = collections.OrderedDict()
|
||||
|
||||
for name, module in self._modules.items():
|
||||
if module is not None:
|
||||
module.state_dict(destination, prefix + name + ".", keep_vars=keep_vars)
|
||||
if self.sample_rate < 1:
|
||||
destination["weight"] = self.weight.detach()
|
||||
else:
|
||||
destination["weight"] = self.weight_activated.data.detach()
|
||||
return destination
|
||||
|
||||
def load_state_dict(self, state_dict, strict: bool = True):
|
||||
if self.sample_rate < 1:
|
||||
self.weight = state_dict["weight"].to(self.weight.device)
|
||||
self.weight_mom.zero_()
|
||||
self.weight_activated.data.zero_()
|
||||
self.weight_activated_mom.zero_()
|
||||
self.weight_index.zero_()
|
||||
else:
|
||||
self.weight_activated.data = state_dict["weight"].to(self.weight_activated.data.device)
|
||||
|
||||
class DistCrossEntropyFunc(torch.autograd.Function):
|
||||
"""
|
||||
CrossEntropy loss is calculated in parallel, allreduce denominator into single gpu and calculate softmax.
|
||||
Implemented of ArcFace (https://arxiv.org/pdf/1801.07698v1.pdf):
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, logits: torch.Tensor, label: torch.Tensor):
|
||||
""" """
|
||||
batch_size = logits.size(0)
|
||||
# for numerical stability
|
||||
max_logits, _ = torch.max(logits, dim=1, keepdim=True)
|
||||
# local to global
|
||||
distributed.all_reduce(max_logits, distributed.ReduceOp.MAX)
|
||||
logits.sub_(max_logits)
|
||||
logits.exp_()
|
||||
sum_logits_exp = torch.sum(logits, dim=1, keepdim=True)
|
||||
# local to global
|
||||
distributed.all_reduce(sum_logits_exp, distributed.ReduceOp.SUM)
|
||||
logits.div_(sum_logits_exp)
|
||||
index = torch.where(label != -1)[0]
|
||||
# loss
|
||||
loss = torch.zeros(batch_size, 1, device=logits.device)
|
||||
loss[index] = logits[index].gather(1, label[index])
|
||||
distributed.all_reduce(loss, distributed.ReduceOp.SUM)
|
||||
ctx.save_for_backward(index, logits, label)
|
||||
return loss.clamp_min_(1e-30).log_().mean() * (-1)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, loss_gradient):
|
||||
"""
|
||||
Args:
|
||||
loss_grad (torch.Tensor): gradient backward by last layer
|
||||
Returns:
|
||||
gradients for each input in forward function
|
||||
`None` gradients for one-hot label
|
||||
"""
|
||||
(
|
||||
index,
|
||||
logits,
|
||||
label,
|
||||
) = ctx.saved_tensors
|
||||
batch_size = logits.size(0)
|
||||
one_hot = torch.zeros(
|
||||
size=[index.size(0), logits.size(1)], device=logits.device
|
||||
)
|
||||
one_hot.scatter_(1, label[index], 1)
|
||||
logits[index] -= one_hot
|
||||
logits.div_(batch_size)
|
||||
return logits * loss_gradient.item(), None
|
||||
|
||||
|
||||
class DistCrossEntropy(torch.nn.Module):
|
||||
def __init__(self):
|
||||
super(DistCrossEntropy, self).__init__()
|
||||
|
||||
def forward(self, logit_part, label_part):
|
||||
return DistCrossEntropyFunc.apply(logit_part, label_part)
|
||||
|
||||
|
||||
class AllGatherFunc(torch.autograd.Function):
|
||||
"""AllGather op with gradient backward"""
|
||||
|
||||
@staticmethod
|
||||
def forward(ctx, tensor, *gather_list):
|
||||
gather_list = list(gather_list)
|
||||
distributed.all_gather(gather_list, tensor)
|
||||
return tuple(gather_list)
|
||||
|
||||
@staticmethod
|
||||
def backward(ctx, *grads):
|
||||
grad_list = list(grads)
|
||||
rank = distributed.get_rank()
|
||||
grad_out = grad_list[rank]
|
||||
|
||||
dist_ops = [
|
||||
distributed.reduce(grad_out, rank, distributed.ReduceOp.SUM, async_op=True)
|
||||
if i == rank
|
||||
else distributed.reduce(
|
||||
grad_list[i], i, distributed.ReduceOp.SUM, async_op=True
|
||||
)
|
||||
for i in range(distributed.get_world_size())
|
||||
]
|
||||
for _op in dist_ops:
|
||||
_op.wait()
|
||||
|
||||
grad_out *= len(grad_list) # cooperate with distributed loss function
|
||||
return (grad_out, *[None for _ in range(len(grad_list))])
|
||||
|
||||
|
||||
AllGather = AllGatherFunc.apply
|
||||
@@ -0,0 +1,5 @@
|
||||
tensorboard
|
||||
easydict
|
||||
mxnet
|
||||
onnx
|
||||
sklearn
|
||||
@@ -0,0 +1,9 @@
|
||||
|
||||
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch \
|
||||
--nproc_per_node=8 \
|
||||
--nnodes=1 \
|
||||
--node_rank=0 \
|
||||
--master_addr="127.0.0.1" \
|
||||
--master_port=12345 train.py $@
|
||||
|
||||
ps -ef | grep "train" | grep -v grep | awk '{print "kill -9 "$2}' | sh
|
||||
@@ -0,0 +1,53 @@
|
||||
import numpy as np
|
||||
import onnx
|
||||
import torch
|
||||
|
||||
|
||||
def convert_onnx(net, path_module, output, opset=11, simplify=False):
|
||||
assert isinstance(net, torch.nn.Module)
|
||||
img = np.random.randint(0, 255, size=(112, 112, 3), dtype=np.int32)
|
||||
img = img.astype(np.float)
|
||||
img = (img / 255. - 0.5) / 0.5 # torch style norm
|
||||
img = img.transpose((2, 0, 1))
|
||||
img = torch.from_numpy(img).unsqueeze(0).float()
|
||||
|
||||
weight = torch.load(path_module)
|
||||
net.load_state_dict(weight, strict=True)
|
||||
net.eval()
|
||||
torch.onnx.export(net, img, output, keep_initializers_as_inputs=False, verbose=False, opset_version=opset)
|
||||
model = onnx.load(output)
|
||||
graph = model.graph
|
||||
graph.input[0].type.tensor_type.shape.dim[0].dim_param = 'None'
|
||||
if simplify:
|
||||
from onnxsim import simplify
|
||||
model, check = simplify(model)
|
||||
assert check, "Simplified ONNX model could not be validated"
|
||||
onnx.save(model, output)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
import os
|
||||
import argparse
|
||||
from backbones import get_model
|
||||
|
||||
parser = argparse.ArgumentParser(description='ArcFace PyTorch to onnx')
|
||||
parser.add_argument('input', type=str, help='input backbone.pth file or path')
|
||||
parser.add_argument('--output', type=str, default=None, help='output onnx path')
|
||||
parser.add_argument('--network', type=str, default=None, help='backbone network')
|
||||
parser.add_argument('--simplify', type=bool, default=False, help='onnx simplify')
|
||||
args = parser.parse_args()
|
||||
input_file = args.input
|
||||
if os.path.isdir(input_file):
|
||||
input_file = os.path.join(input_file, "model.pt")
|
||||
assert os.path.exists(input_file)
|
||||
# model_name = os.path.basename(os.path.dirname(input_file)).lower()
|
||||
# params = model_name.split("_")
|
||||
# if len(params) >= 3 and params[1] in ('arcface', 'cosface'):
|
||||
# if args.network is None:
|
||||
# args.network = params[2]
|
||||
assert args.network is not None
|
||||
print(args)
|
||||
backbone_onnx = get_model(args.network, dropout=0)
|
||||
if args.output is None:
|
||||
args.output = os.path.join(os.path.dirname(args.input), "model.onnx")
|
||||
convert_onnx(backbone_onnx, input_file, args.output, simplify=args.simplify)
|
||||
@@ -0,0 +1,161 @@
|
||||
import argparse
|
||||
import logging
|
||||
import os
|
||||
|
||||
import torch
|
||||
from torch import distributed
|
||||
from torch.utils.tensorboard import SummaryWriter
|
||||
|
||||
from backbones import get_model
|
||||
from dataset import get_dataloader
|
||||
from torch.utils.data import DataLoader
|
||||
from lr_scheduler import PolyScheduler
|
||||
from losses import CosFace, ArcFace
|
||||
from partial_fc import PartialFC
|
||||
from utils.utils_callbacks import CallBackLogging, CallBackVerification
|
||||
from utils.utils_config import get_config
|
||||
from utils.utils_logging import AverageMeter, init_logging
|
||||
|
||||
|
||||
try:
|
||||
world_size = int(os.environ["WORLD_SIZE"])
|
||||
rank = int(os.environ["RANK"])
|
||||
distributed.init_process_group("nccl")
|
||||
except KeyError:
|
||||
world_size = 1
|
||||
rank = 0
|
||||
distributed.init_process_group(
|
||||
backend="nccl",
|
||||
init_method="tcp://127.0.0.1:12584",
|
||||
rank=rank,
|
||||
world_size=world_size,
|
||||
)
|
||||
|
||||
|
||||
def main(args):
|
||||
torch.cuda.set_device(args.local_rank)
|
||||
cfg = get_config(args.config)
|
||||
|
||||
os.makedirs(cfg.output, exist_ok=True)
|
||||
init_logging(rank, cfg.output)
|
||||
summary_writer = (
|
||||
SummaryWriter(log_dir=os.path.join(cfg.output, "tensorboard"))
|
||||
if rank == 0
|
||||
else None
|
||||
)
|
||||
train_loader = get_dataloader(
|
||||
cfg.rec, local_rank=args.local_rank, batch_size=cfg.batch_size, dali=cfg.dali)
|
||||
backbone = get_model(
|
||||
cfg.network, dropout=0.0, fp16=cfg.fp16, num_features=cfg.embedding_size
|
||||
).cuda()
|
||||
|
||||
backbone = torch.nn.parallel.DistributedDataParallel(
|
||||
module=backbone, broadcast_buffers=False, device_ids=[args.local_rank])
|
||||
backbone.train()
|
||||
|
||||
if cfg.loss == "arcface":
|
||||
margin_loss = ArcFace()
|
||||
elif cfg.loss == "cosface":
|
||||
margin_loss = CosFace()
|
||||
else:
|
||||
raise
|
||||
|
||||
module_partial_fc = PartialFC(
|
||||
margin_loss,
|
||||
cfg.embedding_size,
|
||||
cfg.num_classes,
|
||||
cfg.sample_rate,
|
||||
cfg.fp16
|
||||
)
|
||||
module_partial_fc.train().cuda()
|
||||
|
||||
# TODO the params of partial fc must be last in the params list
|
||||
opt = torch.optim.SGD(
|
||||
params=[
|
||||
{"params": backbone.parameters(), },
|
||||
{"params": module_partial_fc.parameters(), },
|
||||
],
|
||||
lr=cfg.lr,
|
||||
momentum=0.9,
|
||||
weight_decay=cfg.weight_decay
|
||||
)
|
||||
total_batch_size = cfg.batch_size * world_size
|
||||
cfg.warmup_step = cfg.num_image // total_batch_size * cfg.warmup_epoch
|
||||
cfg.total_step = cfg.num_image // total_batch_size * cfg.num_epoch
|
||||
lr_scheduler = PolyScheduler(
|
||||
optimizer=opt,
|
||||
base_lr=cfg.lr,
|
||||
max_steps=cfg.total_step,
|
||||
warmup_steps=cfg.warmup_step
|
||||
)
|
||||
|
||||
for key, value in cfg.items():
|
||||
num_space = 25 - len(key)
|
||||
logging.info(": " + key + " " * num_space + str(value))
|
||||
|
||||
callback_verification = CallBackVerification(
|
||||
val_targets=cfg.val_targets, rec_prefix=cfg.rec, summary_writer=summary_writer
|
||||
)
|
||||
callback_logging = CallBackLogging(
|
||||
frequent=cfg.frequent,
|
||||
total_step=cfg.total_step,
|
||||
batch_size=cfg.batch_size,
|
||||
writer=summary_writer
|
||||
)
|
||||
|
||||
loss_am = AverageMeter()
|
||||
start_epoch = 0
|
||||
global_step = 0
|
||||
amp = torch.cuda.amp.grad_scaler.GradScaler(growth_interval=100)
|
||||
|
||||
for epoch in range(start_epoch, cfg.num_epoch):
|
||||
|
||||
if isinstance(train_loader, DataLoader):
|
||||
train_loader.sampler.set_epoch(epoch)
|
||||
for _, (img, local_labels) in enumerate(train_loader):
|
||||
global_step += 1
|
||||
local_embeddings = backbone(img)
|
||||
loss: torch.Tensor = module_partial_fc(local_embeddings, local_labels, opt)
|
||||
|
||||
if cfg.fp16:
|
||||
amp.scale(loss).backward()
|
||||
amp.unscale_(opt)
|
||||
torch.nn.utils.clip_grad_norm_(backbone.parameters(), 5)
|
||||
amp.step(opt)
|
||||
amp.update()
|
||||
else:
|
||||
loss.backward()
|
||||
torch.nn.utils.clip_grad_norm_(backbone.parameters(), 5)
|
||||
opt.step()
|
||||
|
||||
opt.zero_grad()
|
||||
lr_scheduler.step()
|
||||
|
||||
with torch.no_grad():
|
||||
loss_am.update(loss.item(), 1)
|
||||
callback_logging(global_step, loss_am, epoch, cfg.fp16, lr_scheduler.get_last_lr()[0], amp)
|
||||
|
||||
if global_step % cfg.verbose == 0 and global_step > 200:
|
||||
callback_verification(global_step, backbone)
|
||||
|
||||
path_pfc = os.path.join(cfg.output, "softmax_fc_gpu_{}.pt".format(rank))
|
||||
torch.save(module_partial_fc.state_dict(), path_pfc)
|
||||
if rank == 0:
|
||||
path_module = os.path.join(cfg.output, "model.pt")
|
||||
torch.save(backbone.module.state_dict(), path_module)
|
||||
|
||||
if cfg.dali:
|
||||
train_loader.reset()
|
||||
|
||||
if rank == 0:
|
||||
path_module = os.path.join(cfg.output, "model.pt")
|
||||
torch.save(backbone.module.state_dict(), path_module)
|
||||
distributed.destroy_process_group()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
torch.backends.cudnn.benchmark = True
|
||||
parser = argparse.ArgumentParser(description="Distributed Arcface Training in Pytorch")
|
||||
parser.add_argument("config", type=str, help="py config file")
|
||||
parser.add_argument("--local_rank", type=int, default=0, help="local_rank")
|
||||
main(parser.parse_args())
|
||||
@@ -0,0 +1,71 @@
|
||||
import os
|
||||
import sys
|
||||
|
||||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
import pandas as pd
|
||||
from menpo.visualize.viewmatplotlib import sample_colours_from_colourmap
|
||||
from prettytable import PrettyTable
|
||||
from sklearn.metrics import roc_curve, auc
|
||||
|
||||
with open(sys.argv[1], "r") as f:
|
||||
files = f.readlines()
|
||||
|
||||
files = [x.strip() for x in files]
|
||||
image_path = "/train_tmp/IJB_release/IJBC"
|
||||
|
||||
|
||||
def read_template_pair_list(path):
|
||||
pairs = pd.read_csv(path, sep=' ', header=None).values
|
||||
t1 = pairs[:, 0].astype(np.int)
|
||||
t2 = pairs[:, 1].astype(np.int)
|
||||
label = pairs[:, 2].astype(np.int)
|
||||
return t1, t2, label
|
||||
|
||||
|
||||
p1, p2, label = read_template_pair_list(
|
||||
os.path.join('%s/meta' % image_path,
|
||||
'%s_template_pair_label.txt' % 'ijbc'))
|
||||
|
||||
methods = []
|
||||
scores = []
|
||||
for file in files:
|
||||
methods.append(file)
|
||||
scores.append(np.load(file))
|
||||
|
||||
methods = np.array(methods)
|
||||
scores = dict(zip(methods, scores))
|
||||
colours = dict(
|
||||
zip(methods, sample_colours_from_colourmap(methods.shape[0], 'Set2')))
|
||||
x_labels = [10 ** -6, 10 ** -5, 10 ** -4, 10 ** -3, 10 ** -2, 10 ** -1]
|
||||
tpr_fpr_table = PrettyTable(['Methods'] + [str(x) for x in x_labels])
|
||||
fig = plt.figure()
|
||||
for method in methods:
|
||||
fpr, tpr, _ = roc_curve(label, scores[method])
|
||||
roc_auc = auc(fpr, tpr)
|
||||
fpr = np.flipud(fpr)
|
||||
tpr = np.flipud(tpr) # select largest tpr at same fpr
|
||||
plt.plot(fpr,
|
||||
tpr,
|
||||
color=colours[method],
|
||||
lw=1,
|
||||
label=('[%s (AUC = %0.4f %%)]' %
|
||||
(method.split('-')[-1], roc_auc * 100)))
|
||||
tpr_fpr_row = []
|
||||
tpr_fpr_row.append(method)
|
||||
for fpr_iter in np.arange(len(x_labels)):
|
||||
_, min_index = min(
|
||||
list(zip(abs(fpr - x_labels[fpr_iter]), range(len(fpr)))))
|
||||
tpr_fpr_row.append('%.2f' % (tpr[min_index] * 100))
|
||||
tpr_fpr_table.add_row(tpr_fpr_row)
|
||||
plt.xlim([10 ** -6, 0.1])
|
||||
plt.ylim([0.3, 1.0])
|
||||
plt.grid(linestyle='--', linewidth=1)
|
||||
plt.xticks(x_labels)
|
||||
plt.yticks(np.linspace(0.3, 1.0, 8, endpoint=True))
|
||||
plt.xscale('log')
|
||||
plt.xlabel('False Positive Rate')
|
||||
plt.ylabel('True Positive Rate')
|
||||
plt.title('ROC on IJB')
|
||||
plt.legend(loc="lower right")
|
||||
print(tpr_fpr_table)
|
||||
@@ -0,0 +1,110 @@
|
||||
import logging
|
||||
import os
|
||||
import time
|
||||
from typing import List
|
||||
|
||||
import torch
|
||||
|
||||
from eval import verification
|
||||
from utils.utils_logging import AverageMeter
|
||||
from torch.utils.tensorboard import SummaryWriter
|
||||
from torch import distributed
|
||||
|
||||
|
||||
class CallBackVerification(object):
|
||||
|
||||
def __init__(self, val_targets, rec_prefix, summary_writer=None, image_size=(112, 112)):
|
||||
self.rank: int = distributed.get_rank()
|
||||
self.highest_acc: float = 0.0
|
||||
self.highest_acc_list: List[float] = [0.0] * len(val_targets)
|
||||
self.ver_list: List[object] = []
|
||||
self.ver_name_list: List[str] = []
|
||||
if self.rank is 0:
|
||||
self.init_dataset(val_targets=val_targets, data_dir=rec_prefix, image_size=image_size)
|
||||
|
||||
self.summary_writer = summary_writer
|
||||
|
||||
def ver_test(self, backbone: torch.nn.Module, global_step: int):
|
||||
results = []
|
||||
for i in range(len(self.ver_list)):
|
||||
acc1, std1, acc2, std2, xnorm, embeddings_list = verification.test(
|
||||
self.ver_list[i], backbone, 10, 10)
|
||||
logging.info('[%s][%d]XNorm: %f' % (self.ver_name_list[i], global_step, xnorm))
|
||||
logging.info('[%s][%d]Accuracy-Flip: %1.5f+-%1.5f' % (self.ver_name_list[i], global_step, acc2, std2))
|
||||
|
||||
self.summary_writer: SummaryWriter
|
||||
self.summary_writer.add_scalar(tag=self.ver_name_list[i], scalar_value=acc2, global_step=global_step, )
|
||||
|
||||
if acc2 > self.highest_acc_list[i]:
|
||||
self.highest_acc_list[i] = acc2
|
||||
logging.info(
|
||||
'[%s][%d]Accuracy-Highest: %1.5f' % (self.ver_name_list[i], global_step, self.highest_acc_list[i]))
|
||||
results.append(acc2)
|
||||
|
||||
def init_dataset(self, val_targets, data_dir, image_size):
|
||||
for name in val_targets:
|
||||
path = os.path.join(data_dir, name + ".bin")
|
||||
if os.path.exists(path):
|
||||
data_set = verification.load_bin(path, image_size)
|
||||
self.ver_list.append(data_set)
|
||||
self.ver_name_list.append(name)
|
||||
|
||||
def __call__(self, num_update, backbone: torch.nn.Module):
|
||||
if self.rank is 0 and num_update > 0:
|
||||
backbone.eval()
|
||||
self.ver_test(backbone, num_update)
|
||||
backbone.train()
|
||||
|
||||
|
||||
class CallBackLogging(object):
|
||||
def __init__(self, frequent, total_step, batch_size, writer=None):
|
||||
self.frequent: int = frequent
|
||||
self.rank: int = distributed.get_rank()
|
||||
self.world_size: int = distributed.get_world_size()
|
||||
self.time_start = time.time()
|
||||
self.total_step: int = total_step
|
||||
self.batch_size: int = batch_size
|
||||
self.writer = writer
|
||||
|
||||
self.init = False
|
||||
self.tic = 0
|
||||
|
||||
def __call__(self,
|
||||
global_step: int,
|
||||
loss: AverageMeter,
|
||||
epoch: int,
|
||||
fp16: bool,
|
||||
learning_rate: float,
|
||||
grad_scaler: torch.cuda.amp.GradScaler):
|
||||
if self.rank == 0 and global_step > 0 and global_step % self.frequent == 0:
|
||||
if self.init:
|
||||
try:
|
||||
speed: float = self.frequent * self.batch_size / (time.time() - self.tic)
|
||||
speed_total = speed * self.world_size
|
||||
except ZeroDivisionError:
|
||||
speed_total = float('inf')
|
||||
|
||||
time_now = (time.time() - self.time_start) / 3600
|
||||
time_total = time_now / ((global_step + 1) / self.total_step)
|
||||
time_for_end = time_total - time_now
|
||||
if self.writer is not None:
|
||||
self.writer.add_scalar('time_for_end', time_for_end, global_step)
|
||||
self.writer.add_scalar('learning_rate', learning_rate, global_step)
|
||||
self.writer.add_scalar('loss', loss.avg, global_step)
|
||||
if fp16:
|
||||
msg = "Speed %.2f samples/sec Loss %.4f LearningRate %.4f Epoch: %d Global Step: %d " \
|
||||
"Fp16 Grad Scale: %2.f Required: %1.f hours" % (
|
||||
speed_total, loss.avg, learning_rate, epoch, global_step,
|
||||
grad_scaler.get_scale(), time_for_end
|
||||
)
|
||||
else:
|
||||
msg = "Speed %.2f samples/sec Loss %.4f LearningRate %.4f Epoch: %d Global Step: %d " \
|
||||
"Required: %1.f hours" % (
|
||||
speed_total, loss.avg, learning_rate, epoch, global_step, time_for_end
|
||||
)
|
||||
logging.info(msg)
|
||||
loss.reset()
|
||||
self.tic = time.time()
|
||||
else:
|
||||
self.init = True
|
||||
self.tic = time.time()
|
||||
@@ -0,0 +1,16 @@
|
||||
import importlib
|
||||
import os.path as osp
|
||||
|
||||
|
||||
def get_config(config_file):
|
||||
assert config_file.startswith('configs/'), 'config file setting must start with configs/'
|
||||
temp_config_name = osp.basename(config_file)
|
||||
temp_module_name = osp.splitext(temp_config_name)[0]
|
||||
config = importlib.import_module("configs.base")
|
||||
cfg = config.config
|
||||
config = importlib.import_module("configs.%s" % temp_module_name)
|
||||
job_cfg = config.config
|
||||
cfg.update(job_cfg)
|
||||
if cfg.output is None:
|
||||
cfg.output = osp.join('work_dirs', temp_module_name)
|
||||
return cfg
|
||||
@@ -0,0 +1,41 @@
|
||||
import logging
|
||||
import os
|
||||
import sys
|
||||
|
||||
|
||||
class AverageMeter(object):
|
||||
"""Computes and stores the average and current value
|
||||
"""
|
||||
|
||||
def __init__(self):
|
||||
self.val = None
|
||||
self.avg = None
|
||||
self.sum = None
|
||||
self.count = None
|
||||
self.reset()
|
||||
|
||||
def reset(self):
|
||||
self.val = 0
|
||||
self.avg = 0
|
||||
self.sum = 0
|
||||
self.count = 0
|
||||
|
||||
def update(self, val, n=1):
|
||||
self.val = val
|
||||
self.sum += val * n
|
||||
self.count += n
|
||||
self.avg = self.sum / self.count
|
||||
|
||||
|
||||
def init_logging(rank, models_root):
|
||||
if rank == 0:
|
||||
log_root = logging.getLogger()
|
||||
log_root.setLevel(logging.INFO)
|
||||
formatter = logging.Formatter("Training: %(asctime)s-%(message)s")
|
||||
handler_file = logging.FileHandler(os.path.join(models_root, "training.log"))
|
||||
handler_stream = logging.StreamHandler(sys.stdout)
|
||||
handler_file.setFormatter(formatter)
|
||||
handler_stream.setFormatter(formatter)
|
||||
log_root.addHandler(handler_file)
|
||||
log_root.addHandler(handler_stream)
|
||||
log_root.info('rank_id: %d' % rank)
|
||||
+2
-2
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"breakpoint": [
|
||||
1877,
|
||||
29
|
||||
31,
|
||||
110
|
||||
]
|
||||
}
|
||||
+258
@@ -6726,3 +6726,261 @@ n002000\0058_02.jpg
|
||||
n002000\0130_01.jpg
|
||||
n002000\0135_01.jpg
|
||||
n002000\0160_02.jpg
|
||||
n000002\0054_01.jpg
|
||||
n000002\0055_01.jpg
|
||||
n000002\0138_01.jpg
|
||||
n000002\0150_02.jpg
|
||||
n000002\0208_01.jpg
|
||||
n000002\0252_01.jpg
|
||||
n000002\0273_01.jpg
|
||||
n000002\0276_01.jpg
|
||||
n000003\0024_01.jpg
|
||||
n000003\0098_01.jpg
|
||||
n000003\0219_01.jpg
|
||||
n000004\0026_01.jpg
|
||||
n000004\0084_01.jpg
|
||||
n000004\0103_02.jpg
|
||||
n000004\0118_01.jpg
|
||||
n000004\0144_02.jpg
|
||||
n000004\0155_01.jpg
|
||||
n000004\0180_01.jpg
|
||||
n000004\0231_01.jpg
|
||||
n000004\0237_01.jpg
|
||||
n000004\0239_01.jpg
|
||||
n000004\0258_01.jpg
|
||||
n000005\0138_01.jpg
|
||||
n000005\0144_01.jpg
|
||||
n000005\0287_01.jpg
|
||||
n000006\0007_01.jpg
|
||||
n000006\0014_01.jpg
|
||||
n000006\0036_02.jpg
|
||||
n000006\0091_01.jpg
|
||||
n000006\0103_01.jpg
|
||||
n000006\0281_01.jpg
|
||||
n000006\0300_01.jpg
|
||||
n000006\0351_01.jpg
|
||||
n000006\0430_01.jpg
|
||||
n000006\0519_01.jpg
|
||||
n000007\0021_01.jpg
|
||||
n000007\0042_01.jpg
|
||||
n000007\0045_01.jpg
|
||||
n000007\0050_02.jpg
|
||||
n000007\0080_01.jpg
|
||||
n000007\0086_01.jpg
|
||||
n000007\0106_02.jpg
|
||||
n000007\0115_01.jpg
|
||||
n000007\0116_03.jpg
|
||||
n000007\0119_01.jpg
|
||||
n000007\0137_01.jpg
|
||||
n000007\0140_02.jpg
|
||||
n000007\0148_02.jpg
|
||||
n000007\0174_01.jpg
|
||||
n000007\0181_01.jpg
|
||||
n000007\0182_02.jpg
|
||||
n000007\0213_02.jpg
|
||||
n000007\0226_02.jpg
|
||||
n000007\0229_01.jpg
|
||||
n000007\0432_01.jpg
|
||||
n000008\0072_01.jpg
|
||||
n000008\0297_01.jpg
|
||||
n000010\0068_01.jpg
|
||||
n000010\0069_01.jpg
|
||||
n000010\0096_01.jpg
|
||||
n000010\0150_02.jpg
|
||||
n000010\0155_02.jpg
|
||||
n000010\0223_01.jpg
|
||||
n000011\0112_01.jpg
|
||||
n000011\0142_02.jpg
|
||||
n000011\0200_01.jpg
|
||||
n000011\0217_01.jpg
|
||||
n000011\0229_02.jpg
|
||||
n000011\0291_02.jpg
|
||||
n000012\0173_01.jpg
|
||||
n000012\0180_01.jpg
|
||||
n000012\0198_01.jpg
|
||||
n000012\0282_01.jpg
|
||||
n000012\0294_01.jpg
|
||||
n000012\0307_01.jpg
|
||||
n000012\0338_01.jpg
|
||||
n000013\0029_06.jpg
|
||||
n000013\0128_01.jpg
|
||||
n000013\0132_01.jpg
|
||||
n000013\0148_01.jpg
|
||||
n000013\0190_02.jpg
|
||||
n000013\0225_01.jpg
|
||||
n000013\0277_01.jpg
|
||||
n000013\0335_01.jpg
|
||||
n000013\0337_01.jpg
|
||||
n000013\0341_02.jpg
|
||||
n000014\0163_01.jpg
|
||||
n000015\0029_02.jpg
|
||||
n000015\0059_01.jpg
|
||||
n000015\0133_01.jpg
|
||||
n000015\0243_02.jpg
|
||||
n000015\0392_02.jpg
|
||||
n000015\0393_01.jpg
|
||||
n000015\0402_01.jpg
|
||||
n000016\0189_01.jpg
|
||||
n000016\0237_01.jpg
|
||||
n000016\0266_01.jpg
|
||||
n000016\0385_04.jpg
|
||||
n000016\0391_01.jpg
|
||||
n000016\0405_01.jpg
|
||||
n000016\0477_02.jpg
|
||||
n000016\0500_01.jpg
|
||||
n000016\0503_01.jpg
|
||||
n000016\0503_01.jpg
|
||||
n000017\0123_02.jpg
|
||||
n000017\0124_01.jpg
|
||||
n000017\0163_01.jpg
|
||||
n000017\0262_01.jpg
|
||||
n000019\0038_01.jpg
|
||||
n000019\0055_01.jpg
|
||||
n000019\0061_01.jpg
|
||||
n000019\0114_01.jpg
|
||||
n000019\0130_02.jpg
|
||||
n000019\0149_02.jpg
|
||||
n000019\0170_01.jpg
|
||||
n000019\0182_01.jpg
|
||||
n000019\0219_01.jpg
|
||||
n000019\0221_02.jpg
|
||||
n000019\0234_02.jpg
|
||||
n000019\0249_01.jpg
|
||||
n000019\0259_01.jpg
|
||||
n000019\0273_01.jpg
|
||||
n000019\0306_01.jpg
|
||||
n000019\0313_01.jpg
|
||||
n000019\0333_01.jpg
|
||||
n000019\0350_02.jpg
|
||||
n000020\0006_01.jpg
|
||||
n000020\0071_01.jpg
|
||||
n000020\0074_02.jpg
|
||||
n000020\0099_02.jpg
|
||||
n000020\0379_01.jpg
|
||||
n000020\0400_01.jpg
|
||||
n000021\0120_02.jpg
|
||||
n000021\0221_01.jpg
|
||||
n000022\0051_01.jpg
|
||||
n000022\0071_01.jpg
|
||||
n000022\0146_02.jpg
|
||||
n000022\0146_02.jpg
|
||||
n000022\0236_01.jpg
|
||||
n000023\0008_01.jpg
|
||||
n000023\0078_01.jpg
|
||||
n000023\0093_01.jpg
|
||||
n000023\0133_01.jpg
|
||||
n000023\0162_01.jpg
|
||||
n000023\0198_01.jpg
|
||||
n000023\0207_03.jpg
|
||||
n000023\0269_02.jpg
|
||||
n000023\0265_01.jpg
|
||||
n000023\0280_01.jpg
|
||||
n000023\0366_01.jpg
|
||||
n000023\0389_01.jpg
|
||||
n000024\0062_01.jpg
|
||||
n000024\0073_01.jpg
|
||||
n000024\0354_04.jpg
|
||||
n000024\0409_01.jpg
|
||||
n000025\0100_02.jpg
|
||||
n000025\0274_02.jpg
|
||||
n000026\0038_01.jpg
|
||||
n000026\0041_01.jpg
|
||||
n000026\0059_01.jpg
|
||||
n000026\0062_01.jpg
|
||||
n000026\0065_01.jpg
|
||||
n000026\0082_02.jpg
|
||||
n000026\0103_01.jpg
|
||||
n000026\0137_01.jpg
|
||||
n000026\0060_01.jpg
|
||||
n000026\0179_03.jpg
|
||||
n000026\0196_01.jpg
|
||||
n000026\0248_01.jpg
|
||||
n000026\0255_01.jpg
|
||||
n000026\0273_01.jpg
|
||||
n000026\0280_01.jpg
|
||||
n000027\0023_02.jpg
|
||||
n000027\0023_05.jpg
|
||||
n000027\0115_01.jpg
|
||||
n000027\0157_02.jpg
|
||||
n000027\0171_01.jpg
|
||||
n000027\0182_02.jpg
|
||||
n000027\0211_02.jpg
|
||||
n000027\0255_01.jpg
|
||||
n000027\0274_04.jpg
|
||||
n000027\0318_04.jpg
|
||||
n000027\0326_01.jpg
|
||||
n000027\0401_01.jpg
|
||||
n000027\0402_01.jpg
|
||||
n000027\0438_01.jpg
|
||||
n000027\0442_01.jpg
|
||||
n000027\0493_01.jpg
|
||||
n000028\0040_04.jpg
|
||||
n000028\0056_01.jpg
|
||||
n000028\0134_01.jpg
|
||||
n000028\0136_03.jpg
|
||||
n000028\0138_01.jpg
|
||||
n000028\0144_02.jpg
|
||||
n000028\0156_01.jpg
|
||||
n000028\0162_01.jpg
|
||||
n000028\0168_01.jpg
|
||||
n000028\0205_01.jpg
|
||||
n000028\0220_01.jpg
|
||||
n000028\0249_01.jpg
|
||||
n000028\0300_01.jpg
|
||||
n000028\0324_02.jpg
|
||||
n000028\0343_01.jpg
|
||||
n000028\0352_01.jpg
|
||||
n000028\0384_01.jpg
|
||||
n000028\0392_01.jpg
|
||||
n000028\0408_02.jpg
|
||||
n000028\0412_02.jpg
|
||||
n000030\0112_01.jpg
|
||||
n000030\0119_01.jpg
|
||||
n000030\0156_01.jpg
|
||||
n000030\0192_01.jpg
|
||||
n000030\0195_01.jpg
|
||||
n000030\0203_01.jpg
|
||||
n000030\0218_02.jpg
|
||||
n000030\0305_01.jpg
|
||||
n000031\0025_01.jpg
|
||||
n000031\0080_02.jpg
|
||||
n000031\0141_01.jpg
|
||||
n000031\0196_01.jpg
|
||||
n000031\0215_01.jpg
|
||||
n000031\0286_02.jpg
|
||||
n000032\0085_01.jpg
|
||||
n000032\0100_01.jpg
|
||||
n000032\0100_02.jpg
|
||||
n000032\0233_01.jpg
|
||||
n000032\0261_01.jpg
|
||||
n000032\0350_01.jpg
|
||||
n000032\0374_01.jpg
|
||||
n000032\0393_02.jpg
|
||||
n000032\0428_01.jpg
|
||||
n000032\0443_01.jpg
|
||||
n000032\0459_01.jpg
|
||||
n000032\0465_02.jpg
|
||||
n000033\0031_01.jpg
|
||||
n000033\0032_02.jpg
|
||||
n000033\0034_01.jpg
|
||||
n000033\0034_02.jpg
|
||||
n000033\0080_01.jpg
|
||||
n000033\0100_01.jpg
|
||||
n000033\0100_02.jpg
|
||||
n000033\0122_01.jpg
|
||||
n000033\0164_02.jpg
|
||||
n000033\0166_01.jpg
|
||||
n000033\0250_02.jpg
|
||||
n000033\0327_01.jpg
|
||||
n000033\0337_01.jpg
|
||||
n000034\0327_01.jpg
|
||||
n000035\0072_02.jpg
|
||||
n000035\0099_01.jpg
|
||||
n000035\0132_03.jpg
|
||||
n000035\0134_01.jpg
|
||||
n000035\0150_01.jpg
|
||||
n000035\0158_01.jpg
|
||||
n000035\0159_02.jpg
|
||||
n000035\0167_01.jpg
|
||||
n000035\0170_01.jpg
|
||||
n000035\0200_01.jpg
|
||||
|
||||
@@ -0,0 +1,17 @@
|
||||
#!/usr/bin/env python3
|
||||
# -*- coding:utf-8 -*-
|
||||
#############################################################
|
||||
# File: test_arcface.py
|
||||
# Created Date: Thursday March 17th 2022
|
||||
# Author: Chen Xuanhong
|
||||
# Email: chenxuanhongzju@outlook.com
|
||||
# Last Modified: Thursday, 17th March 2022 12:34:57 am
|
||||
# Modified By: Chen Xuanhong
|
||||
# Copyright (c) 2022 Shanghai Jiao Tong University
|
||||
#############################################################
|
||||
import torch
|
||||
|
||||
if __name__ == "__main__":
|
||||
arcface1 = torch.load("./arcface_ckpt/arcface_checkpoint.tar", map_location=torch.device("cpu"))
|
||||
print(arcface1)
|
||||
arcface = arcface1['model'].module
|
||||
@@ -5,7 +5,7 @@
|
||||
# Created Date: Sunday January 9th 2022
|
||||
# Author: Chen Xuanhong
|
||||
# Email: chenxuanhongzju@outlook.com
|
||||
# Last Modified: Tuesday, 15th February 2022 12:00:24 am
|
||||
# Last Modified: Thursday, 17th March 2022 1:01:52 am
|
||||
# Modified By: Chen Xuanhong
|
||||
# Copyright (c) 2022 Shanghai Jiao Tong University
|
||||
#############################################################
|
||||
@@ -26,6 +26,8 @@ from torch_utils import training_stats
|
||||
from torch_utils.ops import conv2d_gradfix
|
||||
from torch_utils.ops import grid_sample_gradfix
|
||||
|
||||
from arcface_torch.backbones.iresnet import iresnet100
|
||||
|
||||
from utilities.plot import plot_batch
|
||||
from losses.cos import cosin_metric
|
||||
from train_scripts.trainer_multigpu_base import TrainerBase
|
||||
@@ -95,8 +97,12 @@ def init_framework(config, reporter, device, rank):
|
||||
reporter.writeInfo("Discriminator structure:")
|
||||
reporter.writeModel(dis.__str__())
|
||||
|
||||
arcface1 = torch.load(config["arcface_ckpt"], map_location=torch.device("cpu"))
|
||||
arcface = arcface1['model'].module
|
||||
# arcface1 = torch.load(config["arcface_ckpt"], map_location=torch.device("cpu"))
|
||||
# arcface = arcface1['model'].module
|
||||
|
||||
arcface = iresnet100(pretrained=False, fp16=False)
|
||||
arcface.load_state_dict(torch.load(config["arcface_ckpt"], map_location='cpu'))
|
||||
arcface.eval()
|
||||
|
||||
# train in GPU
|
||||
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user