c7ab9653c5
for every batch_size*16 samples, model collects the samples with the highest error and learns them again therefore hard samples will be trained more often