improved model generalization, overall accuracy and sharpness by using new 'Learning rate dropout' technique from paper https://arxiv.org/abs/1912.00144 An example of a loss histogram where this function is enabled after the red arrow: https://i.imgur.com/3olskOd.jpg