WeightWatcher: Data-Free Diagnostics for Deep Learning

ResNetV2 Models

The ResNet models are Computer Vision (VC) models that were introduces in 2015 as an DNN architecture that can have hundreds of layers. These moedles use slip connections between layers to help information flow from the data to the labels during training. (They are also sometimes called a form of Highway Nets.)

Unlike the VGG models , the Correlation Flow in ResNets is quite good, with all but the last layers have optimal layer alphas near 2.0. As promised, the weightwatcher average alpha is a good predictor of the ResNet model accuracy (or quality). Indeed, all three weightwatcher metrics shown here (alpha, alpha-hat, and rand_distance), are all well correlated with the ResNet test accuracies. And while we only show a few models here, the weightwatcher alpha metric works across a wide range of ResNet models, as shown in Figure 3 of our Nature paper

Note that the ResNet models have no Fully Connected or Linear layers. Also, many of the layer alphas are less than 2.0, except for those closest to the labels. Ths could indicate that the early layers, with very small alphas, are slightly ovefit, however, this may also simply be because it is difficult to estimate the PL exponent alpha from such very snall convolutional layers

Primary Reference: https://pytorch.org/vision/stable/models/resnet.html
Secondary Reference: https://pytorch.org/vision/stable/models.html
Paper: Deep Residual Learning for Image Recognition

ResNetV2 Models Included

ResNetV2 Model Set Plots

ResNetV2 % Randomness Metric Plots