albert-base-v2


Find this model in the ALBERT model summary
albert-base-v2 Model Summary Plots





albert-base-v2 Model Selected Details
  layer_type N M Q alpha D alpha-hat log_SN % Rand num_traps num_fingers rank_loss
layer_id                        
2 EMBEDDING 30000 128 234.38 3.93 0.07 11.39 2.90 76.03 1 0 0
3 EMBEDDING 512 128 4.00 1.19 0.16 1.32 1.11 30.66 0 1 0
8 DENSE 768 128 6.00 7.45 0.05 2.43 0.33 93.63 0 0 0
15 DENSE 768 768 1.00 4.04 0.07 4.62 1.14 84.68 0 0 6
16 DENSE 768 768 1.00 3.89 0.05 4.75 1.22 83.45 0 0 6
17 DENSE 768 768 1.00 3.75 0.08 4.56 1.22 90.62 0 0 1
20 DENSE 768 768 1.00 3.39 0.04 4.73 1.39 88.67 0 0 1
22 DENSE 3072 768 4.00 4.12 0.03 10.18 2.47 82.95 0 0 0
23 DENSE 3072 768 4.00 4.56 0.03 9.58 2.10 83.53 1 0 0
26 DENSE 768 768 1.00 2.56 0.03 5.09 1.99 59.07 0 0 3

albert-base-v2 Layer Plots
Layer 2
   Layer=2  |  N=30000  |  M=128  |  Q=234.38  |  alpha=3.93  |  D_ks=0.07  |  alpha-hat=11.39  |  num traps=1









Layer 3
   Layer=3  |  N=512  |  M=128  |  Q=4.00  |  alpha=1.19  |  D_ks=0.16  |  alpha-hat=1.32  |  num traps=0









Layer 8
   Layer=8  |  N=768  |  M=128  |  Q=6.00  |  alpha=7.45  |  D_ks=0.05  |  alpha-hat=2.43  |  num traps=0









Layer 15
   Layer=15  |  N=768  |  M=768  |  Q=1.00  |  alpha=4.04  |  D_ks=0.07  |  alpha-hat=4.62  |  num traps=0









Layer 16
   Layer=16  |  N=768  |  M=768  |  Q=1.00  |  alpha=3.89  |  D_ks=0.05  |  alpha-hat=4.75  |  num traps=0









Layer 17
   Layer=17  |  N=768  |  M=768  |  Q=1.00  |  alpha=3.75  |  D_ks=0.08  |  alpha-hat=4.56  |  num traps=0









Layer 20
   Layer=20  |  N=768  |  M=768  |  Q=1.00  |  alpha=3.39  |  D_ks=0.04  |  alpha-hat=4.73  |  num traps=0









Layer 22
   Layer=22  |  N=3072  |  M=768  |  Q=4.00  |  alpha=4.12  |  D_ks=0.03  |  alpha-hat=10.18  |  num traps=0









Layer 23
   Layer=23  |  N=3072  |  M=768  |  Q=4.00  |  alpha=4.56  |  D_ks=0.03  |  alpha-hat=9.58  |  num traps=1









Layer 26
   Layer=26  |  N=768  |  M=768  |  Q=1.00  |  alpha=2.56  |  D_ks=0.03  |  alpha-hat=5.09  |  num traps=0