albert-xlarge-v2


Find this model in the ALBERT model summary
albert-xlarge-v2 Model Summary Plots





albert-xlarge-v2 Model Selected Details
  layer_type N M Q alpha D alpha-hat log_SN % Rand num_traps num_fingers rank_loss
layer_id                        
2 EMBEDDING 30000 128 234.38 2.32 0.14 7.50 3.23 74.64 1 5 0
3 EMBEDDING 512 128 4.00 1.41 0.12 1.38 0.98 36.22 0 1 0
8 DENSE 2048 128 16.00 6.77 0.13 3.80 0.56 96.11 0 0 0
15 DENSE 2048 2048 1.00 2.88 0.04 4.86 1.69 83.66 0 0 2
16 DENSE 2048 2048 1.00 2.71 0.04 5.08 1.88 81.85 0 0 2
17 DENSE 2048 2048 1.00 3.43 0.05 4.26 1.24 90.53 0 0 2
20 DENSE 2048 2048 1.00 3.56 0.02 6.60 1.85 88.32 0 0 2
22 DENSE 8192 2048 4.00 3.67 0.02 10.76 2.94 85.75 0 0 0
23 DENSE 8192 2048 4.00 3.66 0.02 10.13 2.77 88.41 0 0 0
26 DENSE 2048 2048 1.00 1.84 0.01 4.45 2.42 57.02 0 0 3

albert-xlarge-v2 Layer Plots
Layer 2
   Layer=2  |  N=30000  |  M=128  |  Q=234.38  |  alpha=2.32  |  D_ks=0.14  |  alpha-hat=7.50  |  num traps=1









Layer 3
   Layer=3  |  N=512  |  M=128  |  Q=4.00  |  alpha=1.41  |  D_ks=0.12  |  alpha-hat=1.38  |  num traps=0









Layer 8
   Layer=8  |  N=2048  |  M=128  |  Q=16.00  |  alpha=6.77  |  D_ks=0.13  |  alpha-hat=3.80  |  num traps=0









Layer 15
   Layer=15  |  N=2048  |  M=2048  |  Q=1.00  |  alpha=2.88  |  D_ks=0.04  |  alpha-hat=4.86  |  num traps=0









Layer 16
   Layer=16  |  N=2048  |  M=2048  |  Q=1.00  |  alpha=2.71  |  D_ks=0.04  |  alpha-hat=5.08  |  num traps=0









Layer 17
   Layer=17  |  N=2048  |  M=2048  |  Q=1.00  |  alpha=3.43  |  D_ks=0.05  |  alpha-hat=4.26  |  num traps=0









Layer 20
   Layer=20  |  N=2048  |  M=2048  |  Q=1.00  |  alpha=3.56  |  D_ks=0.02  |  alpha-hat=6.60  |  num traps=0









Layer 22
   Layer=22  |  N=8192  |  M=2048  |  Q=4.00  |  alpha=3.67  |  D_ks=0.02  |  alpha-hat=10.76  |  num traps=0









Layer 23
   Layer=23  |  N=8192  |  M=2048  |  Q=4.00  |  alpha=3.66  |  D_ks=0.02  |  alpha-hat=10.13  |  num traps=0









Layer 26
   Layer=26  |  N=2048  |  M=2048  |  Q=1.00  |  alpha=1.84  |  D_ks=0.01  |  alpha-hat=4.45  |  num traps=0