Find this model in the Llama3.2 model summary
id | layer_type | N | M | Q | alpha | D | alpha-hat | num_spikes | warning |
---|---|---|---|---|---|---|---|---|---|
1 | dense | 8192 | 2048 | 4.0 | 9.421698 | 0.035515 | -5.965082 | 100 | under-trained |
2 | dense | 8192 | 2048 | 4.0 | 5.950188 | 0.030677 | -3.159531 | 94 | |
3 | dense | 8192 | 2048 | 4.0 | 7.464558 | 0.027772 | -4.761066 | 68 | under-trained |
4 | dense | 2048 | 512 | 4.0 | 2.020664 | 0.034868 | 0.606245 | 84 | |
5 | dense | 2048 | 2048 | 1.0 | 4.488910 | 0.029646 | -3.663352 | 69 | |
6 | dense | 2048 | 2048 | 1.0 | 2.360407 | 0.031176 | 1.703370 | 52 | |
7 | dense | 2048 | 512 | 4.0 | 3.925694 | 0.033773 | -4.620999 | 36 | |
8 | dense | 2048 | 2048 | 1.0 | 3.641031 | 0.032100 | -0.116329 | 43 | |
9 | dense | 2048 | 2048 | 1.0 | 5.264590 | 0.037251 | -4.061943 | 32 | |
10 | dense | 2048 | 512 | 4.0 | 4.010883 | 0.045891 | -2.084470 | 21 | |
11 | dense | 2048 | 512 | 4.0 | 6.332639 | 0.047976 | -9.303010 | 36 | under-trained |
12 | dense | 8192 | 2048 | 4.0 | 6.244164 | 0.020647 | -1.686553 | 83 | under-trained |
13 | dense | 8192 | 2048 | 4.0 | 10.933645 | 0.030272 | -7.205779 | 57 | under-trained |
14 | dense | 8192 | 2048 | 4.0 | 7.699749 | 0.020161 | -3.507166 | 71 | under-trained |
15 | dense | 2048 | 512 | 4.0 | 6.223312 | 0.098870 | -9.030522 | 71 | under-trained |
16 | dense | 2048 | 2048 | 1.0 | 4.246627 | 0.036597 | -1.296309 | 66 | |
17 | dense | 2048 | 2048 | 1.0 | 8.091381 | 0.035954 | -7.437897 | 42 | under-trained |
18 | dense | 2048 | 512 | 4.0 | 4.415654 | 0.055436 | -3.236097 | 33 | |
19 | dense | 8192 | 2048 | 4.0 | 7.649070 | 0.023182 | -3.085672 | 61 | under-trained |
20 | dense | 8192 | 2048 | 4.0 | 6.667444 | 0.032559 | -1.337082 | 33 | under-trained |
21 | dense | 8192 | 2048 | 4.0 | 11.281225 | 0.041699 | -7.284295 | 37 | under-trained |
22 | dense | 2048 | 512 | 4.0 | 6.196003 | 0.094518 | -9.182601 | 69 | under-trained |
23 | dense | 8192 | 2048 | 4.0 | 6.717740 | 0.025897 | -2.632054 | 63 | under-trained |
24 | dense | 8192 | 2048 | 4.0 | 5.721546 | 0.027544 | -0.584405 | 45 | |
25 | dense | 8192 | 2048 | 4.0 | 9.464621 | 0.043929 | -5.828644 | 52 | under-trained |
26 | dense | 2048 | 2048 | 1.0 | 3.839392 | 0.028996 | -0.880947 | 57 | |
27 | dense | 2048 | 512 | 4.0 | 4.077587 | 0.067094 | -2.628718 | 49 | |
28 | dense | 2048 | 2048 | 1.0 | 5.486808 | 0.024861 | -4.653895 | 72 | |
29 | dense | 2048 | 512 | 4.0 | 5.227200 | 0.081454 | -7.739984 | 80 | |
30 | dense | 8192 | 2048 | 4.0 | 7.743287 | 0.087989 | -4.796593 | 159 | under-trained |
31 | dense | 8192 | 2048 | 4.0 | 5.552786 | 0.024679 | -0.462238 | 67 | |
32 | dense | 8192 | 2048 | 4.0 | 6.554204 | 0.030192 | -2.695512 | 61 | under-trained |
33 | dense | 2048 | 512 | 4.0 | 2.823983 | 0.108977 | -1.455137 | 150 | |
34 | dense | 2048 | 2048 | 1.0 | 5.158814 | 0.085864 | -4.923454 | 97 | |
35 | dense | 2048 | 2048 | 1.0 | 3.901705 | 0.032842 | -0.286085 | 39 | |
36 | dense | 8192 | 2048 | 4.0 | 8.789849 | 0.057408 | -5.113988 | 75 | under-trained |
37 | dense | 8192 | 2048 | 4.0 | 5.409071 | 0.039457 | -0.231722 | 77 | |
38 | dense | 8192 | 2048 | 4.0 | 6.687619 | 0.037080 | -2.401148 | 53 | under-trained |
39 | dense | 2048 | 512 | 4.0 | 3.570507 | 0.097687 | -1.705782 | 91 | |
40 | dense | 2048 | 2048 | 1.0 | 6.632956 | 0.052276 | -6.388009 | 37 | under-trained |
41 | dense | 2048 | 512 | 4.0 | 5.650185 | 0.096535 | -8.848127 | 78 | |
42 | dense | 2048 | 2048 | 1.0 | 4.003132 | 0.033383 | -0.659275 | 55 | |
43 | dense | 2048 | 512 | 4.0 | 8.691844 | 0.114689 | -13.376314 | 40 | under-trained |
44 | dense | 8192 | 2048 | 4.0 | 5.458287 | 0.034098 | -0.078183 | 51 | |
45 | dense | 8192 | 2048 | 4.0 | 6.032575 | 0.031060 | -1.894421 | 70 | under-trained |
46 | dense | 2048 | 512 | 4.0 | 3.007362 | 0.109434 | -0.851700 | 135 | |
47 | dense | 2048 | 2048 | 1.0 | 6.011258 | 0.041852 | -5.999695 | 51 | under-trained |
48 | dense | 8192 | 2048 | 4.0 | 6.902764 | 0.039217 | -3.313018 | 92 | under-trained |
49 | dense | 2048 | 2048 | 1.0 | 3.875596 | 0.047102 | -0.587975 | 51 | |
50 | dense | 2048 | 512 | 4.0 | 4.522608 | 0.055328 | -1.862352 | 28 | |
51 | dense | 8192 | 2048 | 4.0 | 5.934280 | 0.033856 | -1.162457 | 45 | |
52 | dense | 2048 | 2048 | 1.0 | 4.285398 | 0.051286 | -3.500149 | 97 | |
53 | dense | 8192 | 2048 | 4.0 | 6.945759 | 0.048291 | -3.298934 | 80 | under-trained |
54 | dense | 2048 | 512 | 4.0 | 6.912559 | 0.077030 | -10.032872 | 43 | under-trained |
55 | dense | 2048 | 2048 | 1.0 | 4.346657 | 0.050640 | -0.677867 | 33 | |
56 | dense | 8192 | 2048 | 4.0 | 5.036899 | 0.032998 | 0.216070 | 59 | |
57 | dense | 8192 | 2048 | 4.0 | 6.630471 | 0.086818 | -3.016427 | 173 | under-trained |
58 | dense | 8192 | 2048 | 4.0 | 6.066240 | 0.066120 | -1.640014 | 135 | under-trained |
59 | dense | 2048 | 512 | 4.0 | 4.316368 | 0.027283 | -1.253823 | 40 | |
60 | dense | 2048 | 2048 | 1.0 | 5.917830 | 0.063974 | -6.265611 | 63 | |
61 | dense | 2048 | 2048 | 1.0 | 3.689886 | 0.072247 | -0.909121 | 115 | |
62 | dense | 2048 | 512 | 4.0 | 11.256640 | 0.089314 | -17.596674 | 26 | under-trained |
63 | dense | 8192 | 2048 | 4.0 | 6.087767 | 0.041463 | -0.152961 | 47 | under-trained |
64 | dense | 8192 | 2048 | 4.0 | 7.844570 | 0.097030 | -4.000637 | 154 | under-trained |
65 | dense | 8192 | 2048 | 4.0 | 6.705917 | 0.033854 | -0.384617 | 39 | under-trained |
66 | dense | 8192 | 2048 | 4.0 | 6.767449 | 0.085609 | -1.883431 | 118 | under-trained |
67 | dense | 2048 | 512 | 4.0 | 3.229746 | 0.116212 | -1.331462 | 123 | |
68 | dense | 2048 | 2048 | 1.0 | 3.598116 | 0.070727 | -2.375259 | 197 | |
69 | dense | 2048 | 512 | 4.0 | 9.015283 | 0.107876 | -14.124009 | 41 | under-trained |
70 | dense | 2048 | 2048 | 1.0 | 4.012338 | 0.058270 | -0.486432 | 63 | |
71 | dense | 2048 | 512 | 4.0 | 4.877246 | 0.121103 | -7.251835 | 118 | |
72 | dense | 2048 | 2048 | 1.0 | 7.306414 | 0.087748 | -6.365103 | 56 | under-trained |
73 | dense | 2048 | 512 | 4.0 | 2.716125 | 0.120827 | -0.859173 | 172 | |
74 | dense | 2048 | 2048 | 1.0 | 4.580215 | 0.051587 | -0.523799 | 51 | |
75 | dense | 8192 | 2048 | 4.0 | 6.574870 | 0.040861 | -0.746741 | 80 | under-trained |
76 | dense | 8192 | 2048 | 4.0 | 9.980163 | 0.094855 | -5.533906 | 89 | under-trained |
77 | dense | 8192 | 2048 | 4.0 | 7.173291 | 0.045686 | -2.319722 | 78 | under-trained |
78 | dense | 8192 | 2048 | 4.0 | 9.266222 | 0.090372 | -5.185787 | 108 | under-trained |
79 | dense | 8192 | 2048 | 4.0 | 8.150679 | 0.041536 | -1.195119 | 29 | under-trained |
80 | dense | 8192 | 2048 | 4.0 | 7.184082 | 0.086907 | -2.425534 | 129 | under-trained |
81 | dense | 2048 | 2048 | 1.0 | 4.695973 | 0.108588 | -3.324541 | 183 | |
82 | dense | 2048 | 2048 | 1.0 | 4.055418 | 0.098369 | -0.429523 | 132 | |
83 | dense | 2048 | 512 | 4.0 | 11.268829 | 0.125374 | -15.902215 | 34 | under-trained |
84 | dense | 2048 | 512 | 4.0 | 3.114963 | 0.115024 | -1.199879 | 146 | |
85 | dense | 2048 | 2048 | 1.0 | 5.963024 | 0.054076 | -4.555063 | 63 | |
86 | dense | 8192 | 2048 | 4.0 | 10.958605 | 0.080680 | -5.438422 | 75 | under-trained |
87 | dense | 8192 | 2048 | 4.0 | 7.550548 | 0.035029 | -1.286163 | 56 | under-trained |
88 | dense | 8192 | 2048 | 4.0 | 8.549903 | 0.037558 | -2.157261 | 35 | under-trained |
89 | dense | 2048 | 512 | 4.0 | 3.969946 | 0.038440 | -1.873265 | 43 | |
90 | dense | 2048 | 2048 | 1.0 | 4.803454 | 0.052480 | -0.381091 | 18 | |
91 | dense | 2048 | 512 | 4.0 | 6.103760 | 0.103801 | -7.496321 | 68 | under-trained |
92 | dense | 8192 | 2048 | 4.0 | 9.952363 | 0.081068 | -4.508674 | 89 | under-trained |
93 | dense | 8192 | 2048 | 4.0 | 8.370968 | 0.030615 | -1.330666 | 49 | under-trained |
94 | dense | 2048 | 512 | 4.0 | 3.910072 | 0.056321 | -1.909579 | 44 | |
95 | dense | 2048 | 2048 | 1.0 | 5.362286 | 0.081559 | -3.933633 | 102 | |
96 | dense | 2048 | 2048 | 1.0 | 3.923021 | 0.050322 | -0.635198 | 68 | |
97 | dense | 2048 | 512 | 4.0 | 5.383228 | 0.103707 | -7.023873 | 87 | |
98 | dense | 8192 | 2048 | 4.0 | 8.610355 | 0.027673 | -2.125461 | 58 | under-trained |
99 | dense | 8192 | 2048 | 4.0 | 9.126770 | 0.094217 | -3.633662 | 104 | under-trained |
100 | dense | 8192 | 2048 | 4.0 | 7.092784 | 0.024012 | -0.412510 | 62 | under-trained |
101 | dense | 8192 | 2048 | 4.0 | 7.162833 | 0.088416 | -1.703839 | 134 | under-trained |
102 | dense | 2048 | 512 | 4.0 | 3.615304 | 0.042787 | -1.444610 | 33 | |
103 | dense | 2048 | 2048 | 1.0 | 4.815216 | 0.030950 | -2.586566 | 88 | |
104 | dense | 2048 | 2048 | 1.0 | 3.593925 | 0.043769 | -0.115587 | 64 | |
105 | dense | 2048 | 512 | 4.0 | 7.376213 | 0.118778 | -9.697736 | 58 | under-trained |
106 | dense | 2048 | 2048 | 1.0 | 3.709058 | 0.029690 | 0.033373 | 49 | |
107 | dense | 8192 | 2048 | 4.0 | 6.022702 | 0.032055 | 0.074158 | 111 | under-trained |
108 | dense | 8192 | 2048 | 4.0 | 5.599232 | 0.025234 | -0.081990 | 110 | |
109 | dense | 2048 | 512 | 4.0 | 3.730787 | 0.030812 | -1.342692 | 34 | |
110 | dense | 2048 | 2048 | 1.0 | 5.993729 | 0.094574 | -4.702663 | 99 | |
111 | dense | 8192 | 2048 | 4.0 | 5.845211 | 0.047917 | 0.553819 | 148 | |
112 | dense | 2048 | 512 | 4.0 | 8.411724 | 0.065107 | -10.738161 | 35 | under-trained |