Find this model in the Llama model summary
id | layer_type | N | M | Q | alpha | D | alpha-hat | num_spikes | warning |
---|---|---|---|---|---|---|---|---|---|
1 | dense | 8192 | 2048 | 4.0 | 4.141185 | 0.025400 | 5.769416 | 67 | |
2 | dense | 8192 | 2048 | 4.0 | 3.713097 | 0.047539 | 6.434524 | 591 | |
3 | dense | 8192 | 2048 | 4.0 | 5.228047 | 0.038880 | 7.520166 | 328 | |
4 | dense | 2048 | 512 | 4.0 | 2.923211 | 0.027567 | 7.108979 | 46 | |
5 | dense | 2048 | 2048 | 1.0 | 3.704523 | 0.023680 | 3.835827 | 70 | |
6 | dense | 2048 | 2048 | 1.0 | 2.520152 | 0.020432 | 7.542082 | 142 | |
7 | dense | 2048 | 512 | 4.0 | 5.542469 | 0.041540 | 0.233354 | 51 | |
8 | dense | 2048 | 2048 | 1.0 | 3.711641 | 0.018657 | 8.277174 | 68 | |
9 | dense | 2048 | 2048 | 1.0 | 3.933387 | 0.038313 | 4.203196 | 80 | |
10 | dense | 2048 | 512 | 4.0 | 4.837489 | 0.041216 | 9.078401 | 40 | |
11 | dense | 2048 | 512 | 4.0 | 6.019533 | 0.058894 | -0.386183 | 48 | under-trained |
12 | dense | 8192 | 2048 | 4.0 | 4.092583 | 0.013882 | 8.300081 | 370 | |
13 | dense | 8192 | 2048 | 4.0 | 4.993378 | 0.015454 | 6.841099 | 111 | |
14 | dense | 8192 | 2048 | 4.0 | 7.031296 | 0.028135 | 11.910484 | 187 | under-trained |
15 | dense | 2048 | 512 | 4.0 | 5.127281 | 0.072064 | 0.172276 | 70 | |
16 | dense | 2048 | 2048 | 1.0 | 3.030573 | 0.037979 | 6.290513 | 188 | |
17 | dense | 2048 | 2048 | 1.0 | 3.358006 | 0.021306 | 3.225524 | 192 | |
18 | dense | 2048 | 512 | 4.0 | 5.374304 | 0.052900 | 9.137867 | 47 | |
19 | dense | 8192 | 2048 | 4.0 | 9.570801 | 0.026665 | 10.045216 | 101 | under-trained |
20 | dense | 8192 | 2048 | 4.0 | 3.663868 | 0.027038 | 7.669229 | 63 | |
21 | dense | 8192 | 2048 | 4.0 | 4.440621 | 0.009935 | 6.384034 | 242 | |
22 | dense | 2048 | 512 | 4.0 | 3.835539 | 0.120145 | 0.252894 | 199 | |
23 | dense | 8192 | 2048 | 4.0 | 7.499175 | 0.020929 | 8.661017 | 137 | under-trained |
24 | dense | 8192 | 2048 | 4.0 | 3.604147 | 0.013904 | 8.381533 | 401 | |
25 | dense | 8192 | 2048 | 4.0 | 4.508443 | 0.011691 | 6.000222 | 235 | |
26 | dense | 2048 | 2048 | 1.0 | 3.398757 | 0.028117 | 7.213640 | 101 | |
27 | dense | 2048 | 512 | 4.0 | 4.411974 | 0.048516 | 8.106639 | 43 | |
28 | dense | 2048 | 2048 | 1.0 | 3.500768 | 0.074918 | 2.948367 | 257 | |
29 | dense | 2048 | 512 | 4.0 | 8.286389 | 0.106424 | 0.474268 | 55 | under-trained |
30 | dense | 8192 | 2048 | 4.0 | 4.186985 | 0.012432 | 5.952185 | 324 | |
31 | dense | 8192 | 2048 | 4.0 | 3.776847 | 0.013131 | 9.042824 | 300 | |
32 | dense | 8192 | 2048 | 4.0 | 6.797984 | 0.021940 | 6.989183 | 174 | under-trained |
33 | dense | 2048 | 512 | 4.0 | 3.696215 | 0.031989 | 6.766707 | 55 | |
34 | dense | 2048 | 2048 | 1.0 | 4.584244 | 0.047419 | 3.651966 | 89 | |
35 | dense | 2048 | 2048 | 1.0 | 3.656006 | 0.023809 | 7.707858 | 63 | |
36 | dense | 8192 | 2048 | 4.0 | 3.874047 | 0.012214 | 6.116188 | 288 | |
37 | dense | 8192 | 2048 | 4.0 | 3.831212 | 0.009427 | 8.830629 | 131 | |
38 | dense | 8192 | 2048 | 4.0 | 6.299683 | 0.016846 | 7.162571 | 130 | under-trained |
39 | dense | 2048 | 512 | 4.0 | 4.321357 | 0.032016 | 7.934952 | 49 | |
40 | dense | 2048 | 2048 | 1.0 | 3.545900 | 0.047909 | 2.821221 | 142 | |
41 | dense | 2048 | 512 | 4.0 | 3.967355 | 0.072516 | 0.158733 | 106 | |
42 | dense | 2048 | 2048 | 1.0 | 3.910493 | 0.030217 | 8.114717 | 32 | |
43 | dense | 2048 | 512 | 4.0 | 8.427666 | 0.084036 | 0.832062 | 35 | under-trained |
44 | dense | 8192 | 2048 | 4.0 | 3.699589 | 0.009776 | 8.156571 | 238 | |
45 | dense | 8192 | 2048 | 4.0 | 5.413743 | 0.013649 | 5.794042 | 202 | |
46 | dense | 2048 | 512 | 4.0 | 2.332383 | 0.089875 | 4.382392 | 158 | |
47 | dense | 2048 | 2048 | 1.0 | 4.249280 | 0.045112 | 3.010547 | 95 | |
48 | dense | 8192 | 2048 | 4.0 | 3.895261 | 0.012799 | 6.517769 | 303 | |
49 | dense | 2048 | 2048 | 1.0 | 3.254713 | 0.036529 | 6.763376 | 68 | |
50 | dense | 2048 | 512 | 4.0 | 4.422180 | 0.038753 | 7.863219 | 47 | |
51 | dense | 8192 | 2048 | 4.0 | 4.747338 | 0.013036 | 6.141709 | 220 | |
52 | dense | 2048 | 2048 | 1.0 | 3.095703 | 0.087419 | 2.428759 | 187 | |
53 | dense | 8192 | 2048 | 4.0 | 3.927546 | 0.017785 | 6.677279 | 353 | |
54 | dense | 2048 | 512 | 4.0 | 3.943979 | 0.109902 | 0.723015 | 114 | |
55 | dense | 2048 | 2048 | 1.0 | 2.148718 | 0.089356 | 4.359238 | 392 | |
56 | dense | 8192 | 2048 | 4.0 | 3.481583 | 0.020449 | 7.683062 | 234 | |
57 | dense | 8192 | 2048 | 4.0 | 4.256401 | 0.026063 | 6.347934 | 383 | |
58 | dense | 8192 | 2048 | 4.0 | 5.913016 | 0.021849 | 6.836058 | 153 | |
59 | dense | 2048 | 512 | 4.0 | 2.711612 | 0.090447 | 4.958893 | 119 | |
60 | dense | 2048 | 2048 | 1.0 | 3.289975 | 0.081035 | 2.508980 | 208 | |
61 | dense | 2048 | 2048 | 1.0 | 2.863129 | 0.068437 | 5.603831 | 118 | |
62 | dense | 2048 | 512 | 4.0 | 3.357515 | 0.119901 | 0.114515 | 196 | |
63 | dense | 8192 | 2048 | 4.0 | 3.872807 | 0.018303 | 8.484145 | 138 | |
64 | dense | 8192 | 2048 | 4.0 | 5.041432 | 0.038128 | 7.360838 | 330 | |
65 | dense | 8192 | 2048 | 4.0 | 4.153827 | 0.023253 | 9.482279 | 111 | |
66 | dense | 8192 | 2048 | 4.0 | 6.034470 | 0.014928 | 7.274925 | 160 | under-trained |
67 | dense | 2048 | 512 | 4.0 | 2.652407 | 0.110240 | 4.914322 | 155 | |
68 | dense | 2048 | 2048 | 1.0 | 4.234541 | 0.049741 | 3.653870 | 125 | |
69 | dense | 2048 | 512 | 4.0 | 7.648357 | 0.067938 | 0.193734 | 68 | under-trained |
70 | dense | 2048 | 2048 | 1.0 | 2.468285 | 0.079693 | 5.344289 | 200 | |
71 | dense | 2048 | 512 | 4.0 | 5.226811 | 0.083666 | 0.872216 | 108 | |
72 | dense | 2048 | 2048 | 1.0 | 3.474159 | 0.037347 | 3.378123 | 144 | |
73 | dense | 2048 | 512 | 4.0 | 3.066793 | 0.116169 | 5.314032 | 107 | |
74 | dense | 2048 | 2048 | 1.0 | 2.303945 | 0.083852 | 4.776214 | 255 | |
75 | dense | 8192 | 2048 | 4.0 | 4.662133 | 0.021340 | 10.490778 | 174 | |
76 | dense | 8192 | 2048 | 4.0 | 5.498780 | 0.035110 | 7.451332 | 258 | |
77 | dense | 8192 | 2048 | 4.0 | 6.986783 | 0.023493 | 7.969898 | 138 | under-trained |
78 | dense | 8192 | 2048 | 4.0 | 6.692958 | 0.025412 | 8.668008 | 159 | under-trained |
79 | dense | 8192 | 2048 | 4.0 | 5.331706 | 0.027066 | 11.357352 | 170 | |
80 | dense | 8192 | 2048 | 4.0 | 7.798572 | 0.034156 | 9.485667 | 116 | under-trained |
81 | dense | 2048 | 2048 | 1.0 | 7.043375 | 0.031527 | 6.471495 | 47 | under-trained |
82 | dense | 2048 | 2048 | 1.0 | 2.529226 | 0.055272 | 5.226962 | 228 | |
83 | dense | 2048 | 512 | 4.0 | 5.730204 | 0.109994 | 0.299868 | 105 | |
84 | dense | 2048 | 512 | 4.0 | 2.405827 | 0.077464 | 4.578793 | 162 | |
85 | dense | 2048 | 2048 | 1.0 | 5.222538 | 0.018855 | 5.831666 | 82 | |
86 | dense | 8192 | 2048 | 4.0 | 5.324917 | 0.024077 | 6.346215 | 244 | |
87 | dense | 8192 | 2048 | 4.0 | 5.580990 | 0.025853 | 10.381848 | 150 | |
88 | dense | 8192 | 2048 | 4.0 | 8.062598 | 0.024412 | 9.386697 | 99 | under-trained |
89 | dense | 2048 | 512 | 4.0 | 4.218250 | 0.036900 | 7.715116 | 36 | |
90 | dense | 2048 | 2048 | 1.0 | 2.252705 | 0.075679 | 4.733342 | 407 | |
91 | dense | 2048 | 512 | 4.0 | 8.252301 | 0.080187 | 0.826841 | 51 | under-trained |
92 | dense | 8192 | 2048 | 4.0 | 6.739482 | 0.028846 | 7.730602 | 137 | under-trained |
93 | dense | 8192 | 2048 | 4.0 | 4.868857 | 0.021607 | 9.016641 | 183 | |
94 | dense | 2048 | 512 | 4.0 | 3.360103 | 0.045088 | 5.870726 | 55 | |
95 | dense | 2048 | 2048 | 1.0 | 6.239053 | 0.055622 | 7.254163 | 86 | under-trained |
96 | dense | 2048 | 2048 | 1.0 | 2.581595 | 0.046928 | 5.296993 | 237 | |
97 | dense | 2048 | 512 | 4.0 | 4.050393 | 0.127968 | 0.806446 | 167 | |
98 | dense | 8192 | 2048 | 4.0 | 6.257810 | 0.022769 | 8.066945 | 159 | under-trained |
99 | dense | 8192 | 2048 | 4.0 | 9.329234 | 0.033155 | 11.540999 | 62 | under-trained |
100 | dense | 8192 | 2048 | 4.0 | 4.436869 | 0.025872 | 8.872160 | 222 | |
101 | dense | 8192 | 2048 | 4.0 | 5.269412 | 0.013837 | 8.560699 | 149 | |
102 | dense | 2048 | 512 | 4.0 | 3.871641 | 0.047139 | 6.539589 | 21 | |
103 | dense | 2048 | 2048 | 1.0 | 3.959309 | 0.121883 | 4.740210 | 246 | |
104 | dense | 2048 | 2048 | 1.0 | 2.045696 | 0.037863 | 4.459243 | 329 | |
105 | dense | 2048 | 512 | 4.0 | 2.221702 | 0.067403 | 1.970060 | 247 | |
106 | dense | 2048 | 2048 | 1.0 | 2.164987 | 0.045854 | 4.806125 | 256 | |
107 | dense | 8192 | 2048 | 4.0 | 3.781602 | 0.028628 | 8.958692 | 338 | |
108 | dense | 8192 | 2048 | 4.0 | 3.543507 | 0.028730 | 8.094537 | 374 | |
109 | dense | 2048 | 512 | 4.0 | 3.228773 | 0.039283 | 5.854102 | 49 | |
110 | dense | 2048 | 2048 | 1.0 | 3.663659 | 0.055250 | 3.361705 | 158 | |
111 | dense | 8192 | 2048 | 4.0 | 9.090875 | 0.055415 | 11.396832 | 87 | under-trained |
112 | dense | 2048 | 512 | 4.0 | 2.541051 | 0.072589 | 1.671547 | 232 |