Find this model in the SmolLM-base model summary



| id | layer_type | N | M | Q | alpha | D | alpha-hat | num_spikes | warning |
|---|---|---|---|---|---|---|---|---|---|
| 1 | dense | 49152 | 2048 | 24.0 | 4.985810 | 0.035900 | 26.094815 | 606 | |
| 2 | dense | 8192 | 2048 | 4.0 | 6.061865 | 0.047682 | 22.467151 | 377 | under-trained |
| 3 | dense | 8192 | 2048 | 4.0 | 4.444656 | 0.025693 | 14.708213 | 315 | |
| 4 | dense | 8192 | 2048 | 4.0 | 5.348869 | 0.022820 | 15.849016 | 192 | |
| 5 | dense | 2048 | 2048 | 1.0 | 1.907949 | 0.044071 | 8.227395 | 64 | over-trained |
| 6 | dense | 2048 | 2048 | 1.0 | 4.234248 | 0.027679 | 11.671066 | 91 | |
| 7 | dense | 2048 | 2048 | 1.0 | 2.007735 | 0.042923 | 9.100340 | 41 | |
| 8 | dense | 2048 | 2048 | 1.0 | 5.673907 | 0.034425 | 12.232675 | 38 | |
| 9 | dense | 8192 | 2048 | 4.0 | 5.877559 | 0.048391 | 21.147509 | 341 | |
| 10 | dense | 8192 | 2048 | 4.0 | 3.492588 | 0.048506 | 11.783397 | 494 | |
| 11 | dense | 8192 | 2048 | 4.0 | 4.657179 | 0.030300 | 13.649515 | 263 | |
| 12 | dense | 2048 | 2048 | 1.0 | 2.379652 | 0.030604 | 8.545119 | 134 | |
| 13 | dense | 2048 | 2048 | 1.0 | 4.192792 | 0.028741 | 14.108445 | 132 | |
| 14 | dense | 2048 | 2048 | 1.0 | 2.217322 | 0.034528 | 8.043939 | 134 | |
| 15 | dense | 2048 | 2048 | 1.0 | 3.205916 | 0.049733 | 7.847394 | 181 | |
| 16 | dense | 8192 | 2048 | 4.0 | 3.689984 | 0.044762 | 12.635370 | 497 | |
| 17 | dense | 8192 | 2048 | 4.0 | 6.824178 | 0.046423 | 22.886140 | 274 | under-trained |
| 18 | dense | 2048 | 2048 | 1.0 | 2.606312 | 0.031805 | 9.119949 | 102 | |
| 19 | dense | 2048 | 2048 | 1.0 | 4.129533 | 0.050163 | 9.838797 | 118 | |
| 20 | dense | 2048 | 2048 | 1.0 | 4.276058 | 0.031084 | 12.893443 | 123 | |
| 21 | dense | 2048 | 2048 | 1.0 | 2.439897 | 0.041158 | 8.605260 | 105 | |
| 22 | dense | 8192 | 2048 | 4.0 | 5.270880 | 0.023578 | 15.242650 | 230 | |
| 23 | dense | 8192 | 2048 | 4.0 | 7.257522 | 0.048512 | 24.878408 | 229 | under-trained |
| 24 | dense | 8192 | 2048 | 4.0 | 4.166175 | 0.036450 | 14.262419 | 400 | |
| 25 | dense | 2048 | 2048 | 1.0 | 4.753083 | 0.017375 | 14.079333 | 116 | |
| 26 | dense | 2048 | 2048 | 1.0 | 2.948238 | 0.017577 | 10.002619 | 119 | |
| 27 | dense | 8192 | 2048 | 4.0 | 5.909732 | 0.022335 | 16.575888 | 190 | |
| 28 | dense | 2048 | 2048 | 1.0 | 2.778066 | 0.019825 | 9.944534 | 99 | |
| 29 | dense | 2048 | 2048 | 1.0 | 4.583617 | 0.056380 | 10.586130 | 134 | |
| 30 | dense | 8192 | 2048 | 4.0 | 3.171970 | 0.029821 | 11.012554 | 105 | |
| 31 | dense | 8192 | 2048 | 4.0 | 6.204448 | 0.036914 | 21.031265 | 262 | under-trained |
| 32 | dense | 8192 | 2048 | 4.0 | 5.583837 | 0.033250 | 15.906687 | 196 | |
| 33 | dense | 2048 | 2048 | 1.0 | 3.173510 | 0.019116 | 10.622360 | 107 | |
| 34 | dense | 2048 | 2048 | 1.0 | 5.208979 | 0.034556 | 15.137898 | 110 | |
| 35 | dense | 2048 | 2048 | 1.0 | 3.499016 | 0.015965 | 11.420596 | 130 | |
| 36 | dense | 2048 | 2048 | 1.0 | 7.380008 | 0.054563 | 16.385398 | 54 | under-trained |
| 37 | dense | 8192 | 2048 | 4.0 | 6.292303 | 0.050843 | 17.251547 | 183 | under-trained |
| 38 | dense | 8192 | 2048 | 4.0 | 3.546132 | 0.031751 | 12.726826 | 535 | |
| 39 | dense | 8192 | 2048 | 4.0 | 5.741278 | 0.035362 | 19.211060 | 281 | |
| 40 | dense | 2048 | 2048 | 1.0 | 3.495771 | 0.018625 | 11.975130 | 137 | |
| 41 | dense | 2048 | 2048 | 1.0 | 4.613460 | 0.021383 | 11.091040 | 125 | |
| 42 | dense | 2048 | 2048 | 1.0 | 3.102129 | 0.013069 | 10.211464 | 156 | |
| 43 | dense | 2048 | 2048 | 1.0 | 4.640748 | 0.015478 | 13.745047 | 95 | |
| 44 | dense | 8192 | 2048 | 4.0 | 5.236114 | 0.014344 | 17.360837 | 194 | |
| 45 | dense | 8192 | 2048 | 4.0 | 3.481919 | 0.023798 | 12.642276 | 504 | |
| 46 | dense | 2048 | 2048 | 1.0 | 3.512885 | 0.031995 | 11.869035 | 195 | |
| 47 | dense | 8192 | 2048 | 4.0 | 5.775285 | 0.057441 | 16.001157 | 217 | |
| 48 | dense | 2048 | 2048 | 1.0 | 3.953079 | 0.031498 | 11.758530 | 122 | |
| 49 | dense | 2048 | 2048 | 1.0 | 3.092364 | 0.021136 | 10.001817 | 218 | |
| 50 | dense | 2048 | 2048 | 1.0 | 4.591788 | 0.036275 | 10.951975 | 94 | |
| 51 | dense | 2048 | 2048 | 1.0 | 3.628250 | 0.022703 | 11.833557 | 63 | |
| 52 | dense | 8192 | 2048 | 4.0 | 3.589795 | 0.024592 | 12.924875 | 493 | |
| 53 | dense | 8192 | 2048 | 4.0 | 4.917805 | 0.021771 | 16.323827 | 144 | |
| 54 | dense | 2048 | 2048 | 1.0 | 4.262291 | 0.027046 | 13.110964 | 77 | |
| 55 | dense | 8192 | 2048 | 4.0 | 6.063698 | 0.041386 | 16.785643 | 182 | under-trained |
| 56 | dense | 2048 | 2048 | 1.0 | 3.310731 | 0.027859 | 10.800938 | 71 | |
| 57 | dense | 2048 | 2048 | 1.0 | 4.768189 | 0.042679 | 11.810830 | 77 | |
| 58 | dense | 2048 | 2048 | 1.0 | 4.053091 | 0.035803 | 12.772964 | 127 | |
| 59 | dense | 2048 | 2048 | 1.0 | 3.368119 | 0.016368 | 10.985669 | 146 | |
| 60 | dense | 8192 | 2048 | 4.0 | 5.823414 | 0.029105 | 16.424994 | 181 | |
| 61 | dense | 8192 | 2048 | 4.0 | 3.313864 | 0.021751 | 11.887602 | 111 | |
| 62 | dense | 8192 | 2048 | 4.0 | 4.103623 | 0.043359 | 13.540755 | 334 | |
| 63 | dense | 2048 | 2048 | 1.0 | 3.066863 | 0.015747 | 9.984157 | 141 | |
| 64 | dense | 2048 | 2048 | 1.0 | 4.489395 | 0.033328 | 11.331343 | 63 | |
| 65 | dense | 2048 | 2048 | 1.0 | 4.423856 | 0.048219 | 11.121372 | 96 | |
| 66 | dense | 8192 | 2048 | 4.0 | 6.476095 | 0.044759 | 18.893815 | 92 | under-trained |
| 67 | dense | 2048 | 2048 | 1.0 | 3.663394 | 0.022355 | 11.287357 | 70 | |
| 68 | dense | 8192 | 2048 | 4.0 | 4.731981 | 0.042626 | 15.517160 | 123 | |
| 69 | dense | 8192 | 2048 | 4.0 | 3.711486 | 0.011649 | 12.592826 | 384 | |
| 70 | dense | 2048 | 2048 | 1.0 | 4.921503 | 0.035260 | 14.583494 | 45 | |
| 71 | dense | 2048 | 2048 | 1.0 | 2.707099 | 0.027965 | 8.664626 | 241 | |
| 72 | dense | 8192 | 2048 | 4.0 | 3.532370 | 0.009439 | 11.999676 | 346 | |
| 73 | dense | 8192 | 2048 | 4.0 | 4.112336 | 0.025554 | 13.960611 | 219 | |
| 74 | dense | 2048 | 2048 | 1.0 | 2.418397 | 0.038347 | 7.500156 | 271 | |
| 75 | dense | 2048 | 2048 | 1.0 | 3.652943 | 0.086832 | 10.245887 | 244 | |
| 76 | dense | 2048 | 2048 | 1.0 | 2.729312 | 0.030026 | 8.434665 | 254 | |
| 77 | dense | 2048 | 2048 | 1.0 | 4.211135 | 0.051518 | 10.459128 | 113 | |
| 78 | dense | 8192 | 2048 | 4.0 | 5.241774 | 0.034134 | 15.733354 | 185 | |
| 79 | dense | 8192 | 2048 | 4.0 | 4.670991 | 0.035909 | 14.031295 | 218 | |
| 80 | dense | 8192 | 2048 | 4.0 | 4.053647 | 0.035053 | 13.780550 | 289 | |
| 81 | dense | 8192 | 2048 | 4.0 | 3.282725 | 0.010121 | 11.143081 | 370 | |
| 82 | dense | 2048 | 2048 | 1.0 | 3.476585 | 0.042547 | 9.239317 | 148 | |
| 83 | dense | 2048 | 2048 | 1.0 | 2.513107 | 0.029821 | 7.856054 | 224 | |
| 84 | dense | 2048 | 2048 | 1.0 | 3.991637 | 0.036708 | 11.472920 | 92 | |
| 85 | dense | 2048 | 2048 | 1.0 | 2.801399 | 0.032188 | 8.697806 | 216 | |
| 86 | dense | 2048 | 2048 | 1.0 | 3.361373 | 0.041090 | 10.606104 | 57 | |
| 87 | dense | 8192 | 2048 | 4.0 | 4.639168 | 0.036442 | 14.163982 | 191 | |
| 88 | dense | 8192 | 2048 | 4.0 | 3.330364 | 0.013417 | 11.036625 | 297 | |
| 89 | dense | 8192 | 2048 | 4.0 | 4.137977 | 0.025324 | 14.109428 | 132 | |
| 90 | dense | 2048 | 2048 | 1.0 | 2.476118 | 0.060575 | 7.870749 | 241 | |
| 91 | dense | 2048 | 2048 | 1.0 | 3.881286 | 0.076151 | 10.313551 | 91 | |
| 92 | dense | 2048 | 2048 | 1.0 | 4.341679 | 0.058734 | 12.502720 | 68 | |
| 93 | dense | 2048 | 2048 | 1.0 | 2.959650 | 0.076142 | 8.589222 | 296 | |
| 94 | dense | 8192 | 2048 | 4.0 | 3.880562 | 0.029010 | 12.979227 | 288 | |
| 95 | dense | 2048 | 2048 | 1.0 | 2.564307 | 0.033808 | 8.021692 | 244 | |
| 96 | dense | 8192 | 2048 | 4.0 | 4.851072 | 0.031249 | 14.836685 | 135 | |
| 97 | dense | 8192 | 2048 | 4.0 | 3.476809 | 0.018308 | 11.271706 | 234 | |
| 98 | dense | 2048 | 2048 | 1.0 | 2.907507 | 0.059473 | 7.802307 | 247 | |
| 99 | dense | 2048 | 2048 | 1.0 | 2.388114 | 0.049874 | 7.467722 | 278 | |
| 100 | dense | 2048 | 2048 | 1.0 | 2.294156 | 0.048259 | 7.055529 | 236 | |
| 101 | dense | 2048 | 2048 | 1.0 | 3.818825 | 0.046852 | 11.603113 | 153 | |
| 102 | dense | 8192 | 2048 | 4.0 | 3.459069 | 0.015853 | 11.197832 | 245 | |
| 103 | dense | 8192 | 2048 | 4.0 | 4.294616 | 0.022258 | 14.375133 | 228 | |
| 104 | dense | 2048 | 2048 | 1.0 | 2.385462 | 0.025676 | 7.377367 | 266 | |
| 105 | dense | 8192 | 2048 | 4.0 | 4.585219 | 0.038642 | 14.011958 | 171 | |
| 106 | dense | 2048 | 2048 | 1.0 | 3.429516 | 0.060301 | 8.924242 | 221 | |
| 107 | dense | 2048 | 2048 | 1.0 | 4.520516 | 0.062814 | 11.641754 | 105 | |
| 108 | dense | 2048 | 2048 | 1.0 | 4.001494 | 0.042004 | 12.447019 | 105 | |
| 109 | dense | 2048 | 2048 | 1.0 | 2.342501 | 0.049112 | 7.210903 | 289 | |
| 110 | dense | 8192 | 2048 | 4.0 | 4.462607 | 0.035344 | 13.719445 | 239 | |
| 111 | dense | 8192 | 2048 | 4.0 | 3.573419 | 0.012997 | 11.493275 | 245 | |
| 112 | dense | 8192 | 2048 | 4.0 | 4.171793 | 0.016879 | 13.705107 | 270 | |
| 113 | dense | 2048 | 2048 | 1.0 | 2.649582 | 0.029696 | 8.157236 | 180 | |
| 114 | dense | 2048 | 2048 | 1.0 | 2.798565 | 0.020672 | 8.525745 | 210 | |
| 115 | dense | 2048 | 2048 | 1.0 | 4.701752 | 0.079109 | 12.155946 | 154 | |
| 116 | dense | 2048 | 2048 | 1.0 | 2.628936 | 0.043744 | 8.016969 | 190 | |
| 117 | dense | 2048 | 2048 | 1.0 | 4.373851 | 0.023457 | 13.656324 | 101 | |
| 118 | dense | 8192 | 2048 | 4.0 | 3.593117 | 0.012376 | 11.544061 | 295 | |
| 119 | dense | 8192 | 2048 | 4.0 | 4.500922 | 0.026694 | 13.797287 | 219 | |
| 120 | dense | 8192 | 2048 | 4.0 | 4.497480 | 0.009951 | 14.760452 | 240 | |
| 121 | dense | 2048 | 2048 | 1.0 | 2.707310 | 0.018194 | 8.362941 | 201 | |
| 122 | dense | 2048 | 2048 | 1.0 | 2.447662 | 0.041650 | 7.395742 | 261 | |
| 123 | dense | 8192 | 2048 | 4.0 | 5.185202 | 0.021314 | 16.294687 | 240 | |
| 124 | dense | 8192 | 2048 | 4.0 | 3.654104 | 0.010326 | 11.761810 | 312 | |
| 125 | dense | 8192 | 2048 | 4.0 | 4.590249 | 0.017140 | 14.221978 | 181 | |
| 126 | dense | 2048 | 2048 | 1.0 | 4.163768 | 0.081519 | 13.236066 | 215 | |
| 127 | dense | 2048 | 2048 | 1.0 | 4.615185 | 0.059889 | 11.916508 | 148 | |
| 128 | dense | 8192 | 2048 | 4.0 | 6.732997 | 0.027807 | 20.724183 | 177 | under-trained |
| 129 | dense | 8192 | 2048 | 4.0 | 3.560899 | 0.014532 | 11.455630 | 359 | |
| 130 | dense | 8192 | 2048 | 4.0 | 4.405660 | 0.017968 | 13.780654 | 256 | |
| 131 | dense | 2048 | 2048 | 1.0 | 2.577489 | 0.017489 | 7.809797 | 257 | |
| 132 | dense | 2048 | 2048 | 1.0 | 4.002647 | 0.120227 | 12.895966 | 323 | |
| 133 | dense | 2048 | 2048 | 1.0 | 2.431265 | 0.034764 | 7.217835 | 282 | |
| 134 | dense | 2048 | 2048 | 1.0 | 3.286023 | 0.111645 | 8.282764 | 314 | |
| 135 | dense | 8192 | 2048 | 4.0 | 8.551152 | 0.037032 | 24.877356 | 142 | under-trained |
| 136 | dense | 8192 | 2048 | 4.0 | 3.621229 | 0.013640 | 11.595202 | 360 | |
| 137 | dense | 8192 | 2048 | 4.0 | 4.383819 | 0.022844 | 13.865581 | 241 | |
| 138 | dense | 2048 | 2048 | 1.0 | 2.631697 | 0.022029 | 7.880670 | 252 | |
| 139 | dense | 2048 | 2048 | 1.0 | 3.344890 | 0.061878 | 11.141460 | 310 | |
| 140 | dense | 2048 | 2048 | 1.0 | 2.436662 | 0.028662 | 7.243400 | 242 | |
| 141 | dense | 2048 | 2048 | 1.0 | 2.727758 | 0.042429 | 7.871635 | 314 | |
| 142 | dense | 8192 | 2048 | 4.0 | 8.806831 | 0.046651 | 26.001409 | 150 | under-trained |
| 143 | dense | 8192 | 2048 | 4.0 | 3.688949 | 0.012926 | 11.736405 | 362 | |
| 144 | dense | 8192 | 2048 | 4.0 | 4.563524 | 0.034064 | 14.092819 | 230 | |
| 145 | dense | 2048 | 2048 | 1.0 | 2.704949 | 0.022842 | 8.303454 | 294 | |
| 146 | dense | 2048 | 2048 | 1.0 | 5.481369 | 0.060181 | 18.966246 | 128 | |
| 147 | dense | 2048 | 2048 | 1.0 | 2.490630 | 0.022876 | 7.465164 | 335 | |
| 148 | dense | 2048 | 2048 | 1.0 | 2.713586 | 0.062785 | 7.590794 | 359 | |
| 149 | dense | 8192 | 2048 | 4.0 | 8.540277 | 0.033041 | 25.806277 | 152 | under-trained |
| 150 | dense | 8192 | 2048 | 4.0 | 3.745582 | 0.012952 | 11.884021 | 353 | |
| 151 | dense | 8192 | 2048 | 4.0 | 4.675792 | 0.037899 | 14.153032 | 246 | |
| 152 | dense | 2048 | 2048 | 1.0 | 2.615634 | 0.024370 | 8.157795 | 308 | |
| 153 | dense | 2048 | 2048 | 1.0 | 5.975373 | 0.063874 | 20.476260 | 154 | |
| 154 | dense | 2048 | 2048 | 1.0 | 2.485479 | 0.023897 | 7.489174 | 323 | |
| 155 | dense | 2048 | 2048 | 1.0 | 2.841628 | 0.039679 | 8.186765 | 324 | |
| 156 | dense | 8192 | 2048 | 4.0 | 7.267474 | 0.038396 | 22.527937 | 181 | under-trained |
| 157 | dense | 8192 | 2048 | 4.0 | 3.850526 | 0.014250 | 12.456862 | 345 | |
| 158 | dense | 8192 | 2048 | 4.0 | 5.369081 | 0.017470 | 17.702984 | 76 | |
| 159 | dense | 2048 | 2048 | 1.0 | 2.735886 | 0.025589 | 8.599283 | 307 | |
| 160 | dense | 2048 | 2048 | 1.0 | 5.705724 | 0.072234 | 20.206546 | 178 | |
| 161 | dense | 2048 | 2048 | 1.0 | 2.588641 | 0.030018 | 7.834319 | 358 | |
| 162 | dense | 2048 | 2048 | 1.0 | 3.855806 | 0.043412 | 10.680908 | 215 | |
| 163 | dense | 8192 | 2048 | 4.0 | 5.519475 | 0.032620 | 19.585211 | 258 | |
| 164 | dense | 8192 | 2048 | 4.0 | 3.926951 | 0.035242 | 13.158016 | 452 | |
| 165 | dense | 8192 | 2048 | 4.0 | 3.917016 | 0.025082 | 13.797526 | 35 | |
| 166 | dense | 2048 | 2048 | 1.0 | 3.083176 | 0.018359 | 9.817370 | 229 | |
| 167 | dense | 2048 | 2048 | 1.0 | 4.443981 | 0.074175 | 15.695763 | 274 | |
| 168 | dense | 2048 | 2048 | 1.0 | 2.827124 | 0.017963 | 8.378949 | 273 | |
| 169 | dense | 2048 | 2048 | 1.0 | 6.049623 | 0.100282 | 16.720815 | 174 | under-trained |