Llama-3.1-8B-Instruct


Find this model in the Llama3.1 model summary


Llama-3.1-8B-Instruct Model Set Plots


Llama3.1 Compared to Base Model Plots



Llama-3.1-8B-Instruct Model Selected Details
id layer_type N M Q alpha D alpha-hat num_spikes warning
1 dense 14336 4096 3.5 5.566277 0.036783 -7.484827 257
2 dense 14336 4096 3.5 3.744572 0.016308 -4.346715 243
3 dense 14336 4096 3.5 3.953621 0.024032 -4.737875 205
4 dense 4096 1024 4.0 1.829401 0.029053 -1.778983 148 over-trained
5 dense 4096 4096 1.0 2.903583 0.024394 -3.552889 164
6 dense 4096 4096 1.0 1.961999 0.020429 -0.422494 232 over-trained
7 dense 4096 1024 4.0 2.892228 0.016609 -5.220379 158
8 dense 4096 1024 4.0 3.904200 0.016518 -7.035455 105
9 dense 4096 4096 1.0 2.034319 0.028225 -2.470295 378
10 dense 4096 1024 4.0 2.110392 0.026734 -3.064152 117
11 dense 4096 4096 1.0 3.394994 0.014317 -4.539148 151
12 dense 14336 4096 3.5 4.304065 0.038416 -4.699831 147
13 dense 14336 4096 3.5 5.319059 0.027326 -6.444489 337
14 dense 14336 4096 3.5 4.638550 0.039024 -5.443615 138
15 dense 14336 4096 3.5 5.440712 0.018162 -7.302276 209
16 dense 14336 4096 3.5 3.876195 0.017399 -4.030499 425
17 dense 14336 4096 3.5 3.929544 0.013290 -4.291484 369
18 dense 4096 1024 4.0 2.688238 0.019762 -4.532494 164
19 dense 4096 4096 1.0 3.810574 0.027142 -5.425370 112
20 dense 4096 4096 1.0 2.568932 0.021869 -2.992376 300
21 dense 4096 1024 4.0 4.204961 0.023961 -8.154221 71
22 dense 4096 4096 1.0 2.440665 0.020927 -3.008347 335
23 dense 4096 1024 4.0 4.618718 0.026747 -9.298781 44
24 dense 4096 1024 4.0 2.598886 0.020203 -4.312166 173
25 dense 4096 4096 1.0 4.086834 0.019777 -5.659924 90
26 dense 14336 4096 3.5 3.586677 0.006002 -3.276137 278
27 dense 14336 4096 3.5 5.306092 0.013875 -7.211346 190
28 dense 14336 4096 3.5 3.818615 0.011928 -3.811465 160
29 dense 14336 4096 3.5 5.288662 0.014028 -7.321813 154
30 dense 14336 4096 3.5 3.475786 0.006908 -3.127371 278
31 dense 14336 4096 3.5 3.870221 0.016374 -3.894667 174
32 dense 4096 1024 4.0 2.275382 0.027502 -3.414949 240
33 dense 4096 4096 1.0 3.617898 0.032009 -5.050236 76
34 dense 4096 4096 1.0 2.237941 0.033755 -2.549039 370
35 dense 4096 1024 4.0 3.789872 0.031270 -7.380141 68
36 dense 4096 1024 4.0 2.523389 0.033862 -4.050649 126
37 dense 4096 1024 4.0 4.236551 0.039970 -8.604957 84
38 dense 4096 4096 1.0 2.396390 0.042810 -2.709584 333
39 dense 4096 4096 1.0 4.562651 0.027370 -6.546579 59
40 dense 14336 4096 3.5 3.345224 0.018337 -2.906197 268
41 dense 14336 4096 3.5 5.742532 0.015142 -7.997184 118
42 dense 14336 4096 3.5 4.058726 0.021147 -4.036190 59
43 dense 4096 4096 1.0 2.256764 0.053891 -2.737779 496
44 dense 14336 4096 3.5 5.483703 0.022006 -7.509832 118
45 dense 14336 4096 3.5 3.322000 0.024683 -2.670651 202
46 dense 14336 4096 3.5 4.039261 0.022260 -3.748096 75
47 dense 4096 1024 4.0 2.575460 0.037876 -3.724308 143
48 dense 4096 4096 1.0 4.613319 0.029404 -7.031493 64
49 dense 4096 1024 4.0 4.421184 0.031630 -8.449028 68
50 dense 4096 4096 1.0 5.482070 0.028429 -8.999247 43
51 dense 4096 4096 1.0 2.300242 0.064950 -2.730321 508
52 dense 4096 1024 4.0 2.873448 0.044821 -4.714789 70
53 dense 4096 1024 4.0 4.281765 0.092534 -9.579145 156
54 dense 14336 4096 3.5 3.305825 0.025538 -2.462425 167
55 dense 14336 4096 3.5 5.273812 0.019953 -6.986482 110
56 dense 14336 4096 3.5 3.778090 0.023315 -3.451383 136
57 dense 14336 4096 3.5 4.857646 0.013794 -5.629543 126
58 dense 14336 4096 3.5 3.418502 0.023981 -2.437481 89
59 dense 14336 4096 3.5 3.632789 0.019624 -3.222215 167
60 dense 4096 1024 4.0 2.493414 0.049723 -4.038852 190
61 dense 4096 4096 1.0 4.825733 0.023394 -7.557893 65
62 dense 4096 4096 1.0 2.537791 0.065874 -3.030965 301
63 dense 4096 1024 4.0 5.435277 0.041492 -11.417254 47
64 dense 14336 4096 3.5 3.773290 0.020846 -3.385495 79
65 dense 4096 1024 4.0 2.522993 0.052379 -3.877708 197
66 dense 4096 4096 1.0 3.766005 0.023528 -4.865797 137
67 dense 4096 4096 1.0 3.256556 0.040422 -3.518876 66
68 dense 4096 1024 4.0 4.383698 0.024910 -7.951996 83
69 dense 14336 4096 3.5 4.923977 0.018218 -6.181336 119
70 dense 14336 4096 3.5 3.482154 0.027412 -2.392972 61
71 dense 4096 1024 4.0 4.856759 0.031993 -9.907825 43
72 dense 14336 4096 3.5 4.778724 0.030886 -5.940484 104
73 dense 14336 4096 3.5 3.397191 0.030359 -2.178533 72
74 dense 14336 4096 3.5 3.566970 0.022815 -3.074182 172
75 dense 4096 1024 4.0 2.626219 0.050535 -4.249996 156
76 dense 4096 4096 1.0 4.515628 0.034145 -7.263082 67
77 dense 4096 4096 1.0 2.401896 0.068155 -2.766601 412
78 dense 14336 4096 3.5 4.115812 0.037242 -4.923190 202
79 dense 14336 4096 3.5 3.155625 0.025266 -1.836681 124
80 dense 14336 4096 3.5 3.490654 0.022166 -2.765869 120
81 dense 4096 1024 4.0 2.695721 0.053725 -4.152259 127
82 dense 4096 4096 1.0 3.974410 0.031570 -5.999063 75
83 dense 4096 4096 1.0 3.267564 0.041047 -3.698597 92
84 dense 4096 1024 4.0 4.464224 0.045865 -9.274534 82
85 dense 4096 4096 1.0 2.094559 0.076795 -2.311837 700
86 dense 14336 4096 3.5 4.230543 0.033043 -4.762989 143
87 dense 14336 4096 3.5 3.208724 0.020508 -1.940045 109
88 dense 14336 4096 3.5 3.447957 0.020300 -2.727383 132
89 dense 4096 1024 4.0 3.185031 0.066376 -5.321613 60
90 dense 4096 1024 4.0 5.259201 0.033806 -11.333432 50
91 dense 4096 4096 1.0 4.653701 0.023562 -7.206140 67
92 dense 14336 4096 3.5 3.486638 0.028560 -2.095000 67
93 dense 14336 4096 3.5 4.301591 0.034038 -4.751005 135
94 dense 4096 1024 4.0 5.599922 0.041605 -12.088022 48
95 dense 14336 4096 3.5 3.719462 0.018987 -2.865647 65
96 dense 4096 1024 4.0 2.676461 0.058559 -3.999240 151
97 dense 4096 4096 1.0 4.889089 0.032124 -7.994793 55
98 dense 4096 4096 1.0 3.263748 0.044998 -3.520784 81
99 dense 4096 1024 4.0 2.666811 0.060553 -4.079505 164
100 dense 4096 4096 1.0 5.833037 0.034403 -10.052303 51
101 dense 4096 4096 1.0 2.460032 0.081570 -2.730429 424
102 dense 4096 1024 4.0 4.761513 0.101614 -10.182945 145
103 dense 14336 4096 3.5 3.403841 0.031532 -2.245913 130
104 dense 14336 4096 3.5 3.664970 0.024135 -3.103680 167
105 dense 14336 4096 3.5 4.554698 0.034979 -4.630477 71
106 dense 4096 4096 1.0 3.861364 0.017876 -4.763029 86
107 dense 4096 4096 1.0 3.336432 0.049241 -3.281043 77
108 dense 4096 1024 4.0 2.678235 0.051177 -3.991945 158
109 dense 14336 4096 3.5 4.124877 0.027509 -3.676328 99
110 dense 14336 4096 3.5 2.942788 0.054924 -2.057289 554
111 dense 4096 1024 4.0 5.343460 0.035658 -10.363828 47
112 dense 14336 4096 3.5 4.137147 0.036676 -4.642545 205
113 dense 4096 1024 4.0 5.453090 0.042485 -11.379989 49
114 dense 4096 1024 4.0 2.961276 0.050823 -4.490973 92
115 dense 14336 4096 3.5 5.194557 0.043443 -5.789716 97
116 dense 14336 4096 3.5 3.591772 0.030046 -2.441147 133
117 dense 14336 4096 3.5 4.191695 0.025207 -3.603833 86
118 dense 4096 4096 1.0 4.762577 0.020839 -6.846206 63
119 dense 4096 4096 1.0 2.466467 0.075247 -2.480368 404
120 dense 14336 4096 3.5 4.774535 0.054608 -5.533087 201
121 dense 14336 4096 3.5 4.043116 0.029824 -2.698235 45
122 dense 4096 1024 4.0 4.995331 0.025293 -9.777764 66
123 dense 4096 4096 1.0 2.687236 0.051787 -2.403087 186
124 dense 4096 4096 1.0 3.730546 0.020139 -4.450703 142
125 dense 4096 1024 4.0 2.828873 0.044022 -4.038707 107
126 dense 14336 4096 3.5 4.246484 0.021018 -3.619156 85
127 dense 14336 4096 3.5 4.001497 0.024469 -2.795359 45
128 dense 14336 4096 3.5 4.818884 0.043329 -5.384868 134
129 dense 14336 4096 3.5 4.333223 0.025583 -3.830013 62
130 dense 4096 1024 4.0 3.739384 0.102830 -7.763754 210
131 dense 4096 4096 1.0 2.890685 0.064631 -2.932307 189
132 dense 4096 4096 1.0 3.602685 0.021159 -4.541940 134
133 dense 4096 1024 4.0 2.773756 0.040463 -3.983166 108
134 dense 4096 1024 4.0 2.605117 0.037866 -3.519287 129
135 dense 14336 4096 3.5 3.811864 0.020283 -2.583575 79
136 dense 4096 4096 1.0 2.534311 0.045690 -2.064493 218
137 dense 4096 1024 4.0 3.788415 0.077754 -7.567815 152
138 dense 14336 4096 3.5 4.077397 0.046358 -4.633618 321
139 dense 14336 4096 3.5 4.205559 0.018214 -3.523855 79
140 dense 4096 4096 1.0 3.401140 0.018897 -3.701802 120
141 dense 14336 4096 3.5 4.248404 0.022072 -3.627240 75
142 dense 14336 4096 3.5 4.282263 0.044585 -4.824118 238
143 dense 4096 1024 4.0 3.434701 0.094322 -6.654858 234
144 dense 4096 4096 1.0 2.459529 0.060041 -2.156628 349
145 dense 4096 4096 1.0 4.002706 0.071486 -5.471758 156
146 dense 4096 1024 4.0 2.801481 0.035508 -3.927915 93
147 dense 14336 4096 3.5 3.739231 0.028171 -2.563615 96
148 dense 14336 4096 3.5 4.667609 0.033556 -5.326261 195
149 dense 14336 4096 3.5 3.782943 0.022981 -2.501314 78
150 dense 14336 4096 3.5 4.139187 0.023189 -3.545766 94
151 dense 4096 1024 4.0 2.708548 0.025557 -3.421375 112
152 dense 4096 4096 1.0 4.627657 0.034273 -6.537403 79
153 dense 4096 4096 1.0 2.488661 0.051379 -2.003094 297
154 dense 4096 1024 4.0 4.337292 0.055716 -8.544352 90
155 dense 14336 4096 3.5 4.271438 0.024087 -3.777454 77
156 dense 4096 1024 4.0 5.102286 0.055429 -10.327493 52
157 dense 4096 4096 1.0 2.662460 0.062519 -2.513626 260
158 dense 4096 4096 1.0 4.723034 0.028986 -6.205449 62
159 dense 14336 4096 3.5 4.909658 0.027117 -5.944382 174
160 dense 14336 4096 3.5 3.739089 0.023923 -2.636720 109
161 dense 4096 1024 4.0 2.888455 0.040537 -4.167628 93
162 dense 4096 4096 1.0 2.640950 0.049703 -2.573006 206
163 dense 14336 4096 3.5 5.000543 0.025480 -6.023193 156
164 dense 14336 4096 3.5 3.603040 0.031477 -2.499370 137
165 dense 14336 4096 3.5 4.231198 0.028894 -3.662233 81
166 dense 4096 1024 4.0 2.685098 0.034546 -4.075439 124
167 dense 4096 4096 1.0 4.299441 0.083730 -5.389531 199
168 dense 4096 1024 4.0 4.687312 0.040545 -9.005686 51
169 dense 4096 1024 4.0 3.658518 0.042463 -6.832878 122
170 dense 4096 4096 1.0 2.495831 0.042071 -2.301120 295
171 dense 4096 4096 1.0 3.941541 0.019981 -4.838795 67
172 dense 14336 4096 3.5 5.122823 0.018292 -5.925772 122
173 dense 14336 4096 3.5 4.140157 0.031284 -3.570270 82
174 dense 14336 4096 3.5 3.707505 0.029033 -2.593846 103
175 dense 4096 1024 4.0 2.637588 0.025598 -3.875651 148
176 dense 14336 4096 3.5 5.059369 0.016025 -5.670223 114
177 dense 14336 4096 3.5 3.769210 0.028027 -2.590619 73
178 dense 14336 4096 3.5 4.061810 0.031584 -3.393754 87
179 dense 4096 1024 4.0 2.636098 0.023425 -3.672644 113
180 dense 4096 4096 1.0 3.865161 0.023610 -4.967077 100
181 dense 4096 4096 1.0 2.366307 0.035209 -2.310895 359
182 dense 4096 1024 4.0 3.324234 0.072874 -5.910475 200
183 dense 4096 1024 4.0 3.537435 0.036191 -6.177606 113
184 dense 4096 4096 1.0 2.438285 0.043889 -1.864263 226
185 dense 4096 4096 1.0 3.538867 0.018582 -3.711131 142
186 dense 4096 1024 4.0 2.548875 0.035833 -3.534538 115
187 dense 14336 4096 3.5 3.329358 0.027520 -2.134180 251
188 dense 14336 4096 3.5 5.238333 0.026579 -5.711918 63
189 dense 14336 4096 3.5 3.545623 0.031017 -2.830998 254
190 dense 14336 4096 3.5 5.087988 0.026789 -5.535566 60
191 dense 14336 4096 3.5 3.262706 0.024458 -1.940640 274
192 dense 14336 4096 3.5 3.531401 0.023633 -2.588921 240
193 dense 4096 1024 4.0 2.607612 0.030054 -3.533266 112
194 dense 4096 4096 1.0 3.624019 0.029149 -3.705106 165
195 dense 4096 4096 1.0 2.470905 0.028849 -2.222166 215
196 dense 4096 1024 4.0 3.724990 0.019131 -6.076207 99
197 dense 4096 1024 4.0 4.809634 0.040594 -9.007583 51
198 dense 4096 4096 1.0 2.464958 0.047648 -1.861644 313
199 dense 4096 4096 1.0 4.863806 0.028548 -6.789878 52
200 dense 14336 4096 3.5 3.212355 0.023528 -1.724386 308
201 dense 14336 4096 3.5 3.553009 0.023099 -2.453656 197
202 dense 14336 4096 3.5 5.193632 0.027637 -5.370406 60
203 dense 4096 1024 4.0 2.629146 0.037752 -3.743482 108
204 dense 14336 4096 3.5 4.679653 0.023142 -4.082676 66
205 dense 14336 4096 3.5 3.090518 0.023316 -1.484805 313
206 dense 14336 4096 3.5 3.556667 0.021469 -2.175238 91
207 dense 4096 1024 4.0 2.514435 0.031008 -3.393185 105
208 dense 4096 4096 1.0 3.368597 0.012765 -3.888572 151
209 dense 4096 4096 1.0 2.218191 0.037997 -1.873997 391
210 dense 4096 1024 4.0 3.107204 0.029493 -5.043023 186
211 dense 4096 1024 4.0 3.429105 0.015942 -5.607908 119
212 dense 4096 4096 1.0 2.607589 0.034715 -1.414339 92
213 dense 4096 4096 1.0 3.080218 0.025337 -2.215568 208
214 dense 14336 4096 3.5 3.149136 0.015422 -1.214684 162
215 dense 14336 4096 3.5 3.304820 0.011774 -1.720937 162
216 dense 14336 4096 3.5 4.290126 0.020322 -3.677406 86
217 dense 4096 1024 4.0 2.419672 0.035917 -2.619083 119
218 dense 4096 1024 4.0 3.306088 0.022248 -5.376384 124
219 dense 14336 4096 3.5 3.119090 0.014628 -1.104154 132
220 dense 14336 4096 3.5 3.171803 0.009477 -1.189946 183
221 dense 4096 1024 4.0 2.199924 0.026173 -2.069228 153
222 dense 4096 4096 1.0 3.212068 0.017559 -2.883493 157
223 dense 4096 4096 1.0 2.161602 0.031998 -0.860058 311
224 dense 14336 4096 3.5 3.274087 0.014122 -3.227743 239