Mistral-7B-Instruct-v0.2


Find this model in the Mistral model summary


Mistral-7B-Instruct-v0.2 Model Set Plots



Mistral-7B-Instruct-v0.2 Model Selected Details
id layer_type N M Q alpha D alpha-hat num_spikes warning
1 dense 32000 4096 7.8125 4.203737 0.014068 -8.441438 471
2 dense 14336 4096 3.5000 4.733687 0.012908 -10.875004 169
3 dense 14336 4096 3.5000 3.510602 0.024584 -6.945640 123
4 dense 14336 4096 3.5000 3.628967 0.024639 -7.293491 115
5 dense 4096 1024 4.0000 1.596066 0.042569 -3.886465 117 over-trained
6 dense 4096 4096 1.0000 2.177463 0.017518 -4.479705 276
7 dense 4096 4096 1.0000 1.632014 0.041549 -3.404980 312 over-trained
8 dense 4096 1024 4.0000 2.618894 0.022334 -7.525651 109
9 dense 14336 4096 3.5000 4.312518 0.017595 -8.978763 328
10 dense 14336 4096 3.5000 3.772527 0.027181 -7.519665 138
11 dense 14336 4096 3.5000 3.931944 0.027686 -8.083925 116
12 dense 4096 1024 4.0000 1.903114 0.021639 -4.359000 130 over-trained
13 dense 4096 4096 1.0000 3.482952 0.027183 -8.508045 102
14 dense 4096 4096 1.0000 1.888538 0.031889 -4.051359 317 over-trained
15 dense 4096 1024 4.0000 3.977264 0.034350 -11.843199 60
16 dense 14336 4096 3.5000 3.459268 0.010400 -7.028127 287
17 dense 4096 1024 4.0000 2.547533 0.025037 -6.605645 195
18 dense 4096 4096 1.0000 2.455597 0.022953 -5.787639 360
19 dense 4096 1024 4.0000 3.769848 0.021938 -11.073785 83
20 dense 14336 4096 3.5000 3.448001 0.011382 -6.931971 380
21 dense 14336 4096 3.5000 4.561222 0.015811 -10.967391 263
22 dense 4096 4096 1.0000 3.590978 0.022518 -8.617295 69
23 dense 14336 4096 3.5000 3.779212 0.013363 -7.854068 145
24 dense 14336 4096 3.5000 3.661848 0.008865 -7.415620 208
25 dense 14336 4096 3.5000 4.562392 0.013112 -10.815673 203
26 dense 4096 1024 4.0000 2.264325 0.024380 -5.433680 158
27 dense 4096 4096 1.0000 3.606121 0.017375 -8.866164 99
28 dense 4096 1024 4.0000 3.788218 0.024923 -11.547376 99
29 dense 4096 4096 1.0000 2.266343 0.029777 -4.624250 355
30 dense 14336 4096 3.5000 3.453640 0.009204 -6.847408 287
31 dense 14336 4096 3.5000 4.791925 0.012351 -11.779283 157
32 dense 4096 1024 4.0000 2.371907 0.021928 -5.936052 154
33 dense 14336 4096 3.5000 3.541961 0.018994 -7.232893 179
34 dense 4096 1024 4.0000 4.230963 0.045857 -13.075153 40
35 dense 4096 4096 1.0000 2.346568 0.024307 -5.362782 305
36 dense 4096 4096 1.0000 4.069751 0.049807 -10.399839 41
37 dense 14336 4096 3.5000 4.562499 0.014820 -11.236652 206
38 dense 4096 1024 4.0000 2.505110 0.023242 -6.386699 180
39 dense 4096 4096 1.0000 4.694188 0.029028 -12.391569 44
40 dense 14336 4096 3.5000 3.676031 0.010141 -7.339683 203
41 dense 14336 4096 3.5000 3.800342 0.017593 -7.820028 155
42 dense 4096 4096 1.0000 2.503113 0.023307 -5.968375 313
43 dense 4096 1024 4.0000 4.580972 0.037013 -14.151661 61
44 dense 14336 4096 3.5000 4.265790 0.009654 -8.637205 215
45 dense 14336 4096 3.5000 3.557537 0.009478 -7.054638 243
46 dense 4096 4096 1.0000 3.786904 0.033144 -9.433137 56
47 dense 4096 4096 1.0000 2.436830 0.022040 -5.431690 299
48 dense 14336 4096 3.5000 3.599938 0.022444 -7.318726 200
49 dense 4096 1024 4.0000 2.400270 0.020035 -5.866752 191
50 dense 4096 1024 4.0000 3.884670 0.040457 -11.582129 89
51 dense 14336 4096 3.5000 4.069773 0.025722 -8.323739 66
52 dense 14336 4096 3.5000 3.560399 0.016740 -7.053281 220
53 dense 4096 4096 1.0000 3.974126 0.035104 -9.562791 45
54 dense 4096 4096 1.0000 2.557987 0.025767 -5.883009 220
55 dense 4096 1024 4.0000 4.422815 0.033538 -13.603901 67
56 dense 4096 1024 4.0000 2.442003 0.035191 -6.329586 168
57 dense 14336 4096 3.5000 4.335012 0.017308 -10.425464 224
58 dense 14336 4096 3.5000 4.344943 0.015130 -10.227773 196
59 dense 14336 4096 3.5000 3.552611 0.014023 -6.839512 166
60 dense 4096 1024 4.0000 2.455676 0.025660 -6.253049 161
61 dense 4096 1024 4.0000 4.357362 0.037588 -13.117375 39
62 dense 4096 4096 1.0000 2.552467 0.027999 -5.779556 182
63 dense 4096 4096 1.0000 3.852859 0.036601 -9.857648 62
64 dense 14336 4096 3.5000 3.913985 0.017676 -7.779513 93
65 dense 14336 4096 3.5000 4.250533 0.021540 -10.146900 177
66 dense 4096 1024 4.0000 2.546773 0.020314 -6.730488 161
67 dense 4096 4096 1.0000 3.059012 0.076545 -7.497885 224
68 dense 4096 4096 1.0000 2.651910 0.032967 -5.808549 162
69 dense 4096 1024 4.0000 4.524737 0.030459 -13.615502 47
70 dense 14336 4096 3.5000 3.456398 0.018155 -6.494726 171
71 dense 14336 4096 3.5000 3.902418 0.023001 -7.642034 62
72 dense 4096 4096 1.0000 2.590670 0.033961 -5.927052 210
73 dense 4096 1024 4.0000 4.051532 0.028172 -11.847315 51
74 dense 4096 1024 4.0000 2.493528 0.019975 -6.659156 161
75 dense 4096 4096 1.0000 2.631671 0.082463 -6.394868 386
76 dense 14336 4096 3.5000 3.547667 0.022371 -6.390552 130
77 dense 14336 4096 3.5000 3.278953 0.016117 -5.571624 169
78 dense 14336 4096 3.5000 4.664701 0.033924 -10.901967 67
79 dense 14336 4096 3.5000 4.359637 0.022864 -8.935726 115
80 dense 14336 4096 3.5000 3.306984 0.015410 -5.766857 149
81 dense 14336 4096 3.5000 3.504422 0.018436 -6.460753 128
82 dense 4096 1024 4.0000 2.607449 0.024974 -6.710978 115
83 dense 4096 4096 1.0000 3.761255 0.032235 -9.333848 77
84 dense 4096 4096 1.0000 2.587706 0.031913 -5.448416 203
85 dense 4096 1024 4.0000 4.264689 0.036522 -13.066962 64
86 dense 14336 4096 3.5000 5.004588 0.032739 -11.905549 58
87 dense 14336 4096 3.5000 3.387061 0.016359 -6.006331 138
88 dense 14336 4096 3.5000 3.627385 0.022660 -6.762875 123
89 dense 4096 1024 4.0000 2.857705 0.033148 -7.966353 109
90 dense 4096 4096 1.0000 4.352224 0.037385 -11.001469 61
91 dense 4096 4096 1.0000 2.726420 0.039587 -5.932402 212
92 dense 4096 1024 4.0000 4.640696 0.032876 -14.654337 52
93 dense 14336 4096 3.5000 4.306918 0.019296 -9.665541 111
94 dense 14336 4096 3.5000 3.365953 0.016181 -5.911097 138
95 dense 14336 4096 3.5000 3.535558 0.018600 -6.514901 139
96 dense 4096 1024 4.0000 2.690063 0.027779 -7.253728 121
97 dense 4096 4096 1.0000 3.210383 0.075764 -8.560363 187
98 dense 4096 4096 1.0000 2.745946 0.037431 -6.036613 171
99 dense 4096 1024 4.0000 4.822337 0.035583 -15.042088 36
100 dense 14336 4096 3.5000 4.612170 0.027317 -9.485820 47
101 dense 14336 4096 3.5000 3.407339 0.013395 -5.906861 133
102 dense 14336 4096 3.5000 3.588206 0.021832 -6.581275 129
103 dense 4096 1024 4.0000 2.730392 0.024686 -7.493436 112
104 dense 4096 4096 1.0000 3.877505 0.023527 -9.771112 84
105 dense 4096 4096 1.0000 2.772656 0.034757 -6.021258 152
106 dense 4096 1024 4.0000 4.569337 0.035140 -14.330306 55
107 dense 14336 4096 3.5000 4.938308 0.023841 -11.764437 82
108 dense 14336 4096 3.5000 3.655703 0.018453 -6.544411 87
109 dense 14336 4096 3.5000 3.761819 0.020158 -7.082439 142
110 dense 4096 1024 4.0000 2.990539 0.030977 -8.492327 113
111 dense 4096 4096 1.0000 4.567245 0.024774 -11.858297 50
112 dense 4096 4096 1.0000 3.828254 0.036475 -7.925271 23
113 dense 4096 1024 4.0000 5.364672 0.030650 -17.288942 37
114 dense 14336 4096 3.5000 4.798428 0.015532 -10.782619 101
115 dense 14336 4096 3.5000 3.649221 0.017426 -6.493037 110
116 dense 14336 4096 3.5000 3.918208 0.019776 -7.332674 123
117 dense 4096 1024 4.0000 2.945434 0.028568 -8.286244 113
118 dense 4096 4096 1.0000 4.657990 0.085824 -12.772596 99
119 dense 4096 4096 1.0000 2.957599 0.039487 -6.390502 165
120 dense 4096 1024 4.0000 4.247543 0.104074 -13.553236 168
121 dense 14336 4096 3.5000 3.963035 0.027422 -7.248754 142
122 dense 14336 4096 3.5000 3.675893 0.018643 -6.676007 117
123 dense 14336 4096 3.5000 3.948102 0.015917 -7.600706 121
124 dense 4096 1024 4.0000 2.804325 0.025880 -7.585108 138
125 dense 4096 4096 1.0000 3.093849 0.093064 -7.956238 323
126 dense 4096 4096 1.0000 2.832636 0.027655 -6.154823 182
127 dense 4096 1024 4.0000 4.183370 0.102107 -13.157252 155
128 dense 14336 4096 3.5000 5.204901 0.027694 -12.320735 68
129 dense 14336 4096 3.5000 3.643744 0.015779 -6.415090 186
130 dense 14336 4096 3.5000 3.835792 0.023769 -7.187244 202
131 dense 4096 1024 4.0000 2.876916 0.026259 -7.943902 133
132 dense 4096 4096 1.0000 4.040527 0.041766 -9.785718 101
133 dense 4096 4096 1.0000 2.964783 0.032265 -6.144357 133
134 dense 4096 1024 4.0000 4.738347 0.036301 -15.262266 60
135 dense 14336 4096 3.5000 5.500719 0.023850 -13.308314 75
136 dense 14336 4096 3.5000 3.788311 0.014971 -6.878936 173
137 dense 14336 4096 3.5000 4.057911 0.018303 -7.867506 191
138 dense 4096 1024 4.0000 2.904249 0.026482 -8.033185 147
139 dense 4096 4096 1.0000 4.809506 0.035970 -12.539096 54
140 dense 4096 4096 1.0000 2.865922 0.028495 -6.207238 188
141 dense 4096 1024 4.0000 4.044413 0.087821 -12.558212 165
142 dense 14336 4096 3.5000 5.587532 0.017665 -13.400464 112
143 dense 14336 4096 3.5000 3.984359 0.017771 -7.542779 142
144 dense 14336 4096 3.5000 4.331687 0.020476 -8.858633 162
145 dense 4096 1024 4.0000 2.862048 0.027073 -7.886461 145
146 dense 4096 4096 1.0000 4.833628 0.093872 -13.551103 151
147 dense 4096 4096 1.0000 2.836795 0.025478 -6.115545 196
148 dense 4096 1024 4.0000 5.418748 0.038312 -17.264443 42
149 dense 14336 4096 3.5000 5.665082 0.019399 -13.474947 127
150 dense 14336 4096 3.5000 4.021767 0.017297 -7.728965 203
151 dense 14336 4096 3.5000 4.506619 0.023256 -9.499272 180
152 dense 4096 1024 4.0000 2.946150 0.022485 -7.969663 153
153 dense 4096 4096 1.0000 4.766407 0.097630 -13.461404 182
154 dense 4096 4096 1.0000 2.921269 0.028329 -6.366020 202
155 dense 4096 1024 4.0000 5.665043 0.039256 -18.170723 48
156 dense 4096 1024 4.0000 2.846977 0.022549 -7.705626 161
157 dense 4096 4096 1.0000 3.928633 0.034868 -9.062613 160
158 dense 4096 4096 1.0000 2.793254 0.019268 -6.012539 226
159 dense 4096 1024 4.0000 4.678478 0.020364 -14.137935 60
160 dense 32000 4096 7.8125 2.540780 0.012594 -1.023954 1333
161 dense 14336 4096 3.5000 5.508372 0.015461 -13.159321 148
162 dense 14336 4096 3.5000 4.088446 0.016525 -7.871214 188
163 dense 14336 4096 3.5000 4.543216 0.021575 -9.417306 170
164 dense 14336 4096 3.5000 5.636580 0.017464 -13.679005 134
165 dense 14336 4096 3.5000 4.141042 0.014189 -7.835773 211
166 dense 14336 4096 3.5000 4.498945 0.019979 -9.191790 187
167 dense 4096 1024 4.0000 2.641827 0.025397 -6.510555 195
168 dense 4096 4096 1.0000 3.182613 0.031088 -7.057005 240
169 dense 4096 4096 1.0000 2.587088 0.021499 -5.520694 320
170 dense 4096 1024 4.0000 3.582739 0.023791 -10.198984 89
171 dense 14336 4096 3.5000 5.579236 0.024480 -13.582682 115
172 dense 14336 4096 3.5000 4.193355 0.016067 -7.813632 245
173 dense 14336 4096 3.5000 4.542454 0.020803 -9.225365 218
174 dense 4096 1024 4.0000 2.769781 0.026994 -7.330878 190
175 dense 4096 4096 1.0000 4.227953 0.026118 -10.830021 80
176 dense 4096 4096 1.0000 2.720853 0.024870 -5.916113 303
177 dense 4096 1024 4.0000 3.900889 0.033332 -11.814334 113
178 dense 14336 4096 3.5000 5.500867 0.022525 -13.333621 82
179 dense 14336 4096 3.5000 4.143864 0.012331 -7.657870 277
180 dense 14336 4096 3.5000 4.489940 0.021188 -9.117943 230
181 dense 4096 1024 4.0000 2.567605 0.019586 -6.285827 200
182 dense 4096 4096 1.0000 3.608813 0.017666 -8.945581 139
183 dense 4096 4096 1.0000 2.502648 0.018021 -5.315170 327
184 dense 4096 1024 4.0000 3.770262 0.015535 -11.046560 102
185 dense 14336 4096 3.5000 4.482052 0.046324 -10.630615 307
186 dense 14336 4096 3.5000 4.077554 0.015520 -7.163220 319
187 dense 14336 4096 3.5000 4.349122 0.023450 -8.377735 275
188 dense 4096 1024 4.0000 2.527340 0.021389 -6.138722 207
189 dense 4096 4096 1.0000 3.476790 0.023336 -7.606497 185
190 dense 4096 4096 1.0000 2.526623 0.022384 -5.174276 310
191 dense 4096 1024 4.0000 3.671831 0.022679 -10.291584 76
192 dense 14336 4096 3.5000 4.155028 0.044148 -9.927388 328
193 dense 14336 4096 3.5000 4.004957 0.014718 -6.832835 370
194 dense 14336 4096 3.5000 4.131026 0.023634 -7.720696 333
195 dense 4096 1024 4.0000 2.513946 0.015404 -6.318807 195
196 dense 4096 4096 1.0000 3.262373 0.012906 -7.812919 158
197 dense 4096 4096 1.0000 2.438479 0.018974 -5.034456 321
198 dense 4096 1024 4.0000 3.567803 0.021476 -9.933741 63
199 dense 14336 4096 3.5000 4.123616 0.027382 -9.075101 253
200 dense 14336 4096 3.5000 3.824352 0.014725 -6.252389 412
201 dense 14336 4096 3.5000 3.983702 0.023416 -7.200108 340
202 dense 4096 1024 4.0000 2.588461 0.017454 -6.946087 183
203 dense 4096 4096 1.0000 3.909249 0.017768 -9.587696 93
204 dense 4096 4096 1.0000 2.595344 0.017758 -5.110323 278
205 dense 4096 1024 4.0000 3.576519 0.022720 -10.394674 108
206 dense 14336 4096 3.5000 3.919404 0.016831 -8.426507 202
207 dense 14336 4096 3.5000 3.626669 0.015553 -5.653272 412
208 dense 14336 4096 3.5000 3.814270 0.022999 -6.655269 390
209 dense 4096 1024 4.0000 2.669492 0.021644 -7.018748 179
210 dense 4096 4096 1.0000 3.604436 0.026515 -8.193559 164
211 dense 4096 4096 1.0000 2.573991 0.019568 -5.208369 256
212 dense 4096 1024 4.0000 3.416111 0.021050 -9.273863 134
213 dense 14336 4096 3.5000 3.680616 0.014578 -7.320332 148
214 dense 14336 4096 3.5000 3.484967 0.015964 -5.134664 383
215 dense 14336 4096 3.5000 3.667138 0.023114 -6.178529 395
216 dense 4096 1024 4.0000 2.610263 0.020454 -6.639865 160
217 dense 4096 4096 1.0000 3.066347 0.025614 -6.370299 201
218 dense 4096 4096 1.0000 2.591561 0.015583 -4.505477 238
219 dense 4096 1024 4.0000 3.413148 0.019005 -9.663683 92
220 dense 14336 4096 3.5000 3.046010 0.023057 -5.567714 393
221 dense 14336 4096 3.5000 3.312722 0.024198 -4.872403 375
222 dense 14336 4096 3.5000 3.516450 0.029939 -6.032777 332
223 dense 4096 1024 4.0000 2.495677 0.021230 -6.113855 178
224 dense 4096 4096 1.0000 3.218557 0.024361 -6.684414 147
225 dense 4096 4096 1.0000 2.587624 0.020531 -4.412254 186
226 dense 4096 1024 4.0000 3.275774 0.031062 -8.896318 155