Mistral-7B-Instruct-v0.1


Find this model in the Mistral model summary


Mistral-7B-Instruct-v0.1 Model Set Plots



Mistral-7B-Instruct-v0.1 Model Selected Details
id layer_type N M Q alpha D alpha-hat num_spikes warning
1 dense 32000 4096 7.8125 4.177157 0.012259 -3.749135 409
2 dense 14336 4096 3.5000 4.612069 0.011515 -4.845850 231
3 dense 14336 4096 3.5000 2.663000 0.056834 -2.118550 700
4 dense 14336 4096 3.5000 2.687217 0.063953 -2.315404 732
5 dense 4096 1024 4.0000 1.542214 0.040778 -1.278025 360 over-trained
6 dense 4096 4096 1.0000 2.171973 0.013668 -2.000766 424
7 dense 4096 4096 1.0000 1.672855 0.039901 -0.077019 235 over-trained
8 dense 4096 1024 4.0000 2.377697 0.017625 -3.099050 137
9 dense 14336 4096 3.5000 4.976780 0.020959 -6.227903 302
10 dense 14336 4096 3.5000 2.817269 0.063098 -2.561524 864
11 dense 14336 4096 3.5000 2.796676 0.068391 -2.762693 944
12 dense 4096 1024 4.0000 1.942930 0.036675 -2.218563 162 over-trained
13 dense 4096 4096 1.0000 2.606918 0.014246 -3.255849 374
14 dense 4096 4096 1.0000 2.047868 0.035052 -1.214612 205
15 dense 4096 1024 4.0000 2.943095 0.021591 -3.964865 183
16 dense 4096 4096 1.0000 2.938168 0.017650 -4.415913 265
17 dense 14336 4096 3.5000 5.080226 0.013639 -6.410482 247
18 dense 4096 4096 1.0000 2.984530 0.027797 -3.647198 206
19 dense 4096 1024 4.0000 2.913064 0.019253 -2.614287 139
20 dense 14336 4096 3.5000 4.214622 0.017863 -4.059813 139
21 dense 14336 4096 3.5000 4.263908 0.024742 -4.335311 143
22 dense 4096 1024 4.0000 3.768094 0.040266 -7.925006 109
23 dense 14336 4096 3.5000 5.056935 0.020989 -5.790466 90
24 dense 4096 4096 1.0000 3.529345 0.034633 -5.799860 195
25 dense 4096 1024 4.0000 2.419449 0.027517 -2.666805 179
26 dense 4096 1024 4.0000 4.257996 0.043573 -9.214424 117
27 dense 4096 4096 1.0000 2.601985 0.026392 -2.111602 265
28 dense 14336 4096 3.5000 4.837082 0.018824 -5.517049 112
29 dense 14336 4096 3.5000 5.187278 0.014581 -6.663764 241
30 dense 4096 1024 4.0000 2.802846 0.029281 -3.803653 90
31 dense 14336 4096 3.5000 5.461317 0.017313 -7.546581 251
32 dense 14336 4096 3.5000 4.355019 0.023663 -4.834182 153
33 dense 4096 4096 1.0000 3.384518 0.047379 -5.719165 201
34 dense 4096 4096 1.0000 2.819175 0.029323 -3.158053 196
35 dense 4096 1024 4.0000 4.449592 0.047137 -9.230259 75
36 dense 14336 4096 3.5000 5.158527 0.025719 -6.003100 55
37 dense 14336 4096 3.5000 5.736490 0.012695 -7.748082 184
38 dense 4096 1024 4.0000 2.910532 0.024104 -4.152354 132
39 dense 14336 4096 3.5000 4.734260 0.024067 -5.529093 129
40 dense 4096 4096 1.0000 4.361183 0.042875 -7.634222 127
41 dense 14336 4096 3.5000 5.567482 0.028212 -6.877623 56
42 dense 4096 1024 4.0000 5.651586 0.075907 -12.586818 85
43 dense 4096 4096 1.0000 3.149199 0.031709 -3.735124 158
44 dense 4096 4096 1.0000 2.970371 0.030523 -3.249474 190
45 dense 4096 4096 1.0000 3.999845 0.043014 -6.694516 135
46 dense 14336 4096 3.5000 4.563176 0.024594 -5.382572 159
47 dense 14336 4096 3.5000 5.872145 0.012791 -7.576777 160
48 dense 4096 1024 4.0000 2.657318 0.033239 -3.418456 209
49 dense 14336 4096 3.5000 5.612667 0.032606 -6.988908 52
50 dense 4096 1024 4.0000 4.625699 0.088638 -9.486988 145
51 dense 4096 1024 4.0000 6.695783 0.033565 -14.273389 43 under-trained
52 dense 4096 4096 1.0000 4.908094 0.043501 -8.785204 73
53 dense 4096 4096 1.0000 2.988237 0.049383 -3.301821 277
54 dense 14336 4096 3.5000 4.967781 0.057137 -6.382917 223
55 dense 14336 4096 3.5000 4.051699 0.049575 -4.769616 426
56 dense 14336 4096 3.5000 6.092691 0.015708 -8.053749 135 under-trained
57 dense 4096 1024 4.0000 2.878890 0.039263 -3.846623 109
58 dense 4096 1024 4.0000 2.890783 0.032882 -3.931944 115
59 dense 4096 1024 4.0000 5.863754 0.028998 -11.811689 47
60 dense 4096 4096 1.0000 3.035793 0.043394 -3.168100 187
61 dense 4096 4096 1.0000 4.483597 0.043778 -8.040171 96
62 dense 14336 4096 3.5000 4.721418 0.032536 -5.556826 123
63 dense 14336 4096 3.5000 5.779237 0.026519 -7.364079 51
64 dense 14336 4096 3.5000 6.415464 0.025102 -8.271571 122 under-trained
65 dense 14336 4096 3.5000 6.541695 0.022045 -8.429768 85 under-trained
66 dense 4096 4096 1.0000 5.110517 0.046288 -8.884407 75
67 dense 4096 1024 4.0000 3.149848 0.029961 -5.134707 100
68 dense 4096 1024 4.0000 6.538267 0.040714 -14.271676 35 under-trained
69 dense 4096 4096 1.0000 3.189403 0.041160 -3.995186 207
70 dense 14336 4096 3.5000 4.937985 0.060625 -6.321351 189
71 dense 14336 4096 3.5000 3.882527 0.049703 -4.584042 435
72 dense 14336 4096 3.5000 4.306591 0.061789 -5.214449 295
73 dense 4096 1024 4.0000 3.041818 0.036289 -4.585230 111
74 dense 4096 1024 4.0000 5.804926 0.031157 -12.073632 53
75 dense 14336 4096 3.5000 3.839159 0.048963 -4.264109 391
76 dense 4096 4096 1.0000 5.219934 0.042307 -9.495970 73
77 dense 4096 4096 1.0000 2.873439 0.060235 -3.381892 381
78 dense 14336 4096 3.5000 6.462813 0.015207 -8.783731 86 under-trained
79 dense 4096 4096 1.0000 5.800402 0.034238 -9.977027 42
80 dense 4096 1024 4.0000 3.304741 0.036638 -5.246216 79
81 dense 14336 4096 3.5000 5.985911 0.014235 -8.212040 127
82 dense 14336 4096 3.5000 3.570293 0.056163 -3.958492 566
83 dense 14336 4096 3.5000 3.522910 0.069654 -4.245696 675
84 dense 4096 4096 1.0000 4.020276 0.047686 -4.824447 52
85 dense 4096 1024 4.0000 4.590136 0.108481 -10.640141 181
86 dense 4096 4096 1.0000 4.873868 0.084506 -8.085970 171
87 dense 4096 4096 1.0000 3.500626 0.066803 -4.571146 229
88 dense 14336 4096 3.5000 6.214930 0.030905 -9.011196 89 under-trained
89 dense 14336 4096 3.5000 3.496776 0.064378 -3.921654 667
90 dense 14336 4096 3.5000 5.772664 0.027227 -7.040748 58
91 dense 4096 1024 4.0000 4.053041 0.045742 -6.731407 49
92 dense 4096 1024 4.0000 5.454282 0.106812 -12.698761 142
93 dense 4096 1024 4.0000 3.199614 0.045261 -5.371509 146
94 dense 4096 4096 1.0000 3.841453 0.087918 -6.397902 275
95 dense 14336 4096 3.5000 4.280676 0.069618 -5.152013 369
96 dense 14336 4096 3.5000 3.783820 0.059018 -4.163572 483
97 dense 14336 4096 3.5000 5.510406 0.016042 -7.401818 125
98 dense 4096 1024 4.0000 6.459720 0.096964 -14.847963 98 under-trained
99 dense 4096 4096 1.0000 3.687419 0.047729 -4.485705 134
100 dense 4096 4096 1.0000 3.583143 0.047409 -4.417630 145
101 dense 4096 4096 1.0000 5.296092 0.023652 -8.222126 75
102 dense 14336 4096 3.5000 4.003043 0.058738 -4.228880 410
103 dense 14336 4096 3.5000 5.644066 0.024497 -7.963046 123
104 dense 4096 1024 4.0000 3.606717 0.045604 -6.175371 74
105 dense 14336 4096 3.5000 5.252100 0.019587 -6.136502 74
106 dense 4096 1024 4.0000 5.659511 0.106769 -13.176295 137
107 dense 4096 4096 1.0000 4.413198 0.040904 -5.433138 54
108 dense 4096 1024 4.0000 6.910392 0.115070 -16.569533 108 under-trained
109 dense 4096 4096 1.0000 6.779364 0.033597 -11.599863 61 under-trained
110 dense 14336 4096 3.5000 4.550833 0.065594 -5.605533 309
111 dense 14336 4096 3.5000 4.278459 0.055864 -4.739472 291
112 dense 14336 4096 3.5000 6.098923 0.019156 -8.468406 103 under-trained
113 dense 4096 1024 4.0000 4.025977 0.036565 -6.526504 68
114 dense 4096 1024 4.0000 6.109803 0.107652 -14.361898 124 under-trained
115 dense 14336 4096 3.5000 5.743302 0.018732 -6.934830 62
116 dense 4096 4096 1.0000 4.030467 0.042956 -5.124188 106
117 dense 4096 4096 1.0000 6.629360 0.042837 -10.641320 82 under-trained
118 dense 4096 1024 4.0000 3.879442 0.032655 -6.353647 73
119 dense 14336 4096 3.5000 6.066311 0.020202 -8.498700 94 under-trained
120 dense 14336 4096 3.5000 4.048670 0.062157 -4.425876 432
121 dense 14336 4096 3.5000 5.522053 0.014832 -7.502289 154
122 dense 4096 1024 4.0000 8.258802 0.052426 -19.210866 29 under-trained
123 dense 14336 4096 3.5000 5.597413 0.017654 -6.754997 81
124 dense 4096 4096 1.0000 4.029405 0.034240 -4.513771 78
125 dense 4096 4096 1.0000 6.022698 0.019570 -9.927458 54 under-trained
126 dense 4096 1024 4.0000 3.651351 0.026739 -5.483350 80
127 dense 14336 4096 3.5000 4.203422 0.064080 -4.476612 452
128 dense 14336 4096 3.5000 5.213909 0.023924 -5.276854 90
129 dense 14336 4096 3.5000 6.125069 0.015259 -8.843103 114 under-trained
130 dense 4096 4096 1.0000 4.273602 0.034246 -4.780746 49
131 dense 4096 4096 1.0000 5.292689 0.025149 -9.098347 95
132 dense 4096 1024 4.0000 3.589276 0.034178 -5.739828 120
133 dense 4096 1024 4.0000 6.670422 0.059195 -15.471622 55 under-trained
134 dense 14336 4096 3.5000 5.335802 0.021862 -6.305673 111
135 dense 14336 4096 3.5000 5.390832 0.019663 -6.460445 135
136 dense 14336 4096 3.5000 6.468054 0.019321 -9.192529 180 under-trained
137 dense 4096 1024 4.0000 5.922777 0.097529 -13.303531 112
138 dense 4096 1024 4.0000 3.730478 0.029494 -5.854632 93
139 dense 14336 4096 3.5000 5.086188 0.018261 -5.009838 118
140 dense 4096 4096 1.0000 3.710071 0.052162 -4.293286 189
141 dense 4096 4096 1.0000 5.610549 0.034976 -9.666071 84
142 dense 14336 4096 3.5000 5.388687 0.021421 -5.487280 131
143 dense 14336 4096 3.5000 6.717315 0.020924 -9.283990 214 under-trained
144 dense 4096 1024 4.0000 3.611821 0.022265 -5.696851 96
145 dense 4096 4096 1.0000 5.332957 0.022528 -8.925430 96
146 dense 4096 4096 1.0000 3.879725 0.033617 -4.549600 103
147 dense 14336 4096 3.5000 5.918640 0.020526 -7.661623 102
148 dense 4096 1024 4.0000 7.265642 0.040568 -16.038889 40 under-trained
149 dense 14336 4096 3.5000 5.587241 0.019981 -5.484506 115
150 dense 14336 4096 3.5000 6.855736 0.019758 -9.155978 238 under-trained
151 dense 14336 4096 3.5000 6.072684 0.018358 -7.546484 130 under-trained
152 dense 4096 1024 4.0000 3.700024 0.019035 -5.740414 110
153 dense 4096 1024 4.0000 7.844663 0.034302 -16.976805 36 under-trained
154 dense 4096 4096 1.0000 3.863428 0.034320 -4.529142 147
155 dense 4096 4096 1.0000 5.874178 0.036981 -9.806572 109
156 dense 4096 1024 4.0000 3.616459 0.020486 -5.896235 117
157 dense 4096 1024 4.0000 5.761651 0.041967 -12.298245 65
158 dense 4096 4096 1.0000 3.718664 0.025263 -4.280025 154
159 dense 4096 4096 1.0000 4.670316 0.023219 -7.541554 157
160 dense 32000 4096 7.8125 3.551029 0.020486 1.977653 921
161 dense 14336 4096 3.5000 6.813025 0.014354 -8.946463 200 under-trained
162 dense 14336 4096 3.5000 5.583195 0.017797 -5.928029 133
163 dense 14336 4096 3.5000 5.968308 0.021762 -7.340988 109
164 dense 14336 4096 3.5000 6.621385 0.013749 -8.544071 195 under-trained
165 dense 14336 4096 3.5000 5.439013 0.017991 -5.746452 193
166 dense 14336 4096 3.5000 5.823464 0.017094 -7.132495 175
167 dense 4096 1024 4.0000 3.444972 0.017162 -5.399339 135
168 dense 4096 4096 1.0000 4.420530 0.023132 -6.971146 153
169 dense 4096 4096 1.0000 3.596863 0.024488 -4.343360 170
170 dense 4096 1024 4.0000 4.801187 0.023650 -9.856588 71
171 dense 14336 4096 3.5000 6.140395 0.012851 -7.732334 189 under-trained
172 dense 14336 4096 3.5000 5.587528 0.017882 -5.868976 198
173 dense 14336 4096 3.5000 5.757164 0.019982 -6.851183 211
174 dense 4096 1024 4.0000 3.425845 0.023793 -5.559778 144
175 dense 4096 4096 1.0000 4.467873 0.023143 -6.692573 146
176 dense 4096 4096 1.0000 3.499592 0.022992 -4.483253 199
177 dense 4096 1024 4.0000 4.641585 0.047641 -9.865183 107
178 dense 14336 4096 3.5000 5.511128 0.017886 -6.764376 275
179 dense 14336 4096 3.5000 5.622801 0.020531 -5.879127 177
180 dense 14336 4096 3.5000 5.748471 0.019392 -6.764202 234
181 dense 4096 1024 4.0000 3.165435 0.016298 -4.725016 147
182 dense 4096 4096 1.0000 4.093010 0.020685 -5.654925 160
183 dense 4096 4096 1.0000 3.202533 0.017328 -3.865524 216
184 dense 4096 1024 4.0000 4.733941 0.028688 -9.710080 71
185 dense 14336 4096 3.5000 5.077879 0.020808 -6.090038 276
186 dense 14336 4096 3.5000 5.484108 0.019419 -5.546530 235
187 dense 14336 4096 3.5000 5.643645 0.025335 -6.689638 243
188 dense 4096 1024 4.0000 3.235005 0.022751 -5.167254 146
189 dense 4096 4096 1.0000 3.919611 0.023088 -5.499852 215
190 dense 4096 4096 1.0000 3.337977 0.020038 -3.869241 157
191 dense 4096 1024 4.0000 4.479108 0.039172 -9.207070 88
192 dense 14336 4096 3.5000 4.695458 0.020361 -5.107541 350
193 dense 14336 4096 3.5000 5.088336 0.024015 -4.692186 334
194 dense 14336 4096 3.5000 5.335235 0.027865 -6.248519 295
195 dense 4096 1024 4.0000 3.145066 0.016505 -4.734409 129
196 dense 4096 4096 1.0000 3.600949 0.017327 -4.715834 217
197 dense 4096 4096 1.0000 3.142532 0.016166 -3.461133 205
198 dense 4096 1024 4.0000 4.117184 0.030500 -7.734162 73
199 dense 14336 4096 3.5000 4.877753 0.014128 -5.371757 309
200 dense 14336 4096 3.5000 4.806522 0.023304 -3.794448 367
201 dense 14336 4096 3.5000 5.039186 0.026591 -5.592293 334
202 dense 4096 1024 4.0000 3.210359 0.025058 -5.178729 144
203 dense 4096 4096 1.0000 3.767712 0.023181 -4.775271 195
204 dense 4096 4096 1.0000 3.399962 0.017615 -3.394928 167
205 dense 4096 1024 4.0000 4.384362 0.024143 -8.141528 95
206 dense 14336 4096 3.5000 4.677241 0.019330 -5.230111 310
207 dense 14336 4096 3.5000 4.564665 0.023660 -2.765675 370
208 dense 14336 4096 3.5000 4.763687 0.025832 -5.208917 375
209 dense 4096 1024 4.0000 3.386104 0.015814 -5.239175 112
210 dense 4096 4096 1.0000 3.665512 0.030830 -4.363615 244
211 dense 4096 4096 1.0000 3.293236 0.017409 -3.307497 176
212 dense 4096 1024 4.0000 3.986018 0.019462 -7.660615 142
213 dense 14336 4096 3.5000 4.228320 0.022313 -4.653216 394
214 dense 14336 4096 3.5000 4.382216 0.019726 -1.935589 389
215 dense 14336 4096 3.5000 4.617360 0.026497 -4.706417 370
216 dense 4096 1024 4.0000 3.324960 0.022030 -5.211592 139
217 dense 4096 4096 1.0000 3.201048 0.019408 -3.307935 219
218 dense 4096 4096 1.0000 3.360872 0.012564 -2.866209 155
219 dense 4096 1024 4.0000 3.710214 0.013428 -6.830825 129
220 dense 14336 4096 3.5000 3.792722 0.023439 -4.379633 384
221 dense 14336 4096 3.5000 4.144129 0.026828 -0.687487 339
222 dense 14336 4096 3.5000 4.400037 0.026049 -3.813555 316
223 dense 4096 1024 4.0000 3.121175 0.023252 -4.419103 141
224 dense 4096 4096 1.0000 3.312122 0.020897 -3.406388 191
225 dense 4096 4096 1.0000 3.298996 0.011616 -2.687362 168
226 dense 4096 1024 4.0000 3.601490 0.030986 -5.935367 158