Mistral-7B-v0.1


Find this model in the Mistral7B model summary


Mistral-7B-v0.1 Model Set Plots



Mistral-7B-v0.1 Model Selected Details
id layer_type N M Q alpha D alpha-hat num_spikes warning
1 dense 32000 4096 7.8125 3.453652 0.021347 4.743494 856
2 dense 14336 4096 3.5000 4.065165 0.031473 0.883362 747
3 dense 14336 4096 3.5000 2.381453 0.020189 1.418909 894
4 dense 14336 4096 3.5000 3.239790 0.047676 0.649902 489
5 dense 4096 1024 4.0000 1.452506 0.023921 1.650277 422 over-trained
6 dense 4096 4096 1.0000 3.418772 0.038004 -0.312641 106
7 dense 4096 4096 1.0000 1.425975 0.026688 2.474365 672 over-trained
8 dense 4096 1024 4.0000 4.336253 0.040110 -2.955510 58
9 dense 4096 4096 1.0000 2.966444 0.052971 2.691267 37
10 dense 4096 4096 1.0000 3.622495 0.024081 0.055981 339
11 dense 4096 1024 4.0000 2.961721 0.038766 1.762481 59
12 dense 4096 1024 4.0000 4.516537 0.112309 -4.656512 166
13 dense 14336 4096 3.5000 2.689852 0.019618 2.031800 945
14 dense 14336 4096 3.5000 5.253410 0.023398 0.175921 125
15 dense 14336 4096 3.5000 3.412001 0.058564 0.100180 718
16 dense 14336 4096 3.5000 3.950052 0.030296 2.170732 605
17 dense 4096 1024 4.0000 7.465837 0.109631 -8.422969 109 under-trained
18 dense 4096 4096 1.0000 3.847189 0.026225 1.920217 137
19 dense 4096 4096 1.0000 4.069051 0.018543 -0.234692 145
20 dense 4096 1024 4.0000 5.752737 0.032088 1.153627 58
21 dense 14336 4096 3.5000 9.367025 0.015316 -1.595798 112 under-trained
22 dense 14336 4096 3.5000 4.378603 0.011728 0.190440 527
23 dense 4096 1024 4.0000 5.991279 0.058980 -6.170054 125
24 dense 4096 4096 1.0000 2.917581 0.029092 2.025290 235
25 dense 4096 4096 1.0000 4.015053 0.028547 -0.565645 276
26 dense 4096 1024 4.0000 3.925765 0.045827 1.507907 53
27 dense 14336 4096 3.5000 3.533310 0.020997 1.798453 844
28 dense 14336 4096 3.5000 4.326046 0.054034 0.019480 596
29 dense 14336 4096 3.5000 7.867653 0.073073 -1.257325 284 under-trained
30 dense 14336 4096 3.5000 5.720345 0.019311 0.060182 171
31 dense 14336 4096 3.5000 3.218833 0.025281 2.186844 241
32 dense 14336 4096 3.5000 7.620396 0.027914 -1.549166 275 under-trained
33 dense 4096 1024 4.0000 4.839544 0.063871 0.642546 43
34 dense 4096 4096 1.0000 3.482508 0.041964 -0.946113 296
35 dense 4096 4096 1.0000 3.279083 0.059951 1.681968 171
36 dense 4096 1024 4.0000 5.345091 0.085136 -5.452285 145
37 dense 14336 4096 3.5000 7.531810 0.015163 -1.044341 206 under-trained
38 dense 4096 1024 4.0000 8.355038 0.038937 -9.110954 53 under-trained
39 dense 4096 4096 1.0000 2.966308 0.059193 1.456511 354
40 dense 4096 4096 1.0000 4.353909 0.033211 -1.752739 216
41 dense 14336 4096 3.5000 3.260264 0.014785 2.568199 945
42 dense 14336 4096 3.5000 4.576713 0.022255 -0.276857 446
43 dense 4096 1024 4.0000 2.672633 0.101050 0.340762 322
44 dense 4096 4096 1.0000 3.882287 0.023917 2.079085 112
45 dense 14336 4096 3.5000 4.335738 0.012601 0.506101 292
46 dense 14336 4096 3.5000 6.813491 0.010191 -0.744688 248 under-trained
47 dense 4096 1024 4.0000 5.159509 0.031877 1.141429 50
48 dense 4096 4096 1.0000 4.040646 0.031846 -1.849664 226
49 dense 4096 1024 4.0000 5.495225 0.055012 -5.591256 118
50 dense 14336 4096 3.5000 3.770993 0.008890 2.747745 552
51 dense 14336 4096 3.5000 4.504961 0.025413 0.018129 362
52 dense 4096 1024 4.0000 6.290220 0.072807 -6.366179 113 under-trained
53 dense 4096 4096 1.0000 2.934296 0.062655 1.371796 240
54 dense 14336 4096 3.5000 3.370460 0.017560 2.653654 784
55 dense 4096 1024 4.0000 2.782193 0.072741 0.438617 175
56 dense 14336 4096 3.5000 6.565010 0.023946 -0.619235 192 under-trained
57 dense 4096 4096 1.0000 3.792190 0.042244 -1.330963 247
58 dense 14336 4096 3.5000 4.164940 0.043066 0.332439 479
59 dense 14336 4096 3.5000 6.159954 0.014804 -0.454111 224 under-trained
60 dense 4096 1024 4.0000 3.952086 0.080255 0.665800 112
61 dense 4096 4096 1.0000 5.385848 0.032754 -2.134116 98
62 dense 4096 4096 1.0000 3.424584 0.058981 1.591460 183
63 dense 4096 1024 4.0000 3.632115 0.071168 -3.625067 275
64 dense 14336 4096 3.5000 3.788581 0.011629 2.896575 414
65 dense 4096 1024 4.0000 5.610551 0.063293 -5.520452 82
66 dense 4096 1024 4.0000 4.083817 0.068829 0.194225 143
67 dense 4096 4096 1.0000 3.519288 0.042645 1.227092 163
68 dense 4096 4096 1.0000 3.709412 0.054745 -1.655144 197
69 dense 14336 4096 3.5000 5.017330 0.019779 -0.423187 324
70 dense 14336 4096 3.5000 3.950693 0.047003 0.331326 371
71 dense 14336 4096 3.5000 3.597138 0.013396 2.809712 519
72 dense 14336 4096 3.5000 4.142180 0.032358 0.653613 319
73 dense 14336 4096 3.5000 3.650178 0.020620 2.617467 465
74 dense 14336 4096 3.5000 5.003618 0.030468 -0.573169 247
75 dense 4096 1024 4.0000 3.422553 0.098880 0.502640 166
76 dense 4096 4096 1.0000 4.232669 0.041921 -1.930309 180
77 dense 4096 4096 1.0000 5.112238 0.042896 2.273357 22
78 dense 4096 1024 4.0000 8.013867 0.043838 -8.138253 56 under-trained
79 dense 4096 1024 4.0000 8.390121 0.076161 -7.706020 45 under-trained
80 dense 4096 4096 1.0000 4.287437 0.038897 -1.686889 121
81 dense 14336 4096 3.5000 4.409746 0.033372 0.560758 168
82 dense 14336 4096 3.5000 3.695771 0.023342 2.735422 396
83 dense 14336 4096 3.5000 4.768246 0.031904 -0.415385 275
84 dense 4096 1024 4.0000 2.013414 0.116113 0.099744 426
85 dense 4096 4096 1.0000 2.827092 0.069725 0.952681 320
86 dense 14336 4096 3.5000 4.212177 0.030958 0.692659 237
87 dense 4096 4096 1.0000 3.567448 0.066769 -1.698087 268
88 dense 14336 4096 3.5000 3.813526 0.023168 2.892866 277
89 dense 14336 4096 3.5000 4.912457 0.029302 -0.486270 254
90 dense 4096 1024 4.0000 2.232701 0.118687 0.094196 376
91 dense 4096 4096 1.0000 2.312417 0.078781 0.803702 611
92 dense 4096 1024 4.0000 7.710415 0.093188 -7.968436 68 under-trained
93 dense 4096 1024 4.0000 3.364656 0.078453 -3.118782 321
94 dense 4096 4096 1.0000 2.965849 0.063444 1.241536 277
95 dense 4096 1024 4.0000 4.730274 0.093592 0.777061 61
96 dense 4096 4096 1.0000 3.482480 0.035300 -1.383186 246
97 dense 14336 4096 3.5000 3.891171 0.023005 2.769839 415
98 dense 14336 4096 3.5000 3.571339 0.029463 0.605055 560
99 dense 14336 4096 3.5000 5.035522 0.029905 -0.405505 274
100 dense 14336 4096 3.5000 4.368585 0.008317 0.576080 391
101 dense 14336 4096 3.5000 3.766537 0.019196 3.050174 378
102 dense 14336 4096 3.5000 4.632315 0.025298 -0.336658 398
103 dense 4096 1024 4.0000 2.083136 0.114432 0.096061 426
104 dense 4096 4096 1.0000 3.409219 0.084361 -1.490142 358
105 dense 4096 4096 1.0000 2.286744 0.075756 0.819542 630
106 dense 4096 1024 4.0000 4.443237 0.092594 -4.087350 247
107 dense 4096 1024 4.0000 2.999088 0.092321 0.184673 162
108 dense 4096 1024 4.0000 6.231267 0.113212 -6.448474 215 under-trained
109 dense 4096 4096 1.0000 2.832030 0.050686 0.859085 249
110 dense 4096 4096 1.0000 4.382318 0.023489 -1.628940 201
111 dense 14336 4096 3.5000 4.015400 0.024751 3.327450 296
112 dense 14336 4096 3.5000 4.555730 0.010142 -0.042108 486
113 dense 14336 4096 3.5000 5.592558 0.036287 -0.604474 174
114 dense 14336 4096 3.5000 4.554184 0.013122 0.129308 536
115 dense 14336 4096 3.5000 4.455586 0.018130 4.014289 205
116 dense 14336 4096 3.5000 6.036842 0.026193 -0.493554 211 under-trained
117 dense 4096 1024 4.0000 2.743224 0.095170 0.300732 185
118 dense 4096 4096 1.0000 3.898180 0.031746 -1.022514 147
119 dense 4096 4096 1.0000 2.257951 0.072147 0.891256 708
120 dense 4096 1024 4.0000 15.950829 0.114767 -17.264122 52 under-trained
121 dense 14336 4096 3.5000 4.658224 0.011810 -0.013999 511
122 dense 14336 4096 3.5000 4.817181 0.019087 4.132277 170
123 dense 14336 4096 3.5000 6.017769 0.010647 -0.556226 280 under-trained
124 dense 4096 4096 1.0000 3.461459 0.053868 -1.049307 346
125 dense 4096 4096 1.0000 2.728824 0.074099 1.429254 456
126 dense 4096 1024 4.0000 12.716698 0.098928 -11.658202 40 under-trained
127 dense 4096 1024 4.0000 4.741488 0.086538 0.837691 92
128 dense 14336 4096 3.5000 5.602403 0.019923 -0.196594 457
129 dense 14336 4096 3.5000 4.650981 0.020731 4.261582 217
130 dense 14336 4096 3.5000 6.002625 0.010141 0.030160 314 under-trained
131 dense 4096 1024 4.0000 4.807713 0.119899 0.226031 90
132 dense 4096 4096 1.0000 5.465012 0.037863 -0.988551 123
133 dense 4096 4096 1.0000 2.330275 0.068441 0.973134 601
134 dense 4096 1024 4.0000 7.246114 0.107182 -6.414276 174 under-trained
135 dense 4096 1024 4.0000 5.809264 0.046766 -5.070202 140
136 dense 4096 4096 1.0000 2.785123 0.054369 1.201917 418
137 dense 4096 1024 4.0000 2.997752 0.065101 0.329626 183
138 dense 14336 4096 3.5000 4.771653 0.017485 4.409426 278
139 dense 14336 4096 3.5000 6.680979 0.030687 -0.242522 397 under-trained
140 dense 14336 4096 3.5000 6.504103 0.014222 0.149455 258 under-trained
141 dense 4096 4096 1.0000 3.623508 0.025610 -0.203964 426
142 dense 4096 4096 1.0000 4.532999 0.045114 -0.945727 206
143 dense 14336 4096 3.5000 8.658853 0.035426 -0.873374 273 under-trained
144 dense 14336 4096 3.5000 4.753277 0.013617 4.443002 353
145 dense 14336 4096 3.5000 7.045105 0.020131 0.034106 251 under-trained
146 dense 4096 1024 4.0000 2.507083 0.095205 0.235579 331
147 dense 4096 1024 4.0000 4.928525 0.125795 -4.841174 284
148 dense 4096 4096 1.0000 2.568081 0.060616 1.208047 618
149 dense 4096 1024 4.0000 4.572821 0.056896 -3.849136 233
150 dense 4096 4096 1.0000 3.126071 0.051076 1.559397 372
151 dense 14336 4096 3.5000 4.825974 0.013304 4.684679 343
152 dense 14336 4096 3.5000 8.407626 0.031210 -0.180089 199 under-trained
153 dense 4096 1024 4.0000 3.375858 0.056814 0.248556 184
154 dense 4096 4096 1.0000 6.484951 0.035495 -1.766629 105 under-trained
155 dense 14336 4096 3.5000 8.628307 0.042378 -1.423876 282 under-trained
156 dense 14336 4096 3.5000 5.241843 0.018032 4.765576 264
157 dense 14336 4096 3.5000 8.688255 0.038182 -0.122972 211 under-trained
158 dense 14336 4096 3.5000 7.400242 0.032157 -0.836438 297 under-trained
159 dense 4096 1024 4.0000 3.848752 0.121181 -3.286236 294
160 dense 32000 4096 7.8125 4.584203 0.021245 11.074229 753
161 dense 4096 1024 4.0000 4.219729 0.085736 -0.161546 118
162 dense 4096 4096 1.0000 5.596872 0.030579 -1.130491 136
163 dense 4096 4096 1.0000 4.596440 0.030407 2.186925 55
164 dense 4096 4096 1.0000 4.294613 0.019408 2.264753 52
165 dense 4096 4096 1.0000 4.275867 0.062025 -0.221747 306
166 dense 4096 1024 4.0000 4.876291 0.026095 0.337908 75
167 dense 4096 1024 4.0000 13.569907 0.124125 -11.129812 55 under-trained
168 dense 14336 4096 3.5000 5.314026 0.027914 4.835165 334
169 dense 14336 4096 3.5000 5.975580 0.033901 -0.473650 454
170 dense 14336 4096 3.5000 9.373217 0.045144 -0.494534 202 under-trained
171 dense 14336 4096 3.5000 4.427773 0.035442 0.427996 111
172 dense 14336 4096 3.5000 5.481159 0.033323 5.215676 345
173 dense 14336 4096 3.5000 9.825033 0.052874 -0.576350 219 under-trained
174 dense 4096 1024 4.0000 5.459371 0.036923 -0.022412 37
175 dense 4096 4096 1.0000 5.494573 0.020604 -2.233284 177
176 dense 4096 4096 1.0000 3.513566 0.028636 1.785171 342
177 dense 4096 1024 4.0000 8.181651 0.029209 -6.146068 64 under-trained
178 dense 4096 4096 1.0000 4.490688 0.031715 2.076158 56
179 dense 4096 1024 4.0000 7.339449 0.041932 -5.639296 73 under-trained
180 dense 4096 1024 4.0000 5.120632 0.051605 -0.131346 41
181 dense 4096 4096 1.0000 6.052966 0.026592 -2.001672 81 under-trained
182 dense 14336 4096 3.5000 5.652226 0.038166 5.400175 364
183 dense 14336 4096 3.5000 4.400363 0.030990 0.465758 663
184 dense 14336 4096 3.5000 9.892211 0.058919 -0.094495 237 under-trained
185 dense 4096 1024 4.0000 4.568444 0.101101 -2.880076 221
186 dense 4096 4096 1.0000 4.960086 0.025379 -0.061624 202
187 dense 4096 1024 4.0000 4.901539 0.035109 0.117100 48
188 dense 4096 4096 1.0000 3.460357 0.029249 1.760781 365
189 dense 14336 4096 3.5000 5.681792 0.039536 5.361919 364
190 dense 14336 4096 3.5000 3.906791 0.028974 0.061031 757
191 dense 14336 4096 3.5000 9.927828 0.062160 0.377457 234 under-trained
192 dense 4096 4096 1.0000 4.585753 0.026564 2.289088 62
193 dense 14336 4096 3.5000 3.615591 0.040068 0.208689 997
194 dense 14336 4096 3.5000 5.594393 0.029200 5.350676 380
195 dense 14336 4096 3.5000 9.159689 0.067858 1.649830 283 under-trained
196 dense 4096 1024 4.0000 5.092237 0.026480 0.402357 102
197 dense 4096 4096 1.0000 4.261335 0.011401 0.245223 201
198 dense 4096 1024 4.0000 6.866542 0.034417 -3.825947 94 under-trained
199 dense 4096 4096 1.0000 3.338860 0.030275 1.815721 309
200 dense 4096 1024 4.0000 3.226660 0.062128 -1.306500 364
201 dense 4096 4096 1.0000 5.610492 0.020395 -0.384767 173
202 dense 14336 4096 3.5000 7.466650 0.053668 2.946760 373 under-trained
203 dense 14336 4096 3.5000 4.870738 0.030150 4.636444 585
204 dense 14336 4096 3.5000 4.363918 0.024421 -0.503818 635
205 dense 4096 1024 4.0000 3.796333 0.098892 0.046356 171
206 dense 4096 1024 4.0000 7.277830 0.084446 -4.541960 139 under-trained
207 dense 4096 4096 1.0000 6.301174 0.045851 -0.503135 186 under-trained
208 dense 4096 1024 4.0000 3.803149 0.028616 0.042533 76
209 dense 4096 4096 1.0000 3.226491 0.020777 1.739532 371
210 dense 14336 4096 3.5000 4.323790 0.033763 3.837365 748
211 dense 14336 4096 3.5000 7.321840 0.062352 -0.217687 262 under-trained
212 dense 14336 4096 3.5000 6.007769 0.044302 3.274354 458 under-trained
213 dense 4096 4096 1.0000 3.980580 0.031548 2.342541 44
214 dense 14336 4096 3.5000 6.874627 0.024979 0.868471 163 under-trained
215 dense 14336 4096 3.5000 3.669217 0.023163 3.099351 139
216 dense 14336 4096 3.5000 5.215590 0.043744 4.483207 558
217 dense 4096 1024 4.0000 4.491950 0.046289 -0.012260 42
218 dense 4096 4096 1.0000 4.824039 0.016144 1.374688 233
219 dense 4096 1024 4.0000 3.352536 0.057732 -0.885618 315
220 dense 4096 4096 1.0000 3.680682 0.066349 3.647850 367
221 dense 4096 1024 4.0000 3.806870 0.042079 0.059967 60
222 dense 4096 4096 1.0000 2.630984 0.040904 1.650408 485
223 dense 14336 4096 3.5000 3.478744 0.024005 2.962974 87
224 dense 14336 4096 3.5000 5.770203 0.032857 2.938153 354
225 dense 14336 4096 3.5000 4.154680 0.027501 5.479383 629
226 dense 4096 1024 4.0000 2.814196 0.046181 -0.670767 437