Hermes-3-Llama-3.1-8B


Find this model in the Hermes model summary


Hermes-3-Llama-3.1-8B Model Set Plots


Hermes Compared to Base Model Plots



Hermes-3-Llama-3.1-8B Model Selected Details
id layer_type N M Q alpha D alpha-hat num_spikes warning
1 dense 14336 4096 3.5 5.295381 0.023768 -8.961477 211
2 dense 14336 4096 3.5 3.749053 0.015799 -5.177400 161
3 dense 14336 4096 3.5 3.850816 0.018723 -5.578046 130
4 dense 4096 1024 4.0 1.906396 0.022809 -1.806081 132 over-trained
5 dense 4096 4096 1.0 2.885221 0.018861 -5.092109 117
6 dense 4096 4096 1.0 2.027878 0.025635 -1.609230 190
7 dense 4096 1024 4.0 3.049897 0.018744 -7.406732 115
8 dense 14336 4096 3.5 4.469266 0.026553 -5.661319 416
9 dense 14336 4096 3.5 4.035014 0.027664 -4.807630 195
10 dense 14336 4096 3.5 4.137212 0.029561 -5.241747 167
11 dense 4096 1024 4.0 2.011462 0.024780 -3.361220 151
12 dense 4096 4096 1.0 3.797735 0.031956 -6.674923 38
13 dense 4096 4096 1.0 1.968793 0.028600 -2.859889 418 over-trained
14 dense 4096 1024 4.0 4.811203 0.030059 -11.900788 62
15 dense 14336 4096 3.5 4.823202 0.020209 -7.260412 237
16 dense 14336 4096 3.5 3.613068 0.016578 -4.270818 467
17 dense 14336 4096 3.5 3.560802 0.014222 -4.416297 392
18 dense 4096 1024 4.0 2.524005 0.023186 -4.833287 175
19 dense 4096 4096 1.0 3.444337 0.028093 -6.244886 111
20 dense 4096 4096 1.0 2.417484 0.023682 -3.658886 383
21 dense 4096 1024 4.0 4.313898 0.041185 -10.481958 40
22 dense 14336 4096 3.5 4.799795 0.014479 -7.400042 211
23 dense 14336 4096 3.5 3.411124 0.006304 -3.818208 316
24 dense 14336 4096 3.5 3.536764 0.014298 -4.297893 140
25 dense 4096 1024 4.0 2.462758 0.022063 -4.588002 204
26 dense 4096 4096 1.0 4.059398 0.042848 -7.696440 68
27 dense 4096 4096 1.0 2.341629 0.020253 -3.622516 393
28 dense 4096 1024 4.0 4.386874 0.043476 -11.118332 63
29 dense 14336 4096 3.5 4.788579 0.017086 -7.357868 164
30 dense 14336 4096 3.5 3.402233 0.006695 -3.748934 284
31 dense 14336 4096 3.5 3.644814 0.013909 -4.653372 152
32 dense 4096 1024 4.0 2.281594 0.020035 -4.221965 196
33 dense 4096 4096 1.0 3.689105 0.039121 -6.597667 69
34 dense 4096 4096 1.0 2.234375 0.020122 -3.355302 345
35 dense 4096 1024 4.0 4.089047 0.040213 -9.675225 48
36 dense 14336 4096 3.5 5.373904 0.013314 -8.429750 91
37 dense 14336 4096 3.5 3.337372 0.012026 -3.724101 200
38 dense 14336 4096 3.5 3.791230 0.021328 -4.824885 93
39 dense 4096 1024 4.0 2.466029 0.020251 -4.668459 133
40 dense 4096 4096 1.0 3.051712 0.092129 -5.921456 289
41 dense 4096 4096 1.0 2.403789 0.025693 -3.521654 254
42 dense 4096 1024 4.0 3.523286 0.085276 -8.610021 174
43 dense 14336 4096 3.5 5.398245 0.019440 -8.583476 80
44 dense 14336 4096 3.5 3.221460 0.010875 -3.545466 247
45 dense 14336 4096 3.5 3.832565 0.021802 -4.929305 76
46 dense 4096 1024 4.0 2.444893 0.025000 -4.679215 175
47 dense 4096 4096 1.0 3.784717 0.082675 -7.536325 154
48 dense 4096 4096 1.0 2.436747 0.025694 -3.540902 216
49 dense 4096 1024 4.0 4.694763 0.040155 -11.650903 55
50 dense 14336 4096 3.5 4.959810 0.018475 -7.731927 103
51 dense 14336 4096 3.5 3.231561 0.016167 -3.378356 138
52 dense 14336 4096 3.5 3.795106 0.024363 -4.742485 81
53 dense 4096 1024 4.0 2.537985 0.034051 -4.767893 107
54 dense 4096 4096 1.0 2.926605 0.082933 -5.779194 265
55 dense 4096 4096 1.0 2.482558 0.031690 -3.626147 206
56 dense 4096 1024 4.0 4.077308 0.087736 -10.355821 135
57 dense 14336 4096 3.5 4.673904 0.018027 -7.395448 111
58 dense 14336 4096 3.5 3.216751 0.018513 -3.220733 85
59 dense 14336 4096 3.5 3.625075 0.016573 -4.327149 98
60 dense 4096 1024 4.0 2.518867 0.032899 -4.944197 134
61 dense 4096 4096 1.0 3.284561 0.091187 -6.413781 219
62 dense 4096 4096 1.0 2.467733 0.037801 -3.666043 253
63 dense 4096 1024 4.0 5.616464 0.043787 -13.971823 25
64 dense 14336 4096 3.5 3.508327 0.021308 -4.137800 150
65 dense 4096 4096 1.0 4.183744 0.029471 -7.080364 61
66 dense 4096 1024 4.0 4.709620 0.022489 -11.465989 52
67 dense 14336 4096 3.5 3.099162 0.016386 -2.975167 151
68 dense 14336 4096 3.5 4.781310 0.020075 -7.446955 109
69 dense 4096 1024 4.0 2.663426 0.032408 -5.080724 122
70 dense 4096 4096 1.0 2.439581 0.039395 -3.455519 279
71 dense 14336 4096 3.5 3.066240 0.020192 -2.865299 116
72 dense 14336 4096 3.5 4.877982 0.034690 -7.942179 78
73 dense 4096 4096 1.0 2.513236 0.048455 -3.576271 248
74 dense 14336 4096 3.5 3.541649 0.019071 -4.129910 98
75 dense 4096 1024 4.0 2.637184 0.035337 -5.219599 103
76 dense 4096 4096 1.0 5.161923 0.036922 -10.115914 42
77 dense 4096 1024 4.0 5.256280 0.034913 -12.788531 38
78 dense 14336 4096 3.5 3.423771 0.019824 -3.824997 96
79 dense 14336 4096 3.5 4.639916 0.046407 -7.576742 104
80 dense 14336 4096 3.5 2.854320 0.020820 -2.565286 304
81 dense 4096 1024 4.0 3.192914 0.107914 -7.760805 238
82 dense 4096 1024 4.0 2.618578 0.031802 -5.086369 118
83 dense 4096 4096 1.0 4.462091 0.043962 -8.154761 51
84 dense 4096 4096 1.0 2.449249 0.058202 -3.520593 334
85 dense 4096 1024 4.0 2.676222 0.039914 -4.982531 83
86 dense 14336 4096 3.5 4.796209 0.027068 -6.964505 45
87 dense 14336 4096 3.5 2.950898 0.016525 -2.722372 234
88 dense 14336 4096 3.5 3.428180 0.017997 -3.879729 104
89 dense 4096 4096 1.0 2.554714 0.052874 -3.625638 199
90 dense 4096 4096 1.0 1.660019 0.094809 -3.270242 1633 over-trained
91 dense 4096 1024 4.0 5.460925 0.044302 -13.902647 41
92 dense 4096 4096 1.0 2.585932 0.100135 -5.157975 416
93 dense 14336 4096 3.5 4.472110 0.033231 -6.480888 103
94 dense 14336 4096 3.5 3.113830 0.020711 -2.742413 106
95 dense 14336 4096 3.5 3.472100 0.014152 -3.761916 97
96 dense 4096 1024 4.0 2.920156 0.038613 -5.901863 84
97 dense 4096 1024 4.0 5.372351 0.083885 -13.945572 66
98 dense 4096 4096 1.0 2.371923 0.065077 -3.234848 375
99 dense 4096 4096 1.0 3.036737 0.052283 -4.190779 98
100 dense 14336 4096 3.5 4.438798 0.030532 -6.307525 76
101 dense 14336 4096 3.5 3.081226 0.018766 -2.802701 205
102 dense 14336 4096 3.5 3.505570 0.016455 -3.927277 123
103 dense 4096 1024 4.0 2.866710 0.030385 -5.230228 80
104 dense 4096 4096 1.0 4.665673 0.095209 -10.027520 108
105 dense 4096 1024 4.0 4.162923 0.097391 -10.603458 150
106 dense 4096 1024 4.0 5.380621 0.053809 -13.775430 59
107 dense 14336 4096 3.5 4.488730 0.044281 -6.744297 103
108 dense 14336 4096 3.5 3.342923 0.020390 -3.089751 89
109 dense 14336 4096 3.5 3.792009 0.015362 -4.373311 97
110 dense 4096 1024 4.0 2.702514 0.031906 -5.254390 118
111 dense 4096 4096 1.0 4.621530 0.032428 -8.607382 74
112 dense 4096 4096 1.0 2.656258 0.045121 -3.417236 177
113 dense 4096 1024 4.0 2.706005 0.033164 -4.991454 112
114 dense 14336 4096 3.5 5.055124 0.022957 -7.301909 65
115 dense 14336 4096 3.5 3.241772 0.015897 -2.987007 96
116 dense 14336 4096 3.5 3.558214 0.017217 -4.071745 183
117 dense 4096 4096 1.0 2.631035 0.040533 -3.481965 190
118 dense 4096 4096 1.0 4.834619 0.036133 -8.609447 43
119 dense 4096 1024 4.0 4.836959 0.104641 -12.055250 114
120 dense 14336 4096 3.5 4.607659 0.055123 -6.789639 182
121 dense 4096 4096 1.0 3.829558 0.031693 -5.553322 103
122 dense 14336 4096 3.5 3.305722 0.016057 -2.980220 95
123 dense 14336 4096 3.5 3.661368 0.015076 -4.040036 122
124 dense 4096 1024 4.0 2.689401 0.027887 -5.131115 120
125 dense 4096 1024 4.0 5.134865 0.036760 -12.419101 36
126 dense 4096 4096 1.0 2.520513 0.032989 -3.194654 222
127 dense 14336 4096 3.5 3.266457 0.012814 -3.080145 158
128 dense 14336 4096 3.5 4.128108 0.058352 -5.794711 277
129 dense 4096 4096 1.0 2.671257 0.093106 -4.869029 492
130 dense 4096 1024 4.0 2.677008 0.026771 -4.969790 111
131 dense 4096 1024 4.0 4.801226 0.050415 -11.889345 61
132 dense 4096 4096 1.0 2.634032 0.038641 -3.516497 204
133 dense 14336 4096 3.5 3.652908 0.011240 -4.199337 124
134 dense 14336 4096 3.5 3.809864 0.052521 -5.445156 373
135 dense 14336 4096 3.5 3.274169 0.012724 -2.961226 125
136 dense 14336 4096 3.5 3.565761 0.013941 -3.862885 162
137 dense 4096 4096 1.0 2.450398 0.030958 -2.983069 243
138 dense 4096 4096 1.0 3.609998 0.025653 -5.623656 94
139 dense 4096 1024 4.0 4.078540 0.046096 -9.741083 75
140 dense 4096 1024 4.0 2.601951 0.023483 -4.776648 113
141 dense 4096 4096 1.0 3.404036 0.085196 -5.993436 240
142 dense 4096 1024 4.0 2.624618 0.021682 -4.859868 136
143 dense 4096 4096 1.0 2.520515 0.036427 -3.038705 228
144 dense 14336 4096 3.5 3.341182 0.013666 -3.066720 130
145 dense 4096 1024 4.0 4.577442 0.032080 -10.597634 45
146 dense 14336 4096 3.5 4.203556 0.050745 -6.096876 228
147 dense 14336 4096 3.5 3.655248 0.014880 -4.030108 121
148 dense 14336 4096 3.5 4.447389 0.029775 -6.599878 169
149 dense 14336 4096 3.5 3.316091 0.013858 -3.042093 125
150 dense 14336 4096 3.5 3.655463 0.014130 -4.004596 125
151 dense 4096 1024 4.0 2.600749 0.024528 -4.269374 179
152 dense 4096 4096 1.0 5.160411 0.033519 -9.821305 44
153 dense 4096 4096 1.0 2.495013 0.028418 -2.917683 244
154 dense 4096 1024 4.0 4.304993 0.026859 -9.969208 74
155 dense 14336 4096 3.5 4.639598 0.020459 -6.887632 181
156 dense 14336 4096 3.5 3.338361 0.019322 -3.162353 184
157 dense 14336 4096 3.5 3.697756 0.017387 -4.108070 145
158 dense 4096 1024 4.0 2.717831 0.030230 -5.100702 121
159 dense 4096 4096 1.0 3.130608 0.090630 -5.825553 328
160 dense 4096 4096 1.0 2.468472 0.041645 -3.089152 304
161 dense 4096 1024 4.0 4.637516 0.060256 -11.140906 42
162 dense 14336 4096 3.5 4.657364 0.017644 -6.967848 187
163 dense 14336 4096 3.5 3.311814 0.018769 -3.102706 221
164 dense 14336 4096 3.5 3.643139 0.016529 -4.020860 149
165 dense 4096 1024 4.0 2.536772 0.026238 -4.820742 164
166 dense 4096 4096 1.0 4.435510 0.029606 -7.576730 83
167 dense 4096 4096 1.0 2.479078 0.026539 -3.210805 231
168 dense 4096 1024 4.0 4.111923 0.028323 -9.235118 50
169 dense 14336 4096 3.5 4.670594 0.018152 -6.880405 170
170 dense 14336 4096 3.5 3.371728 0.015059 -3.230821 197
171 dense 14336 4096 3.5 3.642431 0.015860 -3.984515 146
172 dense 4096 1024 4.0 2.633770 0.021177 -5.031448 138
173 dense 4096 4096 1.0 4.199692 0.034726 -7.392704 60
174 dense 4096 4096 1.0 2.497205 0.031163 -3.263328 276
175 dense 4096 1024 4.0 3.911139 0.037202 -8.863387 63
176 dense 14336 4096 3.5 4.533842 0.017820 -6.664560 168
177 dense 14336 4096 3.5 3.460404 0.016824 -3.209949 122
178 dense 14336 4096 3.5 3.667961 0.017027 -3.909915 124
179 dense 4096 1024 4.0 2.559537 0.019599 -4.519321 158
180 dense 4096 4096 1.0 3.899594 0.020284 -6.227884 114
181 dense 4096 4096 1.0 2.358182 0.023039 -3.147862 337
182 dense 4096 1024 4.0 3.981119 0.023617 -8.362549 60
183 dense 14336 4096 3.5 4.370814 0.022979 -6.277787 165
184 dense 14336 4096 3.5 3.338333 0.013330 -3.004651 251
185 dense 14336 4096 3.5 3.607203 0.015150 -3.753975 141
186 dense 4096 1024 4.0 2.524459 0.019156 -4.498052 126
187 dense 4096 4096 1.0 3.550164 0.021739 -5.239901 130
188 dense 4096 4096 1.0 2.399148 0.022280 -2.858006 239
189 dense 4096 1024 4.0 3.645461 0.020157 -8.126813 80
190 dense 14336 4096 3.5 4.046305 0.024385 -6.016987 243
191 dense 14336 4096 3.5 3.336334 0.008579 -2.859515 258
192 dense 14336 4096 3.5 3.547832 0.014206 -3.512564 175
193 dense 4096 1024 4.0 2.545339 0.018230 -4.463473 150
194 dense 4096 4096 1.0 3.553313 0.030251 -5.439368 135
195 dense 4096 4096 1.0 2.377979 0.020152 -3.035110 337
196 dense 4096 1024 4.0 3.770207 0.021635 -8.154721 83
197 dense 14336 4096 3.5 4.030360 0.025017 -5.939592 189
198 dense 14336 4096 3.5 3.307970 0.012216 -2.696810 230
199 dense 14336 4096 3.5 3.506419 0.016220 -3.382304 231
200 dense 4096 1024 4.0 2.476122 0.023864 -4.639355 171
201 dense 4096 4096 1.0 4.534254 0.026977 -8.433124 63
202 dense 4096 4096 1.0 2.483038 0.023241 -3.021310 231
203 dense 4096 1024 4.0 4.292503 0.025505 -9.677990 62
204 dense 14336 4096 3.5 4.121265 0.019575 -5.664195 75
205 dense 14336 4096 3.5 3.200874 0.010310 -2.444110 196
206 dense 14336 4096 3.5 3.374703 0.012154 -3.121649 219
207 dense 4096 1024 4.0 2.381858 0.022804 -4.043409 182
208 dense 4096 4096 1.0 3.399582 0.014635 -4.781116 126
209 dense 4096 4096 1.0 2.262158 0.018267 -2.617378 330
210 dense 4096 1024 4.0 3.259577 0.015642 -6.777980 118
211 dense 14336 4096 3.5 3.971001 0.027708 -5.203513 61
212 dense 14336 4096 3.5 3.129455 0.011711 -2.315122 343
213 dense 14336 4096 3.5 3.326523 0.015747 -2.898843 329
214 dense 4096 1024 4.0 2.439302 0.022294 -4.241211 138
215 dense 4096 4096 1.0 3.244950 0.029639 -4.015278 148
216 dense 4096 4096 1.0 2.424478 0.022486 -2.608761 241
217 dense 4096 1024 4.0 3.454955 0.021956 -7.384321 95
218 dense 14336 4096 3.5 3.047675 0.015291 -2.311090 258
219 dense 14336 4096 3.5 3.188403 0.017626 -2.762698 289
220 dense 4096 1024 4.0 2.226761 0.017711 -3.397122 174
221 dense 4096 4096 1.0 3.166066 0.021212 -4.566527 152
222 dense 4096 4096 1.0 2.216537 0.018061 -1.952136 289
223 dense 4096 1024 4.0 3.257628 0.021969 -6.773057 120
224 dense 14336 4096 3.5 2.885491 0.011159 -2.871362 401