Hermes-2-Theta-Llama-3-8B


Find this model in the Hermes model summary


Hermes-2-Theta-Llama-3-8B Model Set Plots


Hermes Compared to Base Model Plots



Hermes-2-Theta-Llama-3-8B Model Selected Details
id layer_type N M Q alpha D alpha-hat num_spikes warning
1 dense 14336 4096 3.5 5.957163 0.032739 -9.516301 186
2 dense 14336 4096 3.5 3.647221 0.015361 -4.614641 363
3 dense 14336 4096 3.5 4.030530 0.021233 -5.341693 128
4 dense 4096 1024 4.0 1.922321 0.019342 -2.349946 150 over-trained
5 dense 4096 4096 1.0 2.733021 0.022314 -3.936042 169
6 dense 4096 4096 1.0 2.075975 0.026605 -1.701898 220
7 dense 4096 1024 4.0 1.405196 0.084559 -1.245597 948 over-trained
8 dense 14336 4096 3.5 1.318892 0.093132 0.127021 4080 over-trained
9 dense 14336 4096 3.5 4.252375 0.029060 -4.824080 154
10 dense 14336 4096 3.5 4.408560 0.026533 -5.361875 130
11 dense 4096 1024 4.0 2.056844 0.024649 -3.147956 154
12 dense 4096 4096 1.0 3.849565 0.023648 -6.479472 72
13 dense 4096 4096 1.0 2.055462 0.028000 -2.091554 373
14 dense 4096 1024 4.0 2.884047 0.048658 -2.656899 267
15 dense 14336 4096 3.5 5.208352 0.021929 -7.305076 220
16 dense 14336 4096 3.5 3.746501 0.016440 -4.545556 407
17 dense 14336 4096 3.5 3.744410 0.011877 -4.781128 336
18 dense 4096 1024 4.0 2.669780 0.025187 -4.970363 151
19 dense 4096 4096 1.0 3.326921 0.072659 -5.984538 298
20 dense 4096 4096 1.0 2.663013 0.021792 -3.748748 279
21 dense 4096 1024 4.0 5.117645 0.047535 -12.288982 41
22 dense 14336 4096 3.5 5.366543 0.020547 -7.799214 201
23 dense 14336 4096 3.5 3.456977 0.007263 -4.120393 378
24 dense 14336 4096 3.5 3.557370 0.018353 -4.616504 240
25 dense 4096 1024 4.0 2.524456 0.021094 -4.636013 194
26 dense 4096 4096 1.0 3.415699 0.081107 -6.344085 256
27 dense 4096 4096 1.0 2.500943 0.023645 -3.593613 318
28 dense 4096 1024 4.0 4.653987 0.046727 -11.319416 75
29 dense 14336 4096 3.5 5.252094 0.022397 -7.918607 194
30 dense 14336 4096 3.5 3.523704 0.008123 -4.273948 280
31 dense 14336 4096 3.5 3.886005 0.019107 -5.333653 112
32 dense 4096 1024 4.0 2.352245 0.017214 -4.110186 185
33 dense 4096 4096 1.0 4.073351 0.038551 -7.284115 64
34 dense 4096 4096 1.0 2.432466 0.024530 -3.344850 213
35 dense 4096 1024 4.0 4.324075 0.032810 -9.975130 44
36 dense 14336 4096 3.5 5.600412 0.015394 -8.910494 154
37 dense 14336 4096 3.5 3.455170 0.016099 -4.300872 207
38 dense 14336 4096 3.5 4.112966 0.018995 -5.728200 72
39 dense 4096 1024 4.0 2.507369 0.026588 -4.753304 154
40 dense 4096 4096 1.0 4.033011 0.085639 -7.736097 147
41 dense 4096 4096 1.0 2.561864 0.035281 -3.637386 234
42 dense 4096 1024 4.0 3.986734 0.093294 -9.838802 158
43 dense 14336 4096 3.5 5.357660 0.020408 -8.756458 132
44 dense 14336 4096 3.5 3.360525 0.014771 -4.108002 265
45 dense 14336 4096 3.5 4.025613 0.022804 -5.581858 104
46 dense 4096 1024 4.0 2.569440 0.019737 -4.936012 148
47 dense 4096 4096 1.0 3.777960 0.093468 -7.327690 196
48 dense 4096 4096 1.0 2.611926 0.038202 -3.619471 211
49 dense 4096 1024 4.0 5.479682 0.030017 -13.102387 37
50 dense 14336 4096 3.5 5.096997 0.015128 -8.127706 117
51 dense 14336 4096 3.5 3.341561 0.018475 -3.777566 194
52 dense 14336 4096 3.5 4.012666 0.019282 -5.230559 77
53 dense 4096 1024 4.0 2.456757 0.033857 -4.783833 197
54 dense 4096 4096 1.0 3.452179 0.086904 -6.722154 199
55 dense 4096 4096 1.0 2.599681 0.046537 -3.535409 236
56 dense 4096 1024 4.0 5.399696 0.049305 -13.647471 53
57 dense 14336 4096 3.5 4.734130 0.017198 -7.452848 141
58 dense 14336 4096 3.5 3.352270 0.017038 -3.621473 135
59 dense 14336 4096 3.5 3.833233 0.015954 -4.862757 99
60 dense 4096 1024 4.0 2.545374 0.032519 -4.938880 170
61 dense 4096 4096 1.0 5.494582 0.036385 -10.500319 47
62 dense 4096 4096 1.0 2.586565 0.048621 -3.814261 250
63 dense 4096 1024 4.0 4.697856 0.094011 -11.480021 122
64 dense 14336 4096 3.5 3.766566 0.014836 -4.661041 106
65 dense 4096 4096 1.0 4.493260 0.028773 -7.818333 81
66 dense 4096 1024 4.0 4.441197 0.024446 -9.825900 70
67 dense 14336 4096 3.5 3.208489 0.018637 -3.238176 183
68 dense 14336 4096 3.5 4.951099 0.014540 -7.730366 107
69 dense 4096 1024 4.0 2.759534 0.028159 -5.221372 94
70 dense 4096 4096 1.0 2.439795 0.051140 -3.344279 378
71 dense 14336 4096 3.5 3.258280 0.019457 -3.225565 109
72 dense 14336 4096 3.5 4.800373 0.028879 -7.515161 65
73 dense 4096 4096 1.0 2.678531 0.049775 -3.759027 208
74 dense 14336 4096 3.5 3.675408 0.014119 -4.417948 119
75 dense 4096 1024 4.0 2.640982 0.028400 -4.984359 130
76 dense 4096 4096 1.0 5.347047 0.036160 -9.960629 42
77 dense 4096 1024 4.0 5.338837 0.025805 -12.211340 45
78 dense 14336 4096 3.5 3.395128 0.024750 -3.886869 160
79 dense 14336 4096 3.5 4.176824 0.027165 -6.523372 140
80 dense 14336 4096 3.5 3.018469 0.023613 -2.773939 183
81 dense 4096 1024 4.0 5.328021 0.034461 -12.789717 34
82 dense 4096 1024 4.0 2.676838 0.026333 -5.168788 129
83 dense 4096 4096 1.0 5.023318 0.033210 -9.390169 41
84 dense 4096 4096 1.0 2.462148 0.060474 -3.455987 391
85 dense 4096 1024 4.0 2.411170 0.051951 -4.856386 216
86 dense 14336 4096 3.5 4.114486 0.029929 -6.203921 107
87 dense 14336 4096 3.5 3.055360 0.018238 -2.862454 182
88 dense 14336 4096 3.5 3.417916 0.014558 -3.871161 132
89 dense 4096 4096 1.0 2.630231 0.065573 -3.569258 257
90 dense 4096 4096 1.0 1.726172 0.100740 -3.439593 1544 over-trained
91 dense 4096 1024 4.0 4.393605 0.109716 -11.035530 156
92 dense 4096 4096 1.0 5.887809 0.046679 -12.020845 41
93 dense 14336 4096 3.5 4.059830 0.029241 -6.148284 161
94 dense 14336 4096 3.5 3.153897 0.024719 -2.988880 160
95 dense 14336 4096 3.5 3.578521 0.019544 -4.016086 105
96 dense 4096 1024 4.0 2.886817 0.042077 -5.690984 76
97 dense 4096 1024 4.0 5.729954 0.035553 -14.586414 43
98 dense 4096 4096 1.0 2.338342 0.066709 -3.203271 457
99 dense 4096 4096 1.0 2.736664 0.062600 -3.823635 245
100 dense 14336 4096 3.5 4.251366 0.031055 -6.327261 77
101 dense 14336 4096 3.5 3.315305 0.027203 -3.264269 134
102 dense 14336 4096 3.5 3.633840 0.020060 -4.268727 154
103 dense 4096 1024 4.0 2.867023 0.034338 -5.548482 80
104 dense 4096 4096 1.0 4.928481 0.108670 -10.582252 139
105 dense 4096 1024 4.0 4.802842 0.089534 -12.134527 117
106 dense 4096 1024 4.0 5.792980 0.040270 -14.539122 31
107 dense 14336 4096 3.5 4.447523 0.034443 -6.790458 92
108 dense 14336 4096 3.5 3.866706 0.027089 -3.871009 33
109 dense 14336 4096 3.5 4.045045 0.023629 -4.923228 71
110 dense 4096 1024 4.0 2.741247 0.031328 -5.375114 114
111 dense 4096 4096 1.0 4.462989 0.026106 -8.378847 73
112 dense 4096 4096 1.0 2.666877 0.053133 -3.438770 231
113 dense 4096 1024 4.0 2.823390 0.031583 -5.487721 86
114 dense 14336 4096 3.5 4.852466 0.029747 -7.411042 77
115 dense 14336 4096 3.5 3.532929 0.024598 -3.646928 120
116 dense 14336 4096 3.5 3.986952 0.019074 -4.917610 106
117 dense 4096 4096 1.0 2.596253 0.059408 -3.436754 308
118 dense 4096 4096 1.0 4.856519 0.032206 -9.705155 64
119 dense 4096 1024 4.0 5.502449 0.047707 -13.833707 46
120 dense 14336 4096 3.5 4.215714 0.064641 -6.573421 345
121 dense 4096 4096 1.0 4.056780 0.020538 -7.276928 100
122 dense 14336 4096 3.5 3.628602 0.026476 -3.671095 101
123 dense 14336 4096 3.5 4.126775 0.015914 -5.033578 79
124 dense 4096 1024 4.0 2.758359 0.032180 -5.173127 107
125 dense 4096 1024 4.0 5.156567 0.028011 -12.524475 47
126 dense 4096 4096 1.0 2.623976 0.036299 -3.253193 226
127 dense 14336 4096 3.5 3.694313 0.024415 -3.974818 77
128 dense 14336 4096 3.5 4.707387 0.033847 -6.946672 110
129 dense 4096 4096 1.0 4.129001 0.028167 -7.704074 71
130 dense 4096 1024 4.0 2.611257 0.031624 -4.860466 149
131 dense 4096 1024 4.0 4.418172 0.095738 -11.207103 128
132 dense 4096 4096 1.0 2.571651 0.047660 -3.534892 304
133 dense 14336 4096 3.5 4.125328 0.020718 -5.240735 82
134 dense 14336 4096 3.5 4.035383 0.040377 -6.159809 275
135 dense 14336 4096 3.5 3.705411 0.021687 -3.897234 75
136 dense 14336 4096 3.5 4.071136 0.016567 -5.038270 88
137 dense 4096 4096 1.0 2.437323 0.037203 -2.836765 305
138 dense 4096 4096 1.0 3.595043 0.022689 -5.864740 111
139 dense 4096 1024 4.0 4.541698 0.054971 -11.084673 57
140 dense 4096 1024 4.0 2.566443 0.027369 -4.477069 120
141 dense 4096 4096 1.0 3.848584 0.081650 -6.885815 182
142 dense 4096 1024 4.0 2.624843 0.027081 -4.715782 123
143 dense 4096 4096 1.0 2.471347 0.042071 -2.922973 293
144 dense 14336 4096 3.5 3.687042 0.016586 -3.855428 75
145 dense 4096 1024 4.0 3.369346 0.091987 -7.835680 226
146 dense 14336 4096 3.5 4.023614 0.036989 -6.193098 280
147 dense 14336 4096 3.5 4.096328 0.022189 -5.045361 79
148 dense 14336 4096 3.5 4.398837 0.025051 -6.863149 222
149 dense 14336 4096 3.5 3.635033 0.015504 -3.763745 85
150 dense 14336 4096 3.5 4.050098 0.016096 -4.919818 82
151 dense 4096 1024 4.0 2.673254 0.023220 -4.118611 118
152 dense 4096 4096 1.0 4.920418 0.023949 -8.958295 46
153 dense 4096 4096 1.0 2.457679 0.044624 -2.730056 309
154 dense 4096 1024 4.0 4.832072 0.025819 -11.243217 41
155 dense 14336 4096 3.5 4.669886 0.018986 -7.294513 189
156 dense 14336 4096 3.5 3.657891 0.020765 -3.946563 111
157 dense 14336 4096 3.5 4.098196 0.020903 -5.206244 82
158 dense 4096 1024 4.0 2.725626 0.030742 -4.734949 90
159 dense 4096 4096 1.0 4.025946 0.092568 -7.640873 195
160 dense 4096 4096 1.0 2.409804 0.053251 -2.939734 418
161 dense 4096 1024 4.0 3.815157 0.089055 -9.218732 182
162 dense 14336 4096 3.5 4.632877 0.016147 -7.234752 227
163 dense 14336 4096 3.5 3.687682 0.017033 -4.018909 102
164 dense 14336 4096 3.5 4.130732 0.020048 -5.234490 75
165 dense 4096 1024 4.0 2.591589 0.023375 -4.699351 146
166 dense 4096 4096 1.0 5.686624 0.039369 -10.600514 44
167 dense 4096 4096 1.0 2.470140 0.042305 -3.191385 306
168 dense 4096 1024 4.0 4.448506 0.030361 -10.045736 48
169 dense 14336 4096 3.5 4.666673 0.012531 -7.129236 193
170 dense 14336 4096 3.5 3.701308 0.018713 -4.024874 85
171 dense 14336 4096 3.5 4.013771 0.018653 -5.045371 93
172 dense 4096 1024 4.0 2.568684 0.022839 -4.541762 159
173 dense 4096 4096 1.0 4.056677 0.025741 -7.098972 81
174 dense 4096 4096 1.0 2.459752 0.039289 -3.116696 324
175 dense 4096 1024 4.0 3.739862 0.035777 -8.585863 88
176 dense 14336 4096 3.5 4.597638 0.012848 -6.776590 188
177 dense 14336 4096 3.5 3.713913 0.017908 -3.921366 82
178 dense 14336 4096 3.5 4.033208 0.019179 -4.895406 67
179 dense 4096 1024 4.0 2.595888 0.020635 -4.496308 116
180 dense 4096 4096 1.0 3.850624 0.018729 -6.665052 112
181 dense 4096 4096 1.0 2.476501 0.025789 -3.228618 280
182 dense 4096 1024 4.0 4.015984 0.030906 -8.503636 49
183 dense 14336 4096 3.5 4.615660 0.018782 -6.685734 102
184 dense 14336 4096 3.5 3.604203 0.014354 -3.637801 93
185 dense 14336 4096 3.5 3.735274 0.022366 -4.402260 200
186 dense 4096 1024 4.0 2.482785 0.020436 -4.309703 141
187 dense 4096 4096 1.0 3.686235 0.017017 -6.354006 127
188 dense 4096 4096 1.0 2.377753 0.031865 -2.561932 299
189 dense 4096 1024 4.0 3.475396 0.032096 -7.684174 122
190 dense 14336 4096 3.5 4.495831 0.019932 -6.653161 81
191 dense 14336 4096 3.5 3.518953 0.012619 -3.357120 133
192 dense 14336 4096 3.5 3.736934 0.011122 -4.069835 154
193 dense 4096 1024 4.0 2.564571 0.019521 -4.454157 133
194 dense 4096 4096 1.0 3.701296 0.027139 -5.750935 144
195 dense 4096 4096 1.0 2.409214 0.026765 -2.859307 341
196 dense 4096 1024 4.0 3.572334 0.020909 -7.707836 107
197 dense 14336 4096 3.5 4.027933 0.025021 -5.737824 263
198 dense 14336 4096 3.5 3.462123 0.010609 -3.025308 183
199 dense 14336 4096 3.5 3.699154 0.010521 -3.872052 182
200 dense 4096 1024 4.0 2.499961 0.027360 -4.333710 136
201 dense 4096 4096 1.0 4.980196 0.027002 -8.655975 46
202 dense 4096 4096 1.0 2.586686 0.035654 -2.722124 166
203 dense 4096 1024 4.0 4.584627 0.026524 -10.490426 56
204 dense 14336 4096 3.5 4.069199 0.021871 -5.365790 93
205 dense 14336 4096 3.5 3.302401 0.009607 -2.650692 159
206 dense 14336 4096 3.5 3.486789 0.011025 -3.349629 212
207 dense 4096 1024 4.0 2.488924 0.022909 -4.005733 136
208 dense 4096 4096 1.0 3.552263 0.017234 -5.457797 134
209 dense 4096 4096 1.0 2.375678 0.022880 -2.643841 249
210 dense 4096 1024 4.0 3.319430 0.019782 -6.819321 136
211 dense 14336 4096 3.5 3.814198 0.022403 -4.824579 78
212 dense 14336 4096 3.5 3.221006 0.011246 -2.370037 289
213 dense 14336 4096 3.5 3.414938 0.012919 -3.000312 308
214 dense 4096 1024 4.0 2.289505 0.021268 -3.367802 202
215 dense 4096 4096 1.0 3.225673 0.022560 -4.158966 172
216 dense 4096 4096 1.0 2.410559 0.029677 -2.070359 197
217 dense 4096 1024 4.0 3.424031 0.018226 -7.422996 104
218 dense 14336 4096 3.5 3.088008 0.013438 -2.173780 273
219 dense 14336 4096 3.5 3.247513 0.015614 -2.544166 289
220 dense 4096 1024 4.0 2.186345 0.023342 -2.670831 162
221 dense 4096 4096 1.0 3.352293 0.022146 -4.951964 120
222 dense 4096 4096 1.0 2.235232 0.025409 -1.510760 241
223 dense 4096 1024 4.0 3.265499 0.023360 -6.099895 125
224 dense 14336 4096 3.5 2.962964 0.011329 -3.042833 410