OLMo-7B-Instruct-hf


Find this model in the OLMo model summary


OLMo-7B-Instruct-hf Model Set Plots



OLMo-7B-Instruct-hf Model Selected Details
id layer_type N M Q alpha D alpha-hat num_spikes warning
1 dense 11008 4096 2.6875 2.870099 0.013890 -4.995752 309
2 dense 11008 4096 2.6875 2.830372 0.031093 -5.080016 219
3 dense 11008 4096 2.6875 2.774968 0.024044 -4.475009 231
4 dense 4096 4096 1.0000 1.686795 0.027535 -1.609198 475 over-trained
5 dense 4096 4096 1.0000 2.425616 0.010851 -4.151335 225
6 dense 4096 4096 1.0000 1.957195 0.026237 -1.797027 112 over-trained
7 dense 4096 4096 1.0000 2.177086 0.013562 -3.240998 172
8 dense 11008 4096 2.6875 3.302397 0.025836 -6.049888 150
9 dense 11008 4096 2.6875 2.775847 0.031069 -5.364885 416
10 dense 11008 4096 2.6875 3.227779 0.029306 -6.257103 90
11 dense 4096 4096 1.0000 1.843908 0.017283 -2.229719 384 over-trained
12 dense 4096 4096 1.0000 2.729006 0.017545 -5.819385 161
13 dense 4096 4096 1.0000 1.937361 0.020063 -2.298488 230 over-trained
14 dense 4096 4096 1.0000 2.511168 0.016754 -4.634241 106
15 dense 11008 4096 2.6875 3.256819 0.016089 -6.394822 162
16 dense 11008 4096 2.6875 2.831637 0.014197 -5.421003 404
17 dense 11008 4096 2.6875 2.921129 0.014334 -5.568607 104
18 dense 4096 4096 1.0000 2.007431 0.021468 -3.251720 294
19 dense 4096 4096 1.0000 2.591263 0.023502 -5.896521 215
20 dense 4096 4096 1.0000 1.987366 0.019148 -3.248692 313 over-trained
21 dense 4096 4096 1.0000 2.341956 0.024353 -4.912963 349
22 dense 11008 4096 2.6875 3.285258 0.011907 -6.690897 160
23 dense 11008 4096 2.6875 2.870323 0.016566 -5.651185 121
24 dense 11008 4096 2.6875 2.959848 0.019960 -6.031633 90
25 dense 4096 4096 1.0000 1.991538 0.022051 -2.838577 363 over-trained
26 dense 4096 4096 1.0000 2.717729 0.032307 -6.726814 238
27 dense 4096 4096 1.0000 1.961364 0.022441 -3.094433 335 over-trained
28 dense 4096 4096 1.0000 2.626639 0.029881 -5.935054 166
29 dense 11008 4096 2.6875 3.412908 0.012043 -7.318534 158
30 dense 11008 4096 2.6875 2.833998 0.008510 -5.668740 204
31 dense 11008 4096 2.6875 2.876893 0.009712 -5.859413 153
32 dense 4096 4096 1.0000 2.034087 0.018544 -3.833205 341
33 dense 4096 4096 1.0000 3.357843 0.040393 -8.455975 33
34 dense 4096 4096 1.0000 2.024140 0.023628 -3.844620 256
35 dense 4096 4096 1.0000 2.992911 0.033034 -7.071689 49
36 dense 4096 4096 1.0000 1.890875 0.023883 -3.399640 331 over-trained
37 dense 11008 4096 2.6875 2.863328 0.012719 -5.825273 263
38 dense 11008 4096 2.6875 3.598236 0.011873 -8.334150 145
39 dense 4096 4096 1.0000 1.893907 0.023785 -3.113947 371 over-trained
40 dense 11008 4096 2.6875 2.899697 0.010515 -5.994937 222
41 dense 4096 4096 1.0000 2.878145 0.027516 -6.867211 72
42 dense 4096 4096 1.0000 2.837192 0.035890 -7.216951 167
43 dense 11008 4096 2.6875 3.043523 0.018257 -6.262518 175
44 dense 11008 4096 2.6875 4.172201 0.016544 -10.590708 119
45 dense 4096 4096 1.0000 1.909140 0.025466 -3.086290 350 over-trained
46 dense 11008 4096 2.6875 3.085367 0.019700 -6.472252 163
47 dense 4096 4096 1.0000 1.907567 0.024045 -3.480795 321 over-trained
48 dense 4096 4096 1.0000 3.067694 0.037640 -8.335465 173
49 dense 4096 4096 1.0000 2.764151 0.024878 -7.054037 205
50 dense 11008 4096 2.6875 3.071569 0.023264 -6.501795 249
51 dense 11008 4096 2.6875 3.107993 0.021494 -6.485707 188
52 dense 11008 4096 2.6875 4.326353 0.013957 -11.039759 134
53 dense 4096 4096 1.0000 1.746276 0.028931 -2.913477 336 over-trained
54 dense 4096 4096 1.0000 3.347217 0.027522 -9.176551 110
55 dense 4096 4096 1.0000 2.988440 0.018958 -7.547648 120
56 dense 4096 4096 1.0000 1.809692 0.034348 -3.097365 259 over-trained
57 dense 4096 4096 1.0000 1.820121 0.031719 -3.353327 196 over-trained
58 dense 4096 4096 1.0000 2.784002 0.025566 -7.150534 193
59 dense 4096 4096 1.0000 1.711601 0.029609 -2.975714 230 over-trained
60 dense 11008 4096 2.6875 3.135541 0.019281 -6.706806 244
61 dense 11008 4096 2.6875 3.080493 0.015184 -6.439683 251
62 dense 11008 4096 2.6875 4.424613 0.014667 -11.351475 211
63 dense 4096 4096 1.0000 2.547824 0.025257 -6.120445 236
64 dense 11008 4096 2.6875 4.545532 0.015920 -11.936458 161
65 dense 4096 4096 1.0000 1.837154 0.024515 -3.586525 161 over-trained
66 dense 4096 4096 1.0000 2.478291 0.039184 -6.070393 210
67 dense 11008 4096 2.6875 3.136551 0.015722 -6.584624 254
68 dense 11008 4096 2.6875 3.187856 0.016192 -6.863911 264
69 dense 4096 4096 1.0000 1.881736 0.030008 -3.792818 168 over-trained
70 dense 4096 4096 1.0000 2.281286 0.040251 -5.880940 355
71 dense 11008 4096 2.6875 4.662669 0.016466 -12.407244 139
72 dense 11008 4096 2.6875 3.138569 0.015219 -6.628014 313
73 dense 4096 4096 1.0000 2.081680 0.056451 -5.509162 462
74 dense 11008 4096 2.6875 3.237035 0.016059 -7.017595 280
75 dense 4096 4096 1.0000 2.150075 0.046285 -5.185836 422
76 dense 4096 4096 1.0000 1.744263 0.024048 -3.208610 144 over-trained
77 dense 4096 4096 1.0000 1.793876 0.025906 -3.655395 118 over-trained
78 dense 4096 4096 1.0000 2.100638 0.059426 -5.639389 396
79 dense 4096 4096 1.0000 1.839552 0.035043 -3.971100 125 over-trained
80 dense 4096 4096 1.0000 1.732215 0.040185 -3.642086 109 over-trained
81 dense 11008 4096 2.6875 4.786839 0.016872 -12.762893 113
82 dense 11008 4096 2.6875 3.170143 0.015822 -6.758442 329
83 dense 11008 4096 2.6875 3.217625 0.016447 -7.037262 343
84 dense 4096 4096 1.0000 2.442250 0.050820 -5.925394 152
85 dense 11008 4096 2.6875 4.419100 0.022724 -11.514913 157
86 dense 11008 4096 2.6875 3.154547 0.017596 -6.826741 376
87 dense 11008 4096 2.6875 3.237685 0.016375 -7.180756 340
88 dense 4096 4096 1.0000 1.718393 0.040005 -3.490033 93 over-trained
89 dense 4096 4096 1.0000 2.330085 0.052521 -5.610869 226
90 dense 4096 4096 1.0000 1.850082 0.049404 -3.937376 74 over-trained
91 dense 4096 4096 1.0000 2.078105 0.059422 -5.492286 450
92 dense 11008 4096 2.6875 4.439312 0.026491 -11.677480 117
93 dense 11008 4096 2.6875 3.136169 0.012715 -6.797414 385
94 dense 11008 4096 2.6875 3.203103 0.014613 -7.091699 361
95 dense 4096 4096 1.0000 1.733624 0.024835 -3.438476 180 over-trained
96 dense 4096 4096 1.0000 2.080763 0.049779 -4.913376 414
97 dense 4096 4096 1.0000 1.846157 0.025708 -3.945677 141 over-trained
98 dense 4096 4096 1.0000 1.983869 0.058506 -5.087669 546 over-trained
99 dense 11008 4096 2.6875 4.290035 0.020642 -10.914127 99
100 dense 11008 4096 2.6875 3.061813 0.011308 -6.577549 417
101 dense 11008 4096 2.6875 3.119773 0.011589 -6.877469 346
102 dense 4096 4096 1.0000 1.673994 0.032156 -3.230986 182 over-trained
103 dense 4096 4096 1.0000 2.410842 0.052422 -5.778472 159
104 dense 4096 4096 1.0000 1.891993 0.022230 -4.514046 151 over-trained
105 dense 4096 4096 1.0000 2.330958 0.071837 -6.058123 181
106 dense 11008 4096 2.6875 4.092177 0.015313 -10.099028 117
107 dense 11008 4096 2.6875 2.892687 0.008264 -6.086166 470
108 dense 11008 4096 2.6875 2.952651 0.008070 -6.374049 375
109 dense 4096 4096 1.0000 1.928111 0.028139 -4.195415 141 over-trained
110 dense 4096 4096 1.0000 3.089512 0.040023 -7.095318 39
111 dense 4096 4096 1.0000 1.957473 0.021369 -4.401799 193 over-trained
112 dense 4096 4096 1.0000 2.235932 0.056358 -5.612925 277
113 dense 11008 4096 2.6875 4.278066 0.027385 -10.647475 64
114 dense 11008 4096 2.6875 2.744849 0.007690 -5.645314 509
115 dense 11008 4096 2.6875 2.851357 0.015925 -6.097900 281
116 dense 4096 4096 1.0000 1.902806 0.019223 -3.451470 273 over-trained
117 dense 4096 4096 1.0000 2.945106 0.034552 -6.934566 65
118 dense 4096 4096 1.0000 1.950364 0.018548 -4.030336 221 over-trained
119 dense 4096 4096 1.0000 2.420918 0.048000 -6.214981 201
120 dense 4096 4096 1.0000 1.799291 0.018997 -3.271042 340 over-trained
121 dense 4096 4096 1.0000 2.777378 0.030340 -6.616476 107
122 dense 4096 4096 1.0000 1.903636 0.020043 -3.560366 277 over-trained
123 dense 4096 4096 1.0000 2.628367 0.031208 -5.723642 96
124 dense 11008 4096 2.6875 3.748208 0.015533 -8.737135 100
125 dense 11008 4096 2.6875 2.711317 0.009654 -5.551818 380
126 dense 11008 4096 2.6875 2.841499 0.019447 -6.101858 228
127 dense 11008 4096 2.6875 3.821275 0.025327 -8.992942 61
128 dense 11008 4096 2.6875 2.608254 0.013222 -5.196165 436
129 dense 11008 4096 2.6875 2.707714 0.023029 -5.643959 307
130 dense 4096 4096 1.0000 1.892856 0.020718 -3.247737 308 over-trained
131 dense 4096 4096 1.0000 3.048290 0.021433 -6.677705 67
132 dense 4096 4096 1.0000 1.959259 0.025835 -3.623686 205 over-trained
133 dense 4096 4096 1.0000 2.806834 0.026580 -5.341950 61
134 dense 11008 4096 2.6875 3.548631 0.018798 -7.761536 81
135 dense 11008 4096 2.6875 2.590664 0.013665 -5.052746 368
136 dense 11008 4096 2.6875 2.815272 0.024860 -5.766285 150
137 dense 4096 4096 1.0000 1.903843 0.024030 -3.639762 377 over-trained
138 dense 4096 4096 1.0000 2.652858 0.049627 -6.756267 207
139 dense 4096 4096 1.0000 1.941937 0.024427 -3.855533 277 over-trained
140 dense 4096 4096 1.0000 2.767848 0.040458 -6.617633 108
141 dense 11008 4096 2.6875 3.520189 0.026413 -7.674527 63
142 dense 11008 4096 2.6875 2.590090 0.017123 -5.001036 410
143 dense 11008 4096 2.6875 2.669509 0.025782 -5.378129 339
144 dense 4096 4096 1.0000 2.041288 0.021448 -3.568588 324
145 dense 4096 4096 1.0000 2.825844 0.066267 -7.490769 252
146 dense 4096 4096 1.0000 2.011917 0.029601 -3.885634 281
147 dense 4096 4096 1.0000 2.274320 0.067625 -5.691143 543
148 dense 11008 4096 2.6875 3.946251 0.026421 -9.355094 69
149 dense 11008 4096 2.6875 2.763422 0.015118 -5.376197 290
150 dense 11008 4096 2.6875 2.837021 0.021153 -5.795126 308
151 dense 4096 4096 1.0000 2.026834 0.016974 -3.167734 296
152 dense 4096 4096 1.0000 3.713752 0.035620 -9.899997 44
153 dense 4096 4096 1.0000 2.038241 0.020502 -3.407963 215
154 dense 4096 4096 1.0000 3.235470 0.030892 -8.034944 58
155 dense 11008 4096 2.6875 4.339064 0.018582 -10.530303 76
156 dense 11008 4096 2.6875 2.838116 0.021100 -5.733315 254
157 dense 11008 4096 2.6875 3.092919 0.021699 -6.522074 116
158 dense 4096 4096 1.0000 1.888739 0.020483 -3.439856 403 over-trained
159 dense 4096 4096 1.0000 3.610575 0.026429 -9.306142 58
160 dense 4096 4096 1.0000 1.934064 0.023168 -3.770357 259 over-trained
161 dense 4096 4096 1.0000 2.948783 0.030263 -7.134445 104
162 dense 4096 4096 1.0000 2.076760 0.024886 -4.520714 138
163 dense 4096 4096 1.0000 2.898076 0.044187 -6.852162 61
164 dense 4096 4096 1.0000 2.096617 0.028771 -4.803294 94
165 dense 4096 4096 1.0000 2.228133 0.053649 -5.327880 286
166 dense 11008 4096 2.6875 4.419454 0.028730 -11.028649 129
167 dense 11008 4096 2.6875 2.875642 0.020052 -5.914632 289
168 dense 11008 4096 2.6875 3.096847 0.020586 -6.607660 140
169 dense 11008 4096 2.6875 4.663761 0.018671 -11.837716 90
170 dense 11008 4096 2.6875 2.923140 0.015858 -6.105750 306
171 dense 11008 4096 2.6875 3.062585 0.017087 -6.607951 237
172 dense 4096 4096 1.0000 1.833310 0.024520 -3.647048 110 over-trained
173 dense 4096 4096 1.0000 2.461357 0.046634 -5.971242 194
174 dense 4096 4096 1.0000 1.879925 0.025549 -4.333906 123 over-trained
175 dense 4096 4096 1.0000 2.310814 0.064252 -5.612011 266
176 dense 11008 4096 2.6875 4.504361 0.019177 -11.320286 157
177 dense 11008 4096 2.6875 2.979902 0.012284 -6.300844 271
178 dense 11008 4096 2.6875 3.090816 0.012853 -6.754721 260
179 dense 4096 4096 1.0000 1.915380 0.042290 -4.250204 57 over-trained
180 dense 4096 4096 1.0000 2.606767 0.050588 -6.122869 123
181 dense 4096 4096 1.0000 1.899241 0.029908 -4.521691 175 over-trained
182 dense 4096 4096 1.0000 2.153635 0.062804 -5.219498 328
183 dense 11008 4096 2.6875 4.328677 0.013553 -10.358717 133
184 dense 11008 4096 2.6875 3.000821 0.012329 -6.347640 276
185 dense 11008 4096 2.6875 3.128033 0.013443 -6.821358 249
186 dense 4096 4096 1.0000 1.662474 0.041020 -3.671410 314 over-trained
187 dense 4096 4096 1.0000 2.903638 0.045365 -6.634003 60
188 dense 4096 4096 1.0000 2.012241 0.036154 -4.300365 77
189 dense 4096 4096 1.0000 2.297397 0.058155 -5.455487 211
190 dense 11008 4096 2.6875 4.159836 0.018237 -10.133519 123
191 dense 11008 4096 2.6875 2.997025 0.010388 -6.321261 288
192 dense 11008 4096 2.6875 3.122413 0.009941 -6.827129 288
193 dense 4096 4096 1.0000 1.908934 0.021868 -4.139902 190 over-trained
194 dense 4096 4096 1.0000 2.880601 0.028842 -6.542797 66
195 dense 4096 4096 1.0000 1.927170 0.023063 -4.394874 291 over-trained
196 dense 4096 4096 1.0000 2.391609 0.038814 -5.471714 212
197 dense 11008 4096 2.6875 3.841764 0.019830 -8.607470 119
198 dense 11008 4096 2.6875 2.874922 0.012871 -5.916301 375
199 dense 11008 4096 2.6875 3.029232 0.016152 -6.457456 265
200 dense 4096 4096 1.0000 1.875416 0.021975 -3.394998 262 over-trained
201 dense 4096 4096 1.0000 3.001369 0.013993 -7.395793 127
202 dense 4096 4096 1.0000 1.915435 0.026907 -3.642370 248 over-trained
203 dense 4096 4096 1.0000 2.653955 0.015417 -5.805553 200
204 dense 4096 4096 1.0000 2.071744 0.020678 -3.763901 266
205 dense 4096 4096 1.0000 3.672441 0.029715 -9.387901 52
206 dense 4096 4096 1.0000 1.983860 0.027949 -3.694276 369 over-trained
207 dense 4096 4096 1.0000 2.620906 0.032807 -5.941037 274
208 dense 11008 4096 2.6875 3.500014 0.022352 -7.080026 122
209 dense 11008 4096 2.6875 2.822575 0.010427 -5.523040 439
210 dense 11008 4096 2.6875 2.972585 0.012502 -6.101843 318
211 dense 11008 4096 2.6875 3.379570 0.015085 -7.277295 110
212 dense 11008 4096 2.6875 2.796176 0.009739 -5.373582 458
213 dense 11008 4096 2.6875 2.979802 0.014435 -6.091731 291
214 dense 4096 4096 1.0000 2.235753 0.026417 -4.842850 244
215 dense 4096 4096 1.0000 2.456073 0.071347 -6.117200 462
216 dense 4096 4096 1.0000 2.079337 0.027740 -4.181240 393
217 dense 4096 4096 1.0000 2.707269 0.029961 -6.542436 178
218 dense 11008 4096 2.6875 2.475608 0.036342 -4.848039 266
219 dense 11008 4096 2.6875 3.060899 0.009982 -5.795979 246
220 dense 11008 4096 2.6875 3.460575 0.013710 -7.314842 148
221 dense 4096 4096 1.0000 2.124295 0.018080 -3.841337 332
222 dense 4096 4096 1.0000 3.065491 0.029797 -7.111696 70
223 dense 4096 4096 1.0000 2.044756 0.023471 -4.059434 285
224 dense 4096 4096 1.0000 2.774261 0.028714 -6.810288 217