gpt-oss-120b


Find this model in the OpenAI model summary


gpt-oss-120b Model Set Plots



gpt-oss-120b Model Selected Details
id layer_type N M Q alpha D alpha-hat num_spikes warning
1 dense 4096 2880 1.422222 2.208469 0.012764 4.321921 571
2 dense 2880 512 5.625000 3.472631 0.065679 8.765082 93
3 dense 2880 512 5.625000 3.269970 0.083865 4.979458 200
4 dense 4096 2880 1.422222 5.553284 0.041825 11.105858 55
5 dense 2880 128 22.500000 3.852214 0.038861 3.783559 46
6 dense 2880 512 5.625000 2.645160 0.062126 5.975363 79
7 dense 4096 2880 1.422222 3.175964 0.050770 5.504852 87
8 dense 4096 2880 1.422222 2.045456 0.023390 5.043705 475
9 dense 2880 512 5.625000 5.160952 0.076134 6.076600 81
10 dense 2880 128 22.500000 3.373129 0.037766 2.329203 50
11 dense 2880 512 5.625000 7.144880 0.052377 8.386267 41 under-trained
12 dense 2880 512 5.625000 3.279578 0.037811 4.758593 37
13 dense 4096 2880 1.422222 3.170620 0.046415 4.986395 117
14 dense 4096 2880 1.422222 2.092214 0.020858 5.653739 573
15 dense 2880 128 22.500000 3.084761 0.032720 1.762433 55
16 dense 4096 2880 1.422222 3.254060 0.032990 10.640579 38
17 dense 2880 512 5.625000 3.237801 0.067844 3.898329 219
18 dense 4096 2880 1.422222 2.464871 0.049778 3.959622 241
19 dense 2880 512 5.625000 2.944561 0.043867 5.654103 32
20 dense 2880 128 22.500000 3.148655 0.041372 1.872311 58
21 dense 2880 128 22.500000 3.241244 0.037749 1.934718 47
22 dense 2880 512 5.625000 3.763706 0.091890 4.242364 205
23 dense 2880 512 5.625000 3.228186 0.036520 4.661916 58
24 dense 4096 2880 1.422222 3.196315 0.044045 4.390031 155
25 dense 4096 2880 1.422222 3.044016 0.032111 9.067866 47
26 dense 4096 2880 1.422222 3.108985 0.032759 10.015843 55
27 dense 2880 512 5.625000 3.744358 0.046973 5.538288 19
28 dense 4096 2880 1.422222 3.436349 0.041209 6.492551 42
29 dense 2880 128 22.500000 3.456888 0.039824 1.954955 47
30 dense 2880 512 5.625000 12.373127 0.105708 12.324962 31 under-trained
31 dense 4096 2880 1.422222 2.526757 0.019100 7.735397 145
32 dense 4096 2880 1.422222 4.075093 0.038599 5.472925 62
33 dense 2880 512 5.625000 2.794283 0.054154 3.885393 102
34 dense 2880 512 5.625000 9.795358 0.056763 10.642065 36 under-trained
35 dense 2880 128 22.500000 3.265615 0.038312 1.891294 39
36 dense 2880 512 5.625000 5.385046 0.070512 6.404582 61
37 dense 4096 2880 1.422222 3.156772 0.027263 4.948909 102
38 dense 2880 128 22.500000 2.735562 0.045712 1.564596 50
39 dense 2880 512 5.625000 2.842376 0.089045 4.210396 87
40 dense 4096 2880 1.422222 2.380051 0.018818 7.760562 176
41 dense 4096 2880 1.422222 2.645166 0.014982 8.790014 236
42 dense 4096 2880 1.422222 3.229234 0.056745 4.313408 97
43 dense 2880 512 5.625000 3.760304 0.023834 4.505846 42
44 dense 2880 128 22.500000 2.613116 0.038740 1.341844 46
45 dense 2880 512 5.625000 9.847806 0.047890 11.940559 44 under-trained
46 dense 4096 2880 1.422222 2.849062 0.014680 11.301986 206
47 dense 2880 512 5.625000 1.883901 0.043648 3.150082 163 over-trained
48 dense 4096 2880 1.422222 2.797735 0.045072 4.090750 129
49 dense 2880 128 22.500000 2.596757 0.046623 1.003669 52
50 dense 2880 512 5.625000 2.925060 0.054998 4.073273 220
51 dense 2880 512 5.625000 4.764659 0.045677 6.450162 134
52 dense 2880 128 22.500000 2.408473 0.031262 0.925361 60
53 dense 2880 512 5.625000 2.485813 0.024541 3.491191 60
54 dense 4096 2880 1.422222 2.739747 0.020324 10.217447 329
55 dense 4096 2880 1.422222 2.781694 0.042916 3.566225 155
56 dense 4096 2880 1.422222 3.172356 0.024286 12.566761 139
57 dense 4096 2880 1.422222 2.334839 0.039938 3.818264 205
58 dense 2880 128 22.500000 2.411270 0.034860 1.021684 53
59 dense 2880 512 5.625000 17.726207 0.111209 24.206874 35 under-trained
60 dense 2880 512 5.625000 2.014822 0.062062 3.258083 68
61 dense 4096 2880 1.422222 3.233591 0.019563 12.232655 132
62 dense 2880 512 5.625000 4.570913 0.085681 6.722337 135
63 dense 2880 512 5.625000 2.894662 0.041843 1.690166 35
64 dense 4096 2880 1.422222 2.807132 0.039251 4.039771 112
65 dense 2880 128 22.500000 2.358074 0.033051 0.842026 57
66 dense 4096 2880 1.422222 2.489314 0.045615 4.355668 115
67 dense 2880 512 5.625000 4.005401 0.106053 5.826358 149
68 dense 2880 512 5.625000 2.004800 0.053028 3.636198 49
69 dense 2880 128 22.500000 2.334667 0.041312 0.553002 51
70 dense 4096 2880 1.422222 3.494062 0.023007 14.573604 129
71 dense 4096 2880 1.422222 3.610729 0.035367 13.019531 69
72 dense 4096 2880 1.422222 2.481507 0.038454 4.310733 277
73 dense 2880 128 22.500000 2.252607 0.045581 0.364231 62
74 dense 2880 512 5.625000 1.870277 0.061759 2.481666 33 over-trained
75 dense 2880 512 5.625000 2.941862 0.065377 4.972544 227
76 dense 4096 2880 1.422222 3.109918 0.072122 13.440607 281
77 dense 2880 512 5.625000 5.664013 0.093290 8.252137 76
78 dense 2880 128 22.500000 2.276654 0.034881 0.439930 58
79 dense 4096 2880 1.422222 2.321264 0.046096 3.795860 217
80 dense 2880 512 5.625000 1.694113 0.084582 2.741511 148 over-trained
81 dense 2880 512 5.625000 2.134803 0.031990 2.181559 79
82 dense 2880 128 22.500000 2.306871 0.038158 0.259992 61
83 dense 4096 2880 1.422222 3.827929 0.055312 14.843723 113
84 dense 4096 2880 1.422222 2.587669 0.029204 4.024792 137
85 dense 2880 512 5.625000 3.808527 0.059146 6.231956 213
86 dense 4096 2880 1.422222 2.292899 0.031649 3.880986 198
87 dense 2880 512 5.625000 4.309792 0.107894 6.597479 129
88 dense 2880 512 5.625000 1.340080 0.074254 2.660212 266 over-trained
89 dense 4096 2880 1.422222 2.926646 0.041338 13.717326 232
90 dense 2880 128 22.500000 2.436084 0.039443 0.330406 51
91 dense 2880 512 5.625000 1.967432 0.022988 2.488288 182 over-trained
92 dense 4096 2880 1.422222 3.988188 0.076939 16.321480 133
93 dense 2880 512 5.625000 3.510953 0.040514 7.502812 128
94 dense 2880 128 22.500000 2.331693 0.036395 0.249339 57
95 dense 4096 2880 1.422222 2.483042 0.028246 4.269229 172
96 dense 4096 2880 1.422222 3.135838 0.040728 15.635954 268
97 dense 2880 512 5.625000 1.236920 0.108677 3.024176 382 over-trained
98 dense 2880 128 22.500000 2.426283 0.033791 0.204888 43
99 dense 4096 2880 1.422222 2.642628 0.042648 4.618855 61
100 dense 2880 512 5.625000 2.918114 0.049294 7.241564 245
101 dense 2880 128 22.500000 2.545452 0.037564 0.063275 53
102 dense 2880 512 5.625000 1.814614 0.038576 2.657049 139 over-trained
103 dense 4096 2880 1.422222 2.125475 0.053168 3.075680 416
104 dense 2880 512 5.625000 2.697242 0.047467 7.617617 294
105 dense 4096 2880 1.422222 2.992130 0.085815 13.756604 396
106 dense 2880 512 5.625000 2.883635 0.020175 6.854456 201
107 dense 2880 128 22.500000 2.866494 0.037087 0.208354 48
108 dense 4096 2880 1.422222 2.532174 0.023105 13.465440 399
109 dense 4096 2880 1.422222 2.122508 0.030785 3.689537 351
110 dense 2880 512 5.625000 2.484014 0.069946 6.616925 25
111 dense 4096 2880 1.422222 3.368990 0.032658 16.133272 179
112 dense 2880 128 22.500000 3.087381 0.042853 0.218905 39
113 dense 2880 512 5.625000 2.143398 0.030824 2.279270 41
114 dense 4096 2880 1.422222 2.311490 0.039692 4.496668 233
115 dense 2880 512 5.625000 2.662200 0.030442 7.436658 76
116 dense 2880 512 5.625000 3.683417 0.057791 7.629082 79
117 dense 4096 2880 1.422222 2.173600 0.041741 3.583650 329
118 dense 2880 128 22.500000 3.369590 0.051162 0.239795 26
119 dense 4096 2880 1.422222 2.879379 0.031102 15.316405 343
120 dense 2880 512 5.625000 1.613783 0.072539 3.469254 155 over-trained
121 dense 2880 128 22.500000 3.518675 0.044204 -0.210370 30
122 dense 2880 512 5.625000 2.509762 0.026340 0.615255 119
123 dense 4096 2880 1.422222 2.830377 0.034624 14.718370 275
124 dense 4096 2880 1.422222 3.256459 0.040774 6.883906 29
125 dense 2880 512 5.625000 5.711049 0.048416 10.157567 64
126 dense 2880 512 5.625000 2.324256 0.064240 5.357340 220
127 dense 4096 2880 1.422222 2.968097 0.083546 16.346735 459
128 dense 4096 2880 1.422222 2.315388 0.031141 3.973542 232
129 dense 2880 128 22.500000 3.616528 0.059002 0.137758 35
130 dense 2880 512 5.625000 1.388835 0.072804 3.257407 282 over-trained
131 dense 4096 2880 1.422222 3.766056 0.017121 18.802509 213
132 dense 2880 128 22.500000 3.442542 0.035075 0.138813 38
133 dense 2880 512 5.625000 4.483635 0.093100 8.235314 205
134 dense 2880 512 5.625000 2.436476 0.045751 -0.239513 100
135 dense 4096 2880 1.422222 2.487972 0.041338 5.256039 225
136 dense 2880 512 5.625000 2.165771 0.082463 4.593466 292
137 dense 2880 512 5.625000 1.576554 0.034587 2.678060 191 over-trained
138 dense 4096 2880 1.422222 2.114761 0.040321 3.577608 500
139 dense 4096 2880 1.422222 5.016062 0.050016 27.858958 173
140 dense 2880 128 22.500000 3.614442 0.039833 0.491034 33
141 dense 4096 2880 1.422222 3.498729 0.032285 18.983121 148
142 dense 2880 512 5.625000 3.417551 0.035932 7.025308 200
143 dense 4096 2880 1.422222 2.819032 0.030818 5.451648 122
144 dense 2880 512 5.625000 2.788916 0.031368 -0.057381 63
145 dense 2880 128 22.500000 3.838345 0.045362 0.190672 27
146 dense 4096 2880 1.422222 5.714257 0.080708 31.890082 177
147 dense 2880 128 22.500000 3.933552 0.056198 0.698021 22
148 dense 2880 512 5.625000 1.806956 0.062571 4.425388 367 over-trained
149 dense 4096 2880 1.422222 1.957556 0.024243 3.788884 553 over-trained
150 dense 2880 512 5.625000 1.461017 0.047780 2.718899 200 over-trained
151 dense 2880 512 5.625000 2.640404 0.038308 2.630096 69
152 dense 2880 512 5.625000 4.585972 0.022900 9.789456 125
153 dense 4096 2880 1.422222 4.414175 0.026949 22.760216 196
154 dense 2880 128 22.500000 3.565906 0.045476 0.460714 33
155 dense 4096 2880 1.422222 2.328391 0.043005 5.133580 280
156 dense 2880 128 22.500000 3.918509 0.050094 0.505303 23
157 dense 2880 512 5.625000 1.470758 0.060527 2.213420 223 over-trained
158 dense 2880 512 5.625000 2.892210 0.058664 7.116892 316
159 dense 4096 2880 1.422222 1.829911 0.116658 10.043039 1077 over-trained
160 dense 4096 2880 1.422222 1.990689 0.030991 3.734381 533 over-trained
161 dense 2880 128 22.500000 3.894220 0.030175 0.363120 32
162 dense 2880 512 5.625000 5.890095 0.028133 12.039012 75
163 dense 2880 512 5.625000 2.993145 0.025516 -0.924741 57
164 dense 4096 2880 1.422222 2.559268 0.034801 5.722252 136
165 dense 4096 2880 1.422222 4.489798 0.041447 23.385952 167
166 dense 4096 2880 1.422222 4.785425 0.081182 27.544275 141
167 dense 2880 512 5.625000 2.613914 0.060290 2.031349 26
168 dense 4096 2880 1.422222 1.993876 0.031229 4.207472 594 over-trained
169 dense 2880 128 22.500000 3.040466 0.029907 0.236922 47
170 dense 2880 512 5.625000 4.707666 0.082566 12.012274 176
171 dense 2880 128 22.500000 2.820777 0.047052 0.012583 43
172 dense 4096 2880 1.422222 4.240654 0.028765 23.181634 203
173 dense 2880 512 5.625000 2.122419 0.029837 -0.082330 113
174 dense 2880 512 5.625000 5.030785 0.106522 10.108551 239
175 dense 4096 2880 1.422222 2.535780 0.019578 5.508312 109
176 dense 4096 2880 1.422222 2.513204 0.079145 14.295196 17
177 dense 2880 128 22.500000 2.302707 0.028471 0.542856 53
178 dense 2880 512 5.625000 4.940892 0.100657 11.684000 109
179 dense 2880 512 5.625000 1.874893 0.027702 2.318483 123 over-trained
180 dense 4096 2880 1.422222 2.121072 0.024997 4.195084 597