Falcon3-7B-Instruct


Find this model in the Falcon model summary


Falcon3-7B-Instruct Model Set Plots


Falcon Compared to Base Model Plots



Falcon3-7B-Instruct Model Selected Details
id layer_type N M Q alpha D alpha-hat num_spikes warning
1 dense 23040 3072 7.5 4.174017 0.023438 -3.135534 68
2 dense 23040 3072 7.5 2.271795 0.026630 -0.811024 766
3 dense 23040 3072 7.5 2.254143 0.026522 -0.983154 821
4 dense 3072 1024 3.0 1.587088 0.025671 -0.653193 161 over-trained
5 dense 3072 3072 1.0 2.155970 0.029129 -1.307895 84
6 dense 3072 3072 1.0 1.739562 0.042144 0.676740 166 over-trained
7 dense 3072 1024 3.0 2.207940 0.018180 -2.506704 178
8 dense 23040 3072 7.5 3.906160 0.019248 -3.397715 111
9 dense 23040 3072 7.5 2.529267 0.019514 -0.606330 652
10 dense 23040 3072 7.5 2.509093 0.027892 -0.729978 592
11 dense 3072 1024 3.0 2.288377 0.017010 -2.851818 191
12 dense 3072 3072 1.0 2.672867 0.014934 -2.152222 126
13 dense 3072 3072 1.0 2.318969 0.019391 -2.220481 182
14 dense 3072 1024 3.0 2.647608 0.023578 -3.367785 127
15 dense 23040 3072 7.5 3.470335 0.019703 -3.062827 261
16 dense 23040 3072 7.5 2.913978 0.008350 -0.644229 514
17 dense 23040 3072 7.5 2.891341 0.010120 -0.736033 440
18 dense 3072 1024 3.0 2.327500 0.019636 -3.173374 143
19 dense 3072 3072 1.0 2.742032 0.021857 -2.934820 124
20 dense 3072 3072 1.0 2.415911 0.018332 -2.740000 224
21 dense 3072 1024 3.0 2.815827 0.018529 -3.990481 106
22 dense 23040 3072 7.5 3.132290 0.029020 -2.493414 351
23 dense 23040 3072 7.5 2.974510 0.011369 -0.634005 535
24 dense 23040 3072 7.5 2.901764 0.020637 -1.026243 244
25 dense 3072 1024 3.0 2.320164 0.014699 -3.185093 159
26 dense 3072 3072 1.0 2.739824 0.019056 -3.014442 105
27 dense 3072 3072 1.0 2.369681 0.013987 -2.613829 222
28 dense 3072 1024 3.0 2.916323 0.022168 -4.602250 87
29 dense 23040 3072 7.5 3.304891 0.026871 -2.560530 257
30 dense 23040 3072 7.5 3.112761 0.017084 -0.699413 675
31 dense 23040 3072 7.5 2.991348 0.011843 -1.119093 166
32 dense 3072 1024 3.0 2.349008 0.021247 -3.257460 161
33 dense 3072 3072 1.0 2.497096 0.024802 -2.655289 143
34 dense 3072 3072 1.0 2.398762 0.017270 -2.575684 192
35 dense 3072 1024 3.0 2.581255 0.032256 -3.772604 158
36 dense 23040 3072 7.5 3.438022 0.021280 -2.719194 229
37 dense 23040 3072 7.5 2.947362 0.016695 -0.499489 609
38 dense 23040 3072 7.5 2.889070 0.011448 -0.897865 250
39 dense 3072 1024 3.0 2.450700 0.015708 -3.326092 104
40 dense 3072 3072 1.0 2.628458 0.033059 -2.971118 198
41 dense 3072 3072 1.0 2.395984 0.014638 -2.554522 201
42 dense 3072 1024 3.0 2.926586 0.031806 -4.567773 91
43 dense 23040 3072 7.5 3.563258 0.023952 -2.792617 179
44 dense 23040 3072 7.5 2.863747 0.015130 -0.507359 654
45 dense 23040 3072 7.5 2.830498 0.007834 -0.748511 323
46 dense 3072 1024 3.0 2.373888 0.027557 -3.426003 120
47 dense 3072 3072 1.0 3.162339 0.031131 -3.828933 52
48 dense 3072 3072 1.0 2.371870 0.024968 -2.577733 200
49 dense 3072 1024 3.0 2.882061 0.033240 -4.697047 122
50 dense 23040 3072 7.5 3.356307 0.023788 -2.354015 257
51 dense 23040 3072 7.5 2.804828 0.008726 -0.368759 524
52 dense 23040 3072 7.5 2.807098 0.008054 -0.568682 223
53 dense 3072 1024 3.0 2.414898 0.029508 -3.570991 133
54 dense 3072 3072 1.0 3.117440 0.034346 -3.657194 50
55 dense 3072 3072 1.0 2.424314 0.029188 -2.682009 162
56 dense 3072 1024 3.0 2.879189 0.034398 -4.502551 101
57 dense 23040 3072 7.5 2.747935 0.006804 -0.249664 466
58 dense 23040 3072 7.5 2.747603 0.008138 -0.518566 222
59 dense 3072 1024 3.0 2.395177 0.019521 -3.109451 119
60 dense 3072 3072 1.0 3.090964 0.032505 -3.173722 61
61 dense 3072 3072 1.0 2.394673 0.018706 -2.457117 176
62 dense 3072 1024 3.0 2.871818 0.027307 -3.995425 96
63 dense 23040 3072 7.5 3.400641 0.016332 -2.630946 146
64 dense 23040 3072 7.5 3.254413 0.018080 -2.347022 221
65 dense 3072 1024 3.0 3.029581 0.028987 -5.107125 60
66 dense 3072 3072 1.0 2.271929 0.027084 -2.229889 284
67 dense 3072 1024 3.0 2.446193 0.032393 -3.529169 100
68 dense 3072 3072 1.0 2.716144 0.029099 -3.154742 125
69 dense 23040 3072 7.5 2.697521 0.007627 -0.410216 314
70 dense 23040 3072 7.5 2.710332 0.007377 -0.212798 519
71 dense 23040 3072 7.5 3.413119 0.020250 -2.571597 165
72 dense 23040 3072 7.5 2.726834 0.009276 -0.206581 517
73 dense 23040 3072 7.5 2.777696 0.007970 -0.495351 215
74 dense 3072 1024 3.0 2.369816 0.024777 -3.535152 134
75 dense 3072 3072 1.0 2.995445 0.028938 -3.596238 69
76 dense 3072 3072 1.0 2.350342 0.024983 -2.284440 184
77 dense 3072 1024 3.0 2.955237 0.029213 -5.006589 102
78 dense 23040 3072 7.5 3.398581 0.023331 -2.407848 166
79 dense 23040 3072 7.5 2.672661 0.010185 -0.060189 566
80 dense 23040 3072 7.5 2.708447 0.006926 -0.268339 306
81 dense 3072 1024 3.0 2.513248 0.026466 -3.760347 128
82 dense 3072 3072 1.0 3.231811 0.033170 -4.184159 63
83 dense 3072 3072 1.0 2.449472 0.030218 -2.457145 159
84 dense 3072 1024 3.0 2.983784 0.037205 -5.089857 105
85 dense 23040 3072 7.5 3.310742 0.027319 -2.520331 265
86 dense 23040 3072 7.5 2.680820 0.008170 -0.095476 480
87 dense 23040 3072 7.5 2.755551 0.007406 -0.375437 252
88 dense 3072 1024 3.0 2.563220 0.029535 -3.856611 104
89 dense 3072 3072 1.0 3.189246 0.033993 -3.999008 64
90 dense 3072 3072 1.0 2.518013 0.025845 -2.547874 138
91 dense 3072 1024 3.0 3.186499 0.036871 -5.495749 70
92 dense 23040 3072 7.5 3.482322 0.034251 -2.855088 176
93 dense 23040 3072 7.5 2.668889 0.007577 -0.133254 496
94 dense 23040 3072 7.5 2.726631 0.008728 -0.384168 243
95 dense 3072 1024 3.0 2.384694 0.024958 -3.095738 110
96 dense 3072 3072 1.0 3.048348 0.023833 -3.728746 98
97 dense 3072 3072 1.0 2.407142 0.026332 -2.340327 167
98 dense 3072 1024 3.0 3.016200 0.032263 -4.388074 60
99 dense 23040 3072 7.5 3.806767 0.031245 -3.207701 77
100 dense 23040 3072 7.5 2.665799 0.007423 -0.247742 449
101 dense 23040 3072 7.5 2.742528 0.008538 -0.565825 240
102 dense 3072 1024 3.0 2.387648 0.030237 -3.605303 196
103 dense 3072 3072 1.0 2.187215 0.083924 -2.795916 490
104 dense 3072 3072 1.0 2.443910 0.032400 -2.631134 219
105 dense 3072 1024 3.0 2.938536 0.027664 -4.772885 88
106 dense 23040 3072 7.5 3.783000 0.032568 -3.204585 120
107 dense 23040 3072 7.5 2.672636 0.006746 -0.238458 423
108 dense 23040 3072 7.5 2.755878 0.009902 -0.480469 216
109 dense 3072 1024 3.0 2.568390 0.037353 -3.749986 115
110 dense 3072 3072 1.0 2.565766 0.086937 -3.477770 323
111 dense 3072 3072 1.0 2.447806 0.028438 -2.536266 200
112 dense 3072 1024 3.0 3.144754 0.047264 -5.474131 116
113 dense 23040 3072 7.5 3.761859 0.032339 -3.136975 129
114 dense 23040 3072 7.5 2.724392 0.006677 -0.400714 388
115 dense 23040 3072 7.5 2.844873 0.011571 -0.705644 177
116 dense 3072 1024 3.0 2.776015 0.034033 -4.190903 68
117 dense 3072 3072 1.0 3.610579 0.041822 -4.808700 50
118 dense 3072 3072 1.0 2.555529 0.030577 -2.774341 144
119 dense 3072 1024 3.0 3.251877 0.034877 -5.567966 88
120 dense 23040 3072 7.5 3.451303 0.038408 -2.666789 191
121 dense 23040 3072 7.5 2.737719 0.008034 -0.401425 375
122 dense 23040 3072 7.5 2.825285 0.010934 -0.647031 220
123 dense 3072 1024 3.0 2.646795 0.037261 -4.080244 79
124 dense 3072 3072 1.0 3.303528 0.039343 -4.321020 66
125 dense 3072 3072 1.0 2.478474 0.037382 -2.360231 134
126 dense 3072 1024 3.0 3.330945 0.047604 -6.039403 94
127 dense 23040 3072 7.5 4.056938 0.024033 -3.149658 77
128 dense 23040 3072 7.5 2.803533 0.009366 -0.502429 338
129 dense 3072 3072 1.0 2.521085 0.037062 -2.758677 139
130 dense 3072 3072 1.0 3.473057 0.035936 -4.400398 52
131 dense 3072 1024 3.0 2.784000 0.034688 -4.194098 60
132 dense 3072 1024 3.0 3.473498 0.046622 -6.079702 70
133 dense 23040 3072 7.5 2.904126 0.011504 -0.878043 238
134 dense 3072 3072 1.0 2.571683 0.038435 -2.628883 135
135 dense 3072 1024 3.0 2.806685 0.041709 -4.317839 105
136 dense 3072 3072 1.0 2.296094 0.090051 -3.349680 428
137 dense 3072 1024 3.0 2.934223 0.076996 -5.434597 191
138 dense 23040 3072 7.5 4.364860 0.037592 -3.543730 58
139 dense 23040 3072 7.5 2.810325 0.008524 -0.508219 337
140 dense 23040 3072 7.5 2.922486 0.010130 -0.819943 240
141 dense 23040 3072 7.5 4.293302 0.025738 -3.789586 91
142 dense 23040 3072 7.5 2.731193 0.014176 -0.301772 319
143 dense 23040 3072 7.5 2.879842 0.015465 -0.749787 221
144 dense 3072 1024 3.0 2.443354 0.027942 -2.630185 112
145 dense 3072 3072 1.0 3.062594 0.026657 -3.552818 67
146 dense 3072 3072 1.0 2.457293 0.032290 -2.483135 162
147 dense 3072 1024 3.0 3.100743 0.040200 -5.254985 66
148 dense 23040 3072 7.5 4.344294 0.019330 -3.953906 108
149 dense 23040 3072 7.5 2.668722 0.011851 -0.351940 388
150 dense 23040 3072 7.5 2.804549 0.015064 -0.716921 300
151 dense 3072 1024 3.0 2.702124 0.026513 -4.002981 122
152 dense 3072 3072 1.0 3.768076 0.023483 -4.598893 54
153 dense 3072 3072 1.0 2.576803 0.027524 -2.743284 131
154 dense 3072 1024 3.0 2.960824 0.036856 -4.814428 121
155 dense 23040 3072 7.5 4.202515 0.015822 -3.695911 126
156 dense 23040 3072 7.5 2.579073 0.008251 -0.176049 510
157 dense 23040 3072 7.5 2.685727 0.008692 -0.431078 404
158 dense 3072 1024 3.0 2.491168 0.043802 -3.586987 105
159 dense 3072 3072 1.0 2.717345 0.028730 -2.894614 112
160 dense 3072 3072 1.0 2.400712 0.027411 -2.176524 149
161 dense 3072 1024 3.0 2.565140 0.043732 -3.780053 131
162 dense 23040 3072 7.5 3.655302 0.018029 -3.222799 255
163 dense 23040 3072 7.5 2.531756 0.007302 -0.072598 572
164 dense 23040 3072 7.5 2.671837 0.009374 -0.322887 399
165 dense 3072 1024 3.0 2.799528 0.036119 -4.423668 96
166 dense 3072 3072 1.0 2.406399 0.082638 -3.320964 452
167 dense 3072 3072 1.0 2.421174 0.026916 -2.320707 214
168 dense 3072 1024 3.0 2.738491 0.026646 -4.021626 135
169 dense 23040 3072 7.5 3.489867 0.016248 -3.066115 269
170 dense 23040 3072 7.5 2.539542 0.011354 0.014746 837
171 dense 23040 3072 7.5 2.649999 0.009733 -0.212434 686
172 dense 3072 1024 3.0 2.789053 0.031524 -4.190909 68
173 dense 3072 3072 1.0 2.199779 0.080162 -2.812306 642
174 dense 3072 3072 1.0 2.359235 0.020748 -2.148905 212
175 dense 3072 1024 3.0 2.621690 0.017263 -3.631760 166
176 dense 23040 3072 7.5 3.510605 0.018681 -1.796717 189
177 dense 23040 3072 7.5 2.547523 0.015771 0.028751 973
178 dense 23040 3072 7.5 2.644229 0.015290 -0.147519 824
179 dense 3072 1024 3.0 2.565782 0.022608 -3.880491 107
180 dense 3072 3072 1.0 2.865941 0.020451 -3.070139 160
181 dense 3072 3072 1.0 2.390469 0.020128 -2.350409 189
182 dense 3072 1024 3.0 2.628667 0.017313 -3.730873 140
183 dense 23040 3072 7.5 3.382078 0.029066 0.197756 194
184 dense 23040 3072 7.5 2.434046 0.014110 0.167662 180
185 dense 23040 3072 7.5 2.557667 0.012111 0.099665 195
186 dense 3072 1024 3.0 2.576955 0.026210 -3.939957 143
187 dense 3072 3072 1.0 2.863272 0.014062 -2.299233 135
188 dense 3072 3072 1.0 2.402167 0.022433 -2.319328 231
189 dense 3072 1024 3.0 2.669187 0.015417 -3.582698 172
190 dense 23040 3072 7.5 3.077685 0.026591 0.852336 240
191 dense 23040 3072 7.5 2.699097 0.011718 0.697797 733
192 dense 23040 3072 7.5 2.759390 0.011619 1.019190 624
193 dense 3072 1024 3.0 2.485417 0.026919 -3.318720 78
194 dense 3072 3072 1.0 2.430920 0.013426 -0.147941 182
195 dense 3072 3072 1.0 2.295837 0.024090 -1.635690 217
196 dense 3072 1024 3.0 2.347496 0.020125 -1.483108 244