Falcon3-1B-Instruct


Find this model in the Falcon model summary


Falcon3-1B-Instruct Model Set Plots


Falcon Compared to Base Model Plots



Falcon3-1B-Instruct Model Selected Details
id layer_type N M Q alpha D alpha-hat num_spikes warning
1 dense 8192 2048 4.0 3.489293 0.032072 -0.812385 132
2 dense 8192 2048 4.0 3.130161 0.032227 -1.440753 76
3 dense 8192 2048 4.0 2.223478 0.058688 -0.974162 603
4 dense 2048 1024 2.0 1.482273 0.011010 0.507154 336 over-trained
5 dense 2048 2048 1.0 2.051005 0.043427 -1.329527 98
6 dense 2048 2048 1.0 1.663247 0.027677 1.171422 68 over-trained
7 dense 2048 1024 2.0 1.980521 0.028978 -1.781271 133 over-trained
8 dense 8192 2048 4.0 3.668017 0.016932 -2.579155 103
9 dense 8192 2048 4.0 3.280774 0.033765 -1.319149 63
10 dense 8192 2048 4.0 3.243105 0.036650 -1.225822 65
11 dense 2048 1024 2.0 2.003319 0.017717 -0.346262 196
12 dense 2048 2048 1.0 2.591737 0.024143 -1.024270 116
13 dense 2048 2048 1.0 2.178383 0.018554 1.335651 157
14 dense 2048 1024 2.0 2.451120 0.023025 -1.577329 113
15 dense 8192 2048 4.0 3.014527 0.021388 -1.257848 92
16 dense 8192 2048 4.0 3.551696 0.023153 -2.557240 59
17 dense 2048 1024 2.0 2.470302 0.023207 -1.935828 80
18 dense 8192 2048 4.0 2.975661 0.024453 -1.109379 112
19 dense 2048 2048 1.0 2.625514 0.021937 -1.090440 78
20 dense 2048 2048 1.0 2.300276 0.018771 -0.556302 124
21 dense 2048 1024 2.0 2.135625 0.028392 -1.679376 134
22 dense 8192 2048 4.0 3.371499 0.023635 -2.215697 107
23 dense 8192 2048 4.0 3.039844 0.022085 -1.351387 73
24 dense 8192 2048 4.0 2.865433 0.017385 -1.008670 228
25 dense 2048 1024 2.0 2.088283 0.015148 -0.758482 140
26 dense 2048 2048 1.0 2.460525 0.031570 -1.461714 149
27 dense 2048 2048 1.0 2.188794 0.021775 0.316035 149
28 dense 2048 1024 2.0 2.612667 0.022953 -2.341542 72
29 dense 8192 2048 4.0 3.459546 0.022592 -2.375554 93
30 dense 8192 2048 4.0 2.839043 0.016181 -0.994345 270
31 dense 2048 2048 1.0 2.380009 0.031865 -1.222982 79
32 dense 2048 2048 1.0 2.586758 0.041932 -2.513072 78
33 dense 2048 1024 2.0 2.219217 0.030205 -2.500586 113
34 dense 2048 1024 2.0 2.373355 0.050224 -2.892618 131
35 dense 8192 2048 4.0 2.906797 0.017138 -1.425530 100
36 dense 8192 2048 4.0 2.802553 0.021651 -1.350255 165
37 dense 8192 2048 4.0 3.614085 0.030995 -2.636473 68
38 dense 8192 2048 4.0 2.834389 0.013288 -1.028334 231
39 dense 2048 2048 1.0 3.053924 0.034563 -2.355871 29
40 dense 2048 1024 2.0 2.264131 0.020320 -1.862723 103
41 dense 2048 2048 1.0 2.340481 0.021245 -0.516897 105
42 dense 2048 1024 2.0 2.748535 0.028671 -3.066923 74
43 dense 8192 2048 4.0 3.668876 0.025181 -2.647391 64
44 dense 8192 2048 4.0 2.787916 0.009976 -0.847365 246
45 dense 8192 2048 4.0 2.836173 0.017531 -1.191716 120
46 dense 2048 1024 2.0 2.210403 0.028534 -2.143417 112
47 dense 2048 2048 1.0 2.269968 0.022872 -0.896259 131
48 dense 2048 1024 2.0 2.844284 0.036775 -3.561300 55
49 dense 2048 2048 1.0 3.070602 0.029268 -3.377628 66
50 dense 8192 2048 4.0 2.864960 0.018168 -1.164414 88
51 dense 2048 2048 1.0 3.174018 0.038050 -2.669912 41
52 dense 2048 1024 2.0 2.265923 0.026909 -1.529223 102
53 dense 8192 2048 4.0 3.897280 0.035382 -2.643913 32
54 dense 8192 2048 4.0 2.757373 0.009390 -0.880037 191
55 dense 2048 1024 2.0 3.017574 0.029086 -3.148844 47
56 dense 2048 2048 1.0 2.337794 0.026157 -0.252198 119
57 dense 2048 1024 2.0 2.310586 0.031298 -1.671560 96
58 dense 8192 2048 4.0 2.670759 0.023889 -1.043953 182
59 dense 2048 2048 1.0 2.344505 0.021798 -0.023997 110
60 dense 2048 2048 1.0 3.215266 0.031191 -3.333609 63
61 dense 8192 2048 4.0 2.666839 0.015557 -0.812617 205
62 dense 8192 2048 4.0 3.568591 0.028363 -2.593125 62
63 dense 2048 1024 2.0 3.391961 0.044870 -4.336812 30
64 dense 8192 2048 4.0 2.799842 0.017844 -1.175442 128
65 dense 2048 2048 1.0 2.316326 0.038234 -1.412222 128
66 dense 2048 1024 2.0 2.798173 0.043192 -3.879859 93
67 dense 8192 2048 4.0 2.661209 0.017933 -0.838413 238
68 dense 2048 2048 1.0 3.022624 0.047039 -3.145906 58
69 dense 2048 1024 2.0 2.325655 0.049209 -2.517078 102
70 dense 8192 2048 4.0 3.616098 0.036860 -3.003109 82
71 dense 8192 2048 4.0 2.715827 0.015572 -0.776075 151
72 dense 8192 2048 4.0 3.666509 0.038117 -2.944534 80
73 dense 2048 2048 1.0 2.478645 0.045312 -1.816308 106
74 dense 2048 2048 1.0 3.373780 0.038165 -3.588007 44
75 dense 2048 1024 2.0 2.537653 0.042877 -2.871057 77
76 dense 2048 1024 2.0 3.102864 0.044591 -4.451550 57
77 dense 8192 2048 4.0 2.820106 0.017934 -1.111667 117
78 dense 8192 2048 4.0 2.838416 0.018466 -1.129601 111
79 dense 8192 2048 4.0 3.778270 0.039447 -2.824307 73
80 dense 8192 2048 4.0 2.711902 0.017170 -0.893758 188
81 dense 2048 1024 2.0 3.079267 0.047355 -3.488924 72
82 dense 2048 2048 1.0 2.455902 0.043652 -1.934491 116
83 dense 2048 2048 1.0 2.314263 0.091921 -1.819558 312
84 dense 2048 1024 2.0 2.713508 0.051266 -3.408110 56
85 dense 2048 1024 2.0 2.393062 0.040832 -2.092054 106
86 dense 8192 2048 4.0 2.904779 0.017441 -1.279888 128
87 dense 8192 2048 4.0 2.818354 0.014731 -0.940884 144
88 dense 8192 2048 4.0 4.024854 0.031537 -3.249632 61
89 dense 2048 2048 1.0 2.684262 0.032345 -0.226622 37
90 dense 2048 1024 2.0 3.553054 0.050787 -5.267314 43
91 dense 2048 2048 1.0 3.448691 0.046159 -3.877489 43
92 dense 2048 2048 1.0 2.658421 0.097048 -3.255634 235
93 dense 8192 2048 4.0 4.244179 0.043521 -3.622770 68
94 dense 2048 1024 2.0 3.106375 0.048903 -4.138615 42
95 dense 8192 2048 4.0 2.945850 0.019825 -1.230928 166
96 dense 8192 2048 4.0 2.893838 0.012470 -1.036074 179
97 dense 2048 1024 2.0 3.121291 0.049409 -4.517219 87
98 dense 2048 2048 1.0 2.942613 0.035053 -2.142299 42
99 dense 2048 2048 1.0 2.559329 0.033991 -1.134863 68
100 dense 2048 2048 1.0 3.377485 0.036613 -3.803176 63
101 dense 8192 2048 4.0 2.935270 0.020803 -0.854829 92
102 dense 8192 2048 4.0 4.219069 0.029641 -3.498181 88
103 dense 2048 1024 2.0 2.528141 0.037252 -1.797565 103
104 dense 8192 2048 4.0 3.037404 0.026433 -1.031843 84
105 dense 2048 1024 2.0 2.972174 0.042444 -4.044705 80
106 dense 2048 1024 2.0 2.407638 0.042330 -2.784160 124
107 dense 2048 2048 1.0 2.536674 0.020094 -2.058341 108
108 dense 2048 2048 1.0 2.320524 0.046963 -1.813376 149
109 dense 8192 2048 4.0 2.626809 0.019732 -0.226061 211
110 dense 8192 2048 4.0 2.518449 0.019723 -0.096235 283
111 dense 8192 2048 4.0 4.086552 0.044553 -3.304955 48
112 dense 2048 1024 2.0 2.330171 0.064135 -2.720169 133
113 dense 2048 1024 2.0 2.991907 0.041439 -2.625975 30
114 dense 2048 1024 2.0 2.490948 0.024802 -2.472037 110
115 dense 2048 2048 1.0 2.315854 0.037086 -1.220598 178
116 dense 2048 2048 1.0 2.784694 0.025362 -2.266580 74
117 dense 8192 2048 4.0 2.485242 0.015450 -0.093914 327
118 dense 8192 2048 4.0 2.597477 0.013269 -0.300353 309
119 dense 8192 2048 4.0 3.775657 0.033342 -2.550453 48
120 dense 8192 2048 4.0 2.972630 0.020799 -0.840243 125
121 dense 8192 2048 4.0 2.792068 0.017644 -0.451161 239
122 dense 8192 2048 4.0 2.660383 0.009028 -0.157859 234
123 dense 2048 2048 1.0 2.214264 0.037291 -1.059473 138
124 dense 2048 1024 2.0 2.414389 0.031903 -2.383323 129
125 dense 2048 1024 2.0 2.164207 0.035027 -1.802935 122
126 dense 2048 2048 1.0 2.576801 0.022676 -1.727957 81