Falcon3-3B-Instruct


Find this model in the Falcon model summary


Falcon3-3B-Instruct Model Set Plots


Falcon Compared to Base Model Plots



Falcon3-3B-Instruct Model Selected Details
id layer_type N M Q alpha D alpha-hat num_spikes warning
1 dense 9216 3072 3.0 2.938126 0.082193 -3.436596 523
2 dense 9216 3072 3.0 2.309451 0.058064 -1.429594 840
3 dense 9216 3072 3.0 2.410196 0.040109 -1.457316 564
4 dense 3072 1024 3.0 1.588596 0.020792 -0.468300 176 over-trained
5 dense 3072 3072 1.0 1.954505 0.056821 -1.440538 158 over-trained
6 dense 3072 3072 1.0 1.854203 0.035021 0.726273 65 over-trained
7 dense 3072 1024 3.0 2.004117 0.041240 -2.328329 121
8 dense 9216 3072 3.0 4.227237 0.032752 -5.055814 64
9 dense 9216 3072 3.0 2.651094 0.052665 -1.563946 367
10 dense 9216 3072 3.0 2.498963 0.061378 -1.764972 494
11 dense 3072 1024 3.0 2.319071 0.018802 -2.816909 130
12 dense 3072 3072 1.0 2.886947 0.022015 -2.815254 94
13 dense 3072 3072 1.0 2.419355 0.029779 -2.254409 139
14 dense 3072 1024 3.0 3.008295 0.028770 -4.498747 81
15 dense 9216 3072 3.0 3.663077 0.034545 -4.282535 145
16 dense 9216 3072 3.0 3.429998 0.025740 -2.288995 61
17 dense 3072 1024 3.0 2.358057 0.021292 -3.126475 144
18 dense 3072 3072 1.0 2.941910 0.029390 -3.245582 90
19 dense 3072 3072 1.0 2.441468 0.024840 -2.742530 177
20 dense 3072 1024 3.0 2.967904 0.027030 -4.494571 101
21 dense 9216 3072 3.0 2.893190 0.023238 -1.667437 332
22 dense 9216 3072 3.0 3.750567 0.034225 -4.019865 77
23 dense 9216 3072 3.0 3.236622 0.018134 -2.324970 81
24 dense 3072 1024 3.0 2.263618 0.016397 -3.093636 168
25 dense 3072 3072 1.0 2.772729 0.024187 -3.062257 126
26 dense 9216 3072 3.0 3.024075 0.010755 -1.548225 253
27 dense 3072 1024 3.0 3.056853 0.024124 -5.125485 89
28 dense 3072 3072 1.0 2.313170 0.018402 -2.287754 190
29 dense 9216 3072 3.0 3.019818 0.010434 -1.591497 330
30 dense 9216 3072 3.0 3.435916 0.038018 -3.521992 193
31 dense 9216 3072 3.0 3.112450 0.022184 -2.210608 94
32 dense 3072 1024 3.0 2.271802 0.028703 -3.252398 180
33 dense 3072 3072 1.0 2.388231 0.026579 -2.424785 141
34 dense 3072 3072 1.0 2.576868 0.040021 -2.748058 111
35 dense 3072 1024 3.0 2.675270 0.038444 -4.054186 119
36 dense 9216 3072 3.0 3.824334 0.029718 -3.957784 85
37 dense 9216 3072 3.0 3.060734 0.017565 -2.122147 112
38 dense 9216 3072 3.0 2.990383 0.013146 -1.604823 318
39 dense 3072 3072 1.0 3.202821 0.036120 -3.588744 42
40 dense 3072 3072 1.0 2.389846 0.019226 -2.483363 158
41 dense 3072 1024 3.0 2.980705 0.025395 -4.606597 92
42 dense 3072 1024 3.0 2.399499 0.021603 -3.245128 113
43 dense 9216 3072 3.0 2.952679 0.010105 -1.596789 337
44 dense 9216 3072 3.0 3.057856 0.015194 -2.004796 125
45 dense 3072 1024 3.0 2.423857 0.028264 -3.406192 127
46 dense 9216 3072 3.0 3.933790 0.033771 -4.044607 84
47 dense 3072 3072 1.0 3.403715 0.030906 -4.166937 73
48 dense 3072 3072 1.0 2.487440 0.033125 -2.643288 116
49 dense 3072 1024 3.0 3.115074 0.024400 -4.758236 81
50 dense 3072 1024 3.0 2.471108 0.030707 -3.574269 115
51 dense 9216 3072 3.0 2.868903 0.006487 -1.420859 257
52 dense 3072 3072 1.0 3.348531 0.032251 -4.098917 62
53 dense 9216 3072 3.0 3.914412 0.039765 -3.951747 80
54 dense 9216 3072 3.0 3.059238 0.022192 -1.863063 103
55 dense 3072 3072 1.0 2.366250 0.035626 -2.451017 236
56 dense 3072 1024 3.0 3.098097 0.045455 -5.052725 83
57 dense 9216 3072 3.0 3.679297 0.023723 -3.773919 94
58 dense 3072 1024 3.0 2.549017 0.032403 -3.335108 69
59 dense 9216 3072 3.0 2.870830 0.014529 -1.628496 155
60 dense 3072 3072 1.0 2.435949 0.024875 -2.469726 153
61 dense 3072 3072 1.0 3.368100 0.031059 -3.792395 63
62 dense 9216 3072 3.0 2.782295 0.009154 -1.317523 235
63 dense 3072 1024 3.0 3.117297 0.044934 -4.739215 83
64 dense 9216 3072 3.0 3.758835 0.030800 -3.790649 86
65 dense 9216 3072 3.0 2.813476 0.008452 -1.219045 248
66 dense 3072 3072 1.0 2.954118 0.031883 -3.414964 109
67 dense 3072 1024 3.0 2.513954 0.033709 -3.392703 91
68 dense 3072 1024 3.0 3.386765 0.044182 -5.746891 47
69 dense 3072 3072 1.0 2.389296 0.032013 -2.150796 146
70 dense 9216 3072 3.0 2.942526 0.014622 -1.635195 134
71 dense 9216 3072 3.0 2.742309 0.008440 -1.075762 276
72 dense 9216 3072 3.0 3.662097 0.038979 -3.541024 112
73 dense 3072 3072 1.0 2.439609 0.039204 -2.332491 181
74 dense 3072 3072 1.0 3.180210 0.039996 -3.855463 81
75 dense 3072 1024 3.0 2.550970 0.035687 -3.673786 114
76 dense 3072 1024 3.0 2.904344 0.052898 -4.855945 132
77 dense 9216 3072 3.0 2.929163 0.019333 -1.516344 88
78 dense 9216 3072 3.0 2.905327 0.014075 -1.505855 121
79 dense 9216 3072 3.0 4.079590 0.036003 -3.827234 63
80 dense 9216 3072 3.0 2.734296 0.007729 -1.145931 262
81 dense 3072 1024 3.0 3.465132 0.047237 -5.705144 63
82 dense 3072 3072 1.0 2.484033 0.037354 -2.497343 155
83 dense 3072 3072 1.0 3.652394 0.036903 -4.583628 52
84 dense 3072 1024 3.0 2.579968 0.036520 -3.757553 110
85 dense 3072 1024 3.0 2.594158 0.034552 -3.773081 91
86 dense 9216 3072 3.0 2.884069 0.012415 -1.526944 193
87 dense 9216 3072 3.0 2.784688 0.008413 -1.195431 262
88 dense 9216 3072 3.0 4.178343 0.035397 -3.936133 64
89 dense 3072 3072 1.0 2.451472 0.037222 -2.364067 151
90 dense 3072 1024 3.0 3.724537 0.045846 -6.087421 47
91 dense 3072 3072 1.0 3.529390 0.038487 -4.227279 57
92 dense 9216 3072 3.0 3.001188 0.011770 -1.632439 151
93 dense 3072 3072 1.0 2.555782 0.094827 -3.374145 332
94 dense 3072 1024 3.0 3.139279 0.043306 -4.570784 36
95 dense 9216 3072 3.0 4.495818 0.042638 -4.427112 51
96 dense 9216 3072 3.0 2.847527 0.008986 -1.262917 223
97 dense 3072 1024 3.0 4.061307 0.047674 -6.978818 36
98 dense 3072 3072 1.0 2.671826 0.036724 -2.444219 101
99 dense 3072 1024 3.0 3.085210 0.036918 -4.408943 43
100 dense 3072 3072 1.0 2.064893 0.094283 -2.462988 601
101 dense 3072 3072 1.0 2.563492 0.032774 -2.245654 145
102 dense 9216 3072 3.0 4.356687 0.039623 -4.353011 82
103 dense 9216 3072 3.0 2.874102 0.008697 -1.306875 237
104 dense 9216 3072 3.0 3.059124 0.014878 -1.730022 143
105 dense 3072 1024 3.0 3.353293 0.060176 -5.494198 67
106 dense 3072 1024 3.0 3.554785 0.042782 -5.690843 64
107 dense 3072 3072 1.0 3.421921 0.033609 -3.786574 72
108 dense 3072 3072 1.0 2.470257 0.039332 -2.344517 155
109 dense 9216 3072 3.0 2.977617 0.012571 -1.559942 183
110 dense 9216 3072 3.0 2.858635 0.008717 -1.177227 260
111 dense 9216 3072 3.0 4.510955 0.036249 -4.586552 72
112 dense 3072 1024 3.0 2.720097 0.038871 -3.805965 71
113 dense 3072 3072 1.0 2.579717 0.031157 -2.422335 118
114 dense 3072 1024 3.0 3.479394 0.047637 -5.260274 58
115 dense 9216 3072 3.0 2.863037 0.017569 -1.100681 143
116 dense 3072 3072 1.0 4.086282 0.030572 -5.049111 49
117 dense 3072 1024 3.0 2.917247 0.032763 -4.280622 56
118 dense 9216 3072 3.0 2.979158 0.016325 -1.403283 135
119 dense 9216 3072 3.0 4.634343 0.033254 -4.991092 73
120 dense 9216 3072 3.0 4.605542 0.020005 -4.995464 79
121 dense 3072 1024 3.0 2.993928 0.031123 -4.520275 90
122 dense 9216 3072 3.0 2.728551 0.011220 -0.920564 260
123 dense 3072 1024 3.0 2.609464 0.024208 -3.084575 108
124 dense 3072 3072 1.0 2.466050 0.034821 -2.324092 198
125 dense 3072 3072 1.0 3.042681 0.024902 -3.165496 88
126 dense 9216 3072 3.0 2.602468 0.014854 -0.634151 286
127 dense 9216 3072 3.0 2.606818 0.012110 -0.770955 349
128 dense 3072 3072 1.0 2.576024 0.029026 -2.300523 134
129 dense 9216 3072 3.0 2.713352 0.011430 -0.980286 317
130 dense 3072 1024 3.0 2.983094 0.032607 -4.131338 86
131 dense 3072 3072 1.0 3.599315 0.023337 -4.042191 76
132 dense 3072 1024 3.0 2.850538 0.030090 -3.931380 69
133 dense 9216 3072 3.0 4.228554 0.036377 -4.473150 122
134 dense 9216 3072 3.0 3.612722 0.031589 -3.689490 166
135 dense 9216 3072 3.0 2.595085 0.008835 -0.616086 360
136 dense 9216 3072 3.0 2.712767 0.008540 -0.740544 248
137 dense 3072 1024 3.0 2.529270 0.046278 -3.428062 116
138 dense 3072 3072 1.0 2.699638 0.017059 -2.385504 119
139 dense 3072 3072 1.0 2.326658 0.032600 -1.814143 205
140 dense 3072 1024 3.0 2.566064 0.030427 -3.338534 127
141 dense 9216 3072 3.0 3.900630 0.027074 -2.923331 78
142 dense 9216 3072 3.0 2.564981 0.009531 -0.618597 408
143 dense 9216 3072 3.0 2.727109 0.009794 -0.954048 316
144 dense 3072 1024 3.0 2.681631 0.036646 -3.620317 95
145 dense 3072 3072 1.0 2.909569 0.022239 -2.599099 98
146 dense 3072 3072 1.0 2.432593 0.036793 -1.968423 199
147 dense 3072 1024 3.0 2.752660 0.020374 -3.328204 112
148 dense 9216 3072 3.0 3.199742 0.024138 -0.515372 163
149 dense 9216 3072 3.0 2.779285 0.007001 -0.660413 298
150 dense 9216 3072 3.0 2.959441 0.014547 -0.781393 267
151 dense 3072 1024 3.0 2.704498 0.033089 -3.696344 72
152 dense 3072 3072 1.0 2.620920 0.018747 -1.769866 131
153 dense 3072 3072 1.0 2.392206 0.029048 -1.769497 166
154 dense 3072 1024 3.0 2.560565 0.015794 -2.880018 133