Llama-Guard-3-1B


Find this model in the Llama-Guard model summary


Llama-Guard-3-1B Model Set Plots


Llama-Guard Compared to Base Model Plots



Llama-Guard-3-1B Model Selected Details
id layer_type N M Q alpha D alpha-hat num_spikes warning
1 dense 8192 2048 4.0 3.063091 0.013881 -6.555683 183
2 dense 8192 2048 4.0 2.960670 0.019702 -5.633894 220
3 dense 8192 2048 4.0 2.951743 0.013051 -5.631252 182
4 dense 2048 512 4.0 1.970617 0.046053 -3.337725 229 over-trained
5 dense 2048 2048 1.0 2.127588 0.017628 -3.856350 171
6 dense 2048 2048 1.0 1.958300 0.016543 -2.814505 476 over-trained
7 dense 2048 512 4.0 2.046598 0.033327 -4.680246 143
8 dense 2048 2048 1.0 2.039755 0.021846 -3.530673 378
9 dense 2048 2048 1.0 2.386313 0.013179 -4.540098 109
10 dense 2048 512 4.0 2.135515 0.038966 -4.500155 195
11 dense 2048 512 4.0 2.468680 0.027534 -6.355821 123
12 dense 8192 2048 4.0 2.952558 0.020302 -4.771293 247
13 dense 8192 2048 4.0 2.615609 0.023613 -5.432277 252
14 dense 8192 2048 4.0 2.818899 0.012687 -4.563860 209
15 dense 2048 512 4.0 2.400000 0.019103 -5.894289 117
16 dense 2048 2048 1.0 2.067225 0.023893 -3.979747 405
17 dense 2048 2048 1.0 2.307205 0.017525 -4.749407 128
18 dense 2048 512 4.0 2.208857 0.037051 -5.191809 177
19 dense 8192 2048 4.0 2.582433 0.013716 -4.168376 185
20 dense 8192 2048 4.0 2.790007 0.010338 -4.490462 253
21 dense 8192 2048 4.0 2.571060 0.028118 -5.269981 129
22 dense 2048 512 4.0 2.524495 0.021958 -6.650423 72
23 dense 8192 2048 4.0 2.189314 0.032789 -3.344319 346
24 dense 8192 2048 4.0 2.525482 0.015697 -3.720161 204
25 dense 8192 2048 4.0 2.275251 0.035752 -4.367534 217
26 dense 2048 2048 1.0 2.034487 0.018783 -3.673224 364
27 dense 2048 512 4.0 2.140218 0.027680 -5.171302 189
28 dense 2048 2048 1.0 2.251387 0.034548 -4.811740 116
29 dense 2048 512 4.0 2.519178 0.032186 -6.643337 34
30 dense 8192 2048 4.0 2.063143 0.030318 -3.802625 336
31 dense 8192 2048 4.0 2.293002 0.023438 -3.216395 323
32 dense 8192 2048 4.0 1.962320 0.030920 -3.008396 559 over-trained
33 dense 2048 512 4.0 2.078015 0.019959 -4.902627 193
34 dense 2048 2048 1.0 2.037525 0.041947 -4.277874 165
35 dense 2048 2048 1.0 1.959881 0.012842 -3.442555 380 over-trained
36 dense 8192 2048 4.0 1.987070 0.027353 -3.500036 324 over-trained
37 dense 8192 2048 4.0 2.191149 0.022562 -3.105966 303
38 dense 8192 2048 4.0 1.928201 0.028523 -2.930660 526 over-trained
39 dense 2048 512 4.0 2.068321 0.018060 -5.040571 180
40 dense 2048 2048 1.0 1.817777 0.043253 -3.919894 332 over-trained
41 dense 2048 512 4.0 2.217457 0.043509 -5.688210 74
42 dense 2048 2048 1.0 1.955386 0.014729 -3.455625 292 over-trained
43 dense 2048 512 4.0 1.953328 0.050194 -5.208535 187 over-trained
44 dense 8192 2048 4.0 2.084993 0.019833 -2.889196 443
45 dense 8192 2048 4.0 1.929710 0.023940 -2.874765 390 over-trained
46 dense 2048 512 4.0 2.056005 0.016978 -5.071654 171
47 dense 2048 2048 1.0 1.849498 0.039632 -3.932249 227 over-trained
48 dense 8192 2048 4.0 1.852397 0.024743 -3.094534 641 over-trained
49 dense 2048 2048 1.0 1.916079 0.013863 -3.263448 333 over-trained
50 dense 2048 512 4.0 2.039546 0.017635 -4.953906 191
51 dense 8192 2048 4.0 1.863291 0.017627 -2.541572 686 over-trained
52 dense 2048 2048 1.0 1.738410 0.026350 -3.400788 488 over-trained
53 dense 8192 2048 4.0 1.832503 0.022236 -2.827990 510 over-trained
54 dense 2048 512 4.0 1.910959 0.044072 -5.057718 164 over-trained
55 dense 2048 2048 1.0 1.904491 0.011077 -3.187308 417 over-trained
56 dense 8192 2048 4.0 2.042811 0.015364 -2.665780 460
57 dense 8192 2048 4.0 1.738271 0.013466 -2.702129 1271 over-trained
58 dense 8192 2048 4.0 1.809049 0.015231 -2.333807 988 over-trained
59 dense 2048 512 4.0 2.015010 0.017675 -4.807331 232
60 dense 2048 2048 1.0 1.681306 0.024705 -3.413387 536 over-trained
61 dense 2048 2048 1.0 1.911223 0.007252 -2.804352 427 over-trained
62 dense 2048 512 4.0 1.898604 0.043219 -5.175941 176 over-trained
63 dense 8192 2048 4.0 1.995272 0.014047 -2.417374 551 over-trained
64 dense 8192 2048 4.0 1.779237 0.010284 -2.753048 1138 over-trained
65 dense 8192 2048 4.0 1.931309 0.006027 -2.162215 796 over-trained
66 dense 8192 2048 4.0 1.808836 0.006202 -2.237733 1013 over-trained
67 dense 2048 512 4.0 1.980091 0.020940 -4.592732 241 over-trained
68 dense 2048 2048 1.0 1.730088 0.032033 -3.592165 494 over-trained
69 dense 2048 512 4.0 1.891348 0.047711 -5.049599 163 over-trained
70 dense 2048 2048 1.0 1.922437 0.013486 -2.566590 507 over-trained
71 dense 2048 512 4.0 1.895275 0.036781 -4.969357 139 over-trained
72 dense 2048 2048 1.0 1.695497 0.037414 -3.489436 607 over-trained
73 dense 2048 512 4.0 1.961862 0.011817 -4.399616 203 over-trained
74 dense 2048 2048 1.0 1.865661 0.012195 -2.818788 512 over-trained
75 dense 8192 2048 4.0 1.949044 0.007700 -2.191472 744 over-trained
76 dense 8192 2048 4.0 1.799416 0.019423 -2.438914 1052 over-trained
77 dense 8192 2048 4.0 1.826380 0.009239 -2.314603 889 over-trained
78 dense 8192 2048 4.0 1.859565 0.014149 -2.184872 816 over-trained
79 dense 8192 2048 4.0 1.928510 0.006573 -2.180686 829 over-trained
80 dense 8192 2048 4.0 1.834301 0.008139 -2.275265 962 over-trained
81 dense 2048 2048 1.0 1.732060 0.033799 -3.418087 579 over-trained
82 dense 2048 2048 1.0 1.898396 0.010195 -3.021769 517 over-trained
83 dense 2048 512 4.0 1.899506 0.045179 -4.947970 175 over-trained
84 dense 2048 512 4.0 2.077934 0.015501 -5.110080 196
85 dense 2048 2048 1.0 1.741858 0.025575 -3.289040 551 over-trained
86 dense 8192 2048 4.0 1.841246 0.011345 -1.887331 944 over-trained
87 dense 8192 2048 4.0 1.926907 0.006188 -2.243325 850 over-trained
88 dense 8192 2048 4.0 1.867245 0.007620 -2.235489 839 over-trained
89 dense 2048 512 4.0 2.051393 0.017474 -5.104428 211
90 dense 2048 2048 1.0 1.893019 0.015673 -3.400241 550 over-trained
91 dense 2048 512 4.0 1.933650 0.032876 -4.761910 142 over-trained
92 dense 8192 2048 4.0 1.828591 0.010342 -1.765303 1154 over-trained
93 dense 8192 2048 4.0 1.911582 0.007190 -2.285539 898 over-trained
94 dense 2048 512 4.0 2.033850 0.020530 -5.005129 188
95 dense 2048 2048 1.0 1.771744 0.013733 -3.186756 608 over-trained
96 dense 2048 2048 1.0 1.886663 0.012187 -3.088871 540 over-trained
97 dense 2048 512 4.0 1.877793 0.018273 -4.665623 199 over-trained
98 dense 8192 2048 4.0 1.871191 0.006188 -2.308024 899 over-trained
99 dense 8192 2048 4.0 1.766974 0.021444 -1.738004 379 over-trained
100 dense 8192 2048 4.0 1.914787 0.013321 -2.008385 1043 over-trained
101 dense 8192 2048 4.0 1.868480 0.010119 -2.001875 1067 over-trained
102 dense 2048 512 4.0 1.966041 0.016702 -4.189829 221 over-trained
103 dense 2048 2048 1.0 1.809624 0.026354 -3.202992 256 over-trained
104 dense 2048 2048 1.0 1.884955 0.013365 -2.978362 532 over-trained
105 dense 2048 512 4.0 1.889134 0.012409 -4.544430 242 over-trained
106 dense 2048 2048 1.0 1.863459 0.019358 -2.971158 560 over-trained
107 dense 8192 2048 4.0 1.806102 0.023707 -1.573331 297 over-trained
108 dense 8192 2048 4.0 1.776957 0.021407 -1.416306 313 over-trained
109 dense 2048 512 4.0 1.942283 0.020119 -4.252569 241 over-trained
110 dense 2048 2048 1.0 1.700779 0.033243 -2.792576 142 over-trained
111 dense 8192 2048 4.0 1.540058 0.064710 -1.429913 144 over-trained
112 dense 2048 512 4.0 1.848361 0.028537 -4.421692 103 over-trained