gemma-7b-it


Find this model in the Gemma model summary


gemma-7b-it Model Set Plots


Gemma Compared to Base Model Plots



gemma-7b-it Model Selected Details
id layer_type N M Q alpha D alpha-hat num_spikes warning
8 dense 24576 3072 8.000000 13.204637 0.071747 -5.927451 170 under-trained
9 dense 24576 3072 8.000000 9.407434 0.071397 0.131956 314 under-trained
10 dense 24576 3072 8.000000 11.139997 0.056180 -2.488992 221 under-trained
11 dense 4096 3072 1.333333 4.828747 0.046589 -0.938114 279
12 dense 4096 3072 1.333333 6.618802 0.053016 -2.892642 174 under-trained
1 dense 24576 3072 8.000000 5.180245 0.027473 -0.404991 190
2 dense 24576 3072 8.000000 5.223154 0.034106 2.660176 416
3 dense 24576 3072 8.000000 3.401054 0.033436 4.048122 674
4 dense 4096 3072 1.333333 2.682806 0.012706 3.233310 302
5 dense 4096 3072 1.333333 3.624086 0.036326 -0.395872 311
6 dense 4096 3072 1.333333 2.730183 0.029942 3.185442 384
7 dense 4096 3072 1.333333 2.918944 0.018709 0.599816 254
13 dense 4096 3072 1.333333 4.413043 0.058609 -0.351628 366
14 dense 4096 3072 1.333333 6.644915 0.044488 -2.603577 143 under-trained
15 dense 24576 3072 8.000000 16.684788 0.121132 -9.015546 198 under-trained
16 dense 24576 3072 8.000000 17.175797 0.058791 -5.126229 117 under-trained
17 dense 24576 3072 8.000000 14.666933 0.076802 -3.060876 173 under-trained
18 dense 4096 3072 1.333333 8.551310 0.031835 -5.152714 39 under-trained
19 dense 4096 3072 1.333333 7.880493 0.048577 -6.348927 123 under-trained
20 dense 4096 3072 1.333333 5.958587 0.036181 -3.125761 161
21 dense 4096 3072 1.333333 8.707095 0.046033 -6.242550 95 under-trained
22 dense 24576 3072 8.000000 13.312775 0.071876 -6.646644 202 under-trained
23 dense 24576 3072 8.000000 17.423985 0.068692 -6.321120 132 under-trained
24 dense 24576 3072 8.000000 16.788655 0.076076 -5.319194 142 under-trained
25 dense 4096 3072 1.333333 6.325770 0.038349 -2.565324 142 under-trained
26 dense 4096 3072 1.333333 6.616444 0.053572 -4.879926 173 under-trained
27 dense 4096 3072 1.333333 6.005698 0.043917 -2.162414 171 under-trained
28 dense 4096 3072 1.333333 6.981795 0.041191 -5.054562 142 under-trained
29 dense 24576 3072 8.000000 10.914789 0.068456 -5.302527 270 under-trained
30 dense 24576 3072 8.000000 11.831374 0.054257 -3.003014 219 under-trained
31 dense 24576 3072 8.000000 14.028278 0.059334 -6.072773 172 under-trained
32 dense 4096 3072 1.333333 6.109448 0.032875 -2.296519 146 under-trained
33 dense 4096 3072 1.333333 8.039530 0.062200 -7.158806 136 under-trained
34 dense 4096 3072 1.333333 5.839769 0.034703 -1.521372 161
35 dense 4096 3072 1.333333 11.037766 0.054188 -11.363460 82 under-trained
36 dense 24576 3072 8.000000 11.408626 0.063044 -6.125920 257 under-trained
37 dense 24576 3072 8.000000 13.014428 0.063277 -2.860776 192 under-trained
38 dense 24576 3072 8.000000 12.681865 0.063479 -1.972998 206 under-trained
39 dense 4096 3072 1.333333 6.255097 0.027944 -2.834877 137 under-trained
40 dense 4096 3072 1.333333 6.636301 0.059953 -6.006800 199 under-trained
41 dense 4096 3072 1.333333 5.871632 0.035398 -1.799211 163
42 dense 4096 3072 1.333333 8.914550 0.021573 -9.436340 79 under-trained
43 dense 4096 3072 1.333333 6.197443 0.033585 -2.153169 144 under-trained
44 dense 4096 3072 1.333333 8.563891 0.060450 -9.276530 139 under-trained
45 dense 4096 3072 1.333333 5.100835 0.040072 -0.783224 215
46 dense 4096 3072 1.333333 13.241647 0.040277 -15.868850 40 under-trained
47 dense 24576 3072 8.000000 14.195889 0.072723 -7.976391 191 under-trained
48 dense 24576 3072 8.000000 13.288081 0.056667 -6.021244 185 under-trained
49 dense 24576 3072 8.000000 11.055204 0.050487 -1.436680 224 under-trained
50 dense 4096 3072 1.333333 8.115021 0.045952 -4.319027 118 under-trained
51 dense 24576 3072 8.000000 10.414555 0.060626 0.266023 254 under-trained
52 dense 4096 3072 1.333333 9.764129 0.066776 -10.217694 105 under-trained
53 dense 24576 3072 8.000000 13.720573 0.073490 -7.562076 215 under-trained
54 dense 24576 3072 8.000000 12.417700 0.062982 -3.636889 202 under-trained
55 dense 4096 3072 1.333333 17.303952 0.109575 -21.789567 64 under-trained
56 dense 4096 3072 1.333333 6.050280 0.047205 -1.599543 177 under-trained
57 dense 24576 3072 8.000000 12.491102 0.061768 -1.245694 209 under-trained
58 dense 24576 3072 8.000000 17.805112 0.067348 -9.665398 122 under-trained
59 dense 4096 3072 1.333333 11.561723 0.045117 -7.141097 66 under-trained
60 dense 4096 3072 1.333333 10.101271 0.079689 -8.835811 117 under-trained
61 dense 4096 3072 1.333333 9.028049 0.060795 -4.307781 106 under-trained
62 dense 24576 3072 8.000000 14.915284 0.061531 -8.865932 169 under-trained
63 dense 4096 3072 1.333333 14.670196 0.114482 -18.530373 81 under-trained
64 dense 24576 3072 8.000000 16.166027 0.044574 -10.782348 124 under-trained
65 dense 4096 3072 1.333333 8.198981 0.059479 -3.948592 123 under-trained
66 dense 4096 3072 1.333333 10.158609 0.073529 -11.886079 119 under-trained
67 dense 24576 3072 8.000000 14.396701 0.063426 -4.408299 162 under-trained
68 dense 4096 3072 1.333333 12.591238 0.113709 -15.198147 110 under-trained
69 dense 4096 3072 1.333333 8.576819 0.046177 -3.814329 111 under-trained
70 dense 24576 3072 8.000000 18.965968 0.070370 -11.652865 123 under-trained
71 dense 24576 3072 8.000000 12.053267 0.056882 -2.540708 180 under-trained
72 dense 24576 3072 8.000000 17.724349 0.059477 -10.904854 134 under-trained
73 dense 4096 3072 1.333333 6.427865 0.056708 -1.256859 180 under-trained
74 dense 4096 3072 1.333333 11.417160 0.068156 -12.818244 101 under-trained
75 dense 4096 3072 1.333333 6.963597 0.039664 -2.524043 123 under-trained
76 dense 4096 3072 1.333333 8.534623 0.118138 -11.445176 184 under-trained
77 dense 24576 3072 8.000000 15.276051 0.055366 -6.422125 135 under-trained
78 dense 24576 3072 8.000000 12.805932 0.089597 -8.609008 185 under-trained
79 dense 24576 3072 8.000000 17.361188 0.069024 -8.984600 137 under-trained
80 dense 24576 3072 8.000000 11.445580 0.052686 -2.165320 131 under-trained
81 dense 4096 3072 1.333333 26.362752 0.114794 -37.595970 36 under-trained
82 dense 4096 3072 1.333333 9.031662 0.060551 -5.491933 104 under-trained
83 dense 4096 3072 1.333333 10.454732 0.123211 -12.540151 160 under-trained
119 dense 4096 3072 1.333333 4.256852 0.102261 -5.766150 411
84 dense 4096 3072 1.333333 9.968822 0.056394 -5.964082 95 under-trained
85 dense 4096 3072 1.333333 11.460747 0.061886 -7.745840 54 under-trained
86 dense 24576 3072 8.000000 9.107242 0.099935 -5.644136 317 under-trained
87 dense 24576 3072 8.000000 9.714690 0.058381 -1.079814 149 under-trained
88 dense 24576 3072 8.000000 15.872604 0.077857 -8.377246 166 under-trained
89 dense 4096 3072 1.333333 6.189842 0.108635 -8.531773 259 under-trained
90 dense 4096 3072 1.333333 9.237783 0.060225 -5.059788 94 under-trained
91 dense 4096 3072 1.333333 6.599207 0.127169 -7.297609 309 under-trained
92 dense 24576 3072 8.000000 16.117582 0.085289 -10.035349 69 under-trained
93 dense 4096 3072 1.333333 8.044406 0.122381 -9.808868 217 under-trained
94 dense 4096 3072 1.333333 7.441985 0.110869 -3.784165 186 under-trained
95 dense 24576 3072 8.000000 5.373591 0.093115 -1.368193 611
96 dense 24576 3072 8.000000 16.906133 0.070211 -9.556732 131 under-trained
97 dense 4096 3072 1.333333 9.202629 0.052793 -5.505592 88 under-trained
98 dense 4096 3072 1.333333 4.991469 0.106457 -6.662127 328
99 dense 24576 3072 8.000000 17.571769 0.061502 -9.707955 120 under-trained
100 dense 24576 3072 8.000000 4.650532 0.095389 -1.127250 821
101 dense 24576 3072 8.000000 5.987098 0.095596 -3.781400 577
102 dense 4096 3072 1.333333 8.555648 0.105002 -6.244373 141 under-trained
103 dense 4096 3072 1.333333 9.360369 0.122561 -11.175862 173 under-trained
104 dense 4096 3072 1.333333 12.336617 0.042937 -9.622290 43 under-trained
105 dense 4096 3072 1.333333 7.398483 0.096062 -10.227805 159 under-trained
106 dense 4096 3072 1.333333 10.697750 0.091507 -11.914940 118 under-trained
107 dense 4096 3072 1.333333 11.120218 0.043095 -7.992190 62 under-trained
108 dense 4096 3072 1.333333 9.822712 0.113899 -13.636581 136 under-trained
109 dense 4096 3072 1.333333 10.476131 0.117071 -7.713990 113 under-trained
110 dense 24576 3072 8.000000 17.685307 0.065288 -11.034616 113 under-trained
111 dense 24576 3072 8.000000 10.725446 0.048543 -2.263563 97 under-trained
112 dense 24576 3072 8.000000 12.335317 0.093203 -8.004306 130 under-trained
113 dense 24576 3072 8.000000 16.486079 0.067096 -9.155776 136 under-trained
114 dense 24576 3072 8.000000 6.439190 0.096553 -1.811248 519 under-trained
115 dense 24576 3072 8.000000 14.402360 0.031874 -8.426503 54 under-trained
116 dense 4096 3072 1.333333 7.550683 0.104431 -5.248409 157 under-trained
117 dense 4096 3072 1.333333 9.032708 0.115210 -11.657835 171 under-trained
118 dense 4096 3072 1.333333 8.785475 0.040813 -5.588665 77 under-trained
120 dense 24576 3072 8.000000 18.093398 0.063816 -11.629522 131 under-trained
121 dense 24576 3072 8.000000 10.085261 0.055440 -2.744147 126 under-trained
122 dense 24576 3072 8.000000 12.773765 0.037877 -6.897089 88 under-trained
123 dense 4096 3072 1.333333 9.880403 0.064219 -6.236671 81 under-trained
124 dense 4096 3072 1.333333 11.640971 0.114528 -16.088255 116 under-trained
125 dense 4096 3072 1.333333 8.521147 0.100167 -5.649375 137 under-trained
126 dense 4096 3072 1.333333 7.652605 0.106882 -10.869922 161 under-trained
127 dense 24576 3072 8.000000 19.679279 0.062083 -13.896251 100 under-trained
128 dense 24576 3072 8.000000 9.678847 0.051382 -2.017840 181 under-trained
129 dense 24576 3072 8.000000 12.825131 0.039029 -6.770712 120 under-trained
130 dense 4096 3072 1.333333 10.552083 0.064406 -7.239141 82 under-trained
131 dense 4096 3072 1.333333 9.799946 0.124375 -11.632234 171 under-trained
132 dense 4096 3072 1.333333 7.603319 0.115549 -5.612161 195 under-trained
133 dense 4096 3072 1.333333 4.943308 0.099164 -6.861132 347
134 dense 24576 3072 8.000000 23.144797 0.055074 -19.248337 71 under-trained
135 dense 24576 3072 8.000000 9.232115 0.050233 -2.226254 277 under-trained
136 dense 24576 3072 8.000000 11.320355 0.046918 -4.683092 193 under-trained
137 dense 4096 3072 1.333333 6.189729 0.123001 -3.192766 291 under-trained
138 dense 4096 3072 1.333333 8.755920 0.117418 -11.468685 172 under-trained
139 dense 4096 3072 1.333333 7.132659 0.120974 -3.993149 218 under-trained
140 dense 4096 3072 1.333333 22.043473 0.106426 -30.376538 39 under-trained
141 dense 24576 3072 8.000000 23.986319 0.047038 -21.184777 68 under-trained
142 dense 24576 3072 8.000000 9.697232 0.053171 -1.324193 262 under-trained
143 dense 24576 3072 8.000000 11.174339 0.057407 -3.529942 223 under-trained
144 dense 4096 3072 1.333333 8.402452 0.064322 -4.125130 118 under-trained
145 dense 4096 3072 1.333333 10.741204 0.113655 -13.013484 135 under-trained
146 dense 4096 3072 1.333333 6.094833 0.113977 -3.251683 271 under-trained
147 dense 4096 3072 1.333333 20.479848 0.119810 -27.496265 51 under-trained
148 dense 24576 3072 8.000000 26.073204 0.043994 -22.660535 53 under-trained
149 dense 24576 3072 8.000000 10.362370 0.050044 -1.060650 220 under-trained
150 dense 24576 3072 8.000000 12.795236 0.059982 -4.866445 176 under-trained
151 dense 4096 3072 1.333333 9.036874 0.070701 -4.126474 121 under-trained
152 dense 4096 3072 1.333333 11.951687 0.065784 -15.081608 94 under-trained
153 dense 4096 3072 1.333333 7.870646 0.118015 -4.272420 196 under-trained
154 dense 4096 3072 1.333333 13.081190 0.111640 -17.095696 102 under-trained
155 dense 24576 3072 8.000000 13.801858 0.037522 -11.072957 152 under-trained
156 dense 24576 3072 8.000000 10.974947 0.046516 -1.294818 200 under-trained
157 dense 24576 3072 8.000000 13.605822 0.044746 -6.075320 142 under-trained
158 dense 4096 3072 1.333333 9.078428 0.064931 -4.221869 105 under-trained
159 dense 4096 3072 1.333333 15.723812 0.053597 -19.271871 57 under-trained
160 dense 4096 3072 1.333333 9.532343 0.071268 -5.296937 95 under-trained
161 dense 4096 3072 1.333333 9.778812 0.125249 -11.787545 164 under-trained
162 dense 24576 3072 8.000000 12.980312 0.040828 -9.871774 150 under-trained
163 dense 24576 3072 8.000000 11.864702 0.055583 -2.682106 195 under-trained
164 dense 24576 3072 8.000000 12.553634 0.047090 -5.042727 172 under-trained
165 dense 4096 3072 1.333333 7.371136 0.111758 -3.888498 209 under-trained
166 dense 4096 3072 1.333333 9.882683 0.048224 -10.848115 95 under-trained
167 dense 4096 3072 1.333333 9.713087 0.054544 -5.203713 81 under-trained
168 dense 4096 3072 1.333333 15.656777 0.063844 -17.068735 56 under-trained
169 dense 4096 3072 1.333333 7.548668 0.052585 -3.868171 130 under-trained
170 dense 4096 3072 1.333333 7.212494 0.053271 -6.266802 173 under-trained
171 dense 4096 3072 1.333333 7.688330 0.057651 -4.080901 119 under-trained
172 dense 4096 3072 1.333333 9.650592 0.057128 -8.599274 114 under-trained
173 dense 24576 3072 8.000000 6.888565 0.059327 -2.725393 533 under-trained
174 dense 24576 3072 8.000000 11.829818 0.061658 -3.094247 226 under-trained
175 dense 24576 3072 8.000000 12.117547 0.057376 -4.151599 218 under-trained
176 dense 24576 3072 8.000000 7.110298 0.046913 -0.723421 441 under-trained
177 dense 24576 3072 8.000000 9.143261 0.067639 -0.771558 342 under-trained
178 dense 24576 3072 8.000000 6.445925 0.047915 0.201111 22 under-trained
179 dense 4096 3072 1.333333 9.310068 0.042007 -5.218459 93 under-trained
180 dense 4096 3072 1.333333 4.996093 0.027426 -3.581415 228
181 dense 4096 3072 1.333333 8.350576 0.043761 -5.159379 102 under-trained
182 dense 4096 3072 1.333333 5.129648 0.053937 -2.330954 277
183 dense 24576 3072 8.000000 6.898444 0.044651 -0.523026 444 under-trained
184 dense 24576 3072 8.000000 7.928312 0.054151 1.094327 385 under-trained
185 dense 24576 3072 8.000000 5.775951 0.033986 -0.666577 44
186 dense 4096 3072 1.333333 3.526949 0.029532 0.026421 383
187 dense 4096 3072 1.333333 3.086879 0.036378 -1.119379 51
188 dense 4096 3072 1.333333 3.370159 0.036849 0.311190 457
189 dense 4096 3072 1.333333 3.542124 0.051364 0.090979 492
190 dense 24576 3072 8.000000 6.826081 0.065984 -1.640947 470 under-trained
191 dense 24576 3072 8.000000 3.984349 0.061575 1.884990 29
192 dense 24576 3072 8.000000 5.145938 0.050730 0.086875 39
193 dense 4096 3072 1.333333 3.196460 0.034577 2.613299 376
194 dense 4096 3072 1.333333 2.886366 0.030415 -0.602724 461
195 dense 4096 3072 1.333333 3.063616 0.034627 2.688421 430
196 dense 4096 3072 1.333333 3.080685 0.034139 -0.198767 413