We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
The following info is for Bloom-1.3B and embedding-and-MADX-adapters (with replace strategy) with the default bottleneck reduction size of 16.
Total frozen parameters: 1208602624 Total trainable parameters: 24979456 Total emb parameters: 20488192 Total MAD-X adapter parameters: 4,491,264
The following info is for Bloom-1.3B and embedding-and-MADX-adapters (with replace strategy) with the default bottleneck reduction size of 16.