WeightWatcher: Data-Free Diagnostics for Deep Learning

Dolly Models

Dolly, from Databricks, was one of the first truly open-source instruction fine-tuned LLMs.Dolly 2.0 is a 12B parameter language model based on the Pythia models,and fine-tuned on crowdsourced dataset (build by Databricks employees.)

The Dolly models show a bit smaller median alphas for their size(when compared to say theBLOOMmodels )For example, the Dolly 7b median alpha is 3.26,(with a Dks just below 0.25),whereas for the LLAMA 7b model, the median alpha is 3.77(with a slightly larger Dks)For comparison, the BLOOM 7b media alpha is around 3.5,which is not as large as LLAMA, but the Dks ismuch larger Dks and above 0.3)

Overall, the Dolly models appear to have been well-trainedcaptured more information from their datasetscompared to other LLM models of their size.

Curiously, however, the Dolly models are not consideredvery reiable (having a low TruthfulQA metric).

Primary Reference: https://github.com/databrickslabs/dolly
Secondary Reference: https://huggingface.co/databricks
Paper: N/A

Dolly Models Included

Dolly Model Set Plots