Great AI models have well-trained layers

WeightWatcher is like an oscilloscope for AI models; it provides a wide range of layer diagnostics and plots to help you determine if your model layers are well-trained, over-trained, or under-trained.

The best performing Deep Learning models have well-shaped layers--and they look like the plot on the right. They have a simple shape (linear on a log-log plot), with the unique weightwatcher alpha metric near 2 (or at least between 2 and 6). But don't take our word for it--see for yourself.

We apply weightwatcher to a wide range of open-source models. Click below for results

Explore weightwatcher quality metrics and reports on the most popular open-source Deep Learning models.

An old LLM Leaderboard

check out latest work on Grokking

Instruction Fine-Tuned open-source models

LLAMA 3.1

LLAMA 3.1 models

LLAMA 3.2

LLAMA 3.2 models

Qwen2 small

Qwen2 small models

Qwen2.5 small

Qwen2.5 small models

Gemma

Gemma models

Falcon 1&2

Falcon 1&2 models

Qwen2.0

Qwen2.0 models

Qwen2.5

Qwen2.5 models

Mistral 7B

Mistral7B v0.1-v0.3 models

Bielik

Bielik models

SmolLM

SmolLM models

OLMo

OLMo models

Special Cases: Good and Bad cases of overfitting

SAM

Segment Anything models

Llama Guard

Llama Guard Models

For more details, see these blog posts on:
What’s instructive about Instruct Fine-Tuning: a weightwatcher analysis
Fine-Tuned Llama3.2: Bad Instructions ?

Older, popular open-source models

LLAMA

LLAMA models

Dolly

Guanaco

Falcon

VGG

VGG11 - VGG19

ResNet

Resnet50 - ResNet152

Bert/XLNet

Bert vs XLNet

GPT

GPT vs GPT2

ALBERT

ALBERT models

FlanT5

FlanT5 small - large

CLIP

Clip-ViT

BLOOM

BLOOM models

About

The weightwachter tool has been developed by Calculation Consulting. We provide consulting to companies looking to implement Data Science, Machine Learning, and/or AI solutions. Reach out today to learn how to get started with your own AI project. Email: Info@CalculationConsulting.com