Post by @leading • Hey

The training loss of the Llama 2 model at different model sizes

Stats

Comments