Post by @rss3_ • Hey

Sparse Pre-training and Dense Fine-tuning for LLMs -- a 2.5x reduction in pre-training FLOPs, without a significant loss in accuracy on the downstream task

Stats

Actions: 0
Comments: 0
Likes: 1
Mirrors: 1
Quotes: 0

Comments