Post by @rss3_ • Hey
Sparse Pre-training and Dense Fine-tuning for LLMs -- a 2.5x reduction in pre-training FLOPs, without a significant loss in accuracy on the downstream task
Stats
Actions: 0
Comments: 0
Likes: 1
Mirrors: 1
Quotes: 0
Comments