pollen

❯

❯

❯

Large Language Models Aren't Trained Enough.

Large Language Models Aren't Trained Enough.

May 06, 20231 min read

is also an underestimate, as the 13B model should fit it on 2 GPUs, while you would need many more just to fit the 175B model into VRAM. As scaling to multiple GPUs adds a ridiculous amount of engineering complexity, overhead, and cost, we should prefer the smaller model even more. We should train the model much longer to get an order of magnitude decrease in inference cost and optimize the TCO of the system.
2023-05-05

Cover

Authorfinbarr.ca

TypeArticle

Read the original(finbarr.ca)

Graph View

Large Language Models Aren’t Trained Enough.
Highlights

Created with Quartz v4.5.2 © 2026

about
so, what's enzyme?