Hacker News

matt_d
Understanding Inference Scaling for LLMs: Bottlenecks, Trade-Offs, and Perf arxiv.org

hn-front (c) 2024 voximity
source