Hacker News

monax
A complete Llama2 inference engine that fits in 1356 bytes of x86 assembly github.com

hn-front (c) 2024 voximity
source