Taalas Achieves 17,000 Tokens/Second with Llama 3.1 8B

TECH Februar 22, 2026

Canadian hardware startup Taalas has announced its first product: a custom hardware implementation of the Llama 3.1 8B model capable of running at an impressive 17,000 tokens per second. This development marks a significant leap in local AI inference performance.