Speculative Decoding

KnapSpec: Self-Speculative Decoding via Adaptive Layer Selection as a Knapsack Problem

A self-speculative decoding method with training-free, knapsack-based adaptive layer selection that accounts for context-dependent attention overhead - ___[ICML 2026](http://icml.cc/Conferences/2026)___