> The page you have tried to access is not available because the owner of the file you are trying to access has exceeded our short term bandwidth limits. Please try again shortly.
The nice thing about DeepSeek and off-memory streaming is that you ought to be able to batch multiple sessions of it in parallel. Each individual session would slow down from streaming incrementally more active weights from disk, but your total tok/s would ultimately only be limited by compute. Other models have trouble doing this, because the KV cache takes too much space in RAM (and increases wear-and-tear if stored on disk) even for somewhat limited context.
reply