Flash Indexer for Dynamo
I worked on the Flash Indexer, a high-throughput global KV-cache indexer for NVIDIA Dynamo. The problem is to track which inference workers hold which KV blocks, and then answer routing queries fast enough that the indexer itself does not become the bottleneck. ...