Vault 2016 has ended
Back To Schedule
Thursday, April 21 • 11:30am - 12:20pm
Huge Indexes: Algorithms to Track Objects in Cache Tiers - Dan Lambright, Red Hat

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

A storage cache must implement an index to quickly locate the objects it holds. The index’s design is impacted by the storage medium. For example, a memory cache’s requirements differ from a cache built using storage tiers. In the former, an in-memory hash table or balanced tree may suffice. But in the later, those structures may stumble. The metadata required to track such a large number of objects won’t fit in memory. In such cases, the challenge is to find an index that scales. A further consideration is wether to track elements in LRU order, in which case a sorting mechanism is called for. This talk contrasts 3 cache tiering implementations in Linux that have tackled this problem from GlusterFS, Ceph, and DMcache. Solutions vary from bloom filters to sqlite databases. We will explore their relative pros and cons along the dimensions of performance, overhead, complexity, and more.

avatar for Dan Lambright

Dan Lambright

Software Engineer, Red Hat
Dan Lambright is a principal software engineer at Red Hat, where he works on distributed storage systems. Prior to Red Hat is worked at EMC, DELL, and several storage startups. He also teaches as an adjunct professor at the University of Massachusetts, Lowell.

Thursday April 21, 2016 11:30am - 12:20pm PDT
State Ballroom A