Title: Efficient Data Mappings for Parity-Declustered Data Layouts
Published: July 2002
Authors: Eric J. Schwabe, Ian M. Sutherland
Abstract: The joint demands of high performance and fault tolerance in a large array of disks can be satisfied by a parity-declustered data layout -- an arrangement of data and redundant information that allows the rapid reconstruction of lost data while the array continues to operate. A data layout is typically generated by partitioning the data units on the disks into stripes and choosing one or more units per stripe to hold redundant information. Such a data layout can be represented as a table of stripes. The data mapping problem is the problem of translating a data address in a linear address space (the file system's view) into a disk identifier and an offset on the disk where the data is stored. Typically, the disk and offset are obtained from the data layout using table lookups, but recent work has yielded mappings that compute (disk, offset) pairs directly from data addresses without the need to store tables. In this paper, we show that parity-declustered data layouts based on commutative rings yield mappings with improved computational efficiency. These layouts also apply to a wider range of array configurations than other known layouts that do not use table lookup.
Full Paper: [postscript, pdf]