Optimizing Apportionment of Redundancies in Hierarchical RAID
Large disk arrays are organized into storage nodes – SNs or bricks with their own cashed RAID controller for multiple disks. Erasure coding at SN level is attained via parity or Reed-Solomon codes. Hierarchical RAID – HRAID – provides an additional level of coding across SNs, e.g., check strips P, Q at intra-SN level and R at the inter-SN level. Failed disks and SNs are not replaced and rebuild is accomplished by restriping, e.g., overwriting P and Q for disk failures and R for an SN failure. For a given total redundancy level we use an approximate reliability analysis method and Monte-Carlo simulation to explore the better apportionment of check blocks for intra- vs inter-SN redundancy. Our study indicates that a higher MTTDL – Mean-Time-to-Data-Loss – is attained by associating higher reliability at intra-SN level rather than inter-SN level, which is contrary to that of an IBM study.
READ FULL TEXT