Title: Hypercube Network Fault Tolerance: A Probabilistic Approach
Published: January 2002
Authors: Jianer Chen, Iyad A. Kanj, Guojun Wang
Abstract: Extensive experiments and experience have shown that the well-known hypercube networks are highly fault tolerant. On the other hand, most proposed fault tolerance models for hypercube networks are only able to characterize very rare extreme situations thus significantly underestimating the power of hypercube network fault tolerance. In this paper, we develop a new scheme that enables us to derive lower bounds for the probability of hypercube network fault tolerance in terms of node failure probability. Our results provide formal proofs that the hypercube networks can sustain an extremely large number of node failures. Two typical examples are that a 10-cube network of 1024 nodes can sustain up to 10% faulty nodes (i.e., over 100 faulty nodes) while still keeping the non-faulty nodes connected with probability 99%, and that if the failure probability of each individual node is bounded by 0.1%, then all hypercube networks of practical size (e.g., up to a trillion nodes) are able to keep their non-faulty nodes connected with probability 99.9%. A simple routing algorithm, which is local-information-based, is proposed based on this scheme that, with roughly the same high probability, runs in optimal time and constructs a routing path of length bounded by a small constant plus twice the Hamming distance between any two given non-faulty nodes. Our scheme is also applicable to the study of other hierarchical network structures and of other network communication problems.
Full Paper: [postscript]