RPC-V: Toward Fault-Tolerant RPC for Internet Connected Desktop Grids with Volatile Nodes

Abstract : RPC is one of the programming models envisioned for the Grid. In Internet connected Large Scale Grids such as Desktop Grids, nodes and networks failures are not rare events. This paper provides several contributions, examining the feasibility and limits of fault-tolerant RPC on these platforms. First, we characterize these Grids from their fundamental features and demonstrate that their applications scope should be safely restricted to stateless services. Second, we present a new fault-tolerant RPC protocol associating an original combination of three-tier architecture, passive replication and message logging. We describe RPC-V, an implementation of the proposed protocol within the XtremWeb Desktop Grid middleware. Third, we evaluate the performance of RPC-V and the impact of faults on the execution time, using a real life application on a Desktop Grid testbed assembling nodes in France and USA. We demonstrate that RPC-V allows the applications to continue their execution while key system components fail.
S. Djilali, T. Herault, Oleg Lodygensky, T. Morlier, Gilles Fedak, et al.. RPC-V: Toward Fault-Tolerant RPC for Internet Connected Desktop Grids with Volatile Nodes. SuperComputing 2004 (SC2004), Nov 2004, Pittsburgh, United States. pp.39-39, ⟨10.1109/SC.2004.51⟩. ⟨in2p3-00457039⟩



