Citation
Salvador Perea, Rubén and Otero Marnotes, Andres and Mora, Javier and Torre Arnanz, Eduardo de la and Sekanina, Lukás and Riesgo Alcaide, Teresa
(2011).
Fault Tolerance Analysis and Self-Healing Strategy of Autonomous, Evolvable Hardware Systems.
In: "2011 International Conference on Reconfigurable Computing and FPGAs (ReConFig)", 30/11/2011 - 02/12/2011, Cancú, México. ISBN 978-1-4577-1734-5. pp. 164-169.
Abstract
This paper presents an analysis of the fault tolerance achieved by an autonomous, fully embedded evolvable hardware system, which uses a combination of partial dynamic reconfiguration and an evolutionary algorithm (EA). It demonstrates that the system may self-recover from both transient and cumulative permanent faults. This self-adaptive system, based on a 2D array of 16 (4×4) Processing Elements (PEs), is tested with an image filtering application. Results show that it may properly recover from faults in up to 3 PEs, that is, more than 18% cumulative permanent faults. Two fault models are used for testing purposes, at PE and CLB levels. Two self-healing strategies are also introduced, depending on whether fault diagnosis is available or not. They are based on scrubbing, fitness evaluation, dynamic partial reconfiguration and in-system evolutionary adaptation. Since most of these adaptability features are already available on the system for its normal operation, resource cost for self-healing is very low (only some code additions in the internal microprocessor core)