ASPCS
 
Back to Volume
Paper: Software Fault Tolerance for Low-to-Moderate Radiation Environments
Volume: 238, Astronomical Data Analysis Software and Systems X
Page: 257
Authors: Sengupta, R.; Offenberg, J. D.; Fixsen, D. J.; Katz, D. S.; Springer, P. L.; Stockman, H. S.; Nieto-Santisteban, M. A.; Hanisch, R. J.; Mather, J. C.
Abstract: The primary intention of NASA's Remote Exploration and Exploration (REE) project is to use commercial off-the-shelf, scalable, low-power, fault-tolerant, high-performance computation in space. Most of the faults caused by the radiation environments in regions of space of interest to REE (Deep Space, Low Earth Orbit) are transient, single event effects. Some of these faults can cause errors at different application levels. System and applications software can potentially detect and correct some or many of these errors. We discuss different software fault tolerance approaches such as replication, voting, and masking with a focus on algorithm-based fault-tolerance. Combined software and hardware approaches such as fault avoidance, redundancy, masking, and reconfiguration are discussed. These approaches allow trade-offs between reliability, power, cost, and computation power for spacecraft in a low-to-moderate radiation environment.
Back to Volume