Jun 06, 2025; Talk
Echtzeit-AGFault Tolerant Memory Allocation Interface for Software-based Checksums
Fault tolerance is essential for system reliability, preventing data corruption and critical failures in applications where errors can have serious consequences. Checksums, which are periodically verified, are vital in developing fault-tolerant systems. Generic Object Protection introduces differential checksums into object-level data organization, automatically surrounding each access function with checksum checks and recalculations as needed.
This thesis aims to implement software-based checksumming for fault tolerance at both page and allocation granularity, protecting any data regardless of its organization. It will develop a memory allocation interface that enforces checksumming on allocated memory regions and explore transparent and semi-transparent applications of these checksums. The focus will be on the benefits of differential checksums, evaluating their effectiveness in detecting and preventing faults using the FAIL* simulation-based fault injection framework. Finally, the thesis will analyze the efficiency of the system, particularly the computational and resource overhead introduced by the checksumming process, to assess the practicality and scalability of the proposed solution in real-world applications.