More and more systems (e.g., aircraft, machinery, cars) rely heavily on software, which performs safety-critical operations. Assuring software safety though traditional V&V has become a tremendous, if not impossible task, given the growing size and complexity of the software. We propose that iSWHM (Integrated SoftWare Health Management) can increase safety and reliability of high-assurance software systems. iSWHM uses advanced techniques from the area of system health management in order to continuously monitor the behavior of the software during operation, quickly detect anomalies and perform automatic and reliable root-cause analysis, while not replacing traditional V&V. Information provided by the iSWHM system can be used for automatic mitigation mechanisms (e.g., recovery, dynamic reconfiguration) or presented to a human operator. iSWHM’s prognostic capabilities will further improve reliability and availability as it provides information about soon-to-occur failures or looming performance bottlenecks. In this paper, we will discuss challenges and future potential and describe how Bayesian networks (BN) could be used for iSWHM modeling.
- Software engineering,
- Software bugs,
- Bayesian networks,
- system health management
Available at: http://works.bepress.com/ole_mengshoel/12/