Fast Abstracts Archives . .

FastAbstracts


WHAT IS a
FastAbstract

The History

Archives of
FastAbstracts

ISSRE 2003
ISSRE 2002
ISSRE 2001
ISSRE 2000
ISSRE 1999
ISSRE 1998
FTCS 1999
FTCS 1998



 

 

 

Limits of Engineering Judgement for Safety-Critical Software-Based Embedded Systems

Francesca Saglietti1
Institute for Safety Technology (ISTec) Garching (D)
saf@istecmuc.grs.de

 

In recent times considerable effort is being invested by the software engineering community in various attempts to formalise the decision process underlying licensing procedures for safety-critical embedded systems based on software-intensive components. As for more conventional technologies based on mechanical or electrical devices, the licensing process requires first an evaluation of the risk involved by automation, as compared with the inherent risks of the uncontrolled application. Main outcome of this analysis are the identification and classification of failure criticality; based on this insight minimum reliability demands are determined and required to be demonstrated for the purpose of certification.

Product Reliability

According to most international standards the quantitative notion of software reliability refers to the probability of operational survival; i.e., to the probability of correct performance under the expected application-specific operational demand profile. The source of randomness justifying a probabilistic approach in spite of the deterministic nature of software faults lies in the input selection, which is subject to unpredictable physical state transitions of the underlying technical process. Thus, a quantitative evaluation of software reliability is achievable by

operational evidence: the execution of a large number of independent and representative scenarios is followed by an estimation of maximum failure probability based on statistical sampling theory. Assuming an accurate performance of this testing phase, it provides the rationale for deciding whether:

  • the system fulfils ultrahigh reliability (F), or
  • the system does not fulfil ultrahigh reliability (N).

The amount of test cases required to demonstrate ultrahigh reliability targets, and the difficulties in anticipating the operational profile often pose serious problems in carrying out a statistically significant reliability evaluation for safety-critical software.

Expert Judgement

For this reason, and in order to support early fault detection by transparent and sound design methods, independent assessors usually include in their quantitative judgement also qualitative aspects based on

non-operational evidence: refers to life-cycle phases preceding operation (a. o. development process, safety culture, documentation, resources invested, inspections, static analysis, non-operational tests).

The crucial problem in engineering judgement is the integration of heterogeneous (operational and non-operational) evidence into a single probabilistic statement. The validity of such an integration mainly depends on the available expertise, which may be of

frequentistic nature: factual knowledge of process impact on product quality is available on the basis of large populations of identical or comparable (typically physical) devices developed by standardised manufacturing procedures; or rather of

subjective nature: personal expectation of process impact on product quality on the basis of individual observations of "similar" development projects.

The result of the assessment process can be two-fold:

  • acceptance of the software-based system (A), or
  • rejection of the software-based system (R).

In order to provide a formal framework for the integration of inhomogeneous evidence, in several technical fields diagnosis activities are supported by

Bayesian Belief Networks (BBNs): the assessment process is modelled by a network reflecting the updating of prior probabilities based on historical experience in the light of present factual knowledge.

It is not the intention of this article to detract from the qualitative merit of BBNs in providing a transparent assessment structure (see [Dah97], [Del97], [Nei96]). It rather aims at an analysis of the statistical significance of BBN quantitative outcome in the special case of ultrahigh software reliability assessment. The main limit of engineering judgement pointed out here concerns

stable development: the determination of sound a priori values based on historical, non-operational evidence assumes a homogeneous expertise in a stable programming environment; the variance in the development process limits the validity of subjective judgement. In particular, assuming project homogeneity implies stability in reliability targets to be achieved and demonstrated. At present, this assumption is not generally fulfilled; on the contrary, expertise with ultrahigh reliable systems usually represents a minor part out of the overall set of applications considered:

P(F) << P(N)

In fact, most subjective opinions originate from lessons learnt in the light of past faults rather than in the light of successful experience. In other words, reasoning on process effectiveness relies more often on falsification than on verification. Considering five safety integrity levels (like IEC 65A) the frequency of applications experienced is likely to follow a normal distribution, where the highest reliability class may just represent one out of ten cases:

P(F) / [P(F)+ P(N)] = ca. 0.1.

Expert Reliability. Safety assessment based on subjective judgement evidently relies on expert quality. Due to the above, probabilities for justified acceptance / rejection are not likely to be better than:

P (A under condition F) = P(R under condition N) = ca. 0.9.

Certification Reliability. For the purpose of licensing, more crucial than expert reliability is the probability of successful certification (S), i.e. of ultrahigh system reliability assuming acceptance. In analogy to the Harvard medical study reported in [Gra98], applying Bayes’ theorem on the examples considered above yields a surprisingly low probability of successful certification:

P(A) = P(A under condition N) P(N) + P(A under condition F) P(F) = ca. 0.18

P(S) = P(F under condition A) = P(A under condition F) P(F) / P(A) = ca. 0.5.

Conclusion. This article aims at demonstrating a substantial weakness in the quantitative assessment of ultrahigh software reliability based on subjective judgement. To begin with, today’s still unstable software engineering limits the extrapolation of past experience on new projects; this severely restricts expert reliability. Moreover, a high variance in the reliability degrees achieved and demonstrated so far contributes to a dramatic decrease in the probability of successful certification. However, a transparent description of the assessment process undoubtedly offers a qualitative support, which in future might become extendible to include sound quantification.

References

[Dah97] G. Dahll: "Safety Assessment of Software Based Systems", Safecomp’97, Springer-Verlag 1997

[Del97] K. Delic, F. Mazzanti, L. Strigini: "Formalising Engineering Judgement on Software Dependability via Belief Networks", DCCA’97, Springer-Verlag 1997

[Gra98] T. Grams: "Bedienfehler und ihre Ursachen", Automatisierungstechnik Praxis 40, Nr. 3, 4, Oldenbourg-Verlag 1998

[Nei96] M. Neil, B. Littlewood, N. Fenton: "Applying Bayesian Belief Networks to System Dependability Assesment", Safety Critical Systems Club Symposium, Springer-Verlag 1996


1. Author contact: Institute for Safety Technology (ISTec) GmbH, Forschungsgelände, 85748 Garching, Germany, phone: 49 89 32004 - 539 fax: - 300 e-mail: saf@istecmuc.grs.de.