The present invention pertains to a computer-implemented method for resolving closed loops in automatic fault tree analysis of a multi-component system. The present invention further pertains to a device comprising a processor configured to perform such a method.
1. Technical field
The present invention relates to a. Further, the invention relates to a corresponding computing unit and a corresponding computer program product.
2. Prior art
The importance of safety-critical systems in many application domains of embedded systems, such as aerospace, railway, health care, automotive and industrial automation is continuously growing. Thus, along with the growing system complexity, the need for safety assurance as well as its effort is increasing in order to guarantee the high quality demands in these application domains. The aim of safety assurance is to ensure that systems do not lead to hazardous situations which may harm people or endanger the environment. In the application domains of safety-critical systems, the safety assurance is defined by the means of standards, see, e.g., International Electrotechnical Commission (IEC) 61508, "Functional Safety of Electrical/Electronic/Programmable Electronic Safety-related Systems," 1998
Traditionally, the assessment of a system in terms of safety is based on a bottom-up safety analysis approach, such as Failure Mode and Effect Analysis (FMEA), see IEC 60812, "Analysis Techniques for System Reliability - Procedure for Failure Mode and Effects Analysis (FMEA)," 1991
. Alternatively, the assessment of a system according to reference implementations is based on top-down approaches such as Fault Tree Analysis (FTA), see, e.g., Vesely et al., "Fault Tree Handbook," US Nuclear Regulatory Commission, 1981
. By such techniques, it is possible to identify system failure states, their causes and effects with impact on the system safety.
Often architectures of systems contain loops. An example for a loop is a closed-loop controller (PID). Closed-loop control refers to the process in which a physical variable, e. g., an ambient temperature, is to be brought to a particular value while being stabilized against disturbances. A feedback - obtained based on measuring an observable indicative of the physical variable - is used to set operation of an actuator influencing the physical variable. The controller is the component that acquires the actual value and derives a control signal from the difference between the set point and actual value. The controller then activates a final controlling element, e. g., a heater, that compensates for the control deviation.
Since failure propagation models often use Boolean logic, e.g., to drive a fault tree (FT), (closed) loops or ring closures are problematic. Because Boolean logic cannot contain loops in general, there are techniques to prevent loops in such models, e.g., as described in Höfig et al., "Streamlining Architectures for Integrated Safety Analysis Using Design Structure Matrices (DSMS)," Safety and Reliability: Methodology and Applications, 2014
. For applications where failure propagation models are composed automatically, e.g., when the architecture is generated, such preventive technologies cannot help. Such loops often cannot be prevented, as they simply develop during the composition of a system from existing components and existing parts of failure propagation models. Therefore a technique is required that is able to deal with loops in failure propagation models that use Boolean logic.
Another approach can be found in Lim et al., "Systematic Treatment of Circular Logics in a Fault Tree Analysis," Nuclear Engineering and Design, 245 (Supplement C):172 - 179, 2012
, where the initial condition of a system is investigated to treat circular logic. An initial condition of a circular logic is the point where the loop is closed. If the next gate is of the type "fails to run" or "fails to start", the circular logic is treated differently. These conditions indicate whether a supporting system is in standby and needs to be started to fulfill its function or if a system is running and fails to perform its operation. This knowledge is required for all gates of a fault tree logic that close a loop to automatically treat the circular logic using the algorithm. This limits the ability to fully automate the process of removing circular logic from automatically generated fault trees.
Against this background, a need exists for advanced techniques of analyzing safety-critical systems. In particular, a need exists for advanced techniques of identifying and resolving loops in fault trees.
It is therefore an objective of the invention to provide a computer-implemented method for resolving closed loops in automatic fault tree analysis of a multi-component system in an efficient and reliable manner.
3. Summary of the invention
This problem is according to one aspect of the invention solved by a computer-implemented method for resolving closed loops in automatic fault tree analysis of a multi-component system, the method comprising the steps:
- a. modeling the multi-component system using a fault tree, the fault tree comprising elements associated with components of the multi-component system and interconnections between the elements associated with functional dependencies between the components;
- b. back-tracing failure propagation paths from an output element of the fault tree via the interconnections towards one or more input elements of the fault tree;
- c. checking, for all failure propagation paths , if the respective failure propagation path contains a closed loop by identifying a downstream element of the respective failure propagation path having a dependency of its output value on an output value of an upstream element (4c) of the failure propagation path;
- d. setting the input value corresponding to a loop interconnection of each such downstream element to Boolean TRUE;
- e. Identifying any Boolean AND-gate having, independently of the specific values of the input elements, no Boolean TRUE as output value;
cutting off any Boolean TRUE input to these identified Boolean AND-gate between the respective downstream element and the respective upstream element, and
- f. setting the input value of each respective downstream element corresponding to the loop interconnection to Boolean FALSE.
One idea of the present invention is to provide a method of resolving closed loops by following an approach inspired by fixed-point iteration, i.e. by a method of computing solutions of equations that can be written in the form x = f(x). Given such a function defined on real numbers with real values and given a starting point x0
in the domain of f, one can show that the sequence xn+1
= f (xn
), n = 0,1,2... converges against a solution x of x = f(x) under specific circumstances. In the present case, taking into account the underlying Boolean logic, only two values are possible for variables, namely Boolean TRUE and FALSE. A fault tree or failure propagation paths within the fault tree may be regarded as some form of equation or system of coupled equations. The present invention now considers Boolean TRUE as starting value to all failure propagation paths where a closed loop has been discovered, such closed loops being found by iteratively going through the fault tree from the output to one or more inputs. Based on that, certain properties of the fault tree are evaluated and the fault tree is amended in a specific way to remove any closed loop present in the fault tree. Subsequently, Boolean FALSE is inserted as second starting value to render the remaining fault tree analyzable.
The specifics of the method according to the invention will become more apparent further below with reference to exemplary embodiments depicted in the drawings as appended. In short, simple fault trees may already be solved by simply setting any loop interconnection to Boolean TRUE.
In case a tautology arises, which means that the output element of the fault tree gives Boolean TRUE independent of the values of the input elements, the loop interconnections have to be set to Boolean FALSE in a subsequent step, which then may render the fault tree analyzable, that is without any remaining loops.
However, for specific complex fault trees, problems may arise, which make it necessary to identify any Boolean AND-gate having, independently of the specific values of the input elements, no Boolean TRUE as output value. These inputs, namely the Boolean TRUE input to any Boolean AND-gate, are cut. Setting any loop interconnection to Boolean FALSE in the next step then renders any fault tree analyzable, which means that the remaining fault tree does not contain loops anymore and hence may be expressed as and/or expanded into a definite Boolean expression.
The method according to the invention results in a lower bound for the fault tree analysis. This means that the result of the Fault Tree analysis is either equal or larger the exact result of the fault tree.
The advantage is that the combination of demining a lower and in addition to an upper bound enables a clear judgment if safety requirements are fulfilled or not in most of the cases.
The solution according to the invention is highly effective compared to conventional methods. The method of the invention particularly features linear complexity O(n) and thus is much faster than any method known so far. The method may enable automatized optimization of technical products and/or systems with regards to reliability, availability, maintainability and/or safety (RAMS requirements). Moreover, such RAMS requirements may be taken into consideration for the optimization of further technical system properties like for example efficiency and so on. The invention particularly provides an advanced technique for analyzing safety-critical systems.
According to another aspect, the fault tree is expressed within Boolean algebra by iteratively expanding the fault tree into Boolean expressions at the elements.
According to another aspect, the closed loop of the fault tree is associated with a closed-loop control circuitry of the multi-component system.
A further aspect of the invention is a device comprising a processor configured to perform the aforementioned method.
4. Short description of the drawings
In the following detailed description, presently preferred embodiments of the invention are further described with reference to the following figures:
Fig. 1 shows a device with a processor performing a method according to the invention, resulting in a lower bound for the fault tree analysis.
Figs. 2-4 show a fault tree analyzed with the device of Fig 1.
Figs. 5 and 6 show fault trees according to an alternative embodiment, resulting in an upper bound for the fault tree analysis.
Figs. 7 and 8 show fault trees according to an embodiment resulting in a lower bound for the fault tree analysis.
Figs. 9 and 10 show fault trees according to an alternative embodiment, resulting in an upper bound for the fault tree analysis.
Figs. 11 and 12 show fault trees according to an embodiment resulting in a lower bound for the fault tree analysis.
5. Detailed description of preferred embodiments
Although specific embodiments are illustrated and described herein, it will be appreciated by those of ordinary skill in the art that a variety of alternate and/or equivalent implementations may be substituted for the specific embodiments shown and described without departing from the scope of the present invention. Generally, this application is intended to cover any adaptations or variations of the specific embodiments discussed herein.
Herein, techniques are described to reliably and computationally inexpensively detect closed loops and/or ring closures in fault trees (FT). For this, a plurality of failure propagation paths are back-traced from an output of the FT towards one or more inputs of the FT. Then, for each failure propagation path, a check can be made if the respective failure propagation path forms a closed loop. Then, if a closed loop is identified, it may be possible to take an appropriate counter measure to mitigate negative effects of the closed loop on the analyzability of the FT.
Generally, the techniques described herein may find application in various kinds and types of safety-critical systems. For example, the techniques described herein may find application in multi-component system, e.g. control or actuator systems. Such control or actuator systems may provide control functionality or activation functionality for certain machines. Some elements of multi-component safety-critical systems may be implemented as hardware while some components may alternatively or additionally be implemented using software. It is possible that the safety-critical systems for which the techniques are employed include an output which provides an actuator force or a control signal for actuating or controlling one or more machines. Specific examples of safety-critical systems which may benefit from the techniques described herein include, but are not limited to, electronic circuitry including active and/or passive electronic components such as transistors, coils, capacitors, resistors, etc.; drivetrains for vehicles such as trains or passenger cars or airplanes; assembly lines including conveyor belts, robots, movable parts, control sections, test sections for inspecting manufactured goods (backend testing); medical systems such as imaging systems including magnetic resonance imaging or computer tomography, particle therapy systems; power plants; etc.
As a general rule, in the various examples described herein, different kinds and types of FTs may be used. An example implementation of a FT that may be relied upon in the techniques described herein includes a component FT (CFT). For sake of simplicity, hereinafter, various examples are described in the context of CFTs - while, generally, also a FT may be employed.
CFTs are described, e.g., in Kaiser et al., "A new component concept for FTs," Proceedings of the 8th Australian Workshop on Safety Critical Systems and Software, Volume 33, pp. 37-46, 2003
. CFTs provide a model- and component-based methodology for FT analysis, which supports a modular and compositional safety analysis strategy. The CFT includes a plurality of elements. The elements are associated with components of the system. The CFT also includes a plurality of interconnections between the elements. The interconnections are associated with functional dependencies between components of the system. Such functional dependencies may model input/output of control signals or flow of forces. The CFT may model an error behavior of the system. The error behavior of the system may be modeled by the CFT using approaches of hierarchical decomposition. Here, the overall behavior of the system can be predicted based on the individual behavior of components. In other words, the causal chain leading to an overall system behavior may be modeled by a causal chain of errors of components. The CFT may include Boolean interconnections between adjacent elements to model propagation of errors throughout the system. The CFT may model the system using a graph; here nodes of the graph may correspond to the elements and edges of the graph may correspond to the interconnections.
Various techniques described herein are based on the finding that CFTs modeling a system using Boolean logic expressions can malfunction if they include closed loops and/or ring closures. A closed loop may generally be present if an input value of an element of the CFT is derived from an output having an associated Boolean logic expression, which includes that input value.
Figure 1 shows a device 10 with a processor 6 performing a method M according to the invention for resolving closed loops in automatic fault tree analysis of a multi-component system (not depicted). The multi-component system may be, for example, a safety critical system or the like, which may comprise closed-loop control circuitry of a closed-loop controller (PID). The PID may for example be configured to control a component of the multi-component system on basis of a closed control loop. The PID may for example control a physical variable like a temperature, a pressure, a force and so on.
The method M will be explained in detail with reference to Figs. 2 to 4 for one particular example of a fault tree 1. The fault tree 1 models a multi-component system and comprises a plurality of elements 4 associated with components of the multi-component system and interconnections 2 between the elements 4 associated with functional dependencies between the components. Accordingly, the method M comprises under M1 modeling the multi-component system using the fault tree 1. The fault tree 1 comprises one output element 4a and four input elements 4b. Each element 4 (labeled as Xi
with i = 1...9) is associated with either a gate 3 or an event 5. In this particular example, there are three Boolean OR-gates 3b and two Boolean AND-gates 3a. Further, there are different basic events b1
. As can be seen in Fig. 2, the gates X5
both have inputs stemming from gates upstream in the fault tree, namely from X3
, respectively. Hence, these two gates X5
cause loops within the fault tree 1, which make it problematic to automatically analyze the fault tree 1 as no meaningful Boolean expression can be readily assigned to the fault tree 1 due to the loop.
The method M further comprises under M2 back-tracing failure propagation paths 11 from the output element 4a of the fault tree 1 via the interconnections 2 towards the input elements 4b of the fault tree 1. This back-tracing is illustrated in Fig. 3, where it can be seen that the fault tree 1 is basically decomposed into two failure propagation paths 11, each of which features one closed loop 7. Or, to describe it differently, the fault tree 1 is "unrolled". The interconnection of each loop 7 to the respective failure propagation path 11 is labeled ψi
in the following. Hence, the failure propagation path 11 on the left in Fig. 3 has one closed loop 7 connecting one input of element X6
with the output of element X2
at loop interconnection ψ1
. Correspondingly, the failure propagation path 11 on the right in Fig. 3 has one closed loop 7 connecting one input of element X5
with the output of element X3
at loop interconnection W2
Such loop-causing gates may be identified in a general manner by checking for all failure propagation paths 11 if the respective failure propagation path 11 contains a downstream element 4d having a dependency of its output value on an output value of an upstream element 4c of the failure propagation path 11. Accordingly, the method M comprises under M3 checking, for all failure propagation paths 11, if the respective failure propagation path 11 contains a closed loop 7 by identifying a downstream element 4d of the respective failure propagation path 11 having a dependency of its output value on an output value of an upstream element 4c of the failure propagation path 11.
Next, the method M removes these two closed loops 7 in the fault tree 1. To this end, the method M comprises under M4 setting the input value corresponding to the loop interconnection ψi
of each such downstream element 4d to Boolean TRUE. Or, in other words, the problematic element turning up in a corresponding Boolean expression at this point is replaced by the expresson ψi
. Further, the method comprises under M5 identifying any Boolean AND-gate 3a having, independently of the specific values of the input elements 4b, not Boolean TRUE as output. With reference to Fig. 4, it can be seen that two AND-gates 4da can be found that fulfill these criteria and, thus, two Boolean AND-gates 4da are identified.
The method M further comprises under M5 cutting off any Boolean TRUE input to the identified Boolean AND-gate 3a remaining between the respective downstream element 4d and the upstream element 4c. As can be seen in Fig. 4, for both X6
one respective input is cut off, namely the loop interconnections ψ1
(denoted as cut interconnections 8 in Fig. 4). Finally, the method M comprises under M6 setting the input value of each respective downstream element 4d corresponding to the loop interconnection ψi
to Boolean FALSE. In this particular example, the loop interconnections ψi
are cut off anyway, hence this method step has no consequence (cf., however, the examples in Figs. 5 to 8). As can be seen in Fig. 4, the closed loops 7 have been removed, i.e. they have been cut off the failure propagation paths 11. Only well-defined Boolean gates 3 and basic events bi
remain in the fault tree 1. Hence, the fault tree 1 in Fig. 2 can now be evaluated, that is, it can be iteratively expanded into definite Boolean expressions at the elements 4, proceeding from the output element 4a via the interconnections 2 towards the input elements 4b or vice versa. The fault tree 1 thus can be expressed as:
The method according to the invention can be summarized as follows with the pseudo code:
- 1. Identify all loops within the fault tree (using depth-first or depth-first search).
- 2. Break each loop by removing the connection between the loop causing gate and its predecessor and add a substitute basic event X. Set X = true.
- 3. For all loops:
Check for each of the AND gates within the loop, if the AND gate = true (tautology) or not
- i. If AND gate = true then do nothing
- ii. If AND gate is not true then remove all connections to its children which are true
- iii. Set all substitute basic events to false
The method can be used to determine the lower bound for the fault tree analysis.
Thereby, alternatively, the method step M5 can be modified as follows:
- Checking M5 any Boolean AND-gate 3a having, independently of the specific values of the input elements 4b, Boolean TRUE as output value nothing has to be done
- the step of replacing M5 any Boolean AND-gate 3a having, independently of the specific values of the input elements 4b, Boolean TRUE as output value with a Boolean OR-gate 3b between the respective downstream element 4d and the respective upstream element 4c.
This alternative method step results in an upper bound of the fault tree analysis, as explained in detail in document EP 579 074 A1
The upper bound and lower bound, accordingly, two bounds, can be used to judge the fault tree analysis result.
This can lead to the following exemplary use cases:
- 1. The upper and lower bounds of the fault tree analysis are equal, according to Figures 6 and 8, denoted as minimal cut set = A v B1 v B2 v C. Thereby, cut sets are the unique combinations of component failures that can cause system failure. Specifically, a cut set is said to be a minimal cut set if, when any basic event is removed from the set, the remaining events collectively are no longer a cut set. Hence, in this case, the loop breaking step according to Figures 5 and 7 leads to the same and also exact result. This result can be compared to the Tolerable Hazard Rate (THR) to judge, if the requirement is fulfilled (or not).
- 2. The upper and lower bounds of the fault tree analysis are different. In other words, in this case, the loop breaking according to Figures 9 and 11 leads to distinct results or bounds, denoted with distinct minimal cut sets. The minimal cut sets are shown in Figures 10 and 12, respectively.
In this case, distinct sub cases are possible
- a. Upper bound is lower than the THR, the requirement is fulfilled
- b. Lower bound is lower than the THR, the requirement is not fulfilled
- c. The THR is between the lower and the upper bound. This means, that it is not sure, if the requirement is fulfilled or not. Hence, the fault tree can be postprocessed. For example, the fault tree is reviewed and/or checked by an expert or automatically, if loops can be removed by changing the reworking the fault tree. The removal of loops can be performed manually or automatically. Afterwards the fault tree analysis must be executed again and the lower and upper bound must compared to the THR.