[0001] Operational technology (OT) systems, such as those used in traditional factories,
have typically been extremely deterministic in their operation - and their security
has been at least partially implemented by the means of isolation. However, modern
industry 4.0 manufacturing systems are characterized by a convergence of information
technology (IT) and OT infrastructure, displaying increasingly complex activity patterns
and network topologies and making extensive use of autonomous systems and components
(e.g., robots or AI-enabled software applications) that constantly interact with one
another. This new paradigm has the potential to substantially improve the flexibility,
reliability, and efficiency of manufacturing but also poses new cybersecurity challenges
and demands a high degree of domain-specific knowledge from analysts assessing potential
integrity issues or indications of security compromises.
[0002] Most commercial software requires establishing a baseline behavior by configuring
the network and letting it run without any security events for a period of time. Then,
it measures deviations from the baseline behavior. A drawback is that is does not
deal well with topology changes and also requires baselining activities beforehand.
[0003] Traditional methods for anomaly detection in the cybersecurity domain mostly focus
on separate data streams (e.g., only network events) and often suffer from producing
an unmanageable number of alerts (of which the majority are false positives) that
are hard to interpret for human security analysts.
[0006] It is an object of the invention to provide an alternative to the state of the art.
[0007] The objectives of the invention are solved by the independent claims. Further advantageous
arrangements and embodiments of the invention are set forth in the respective dependent
claims.
[0008] According to the method for anomaly detection in a network, the following operations
are performed by components, wherein the components are software components executed
by one or more processors and/or hardware components:
- representing, by a temporal knowledge graph stored in a graph database, a network
including interactions between network modules with a set of entities, a set of relations,
and a set of timestamps,
- with each entity representing data about the network as a node in the temporal knowledge
graph,
- with at least some entities representing network modules consisting of hardware and/or
software,
- with at least some of the relations representing a type of interaction between network
modules, and
- with at least some edges between nodes representing interactions between entities
that have been observed as events, with each of these edges having a timestamp, encoding
when the interaction occurred, and being an instance of one of the relations,
- sampling, in a first step, temporal random walks from the temporal knowledge graph,
- transforming, in a second step, the temporal random walks into temporal logical rules,
- observing, by a monitoring component, an event in the network or a different network,
and
- classifying, in a third step, the observed event with regard to an anomaly, using
the temporal logical rules.
[0009] The system for anomaly detection in a network comprises the following components:
- a graph database storing a temporal knowledge graph representing a network including
interactions between network modules with a set of entities, a set of relations, and
a set of timestamps,
- with each entity representing data about the network as a node in the temporal knowledge
graph,
- with at least some entities representing network modules consisting of hardware and/or
software,
- with at least some of the relations representing a type of interaction between network
modules, and
- with at least some edges between nodes representing interactions between entities
that have been observed as events, with each of these edges having a timestamp, encoding
when the interaction occurred, and being an instance of one of the relations,
- a temporal random walks component configured for sampling temporal random walks from
the temporal knowledge graph,
- a rule learning component configured for transforming the temporal random walks into
temporal logical rules,
- a monitoring component, configured for observing an event in the network or a different
network, and
- a rule matching component, configured for classifying the observed event with regard
to an anomaly, using the temporal logical rules.
[0010] The following advantages and explanations are not necessarily the result of the object
of the independent claims. Rather, they may be advantages and explanations that only
apply to certain embodiments or variants.
[0011] The graph database can be stored in RAM, on a hard disk or on a solid-state drive,
for example. The graph database can be implemented by simply storing a set of datasets,
or by any other kind of database, for example a relational database or a dedicated
graph database.
[0012] In connection with the invention, unless otherwise stated in the description, the
terms "training", "generating", "computer-aided", "calculating", "determining", "reasoning",
"retraining" and the like relate preferably to actions and/or processes and/or processing
steps that change and/or generate data and/or convert the data into other data, the
data in particular being or being able to be represented as physical quantities, for
example as electrical impulses.
[0013] The term "computer" should be interpreted as broadly as possible, in particular to
cover all electronic devices with data processing properties. Computers can thus,
for example, be personal computers, servers, clients, programmable logic controllers
(PLCs), handheld computer systems, pocket PC devices, mobile radio devices, smartphones,
devices or any other communication devices that can process data with computer support,
processors and other electronic devices for data processing. Computers can in particular
comprise one or more processors and memory units.
[0014] In connection with the invention, a "memory", "memory unit" or "memory module" and
the like can mean, for example, a volatile memory in the form of random-access memory
(RAM) or a permanent memory such as a hard disk or a Disk.
[0015] The temporal knowledge graph is used as a stream-based data structure to extract
rules that identify typical temporal behavior of the network and can be used to identify
anomalies in a human-interpretable way.
[0016] The method and system, or at least some of their embodiments, frame the anomaly detection
task as a quadruple classification problem, using the temporal logical rules and their
respective groundings in the temporal knowledge graph to support the classification.
[0017] The method and system, or at least some of their embodiments, use a scalable rule-based
quadruple classification framework for interpretable prediction.
[0018] Existing approaches that rephrase anomaly detection as a link prediction task in
knowledge graphs only focus on static knowledge graph methods and are not suitable
to capture the temporal dependencies that arise from complex activity streams in modern
manufacturing systems.
[0019] On the other hand, graph-based algorithms that rely on embeddings do not provide
human-interpretable explanations of the output.
[0020] The method and system, or at least some of their embodiments, provide with the temporal
logical rules comprehensible and - for human analysts interpretable - features that
can help analysts identify the problems. Moreover, the learned temporal logical rules
can be transferred to a new network setup if the underlying data model is consistent
so that it is not necessary to learn the rules from scratch.
[0021] Different to previous methods, the method and system, or at least some of their embodiments,
extract human-interpretable heuristic rules that - depending on the training procedure
and application - describe the typical behavior of network modules that can be used
to detect and analyze network anomalies (or uncover temporal patterns of anomalies,
which can have multiple stages over an extended period of time, like scanning the
network over an extended period of time, followed by migration of the applications
and stealing of credentials). The temporal logical rules and corresponding paths (rule
groundings) in the temporal knowledge graph serve as human-interpretable explanations
so that the domain expert can understand the predictions and act accordingly.
[0022] Moreover, in contrast to embedding-based approaches for knowledge graphs that could
be applied in the context of cybersecurity, at least some of the embodiments can operate
in an inductive setting. That means the temporal logical rules that are extracted
from one knowledge graph can be employed on another knowledge graph (i.e., with other
entities but the same relations, i.e. interaction types) without having to retrain
the learning model. Especially for new networks, no baselining of activities is needed
as in existing software. Compared with approaches that integrate manually defined
rules by domain experts, at least some of the embodiments do not need explicit background
information and are better scalable to large datasets, resulting in reduced time and
effort to construct the rules.
[0023] In an embodiment of the method and system, the first step is performed by a temporal
random walks component, which samples temporal random walks for each relation in the
temporal knowledge graph, wherein each temporal random walk is a sequence of edges
with decreasing timestamps starting with an edge for the respective relation.
[0024] In another embodiment of the method and system, the second step is performed by a
rule learning component, wherein each temporal logical rule consists of a rule body
and a rule head, with each rule head specifying one of the relations, each temporal
random walk is stored as a rule grounding of the respective temporal logical rule,
and a confidence value is computed for each temporal logical rule, indicating a probability
for the correctness of the respective rule.
[0025] In another embodiment of the method and system, the sequence of edges of each temporal
random walk is transformed into a temporal logical rule by replacing the entities
and the timestamps in the sequence of edges by variables, forming the rule head based
on the first edge in the sequence of edges, and forming the rule body by reordering
the remaining sequence of edges. The confidence values for the temporal logical rules
are computed by sampling rule bodies and counting for how many rule bodies a rule
head exists with a future timestamp.
[0026] In another embodiment of the method and system, the observed event is represented
with a query containing a query subject, a query relation, a query object, and a query
timestamp. The third step is performed by a rule matching component, taking as input
the query, the temporal knowledge graph and the temporal logical rules with the confidence
values, and computing a classification score based on matching rules among the temporal
logical rules, wherein the matching rules predict the observed event, and wherein
the classification score indicates a probability to observe the observed event.
[0027] In another embodiment of the method and system, the third step consists of retrieving
temporal random walks in the temporal knowledge graph between the query subject and
query object, determining the matching rules by picking temporal logical rules if
the query relation matches the relation specified in the rule head and one of the
retrieved walks matches the rule body, and calculating the classification score based
on the confidence values of the matching rules.
[0028] In another embodiment of the method and system, the observed event is classified
with regard to a set of classes based on the classification score, with at least one
of the classes indicating an anomaly.
[0029] An embodiment of the method and system comprises an additional fourth step of outputting,
by an explainable prediction component and a user interface, the result of the classification
and the rule groundings for the matching rules to a user.
[0030] This embodiment provides the rule groundings as a human-understandable explanation
for the classification result to the user.
[0031] In another embodiment of the method and system, the result of the classification
and the rule groundings are transformed into natural language output.
[0032] An embodiment of the method and system comprises the additional step of automatically
disconnecting at least one network module from the network or limiting its connections
on certain ports, based on the classification result.
[0033] In another embodiment of the method and system, the network is an industrial automation
system, in particular a manufacturing system.
[0034] In another embodiment of the system, the system is implemented as a monitoring device,
in particular an edge device, wherein the monitoring device is part of the network,
and the network is an industrial automation system, in particular a manufacturing
system.
[0035] The computer program product has program instructions for carrying out the method.
[0036] The provision device stores and/or provides the computer program product.
[0037] The foregoing and other aspects of the present invention are best understood from
the following detailed description when read in connection with the accompanying drawings.
For the purpose of illustrating the invention, there are shown in the drawings embodiments
that are presently preferred, it being understood, however, that the invention is
not limited to the specific instrumentalities disclosed. The embodiments may be combined
with each other. Furthermore, the embodiments may be combined with any of the features
described above. Unless stated otherwise, identical reference signs denote the same
features or functionally identical elements between drawings. Included in the drawings
are the following Figures:
- Fig. 1
- shows a first embodiment,
- Fig. 2
- shows another embodiment,
- Fig. 3
- shows an example of a cyber-physical system and its representation as a knowledge
graph with dynamic edges,
- Fig. 4
- shows an anomaly detection framework,
- Fig. 5
- shows a flowchart of a possible exemplary embodiment of a method for monitoring a
network.
[0038] In the following description, various aspects of the present invention and embodiments
thereof will be described. However, it will be understood by those skilled in the
art that embodiments may be practiced with only some or all aspects thereof. For purposes
of explanation, specific numbers and configurations are set forth in order to provide
a thorough understanding. However, it will also be apparent to those skilled in the
art that the embodiments may be practiced without these specific details.
[0039] The described components can each be hardware components or software components.
For example, a software component can be a software module such as a software library;
an individual procedure, subroutine, or function; or, depending on the programming
paradigm, any other portion of software code that implements the function of the software
component. A combination of hardware components and software components can occur,
in particular, if some of the effects according to the invention are preferably exclusively
implemented by special hardware (e.g., a processor in the form of an ASIC or FPGA)
and some other part by software.
[0040] Fig. 1 shows one sample structure for computer-implementation of the invention which
comprises:
- (101)
- computer system
- (102)
- processor
- (103)
- memory
- (104)
- computer program (product)
- (105)
- user interface
[0041] In this embodiment of the invention the computer program product 104 comprises program
instructions for carrying out the invention. The computer program 104 is stored in
the memory 103 which renders, among others, the memory and/or its related computer
system 101 a provisioning device for the computer program product 104. The system
101 may carry out the invention by executing the program instructions of the computer
program 104 by the processor 102. Results of invention may be presented on the user
interface 105. Alternatively, they may be stored in the memory 103 or on another suitable
means for storing data.
[0042] Fig. 2 shows another sample structure for computer-implementation of the invention
which comprises:
- (201)
- provisioning device
- (202)
- computer program (product)
- (203)
- computer network / Internet
- (204)
- computer system
- (205)
- mobile device / smartphone
[0043] In this embodiment the provisioning device 201 stores a computer program 202 which
comprises program instructions for carrying out the invention. The provisioning device
201 provides the computer program 202 via a computer network / Internet 203. By way
of example, a computer system 204 or a mobile device / smartphone 205 may load the
computer program 202 and carry out the invention by executing the program instructions
of the computer program 202.
[0044] Preferably, the embodiments shown in Figs. 4 and 5 can be implemented with a structure
as shown in Fig. 1 or Fig. 2.
[0045] Figure 3 shows an example of a network, here a cyber-physical system and its representation
as a temporal knowledge graph G with dynamic edges. The cyber-physical system is,
for example, a modern industry 4.0 manufacturing systems, or any other kind of industrial
automation system. In the following, the cyber-physical system will be simply referred
to as "network". However, the term "network" shall also include any other kind of
computer network, as the embodiments can be applied to any kind of (computer) network,
including industrial automation systems, intranets, etc.
[0046] The embodiments shown in Figs. 4 and 5 consist of a rule-based method that monitors
a network, for example a modern manufacturing system, and detects potentially malicious
activities so that automated actions can be made based on the outputs. For these embodiments,
the network is first represented as a dynamic (temporal) knowledge graph as depicted
in Fig. 3, i.e., modules of the network (like servers, applications, hardware modules,
system settings, and properties like IP addresses) are represented as nodes, while
interactions between modules (e.g., network events) are represented as edges between
nodes that are flagged with a timestamp - encoding when the interaction occurred.
This representation allows a seamless and context-conserving integration of data from
different domains like engineering data and network data. The stream-based temporal
knowledge graph G is then used to extract rules that identify typical temporal behavior
of the network and can be used to identify anomalies in a human-interpretable way.
[0047] A hallmark of the embodiment is that extracted temporal logical rules capture behavioral
patterns of the network. This is quite important since whether an action (or setting)
of a network module is suspicious or expected often depends on the prior behavior
of the module (and other modules), i.e., the prior and current state of the network
that precedes with this action (or setting).
[0048] For instance, a software application hosted on an edge computer (i.e., situated in
a factory and used to, e.g., automate or optimize production) might exhibit different
patterns of reading data (variables) from sensors in a manufacturing system, where
- depending on the calculation the application is currently doing - different variables
are read from the manufacturing system in a certain order. An application might have
several of these patterns (with variations thereof), where the validity of data accesses
depends on what the application has accessed in the network beforehand.
[0049] Another example is network connections where certain connectivity patterns might
occur in the network (a module with IP address IP1 accesses always IP3 after accessing
IP2) or certain co-occurrences/patterns of network connections (e.g., high data volumes
are only observed over a specific port or download from one IP is always followed
by an upload to another IP).
[0050] The embodiment allows to automatically capture such dynamic behavior of the network
and trigger automated actions based on the output. For example, if the algorithm finds
that a network device is acting maliciously, then it can be taken offline or limited
in its connections on certain ports to mitigate damage until a human can look further
into it.
[0051] The following embodiments take a temporal knowledge graph G, for example like the
one shown in Fig. 3, as input and learn temporal logical rules by extracting temporal
random walks from the graph for anomaly detection. The anomaly detection task is framed
as a quadruple classification problem and temporal logical rules and respective groundings
in the temporal knowledge graph are used to support the classification.
[0052] Fig. 4 shows an embodiment of the method and system in the form of an anomaly detection
framework.
[0053] The data about the network modules and their interactions are multi-relational and
represented by a temporal knowledge graph
G ⊆ V × R × V × T, where
V denotes the set of entities (e.g., servers, applications),
R the set of relations (e.g., reads, accesses), and
T the set of timestamps (e.g., 2019-12-10;11:10). A link or edge is represented by
a quadruple
(subject, relation, object, timestamp) ∈ G, where the granularity of the timestamps could be hours, minutes, or any other reasonable
unit. A quadruple stands for an observed event in the network.
[0054] The anomaly detection task can be formulated as a quadruple classification problem
on the temporal knowledge graph G, where the context (i.e., the graph neighborhood)
and history of entities serve as features. Concretely, the task is to classify an
edge (corresponding to an event) in the temporal knowledge graph
G as one of the classes such as "expected", "unexpected", "suspicious", and "highly
suspicious".
[0055] The embodiments use a scalable rule-based quadruple classification framework for
interpretable prediction that first extracts temporal random walks from the temporal
knowledge graph
G and then transforms these walks to temporal logical rules TLR that generalize to
new data points. For a query Q in the form of a query quadruple
(subject, relation, object, timestamp), the learned temporal logical rules TLR are matched with the neighborhood of the subject
and object, and a classification score CS is calculated depending on the rule groundings
in the temporal knowledge graph
G. The rule groundings also serve as explicit, human-interpretable explanations for
the prediction. More precisely, the framework consists of the following four components
as shown in Fig. 4:
- a temporal random walks component TRW performing a first step 1,
- a rule learning component RL performing a second step 2,
- a rule matching component RM performing a third step 3, and
- an explainable prediction component EP performing a fourth step 4.
Temporal random walks component TRW
[0056] This component takes as input the temporal knowledge graph
G, containing information about the network modules and interactions (observed events)
in the network.
[0057] The output of this component are temporal random walks.
[0058] Given information about the network and observed events in form of the temporal knowledge
graph
G, temporal random walks are extracted by the temporal random walks component TRC. Starting
with an edge
(x1,h,xn+1,tn+1) for a fixed relation
h ∈ 
, a random walker samples adjacent edges iteratively until a walk of length
n+1 is obtained. The sampling distribution could either be uniform or exponentially distributed
with respect to the timestamps (e.g., to favor events that are close to the previous
event). Thereby, the embodiment imposes the constraint that the timestamps in the
sequence of sampled edges should be decreasing, i.e., we are going backward in time.
In the last step
n+1, an edge is sampled that connects to the first entity
x1 of the walk if such an edge exists. For all relations in the graph, a fixed number
of temporal random walks is sampled.
[0059] The resulting temporal random walk is a sequence of edges

where
tn+1 >
tn ···
> t1.
Rule learning component RL
[0060] This component takes as input the temporal random walks that have been computed by
the temporal random walks component TRW.
[0061] The output of this component are temporal logical rules TLR with confidence values.
[0062] Given a temporal random walk of the form

a temporal logical rule TLR

is created by reordering the edges and replacing the entities and timestamps by variables.
(
X1,
h,Xn+1,
Tn+1) is referred to as the
rule head and

as the
rule body. The rule body consists of a conjunction of body atoms. Here,

represents the inverse of relation
bi (the inverse of the event (
x1,
bi,
x2,
t) is

. The temporal random walk (1) is called a
grounding of the temporal logical rule (2). A temporal logical rule implies that if the rule
body holds with
T1 < ··· <
Tn, then the rule head holds as well for a future timestamp
Tn+1 >
Tn. Since there are rarely rules that are always true, the rule learning component RL
estimates for each temporal logical rule TLR a confidence value, i.e., a probability
for the rule's correctness. The rule learning component RL uses the standard confidence

, where it adapts the body and rule support to account for different timestamp value
combinations. The confidence value is estimated by sampling rule bodies and counting
for how many rule bodies a rule head exists with a future timestamp.
Rule matching component RM
[0063] This component takes as input the temporal logical rules TLR with the confidence
values that have been computed by the rule learning component RL, as well as the temporal
knowledge graph
G, and a query Q.
[0064] The output of this component is a classification score CS as well as rule groundings
in the temporal knowledge graph G.
[0065] Given the query Q of the form
(subject, relation, object, timestamp) representing a query event, e.g.,
(App A, accesses, Variable B, 2022-02-05;12:18), the rule matching component RM retrieves all walks in the temporal knowledge graph
G up to a specified length between the query subject and query object (or a subset
of walks if retrieving all walks is not feasible). The rule matching component RM
matches these walks to the rule bodies of the learned rules, in which the query relation
appears in the rule head. After matching, the rule matching component RM obtains a
list of matching rules (with their confidence values) that could be applied to predict
the event given by the query Q. To calculate the classification score CS, which could
be interpreted as the probability to observe the query event, the rule matching component
RM combines the confidence values from the resulting matching rule list via a noisy-or
aggregation. Instead of directly taking the confidence values, it is also possible
to apply a more complex scoring function, e.g., a function that takes the explicit
timestamp values into account. One possibility for a more complex scoring function
is to combine the confidence values with an exponential function depending on the
time difference
tn+1(
rg) -
t1(rg) of the rule grounding
rg for a rule
r: 
[0066] The main idea is that events that are supported by matching rules with high confidence
values are more likely to happen and events that are only supported by few matching
rules or matching rules with low confidence values are more likely to be an anomaly.
[0067] Depending on the classification score CS and either predefined or learned thresholds,
the query event is classified as one of the classes, e.g., a score of
≥ 0.9 indicates that the event is expected and a score of
≤ 0.1 implies that the event is highly suspicious. In this case, either (predefined) automated
actions can be taken, or actions could be derived from the algorithm's output for
further human investigation.
Explainable prediction component EP
[0068] This component takes as input the classification score CS computed by the rule matching
component RM as well as the rule groundings for the matching rules in the temporal
knowledge graph
G.
[0069] The output of this component is a human-understandable explanation for the predictions.
[0070] The classification score CS on the one hand, and as its human-understandable explanation,
the matching rules along with their rule groundings (corresponding paths) in the temporal
knowledge graph G on the other hand, are displayed to the human operator. The matching
rules and the rule groundings should be presented in a human-readable fashion (verbalization
in natural language) so that the human can understand the reason for the predictions.
The explanations could support the human operator in deciding the next action steps,
and the matching rules provide hints what could be the cause of the anomaly.
[0071] Fig. 5 shows a flowchart illustrating an embodiment of the method.
[0072] In a first step 1, the temporal random walks component TRW takes as input the temporal
knowledge graph G, containing information about the network modules and interactions
(observed events) in the network, and computes as output temporal random walks.
[0073] In a second step 2, the rule learning component RL takes as input the temporal random
walks that have been computed by the temporal random walks component TRW and computes
as output the temporal logical rules TLR with confidence values.
[0074] In a third step 3, the rule matching component RM takes as input the temporal logical
rules TLR with the confidence values that have been computed by the rule learning
component RL, as well as the temporal knowledge graph G, and the query Q.
[0075] The rule matching component RM computes as output the classification score CS as
well as rule groundings for matching rules in the temporal knowledge graph G.
[0076] In a fourth step 4, the explainable prediction component EP takes as input the classification
score CS computed by the rule matching component RM as well as the rule groundings
for the matching rules in the temporal knowledge graph G.
[0077] The output of the explainable prediction component EP is a human-understandable explanation
for the predictions.
[0078] During training, the temporal logical rules TLR are automatically learned with the
first step 1 and the second step 2 from observed events in the network. Since new
events happen continuously, it makes sense to update the rules and the confidence
values of the rules on a regular basis to incorporate the new information. The learned
temporal logical rules TLR are applied in the third step 3 in real-time for monitoring
the network and detecting anomalies in a timely manner so that actions can be taken.
The explainable predictions generated in the fourth step 4 are presented via a graphical
user interface to the expert who can decide which operations to perform depending
on the classification score CS and the explanations.
[0079] The temporal logical rules TLR learned by this embodiment could be used for the reconfiguration
of an existing network or also be transferred to related networks, e.g., new network
setups where not enough historical information exists. The temporal logical rules
TLR can be transferred and applied directly if both networks fulfill certain conditions,
e.g., share a common vocabulary.
[0080] For example, the method can be executed by one or more processors. Examples of processors
include a microcontroller or a microprocessor, an Application Specific Integrated
Circuit (ASIC), or a neuromorphic microchip, in particular a neuromorphic processor
unit. The processor can be part of any kind of computer, including mobile computing
devices such as tablet computers, smartphones or laptops, or part of a server in a
control room or cloud.
[0081] The above-described method may be implemented via a computer program product including
one or more computer-readable storage media having stored thereon instructions executable
by one or more processors of a computing system. Execution of the instructions causes
the computing system to perform operations corresponding with the acts of the method
described above.
[0082] The instructions for implementing processes or methods described herein may be provided
on non-transitory computer-readable storage media or memories, such as a cache, buffer,
RAM, FLASH, removable media, hard drive, or other computer readable storage media.
Computer readable storage media include various types of volatile and non-volatile
storage media. The functions, acts, or tasks illustrated in the figures or described
herein may be executed in response to one or more sets of instructions stored in or
on computer readable storage media. The functions, acts or tasks may be independent
of the particular type of instruction set, storage media, processor or processing
strategy and may be performed by software, hardware, integrated circuits, firmware,
micro code and the like, operating alone or in combination. Likewise, processing strategies
may include multiprocessing, multitasking, parallel processing, and the like.
[0083] The invention has been described in detail with reference to embodiments thereof
and examples. Variations and modifications may, however, be effected within the spirit
and scope of the invention covered by the claims. The phrase "at least one of A, B
and C" as an alternative expression may provide that one or more of A, B and C may
be used.
1. A computer implemented method for anomaly detection in a network, comprising the following
operations, wherein the operations are performed by components, and wherein the components
are software components executed by one or more processors and/or hardware components:
- representing, by a temporal knowledge graph (G) stored in a graph database, a network
including interactions between network modules with a set of entities, a set of relations,
and a set of timestamps,
- with each entity representing data about the network as a node in the temporal knowledge
graph (G),
- with at least some entities representing network modules consisting of hardware
and/or software,
- with at least some of the relations representing a type of interaction between network
modules, and
- with at least some edges between nodes representing interactions between entities
that have been observed as events, with each of these edges having a timestamp, encoding
when the interaction occurred, and being an instance of one of the relations,
- sampling, in a first step (1), temporal random walks from the temporal knowledge
graph (G),
- transforming, in a second step (2), the temporal random walks into temporal logical
rules (TLR),
- observing, by a monitoring component, an event in the network or a different network,
and
- classifying, in a third step (3), the observed event with regard to an anomaly,
using the temporal logical rules (TLR).
2. The method of claim 1,
- wherein the first step (1) is performed by a temporal random walks component (TRW),
which samples temporal random walks for each relation in the temporal knowledge graph
(G), wherein each temporal random walk is a sequence of edges with decreasing timestamps
starting with an edge for the respective relation.
3. The method of claim 2,
- wherein the second step (2) is performed by a rule learning component (RL), wherein
- each temporal logical rule (TLR) consists of a rule body and a rule head, with each
rule head specifying one of the relations,
- each temporal random walk is stored as a rule grounding of the respective temporal
logical rule (TLR), and
- a confidence value is computed for each temporal logical rule (TLR), indicating
a probability for the correctness of the respective rule.
4. The method of claim 3,
- wherein the sequence of edges of each temporal random walk is transformed into a
temporal logical rule (TLR) by
- replacing the entities and the timestamps in the sequence of edges by variables,
- forming the rule head based on the first edge in the sequence of edges, and
- forming the rule body by reordering the remaining sequence of edges, and
- wherein the confidence values for the temporal logical rules (TLR) are computed
by sampling rule bodies and counting for how many rule bodies a rule head exists with
a future timestamp.
5. The method of claim 3 or 4,
- wherein the observed event is represented with a query (Q) containing a query subject,
a query relation, a query object, and a query timestamp, and
- wherein the third step (3) is performed by a rule matching component (RM),
- taking as input the query (Q), the temporal knowledge graph (G) and the temporal
logical rules (TLR) with the confidence values, and
- computing a classification score (CS) based on matching rules among the temporal
logical rules (TLR), wherein the matching rules predict the observed event, and wherein
the classification score (CS) indicates a probability to observe the observed event.
6. The method of claim 5,
- wherein the third step (3) consists of
- retrieving temporal random walks in the temporal knowledge graph (G) between the
query subject and query object,
- determining the matching rules by picking temporal logical rules (TLR) if the query
relation matches the relation specified in the rule head and one of the retrieved
walks matches the rule body, and
- calculating the classification score (CS) based on the confidence values of the
matching rules.
7. The method of claim 5 or 6,
- wherein the observed event is classified with regard to a set of classes based on
the classification score (CS), with at least one of the classes indicating an anomaly.
8. The method of claim 7,
- with an additional fourth step (4) of outputting, by an explainable prediction component
(EP) and a user interface, the result of the classification and the rule groundings
for the matching rules to a user.
9. The method of claim 8,
- wherein the result of the classification and the rule groundings are transformed
into natural language output.
10. The method according to any of the preceding claims,
- with the additional step of automatically disconnecting at least one network module
from the network or limiting its connections on certain ports, based on the classification
result.
11. The method according to any of the preceding claims,
- wherein the network is an industrial automation system, in particular a manufacturing
system.
12. A system for anomaly detection in a network, comprising:
- a graph database storing a temporal knowledge graph (G) representing a network including
interactions between network modules with a set of entities, a set of relations, and
a set of timestamps,
- with each entity representing data about the network as a node in the temporal knowledge
graph (G),
- with at least some entities representing network modules consisting of hardware
and/or software,
- with at least some of the relations representing a type of interaction between network
modules, and
- with at least some edges between nodes representing interactions between entities
that have been observed as events, with each of these edges having a timestamp, encoding
when the interaction occurred, and being an instance of one of the relations,
- a temporal random walks component (TRW) configured for sampling temporal random
walks from the temporal knowledge graph (G),
- a rule learning component (RL) configured for transforming the temporal random walks
into temporal logical rules (TLR),
- a monitoring component, configured for observing an event in the network or a different
network, and
- a rule matching component (RM), configured for classifying the observed event with
regard to an anomaly, using the temporal logical rules (TLR).
13. The system according to claim 12,
- implemented as a monitoring device, in particular an edge device, wherein
- the monitoring device is part of the network, and
- the network is an industrial automation system, in particular a manufacturing system.
14. Computer program product with program instructions for carrying out a method according
to one of the method claims.
15. Provision device for the computer program product according to the preceding claim,
wherein the provision device stores and/or provides the computer program product.