TECHNICAL FIELD
[0001] The present invention relates to the field of computer technologies, and in particular,
to a streaming application upgrading method, a master node, and a stream computing
system.
BACKGROUND
[0002] With arrival of the big data era, market demands for performing real-time processing,
analysis, and decision-making on mass data continuously expand, such as precise advertisement
push in the field of telecommunications, dynamic real-time analysis on transaction
in the field of finances, and real-time monitoring in the field of industries. Against
this backdrop, a data-intensive application such as financial service, network monitoring,
or telecommunications data management, is applied increasingly widely, and a stream
computing system applicable to the data-intensive application also emerges. Data generated
by the data-intensive application is characterized by a large amount of data, a high
speed, and time variance, and after the data-intensive application is deployed in
the stream computing system, the stream computing system may immediately process the
data of the application upon receiving it, to ensure real-time performance. As shown
in FIG. 1, a stream computing system generally includes a master node (Master) and
multiple worker nodes (worker), where the master node is mainly responsible for scheduling
and managing each worker node, the worker node is a logical entity carrying an actual
data processing operation, the worker node specifically processes data by invoking
several process elements (PE, Process Element), and a PE is a physical process element
of service logic.
[0003] Generally, an application program or a service deployed in the stream computing system
is referred to as a streaming application. In the prior art, when a streaming application
is deployed in the stream computing system, a logical model of the streaming application
needs to be defined in advance, and the logical model of the streaming application
is generally denoted by using a directed acyclic graph (Directed Acyclic Graph, DAG).
As shown in FIG. 2, a PE is a physical carrier carrying an actual data processing
operation, and is also a minimum unit that may be scheduled and executed by the stream
computing system; a stream represents a data stream transmitted between PEs, and an
arrow denotes a direction of a data stream; and a PE may load and execute service
logic dynamically, and process data of the streaming application in real time. As
shown in FIG. 3, a stream computing system deploys PEs on different worker nodes for
execution according to a logical model, and each PE performs computing according to
logic of the PE, and forwards a computing result to a downstream PE. However, when
a user demand or a service scenario changes, the streaming application needs to be
updated or upgraded, and the initial logical model is no longer applicable. Therefore,
first, updating of the streaming application needs to be completed offline, and a
new logical model is defined; then the old application is stopped, an updated streaming
application is deployed in the stream computing system according to the new logical
model, and finally the streaming application is started. It can be seen that, in the
prior art, an original service needs to be interrupted to update the streaming application;
therefore, the streaming application cannot be upgraded online, causing a service
loss.
SUMMARY
[0004] Embodiments of the present invention provide a streaming application upgrading method,
a master node, and a stream computing system, which are used to upgrade a streaming
application in a stream computing system online without interrupting a service.
[0005] According to a first aspect, an embodiment of the present invention provides a streaming
application upgrading method, where the method is applied to a master node in a stream
computing system, and the stream computing system includes the master node and at
least one worker node, where multiple process elements PE are distributed on one or
more worker nodes of the at least one worker node, and are configured to process data
of a streaming application deployed in the stream computing system, where an initial
logical model of the streaming application is used to denote the multiple PEs processing
the data of the streaming application and a direction of a data stream between the
multiple PEs; and the method includes:
in a case in which the streaming application is updated, obtaining, by the master
node, an updated logical model of the streaming application, and determining a to-be-adjusted
data stream by comparing the initial logical model of the streaming application with
the updated logical model;
generating an upgrading instruction according to the to-be-adjusted data stream; and
delivering the upgrading instruction to a first worker node, where the first worker
node is a worker node at which a PE related to the to-be-adjusted data stream is located,
and the upgrading instruction is used to instruct the first worker node to adjust
a direction of a data stream between PEs distributed on the first worker node.
[0006] In a first possible implementation manner of the first aspect, the determining a
to-be-adjusted data stream by comparing the initial logical model of the streaming
application with the updated logical model includes:
comparing the initial logical model of the streaming application with the updated
logical model, to determine the to-be-adjusted data stream, where the PEs denoted
by the initial logical model of the streaming application are the same as PEs denoted
by the updated logical model.
[0007] In a second possible implementation manner of the first aspect, the determining a
to-be-adjusted data stream by comparing the initial logical model of the streaming
application with the updated logical model includes:
comparing the initial logical model of the streaming application with the updated
logical model, to determine a to-be-adjusted PE and the to-be-adjusted data stream,
where the PEs denoted by the initial logical model of the streaming application are
not completely the same as PEs denoted by the updated logical model;
the generating an upgrading instruction according to the to-be-adjusted data stream
includes:
generating a first upgrading instruction according to the to-be-adjusted data stream;
and generating a second upgrading instruction according to the to-be-adjusted PE;
and
the delivering the upgrading instruction to a first worker node includes:
delivering the first upgrading instruction to the first worker node, and delivering
the second upgrading instruction to a second worker node, where the second worker
node includes a worker node at which the to-be-adjusted PE is located; and the first
upgrading instruction is used to instruct the first worker node to adjust the direction
of the data stream between the PEs distributed on the first worker node, and the second
upgrading instruction is used to instruct the second worker node to adjust a quantity
of PEs distributed on the second worker node.
[0008] With reference to the first aspect, or either of the first and second possible implementation
manners of the first aspect, in a third possible implementation manner, the method
further includes:
determining, by the master node according to a dependency relationship between an
input stream and an output stream of the PE related to the to-be-adjusted data stream,
a target PE that needs to perform data recovery and a checkpoint checkpoint for the
target PE performing data recovery;
delivering a data recovery instruction to a worker node at which the target PE is
located, where the data recovery instruction is used to instruct the target PE to
recover data according to the checkpoint; and
after it is determined that the first worker node completes adjustment, and the PEs
distributed on the first worker node get ready, triggering, by the master node, the
target PE to input the recovered data to a downstream PE of the target PE for processing.
[0009] With reference to the third possible implementation manner of the first aspect, in
a fourth possible implementation manner, the to-be-adjusted data stream includes:
a to-be-updated data stream and a to-be-deleted data stream; and the determining,
by the master node according to a dependency relationship between an input stream
and an output stream of the PE related to the to-be-adjusted data stream, a target
PE that needs to perform data recovery and a checkpoint checkpoint for the target
PE performing data recovery includes:
determining, by the master node according to status data of a PE related to the to-be-updated
data stream and the to-be-deleted data stream, a checkpoint for performing data recovery;
and determining, according to a dependency relationship between an input stream and
an output stream of a PE related to the to-be-updated data stream and the to-be-deleted
data stream, a target PE that needs to perform data recovery, where status data of
each PE is backed up by the PE when being triggered by an output event, and is used
to indicate a status in which the PE processes data.
[0010] With reference to any one of the second to fourth possible implementation manners
of the first aspect, in a fifth possible implementation manner, the to-be-adjusted
PE includes a to-be-added PE; the second worker node is a worker node selected by
the master node according to a load status of each worker node in the stream computing
system; and the second upgrading instruction is used to instruct the second worker
node to create the to-be-added PE.
[0011] With reference to any one of the second to fifth possible implementation manners
of the first aspect, in a sixth possible implementation manner, the to-be-adjusted
PE includes a to-be-deleted PE; the second worker node is a worker node at which the
to-be-deleted PE is located; and the second upgrading instruction is used to instruct
the second worker node to delete the to-be-deleted PE.
[0012] With reference to the first aspect, or any one of the first to sixth possible implementation
manners of the first aspect, in a seventh possible implementation manner, the method
further includes:
configuring the multiple PEs according to the initial logical model of the streaming
application to process the data of the streaming application.
[0013] With reference to the first aspect, or any one of the first to seventh possible implementation
manners of the first aspect, in an eighth possible implementation manner, the initial
logical model of the streaming application is denoted by using a directed acyclic
graph DAG.
[0014] According to a second aspect, an embodiment of the present invention provides a master
node in a stream computing system, where the stream computing system includes the
master node and at least one worker node, where multiple process elements PE are distributed
on one or more worker nodes of the at least one worker node, and are configured to
process data of a streaming application deployed in the stream computing system, where
an initial logical model of the streaming application is used to denote the multiple
PEs processing the data of the streaming application and a direction of a data stream
between the multiple PEs; and the master node includes:
an obtaining and comparing module, configured to: in a case in which the streaming
application is updated, obtain an updated logical model of the streaming application,
and determine a to-be-adjusted data stream by comparing the initial logical model
of the streaming application with the updated logical model;
an upgrading instruction generating module, configured to generate an upgrading instruction
according to the to-be-adjusted data stream; and
a sending module, configured to deliver the upgrading instruction to a first worker
node, where the first worker node is a worker node at which a PE related to the to-be-adjusted
data stream is located, and the upgrading instruction is used to instruct the first
worker node to adjust a direction of a data stream between PEs distributed on the
first worker node.
[0015] In a first possible implementation manner of the second aspect, the obtaining and
comparing module is specifically configured to:
compare the initial logical model of the streaming application with the updated logical
model, to determine the to-be-adjusted data stream, where the PEs denoted by the initial
logical model of the streaming application are the same as PEs denoted by the updated
logical model.
[0016] In a second possible implementation manner of the second aspect, where the PEs denoted
by the initial logical model of the streaming application are not completely the same
as PEs denoted by the updated logical model;
the upgrading instruction generating module is specifically configured to generate
a first upgrading instruction according to the to-be-adjusted data stream; and generating
a second upgrading instruction according to the to-be-adjusted PE; and
the sending module is specifically configured to deliver the first upgrading instruction
to the first worker node, and deliver the second upgrading instruction to a second
worker node, where the second worker node includes a worker node at which the to-be-adjusted
PE is located; and the first upgrading instruction is used to instruct the first worker
node to adjust the direction of the data stream between the PEs distributed on the
first worker node, and the second upgrading instruction is used to instruct the second
worker node to adjust a quantity of PEs distributed on the second worker node.
[0017] With reference to the second aspect, or any one of the first to third possible implementation
manners of the second aspect, in a third possible implementation manner, the master
node further includes:
a data recovery module, configured to determine, according to a dependency relationship
between an input stream and an output stream of the PE related to the to-be-adjusted
data stream, a target PE that needs to perform data recovery and a checkpoint checkpoint
for the target PE performing data recovery, where
the sending module is further configured to deliver a data recovery instruction to
a worker node at which the target PE is located, where the data recovery instruction
is used to instruct the target PE to recover data according to the checkpoint; and
the master node further includes: an input triggering module, configured to: after
it is determined that the first worker node completes adjustment, and the PEs distributed
on the first worker node get ready, trigger the target PE to input the recovered data
to a downstream PE of the target PE for processing.
[0018] With reference to the third possible implementation manner of the second aspect,
in a fourth possible implementation manner, the to-be-adjusted data stream includes:
a to-be-updated data stream and a to-be-deleted data stream; and the data recovery
module is specifically configured to:
determine, by the master node according to status data of a PE related to the to-be-updated
data stream and the to-be-deleted data stream, a checkpoint for performing data recovery;
and determine, according to a dependency relationship between an input stream and
an output stream of a PE related to the to-be-updated data stream and the to-be-deleted
data stream, a target PE that needs to perform data recovery, where status data of
each PE is backed up by the PE when being triggered by an output event, and is used
to indicate a status in which the PE processes data.
[0019] With reference to any one of the second to fourth possible implementation manners
of the second aspect, in a fifth possible implementation manner, the to-be-adjusted
PE includes a to-be-deleted PE; the second worker node is a worker node at which the
to-be-deleted PE is located; and the second upgrading instruction is used to instruct
the second worker node to delete the to-be-deleted PE.
[0020] With reference to any one of the second to fifth possible implementation manners
of the second aspect, in a sixth possible implementation manner, the to-be-adjusted
PE includes a to-be-added PE; the second worker node is a worker node selected by
the master node according to a load status of each worker node in the stream computing
system; and the second upgrading instruction is used to instruct the second worker
node to create the to-be-added PE.
[0021] With reference to the second aspect, or any one of the first to sixth possible implementation
manners of the second aspect, in a seventh possible implementation manner, the master
node further includes: a configuration module, configured to configure the multiple
PEs according to the initial logical model of the streaming application to process
the data of the streaming application.
[0022] According to a third aspect, an embodiment of the present invention provides a stream
computing system, including: a master node and at least one worker node, where multiple
process elements PE are distributed on one or more worker nodes of the at least one
worker node, and are configured to process data of a streaming application deployed
in the stream computing system, where an initial logical model of the streaming application
is used to denote the multiple PEs processing the data of the streaming application
and a direction of a data stream between the multiple PEs; and
the master node is configured to: in a case in which the streaming application is
updated, obtain an updated logical model of the streaming application, and determine
a to-be-adjusted data stream by comparing the initial logical model of the streaming
application with the updated logical model; generate an upgrading instruction according
to the to-be-adjusted data stream; and delivering the upgrading instruction to a first
worker node, where the first worker node is a worker node at which a PE related to
the to-be-adjusted data stream is located, and the upgrading instruction is used to
instruct the first worker node to adjust a direction of a data stream between PEs
distributed on the first worker node; and
the first worker node is configured to receive the upgrading instruction sent by the
master node, and adjust, according to an indication of the upgrading instruction,
the direction of the data stream between the PEs distributed on the first worker node.
[0023] In a first possible implementation manner of the third aspect, where the PEs denoted
by the initial logical model of the streaming application are the same as PEs denoted
by the updated logical model.
[0024] In a second possible implementation manner of the third aspect, where the PEs denoted
by the initial logical model of the streaming application are not completely the same
as PEs denoted by the updated logical model; generate a first upgrading instruction
according to the to-be-adjusted data stream; and generate a second upgrading instruction
according to the to-be-adjusted PE; and
deliver the first upgrading instruction to the first worker node, and deliver the
second upgrading instruction to a second worker node, where the second worker node
includes a worker node at which the to-be-adjusted PE is located;
the first worker node is specifically configured to receive the first upgrading instruction
sent by the master node, and adjust, according to an indication of the first upgrading
instruction, the direction of the data stream between the PEs distributed on the first
worker node; and
the second worker node is specifically configured to receive the second upgrading
instruction sent by the master node, and adjust, according to an indication of the
second upgrading instruction, a quantity of PEs distributed on the second worker node.
[0025] With reference to the third aspect, or either of the first and second possible implementation
manners of the third aspect, in a third possible implementation manner, the master
node is further configured to determine, according to a dependency relationship between
an input stream and an output stream of the PE related to the to-be-adjusted data
stream, a target PE that needs to perform data recovery and a checkpoint checkpoint
for the target PE performing data recovery; delivering a data recovery instruction
to a worker node at which the target PE is located, where the data recovery instruction
is used to instruct the target PE to recover data according to the checkpoint; and
after it is determined that the first worker node completes adjustment, and the PEs
distributed on the first worker node get ready, trigger the target PE to input the
recovered data to a downstream PE of the target PE for processing.
[0026] It can be known from the foregoing technical solutions that, according to the streaming
application upgrading method and the stream computing system provided in the embodiments
of the present invention, a logical model of a streaming application is compared with
an updated logical model of the streaming application, to dynamically determine a
to-be-adjusted data stream, and a corresponding upgrading instruction is generated
according to the to-be-adjusted data stream and delivered to a worker node, thereby
upgrading the streaming application in the stream computing system online without
interrupting a service.
BRIEF DESCRIPTION OF DRAWINGS
[0027] To describe the technical solutions in the embodiments of the present invention or
in the prior art more clearly, the following briefly introduces the accompanying drawings
required for describing the embodiments or the prior art. Apparently, the accompanying
drawings in the following description show merely some embodiments of the present
invention, and persons of ordinary skill in the art may still derive other drawings
from these accompanying drawings without creative efforts.
FIG. 1 is a schematic diagram of an architecture of a stream computing system according
to the present invention;
FIG. 2 is a schematic diagram of a logical model of a streaming application according
to an embodiment of the present invention;
FIG. 3 is a schematic diagram of deployment of a streaming application according to
an embodiment of the present invention;
FIG. 4 is a diagram of a working principle of a stream computing system according
to an embodiment of the present invention;
FIG. 5 is a flowchart of a streaming application upgrading method according to an
embodiment of the present invention;
FIG. 6 is a schematic diagram of a change to a logical model of a streaming application
after the streaming application is updated according to an embodiment of the present
invention;
FIG. 7 is a schematic diagram of a change to a logical model of a streaming application
after the streaming application is updated according to an embodiment of the present
invention;
FIG. 8 is a flowchart of a streaming application upgrading method according to an
embodiment of the present invention;
FIG. 9 is a schematic diagram of a logical model of a streaming application according
to an embodiment of the present invention;
FIG. 10 is a schematic diagram of an adjustment of a logical model of a streaming
application according to an embodiment of the present invention;
FIG. 11 is a schematic diagram of PE deployment after a streaming application is upgraded
according to an embodiment of the present invention;
FIG. 12 is a schematic diagram of a dependency relationship between an input stream
and an output stream of a PE according to an embodiment of the present invention;
FIG. 13 is a schematic diagram of a dependency relationship between an input stream
and an output stream of a PE according to an embodiment of the present invention;
FIG. 14 is a schematic diagram of backup of status data of a PE according to an embodiment
of the present invention;
FIG. 15 is a schematic diagram of a master node according to an embodiment of the
present invention;
FIG. 16 is a schematic diagram of a stream computing system according to an embodiment
of the present invention; and
FIG. 17 is a schematic diagram of a master node according to an embodiment of the
present invention.
DESCRIPTION OF EMBODIMENTS
[0028] To make the objectives, technical solutions, and advantages of the present invention
clearer, the following clearly and completely describes the technical solutions of
the present invention with reference to the accompanying drawings in the embodiments
of the present invention. Apparently, the following described embodiments are some
of the embodiments of the present invention. Based on the embodiments of the present
invention, persons of ordinary skill in the art can obtain other embodiments that
can resolve the technical problem of the present invention and implement the technical
effect of the present invention by equivalently altering some or all the technical
features even without creative efforts. Apparently, the embodiments obtained by means
of alteration do not depart from the scope disclosed in the present invention.
[0029] The technical solutions provided in the embodiments of the present invention may
be typically applied to a stream computing system. FIG. 4 describes a basic structure
of a stream computing system, and the stream computing system includes a master node
(Master) and multiple worker nodes (worker). During cluster deployment, there may
be one or more master nodes and one or more worker nodes, and a master node may be
a physical node separated from a worker node; and during standalone deployment, a
master node and a worker node may be logical units deployed on a same physical node,
where the physical node may be specifically a computer or a server. The master node
is responsible for scheduling a data stream to the worker node for processing. Generally,
one physical node is one worker node. In some cases, one physical node may correspond
to multiple worker nodes, a quantity of worker nodes corresponding to one physical
node depends on physical hardware resources of the physical node. One worker node
may be understood as one physical hardware resource. Worker nodes corresponding to
a same physical node communicate with each other by means of process communication,
and worker nodes corresponding to different physical nodes communicate with each other
by means of network communication.
[0030] As shown in FIG. 4, a stream computing system includes a master node, a worker node
1, a worker node 2, and a worker node 3.
[0031] The master node deploys, according to a logical model of a streaming application,
the streaming application in the three worker nodes: the worker node 1, the worker
node 2, and the worker node 3 for processing. The logical model shown in FIG. 3 is
a logical relationship diagram including nine process elements (PE, Process Element):
PE1 to PE9, and directions of data streams between the nine PEs, and the directions
of the data streams between the PEs also embodies dependency relationships between
input streams and output streams of the PEs. It should be noted that, a data stream
in the embodiments of the present invention is also briefly referred to as a stream
(stream).
[0032] The master node configures PE1, PE2, and PE3 on the worker node 1, PE4, PE7, and
PE9 on the worker node 2, and PE5, PE6, and PE8 on the worker node 3 according to
the logical model of the streaming application to process a data stream of the streaming
application. It can be seen that, after the configuration, a direction of a data stream
between the PEs on the worker nodes 1, 2, and 3 matches the logical model of the streaming
application.
[0033] The logical model of the streaming application in the embodiments of the present
invention may be a directed acyclic graph (Directed Acyclic Graph, DAG), a tree graph,
or a cyclic graph. The logical model of the streaming application may be understood
by referring to FIG. 2. A diagram of a streaming computing application shown in FIG.
2 includes seven operators that are from PE1 to PE7, and eight data streams that are
from s1 to s8. FIG. 2 explicitly marks directions of the data streams, for example,
the data stream s1 is from PE1 to the operator PE5, which denotes that PE5 processes
a stream that is output by PE1, that is, an output of PE5 depends on an input of PE1.
PE5 is generally also referred to as a downstream PE of PE1, and PE1 is an upstream
PE of PE5. It can be understood that, an upstream PE and a downstream PE are determined
according to a direction of a data stream between the PEs, and only two PEs are related
to one data stream: a source PE that outputs the data stream, and a destination PE
to which the data stream is directed (that is, a PE receiving the data stream). Viewed
from a direction of a data stream, a source PE is an upstream PE of a destination
PE, and the destination PE is a downstream PE of the source PE. Further, after a data
stream S2 is input to PE2, and subject to logical processing of PE2, two data streams
S3 and S4 are generated, and enter PE3 and PE4 respectively for logical processing.
Likewise, PE2 is also a downstream PE of PE1, and PE1 is an upstream PE of PE2. The
data stream S4 that is output by PE4 and a data stream S7 that is output by PE3 are
both used as inputs of PE6, that is, an output of PE6 depends on inputs of PE3 and
PE4. It should be noted that, in the embodiments of the present invention, a PE whose
output depends on an input of a single PE is defined as a stateless PE, such as PE5,
PE3, or PE4; and a PE whose output depends on inputs of multiple PEs is defined as
a stateful PE, such as PE6 or PE7. A data stream includes a single data segment that
is referred to as a tuple, where the tuple may be structured or unstructured data.
Generally, a tuple may denote a status of an object at a specific time point, a PE
in the stream computing system processes a data stream generated by the streaming
application by using a tuple as unit, and it may be also considered that a tuple is
a minimum granularity for division and denotation of data in the stream computing
system.
[0034] It should be further noted that, the stream computing system is only a typical application
scenario of the technical solutions of the present invention, and does not constitute
any limitation on application scenarios of the present invention, and the technical
solutions of the embodiments of the present invention are all applicable to other
application scenarios involved in application deployment and upgrading of a distributed
system or a cloud computing system.
[0035] An embodiment of the present invention provides a streaming application upgrading
method, where the method may be typically applied to the stream computing system shown
in FIG. 1 and FIG. 4. Assuming that a streaming application A is deployed in the stream
computing system and an initial logical model of the streaming application A is D1,
the master node of the stream computing system deploys multiple process elements (PE)
according to the initial logical model D1 to process a data stream of the streaming
application A, where the multiple process elements PE are distributed on one or more
worker nodes of the stream computing system. As shown in FIG. 6, after the streaming
application A is upgraded or updated, the logical model of the streaming application
A is correspondingly updated (it is assumed that a updated logical model is D2), the
updating of the logical model is generally completed by a developer, or by a developer
with a development tool, which is not particularly limited in the present invention.
As shown in FIG. 5, a main procedure of the streaming application upgrading method
is described as follows:
S501: In a case in which a streaming application A is updated, a master node of a
stream computing system obtains an updated logical model D2 of the streaming application
A is updated.
S502: The master node determines a to-be-adjusted data stream by comparing the updated
logical model D2 with the initial logical model D1.
S503: The master node generates an upgrading instruction according to the to-be-adjusted
data stream.
S504: The master node delivers the generated upgrading instruction to a first worker
node, where the first worker node is a worker node at which a PE related to the to-be-adjusted
data stream is located, and the upgrading instruction is used to instruct the first
worker node to adjust a direction of a data stream between PEs distributed on the
first worker node.
[0036] It should be noted that, there may be one or more to-be-adjusted data streams in
this embodiment of the present invention, which depends on a specific situation. PEs
related to each to-be-adjusted data stream specifically refer to a source PE and a
destination PE of the to-be-adjusted data stream, where the source PE of the to-be-adjusted
data stream is a PE that outputs the to-be-adjusted data stream, the destination PE
of the to-be-adjusted data stream is a PE receiving the to-be-adjusted data stream
or a downstream PE of the source PE of the to-be-adjusted data stream.
[0037] According to the streaming application upgrading method and the stream computing
system that are provided in this embodiment of the present invention, a logical model
of a streaming application is compared with an updated logical model of the streaming
application, to dynamically determine a to-be-adjusted data stream, and a corresponding
upgrading instruction is generated according to the to-be-adjusted data stream and
delivered to a worker node, thereby upgrading the streaming application in the stream
computing system online without interrupting a service.
[0038] In this embodiment of the present invention, the logical model of the streaming application
is used to denote multiple PEs processing data of the streaming application and a
direction of a data stream between the multiple PEs. After the streaming application
is upgraded or updated, the logical model of the streaming application is correspondingly
updated. Generally, a difference between a updated logical model and the initial logical
model is mainly divided into two types: (1) PEs denoted by the initial logical model
are completely the same as PEs denoted by the updated logical model, and only a direction
of a data stream between PEs changes; and (2) the PEs denoted by the initial logical
model are not completely the same as the PEs denoted by the updated logical model,
and a direction of a data stream between PEs also changes. For the foregoing two types
of differences, corresponding processing procedures are described below.
[0039] In a specific embodiment, as shown in FIG. 6, PEs denoted by an initial logical model
of a streaming application are completely the same as PEs denoted by an updated logical
model of the streaming application, and a direction of a data stream between PEs changes.
According to FIG. 6, both the PEs in the logical model of the streaming application
before updating and the PEs in the logical model of the streaming application that
is updated are PE1 to PE7, and are completely the same, but a direction of a data
stream changes, that is, a data stream from PE4 to PE6 becomes a data stream S11 from
PE4 to PE7, and a data stream S12 from PE2 to PE6 is added. In this case, a main procedure
of the streaming application upgrading method is as follows:
Step 1: Determine a to-be-adjusted data stream by comparing an initial logical model
of a streaming application with an updated logical model of the streaming application,
where the to-be-adjusted data stream includes one or more data streams. Specifically,
in an exemplary embodiment, the to-be-adjusted data stream may include at least one
of the following: a to-be-added data stream, a to-be-deleted data stream, and a to-be-updated
data stream, where the to-be-updated data stream refers to a data stream whose destination
node or source node changes after the logical model of the streaming application is
updated. Specifically, as shown in FIG. 6, the to-be-adjusted data stream includes
a to-be-added data stream S12, and a to-be-updated data stream S11.
Step 2: Generate an upgrading instruction according to the to-be-adjusted data stream,
where the upgrading instruction may include one or more instructions, and the upgrading
instruction is related to a type of the to-be-adjusted data stream. For example, if
the to-be-adjusted data stream includes a to-be-added data stream and a to-be-updated
data stream, the generated upgrading instruction includes an instruction used to add
a data stream and an instruction used to update a data stream, where different types
of upgrading instructions may be separate instructions, or may be integrated into
one instruction, which is not particularly limited in the present invention either.
Specifically, as shown in FIG. 6, the generated upgrading instruction includes an
instruction for adding the data stream S12 and an instruction for updating a data
stream S6 to a data stream S11.
Step 3: Deliver the generated upgrading instruction to a first worker node, where
the first worker node is a worker node at which a PE related to the to-be-adjusted
data stream is located. It can be understood that, there may be one or more first
worker nodes. After receiving the upgrading instruction, a first worker node performs
operations indicated by the upgrading instruction, for example, adding the data stream
S12, and updating the data stream S6 to the data stream S11, so that a direction of
a data stream between PEs distributed on the first worker node is adjusted, and a
direction of a data stream after the adjustment matches the updated logical model.
[0040] Further, when the first worker node adjusts a data stream between PEs distributed
on the first worker node, data that is being processed may be lost, and therefore
the data needs to be recovered. Specifically, in an embodiment, before the first worker
node adjusts a data stream between PEs distributed on the first worker node, a master
node determines, according to a dependency relationship between an input stream and
an output stream of a PE related to the to-be-adjusted data stream, a target PE that
needs to perform data recovery and a checkpoint checkpoint for the target PE performing
data recovery; and delivers a data recovery instruction to a worker node at which
the target PE is located, where the data recovery instruction is used to instruct
the target PE to recover data according to the checkpoint; and after the master node
determines that the first worker node completes adjustment, and the PEs distributed
on the first worker node get ready, the master node triggers the target PE to input
the recovered data to a downstream PE of the target PE for processing.
[0041] It should be noted that, the master node may perceive a status of a PE on each worker
node in the stream computing system by actively sending a query message, or a worker
node may report a status of each PE distributed on the worker node to the master node,
where a status of a PE includes a running state, a ready state and a stopped state.
When a channel between a PE and an upstream or downstream PE is established successfully,
it indicates that the PE is in the ready state, and the PE may receive and process
a data stream.
[0042] Optionally, before performing the steps of the foregoing streaming application upgrading
method, the master node may further configure multiple PEs according to the initial
logical model of the streaming application to process data of the streaming application.
[0043] According to the streaming application upgrading method provided in this embodiment
of the present invention, a logical model of a streaming application is compared with
an updated logical model of the streaming application, to dynamically determine a
to-be-adjusted data stream, and a corresponding upgrading instruction is generated
and delivered to a worker node, to complete online upgrading of the streaming application,
thereby ensuring that an original service does not need to be interrupted in an application
upgrading process; and further, data is recovered in the upgrading process, to ensure
that key data is not lost, and service running is not affected.
[0044] In another specific embodiment, as shown in FIG. 7, PEs denoted by an initial logical
model of a streaming application are not completely the same as PEs denoted by an
updated logical model of the streaming application, and a direction of a data stream
between the PEs also changes. According to FIG. 7, a quantity of the PEs in the logical
model of the streaming application before updating is different from a quantity of
the PEs in the logical model of the streaming application that is updated (PE2, PE3,
PE4 and PE6 are deleted, and PE9 to PE13 are added), a direction of a data stream
also changes, that is, original data streams S4, S5, S6, and S7 are deleted; and data
streams S11 to S16 are added, a destination PE of an original data stream S3 is updated,
and a source PE of an original data stream S9 is updated. In this case, as shown in
FIG. 8, a main procedure of the streaming application upgrading method is as follows:
S801: A master node determines a to-be-adjusted PE and a to-be-adjusted data stream
by comparing an initial logical model of a streaming application with an updated logical
model of the streaming application, where the to-be-adjusted PE includes one or more
PEs, and the to-be-adjusted data stream includes one or more data streams. Specifically,
in an exemplary embodiment, the to-be-adjusted PE includes at least one of the following:
a to-be-added PE and a to-be-deleted PE, and the to-be-adjusted data stream may include
at least one of the following: a to-be-added data stream, a to-be-deleted data stream,
and a to-be-updated data stream.
[0045] Specifically, as shown in FIG. 9, the master node may determine, by comparing the
logical model of the streaming application before updating with the logical model
of the streaming application that is updated, that the original logical model is the
same as the updated logical model only after a logical submodel including PE2, PE3,
PE4, and PE6 in the original logical model is replaced with a logical submodel including
PE9 to PE13. Therefore, it is determined that PE2, PE3, PE4, PE6, and PE9 to PE13
are to-be-adjusted PEs (where PE2, PE3, PE4 and PE6 are to-be-deleted PEs, and PE9
to PE13 are to-be-added PEs), and it is determined that data streams related to the
to-be-adjusted PEs, that is, all input streams and output streams of the to-be-adjusted
PEs are to-be-adjusted streams. As shown in FIG. 9, a stream indicated by a dashed
line part is a to-be-deleted data stream, and a stream indicated by a black bold part
is a to-be-added data stream, and a stream indicated by a light-colored bold part
is a to-be-updated data stream.
[0046] S802: The master node generates a first upgrading instruction according to the to-be-adjusted
data stream; and generates a second upgrading instruction according to the to-be-adjusted
PE, where the first upgrading instruction and the second upgrading instruction may
include one or more instructions each, the first upgrading instruction is related
to a type of the to-be-adjusted data stream, and the second upgrading instruction
is related to a type of the to-be-adjusted PE. For example, if the to-be-adjusted
data stream includes a to-be-added data stream and a to-be-updated data stream, the
generated first upgrading instruction includes an instruction used to add a data stream
and an instruction used to update a data stream; and if the to-be-adjusted PE includes
a to-be-added PE, the generated second upgrading instruction includes an instruction
used to add a PE, where the first upgrading instruction and the second upgrading instruction
may be separate instructions, or may be integrated into one instruction, which is
not particularly limited in the present invention either. Specifically, as shown in
FIG. 7, the generated first upgrading instruction includes an instruction for deleting
a data stream, an instruction for adding a data stream, and an instruction for updating
a data stream, and the second upgrading instruction includes an instruction for adding
a PE, and an instruction for deleting a PE.
[0047] In a specific embodiment, as shown in FIG. 9, after determining the to-be-adjusted
PE and the to-be-adjusted stream by comparing the logical model of the streaming application
before updating with the logical model of the streaming application that is updated,
the master node may further determine an adjustment policy, that is, how to adjust
a PE and a stream, so that PE deployment after the adjustment (including a quantity
of PEs and a dependency relationship between data streams between the PEs) matches
the updated logical model of the streaming application. The adjustment policy includes
two pieces of content: (1) a policy of adjusting a quantity of PEs, that is, which
PEs need to be added and/or which PEs need to deleted; and (2) a policy of adjusting
a direction of a data stream between PEs, that is, directions of which data streams
between PEs need to be updated, which data streams need to be added, and which data
streams need to be deleted.
[0048] In an exemplary embodiment, the adjustment policy mainly includes at least one of
the following:
- (1) Update a stream: either a destination node or a source node of a data stream changes;
- (2) Delete a stream: a data stream needs to be discarded after an application is updated;
- (3) Add a stream: no data stream originally exists, and a stream is added after an
application is updated;
- (4) Delete a PE: a PE needs to be discarded after an application is updated; and
- (5) Add a PE: a PE is added after an application is updated.
[0049] Specifically, in the logical models shown in FIG. 7 and FIG. 9, it can be seen with
reference to FIG. 10 that, five PEs (PE9 to PE13) need to be added, and data streams
between PE9 to PE13 need to be added; PE2, PE3, PE4, and PE6 need to be deleted, and
data streams between PE2, PE3, PE4, and PE6 need to be deleted. In addition, because
a destination PE of an output stream of PE1 changes (from PE2 to PE9), and an input
stream of PE7 also changes (from an output stream of PE6 to an output stream of PE13,
that is, a source node of a stream changes), an output stream of PE1 and an input
stream of PE7 need to be updated. Based on the foregoing analysis, it may be learned
that the adjustment policy is:
- (1) Add PE9 to PE13;
- (2) Add streams between PE9 to PE13, where directions of data streams between PE9
to PE13 are determined by the updated logical model;
- (3) Delete PE2, PE3, PE4, and PE6;
- (4) Delete streams between PE2, PE3, PE4, and PE6; and
- (5) Change a destination PE of an output stream of PE1 from PE2 to PE9; and change
a source PE of an input stream of PE7 from PE6 to PE13.
[0050] After the adjustment policy is determined, the master node may generate an upgrading
instruction based on the determined adjustment policy, where the upgrading instruction
is used to instruct a worker node (which is specifically a worker node at which a
to-be-adjusted PE is located and a worker node at which a PE related to a to-be-adjusted
data stream is located) to implement the determined adjustment policy. Corresponding
to the adjustment policy, the upgrading instruction includes at least one of the following:
an instruction for adding a PE, an instruction for deleting a PE, an instruction for
updating a stream, an instruction for deleting a stream, and an instruction for adding
a stream. Specifically, in the logical models shown in FIG. 7 and FIG. 9, the upgrading
instruction specifically includes:
- (1) an instruction for adding PE9 to PE13;
- (2) an instruction for adding streams between PE9 to PE13;
- (3) an instruction for deleting PE2, PE3, PE4, and PE6;
- (4) an instruction for deleting streams between PE2, PE3, PE4, and PE6; and
- (5) an instruction for changing a destination PE of an output stream of PE1 from PE2
to PE9; and an instruction for changing a source PE of an input stream of PE7 from
PE6 to PE13.
[0051] S803: The master node delivers the generated first upgrading instruction to a first
worker node, and delivers the generated second upgrading instruction to a second worker
node, where the first worker node is a worker node at which a PE related to the to-be-adjusted
data stream is located, and the second worker node includes a worker node at which
the to-be-adjusted PE is located. It can be understood that, there may be one or more
first worker nodes and one or more second worker nodes, and the first worker node
and the second worker node may be overlapped, that is, a worker node may not only
belong to the first worker node but also belong to the second worker node; the first
upgrading instruction is used to instruct the first worker node to adjust the direction
of the data stream between the PEs distributed on the first worker node, and the second
upgrading instruction is used to instruct the second worker node to adjust a quantity
of PEs distributed on the second worker node. After receiving the upgrading instruction,
the first worker node and the second worker node perform an operation indicated by
the upgrading instruction, so that PEs distributed on the first worker node and the
second worker node and a direction of a data stream between the PEs are adjusted.
It can be understood that, the adjusting, by the second worker node, a quantity of
PEs distributed on the second worker node may be specifically creating a PE and/or
deleting a created PE.
[0052] Optionally, in a specific embodiment, if the to-be-adjusted PE includes a to-be-deleted
PE, the second worker node includes a worker node at which the to-be-deleted PE is
located; and the second upgrading instruction is used to instruct the second worker
node to delete the to-be-deleted PE.
[0053] Optionally, in another specific embodiment, if the to-be-adjusted PE includes a to-be-added
PE, the second worker node may be a worker node selected by the master node according
to a load status of each worker node in the stream computing system, or may be a worker
node randomly selected by the master node; and the second upgrading instruction is
used to instruct the second worker node to create the to-be-added PE.
[0054] Specifically, in the logical models shown in FIG. 7 and FIG. 9, as shown in FIG.
11, the master node sends, to worker2, an instruction for adding PE9, sends, to worker3,
an instruction for adding PE10, sends, to worker4, an instruction for adding PE11
and PE11, and sends, to worker6, an instruction for adding PE13; sends, to worker3,
an instruction for deleting PE2 and PE3, and sends, to worker4, an instruction for
deleting PE4 and PE6; and sends, to worker3, a worker node at which PE2 and PE3 are
initially located, an instruction for deleting a stream between PE2 and PE3, and sends,
to worker3, a worker node at which PE3 is located and worker4, a worker node at which
PE6 is located, an instruction for deleting a data stream between PE3 and PE6; the
other instructions can be deduced by analogy, and details are not described herein
again. It should be noted that, each worker node maintains data stream configuration
information of all PEs on the worker node, and data stream configuration information
of each PE includes information such as a source address, a destination address, and
a port number, and therefore deletion and updating of a data stream is essentially
implemented by modifying data stream configuration information.
[0055] As shown in FIG. 11, according to the upgrading instruction delivered by the master
node, PE9 is added to worker2, PE2 and PE3 are deleted from worker3, PE10 is added
to worker3, PE6 and PE4 are deleted from worker4, PE11 and PE12 are added to worker4,
and PE13 is added to worker6. In addition, worker1 to worker6 also adjust directions
of data streams between PEs by performing operations such as an operation for deleting
a stream, an operation for adding a stream, and an operation for updating a stream.
Specifically, streams between PE9 to PE13 are added, streams between PE2, PE3, PE4,
and PE6 are deleted, and a destination PE of an output stream of PE1 is changed from
PE2 to PE9; and a source PE of an input stream of PE7 is changed from PE6 to PE13.
It can be seen from FIG. 11 that, PE deployment after the adjustment (including a
quantity of PEs and a dependency relationship between data streams between the PEs)
matches the updated logical model of the streaming application A.
[0056] Further, when the first worker node and the second worker node adjust PEs distributed
on the first worker node and the second worker node and a data stream between the
PEs, data that is being processed may be lost, and therefore the data needs to be
recovered. Specifically, in an embodiment, the streaming application upgrading method
further includes:
S804: The master node determines, according to a dependency relationship between an
input stream and an output stream of the PE related to the to-be-adjusted data stream,
a target PE that needs to perform data recovery and a checkpoint checkpoint for the
target PE performing data recovery; and delivers a data recovery instruction to a
worker node at which the target PE is located, where the data recovery instruction
is used to instruct the target PE to recover data according to the checkpoint; and
after the master node determines that the first worker node and the second worker
node complete adjustment, and the PEs distributed on the first worker node and the
second worker node get ready, the master node triggers the target PE to input the
recovered data to a downstream PE of the target PE for processing. It should be noted
that, the master node may perceive a status of a PE on each worker node in the stream
computing system by actively sending a query message, or a worker node may report
a status of each PE distributed on the worker node to the master node, where a status
of a PE includes a running state, a ready state and a stopped state. When a channel
between a PE and an upstream or downstream PE is established successfully, it indicates
that the PE is in the ready state, and the PE may receive and process a data stream.
[0057] In a process of updating or upgrading the streaming application, adjustment of PE
deployment needs to be involved in adjustment of a data stream, and when the PE deployment
is adjusted, some data may be being processed, and therefore, to ensure that data
is not lost in the upgrading process, it is needed to determine, according to a dependency
relationship between an original input stream and an original output stream of the
PE related to the to-be-adjusted data stream, a target PE that needs to perform data
recovery and a checkpoint checkpoint for the target PE performing data recovery, to
ensure that data that has not been completely processed by a PE before the application
is upgraded can continue to be processed after the upgrading is completed, where the
data that needs to be recovered herein generally refers to a tuple.
[0058] In a specific embodiment, as shown in FIG. 12, an input/output relationship of a
logical submodel including {PE1, PE2, PE3, PE4, PE6, PE7} related to a to-be-adjusted
data stream is as follows: after tuples
i1,
i2,
i3 and
i4 are input from PE1 to PE2, the tuples
i1,
i2,
i3 and
i4 are processed by PE2 to obtain tuples
k1,
k2,
k3 and
j1, and then the tuples
k1,
k2, and
k3 are input to PE4 and processed to obtain
m1, the tuple
j1 is input to PE3 and processed to obtain
l1, and PE6 processes
m1 to obtain
O2, and processes
l1 to obtain
O1. Based on the foregoing input/output relationship, a dependency relationship between
an input stream and an output stream of a to-be-adjusted PE may be obtained by means
of analysis, as shown in FIG. 13:
O1 depends on an input l1 of PE6, l1 depends on j1, and j1 depends on i2; therefore, for an entire logical submodel, the output O1 of PE6 depends on the input i2 of PE2; and
O2 depends on the input m1 of PE6, m1 depends on inputs k1, k2 and k3 of PE4, and k1, k2 and k3 also depend on i1, i3 and i4; therefore, for the entire logical submodel, the output O2 of PE6 depends on the inputs i1, i3 and i4 of PE2. It can be known by using the foregoing dependency relationship obtained by
means of analysis that, PE2, PE3, PE4, and PE6 all depend on an output of PE1, and
therefore, when the first worker node and the second worker node adjust PEs distributed
on the first worker node and the second worker node and a data stream between the
PEs, data in PE2, PE3, PE4, and PE6 is not completely processed, and then PE1 needs
to recover the data, that is, PE1 is a target PE.
[0059] Further, it may be determined, according to latest status data backed up by a PE
related to a to-be-adjusted data stream when the first worker node and the second
worker node adjust the PEs distributed on the first worker node and the second worker
node and the data stream between the PEs, whether data that is input to the PE related
to the to-be-adjusted data stream has been completely processed and is output to a
downstream PE, and therefore a checkpoint checkpoint for the target PE performing
data recovery may be determined. It should be noted that, status data of a PE is used
to denote a status in which a PE processes data, and content specifically included
in the status data is well-known by persons skilled in the art. For example, the status
data may include one or more types of the following: cache data in a tuple receiving
queue, cache data on a message channel, and data generated by a PE in a process of
processing one or more common tuples in a receiving queue of the PE (such as a processing
result of a common tuple that is currently processed and intermediate process data).
It should be noted that, data recovery does not need to be performed on an added data
stream, and therefore when a checkpoint for performing data recovery and a target
PE that needs to perform data recovery are determined, neither status information
of a PE related to a to-be-added data stream, nor a dependency relationship between
an input stream and an output stream of the PE related to the to-be-added data stream
needs to be used. For example, in an embodiment, if the to-be-adjusted data stream
includes a to-be-updated data stream, a to-be-deleted data stream, and a to-be-added
data stream, a checkpoint for performing data recovery may be determined according
to only status data of a PE related to the to-be-updated data stream and the to-be-deleted
data stream, and a target PE that needs to perform data recovery may be determined
according to only a dependency relationship between an input stream and an output
stream of the PE related to the to-be-updated data stream and the to-be-deleted data
stream. Similarly, if the to-be-adjusted data stream includes a to-be-updated data
stream and a to-be-added data stream, a checkpoint for performing data recovery and
a target PE that needs to perform data recovery may be determined according to only
status data of a PE related to the to-be-updated data stream, and a dependency relationship
between an input stream and an output stream of the PE related to the to-be-updated
data stream.
[0060] It should be noted that, in an embodiment of the present invention, status data of
a PE is periodically backed up, that is, the stream computing system periodically
triggers each PE to back up status data of the PE, and after receiving a checkpoint
(checkpoint) event, the PE backups current status data of the PE, records the checkpoint,
and clears expired data. It can be understood by persons skilled in the art that,
a checkpoint may be understood as a record point of data backup or an index of backup
data, one checkpoint corresponds to one data backup operation, data backed up at different
moments has different checkpoints, and data backed up at a checkpoint may be queried
and obtained by using the checkpoint. In another embodiment of the present invention,
status data may be backed up by using an output triggering mechanism (triggered by
an output of a PE). As shown in FIG. 14, when a PE completes processing on input streams
Input_Stream1 to Input_Stream5, and outputs a processing result Output_Stream1, a
triggering module triggers a status data processing module, and the status data processing
module then starts a new checkpoint to record latest status data of the PE into a
memory or a magnetic disk. Such a triggering manner is precise and effective, has
higher efficiency compared with a periodic triggering manner, and can avoid excessive
resource consumption. Further, the status data processing module may further clear
historical data recorded at a previous checkpoint, thereby reducing intermediate data
and effectively saving storage space.
[0061] By using the situation shown in FIG. 12 as an example, the following describes in
detail a process of determining, according to a dependency relationship between an
input stream and an output stream of a PE and status data, a target PE that needs
to perform data recovery and a checkpoint checkpoint for the target PE performing
data recovery. If it is determined according to status data of {PE1, PE2, PE3, PE4,
PE6, PE7} related to a to-be-adjusted data stream that the input PE6 has not completed
processing on a tuple
m1, or
O2 obtained after processing is performed on a tuple
m1 has not been sent to PE7, a downstream PE of PE6, it may be determined according
to the foregoing dependency relationship between an input stream and an output stream
that:
i1,
i3, and
i4 on which
O2 depends need to be recovered, and PE1 that outputs
i1,
i3, and
i4 should complete data recovery, that is, a target PE that needs to recover data is
PE1, and therefore a checkpoint at which
i1,
i3, and
i4 may be recovered may be determined. In this way, before the first worker node and
the second worker node adjust deployment of PEs on the first worker node and the second
worker node, the target PE may recover the data
i1,
i3, and
i4 according to the determined checkpoint, and after the first worker node and the second
worker node complete adjustment, and the PEs distributed on the first worker node
and the second worker node get ready, the target PE sends the recovered data
i1,
i3, and
i4 to a downstream PE of the target PE for processing, thereby ensuring that data loss
does not occur in the upgrading process, and achieving an objective of lossless upgrading.
[0062] Optionally, before performing the steps of the foregoing streaming application upgrading
method, the master node may further configure multiple PEs according to the initial
logical model of the streaming application to process data of the streaming application.
[0063] According to the streaming application upgrading method provided in this embodiment
of the present invention, a logical model of a streaming application is compared with
an updated logical model of the streaming application, to dynamically determine a
to-be-adjusted data stream, and a corresponding upgrading instruction is generated
and delivered to a worker node, to complete online upgrading of the streaming application,
thereby ensuring that an original service does not need to be interrupted in an application
upgrading process; and further, data is recovered in the upgrading process, to ensure
that key data is not lost, and service running is not affected.
[0064] Based on the foregoing method and system embodiments, an embodiment of the present
invention further provides a master node in a stream computing system, where the master
node may be a computer or a server; and the stream computing system further includes
at least one worker node. Assuming that a streaming application A is deployed in the
stream computing system, multiple process elements (PE) are distributed on one or
more worker nodes of the at least one worker node, and are configured to process data
of the streaming application A, where a logical model of the streaming application
A is used to denote the multiple PEs processing the data of the streaming application
and a direction of a data stream between the multiple PEs; and assuming that the initial
logical model of the streaming application A is D1, after the streaming application
A is upgraded or updated, the logical model of the streaming application A is correspondingly
updated (assuming that a updated logical model is D2). As shown in FIG. 15, the master
node 30 includes:
an obtaining and determining module 301, configured to: in a case in which the streaming
application A is updated, obtain an updated logical model D2 of the streaming application
A, and determine a to-be-adjusted data stream by comparing the updated logical model
D2 with the initial logical model;
an upgrading instruction generating module 302, configured to generate an upgrading
instruction according to the to-be-adjusted stream; and
a sending module 303, configured to deliver the generated upgrading instruction to
a first worker node, so that the first worker node adjusts, according to an indication
of the upgrading instruction, a direction of a data stream between PEs distributed
on the first worker node, where the first worker node is one or more worker nodes
of the at least one worker node included in the stream computing system, and the first
worker node is a worker node at which a PE related to the to-be-adjusted data stream
is located.
[0065] It should be noted that, there may be one or more to-be-adjusted data streams in
this embodiment of the present invention, which depends on a specific situation. PEs
related to each to-be-adjusted data stream specifically refer to a source PE and a
destination PE of the to-be-adjusted data stream, where the source PE of the to-be-adjusted
data stream is a PE that outputs the to-be-adjusted data stream, the destination PE
of the to-be-adjusted data stream is a PE receiving the to-be-adjusted data stream
or a downstream PE of the source PE of the to-be-adjusted data stream.
[0066] According to the master node in the stream computing system that is provided in this
embodiment of the present invention, a logical model of a streaming application is
compared with an updated logical model of the streaming application, to dynamically
determine a to-be-adjusted data stream, and a corresponding upgrading instruction
is generated according to the to-be-adjusted data stream and delivered to a worker
node, thereby upgrading the streaming application in the stream computing system online
without interrupting a service.
[0067] Further, specific processing of the obtaining and determining module 301 varies with
a type of a difference between the updated logical model and the initial logical model.
For example, in an exemplary embodiment, the obtaining and comparing module 301 is
specifically configured to:
compare the initial logical model of the streaming application A with the updated
logical model, to determine the to-be-adjusted data stream, where the PEs denoted
by the initial logical model D1 of the streaming application A are the same as PEs
denoted by the updated logical model D2.
[0068] In another exemplary embodiment, the obtaining and comparing module 301 is specifically
configured to: compare the initial logical model D1 with the updated logical model
D2, to determine a to-be-adjusted PE and the to-be-adjusted data stream, where the
PEs denoted by the initial logical model D1 of the streaming application A are not
completely the same as PEs denoted by the updated logical model D2. Correspondingly,
in this case, the upgrading instruction generating module 302 is specifically configured
to: generate a first upgrading instruction according to the to-be-adjusted data stream
determined by the obtaining and comparing module 301; and generate a second upgrading
instruction according to the to-be-adjusted PE determined by the obtaining and comparing
module 301. The sending module 303 is specifically configured to: deliver the first
upgrading instruction to a first worker node, and deliver the second upgrading instruction
to a second worker node, where the second worker node includes a worker node at which
the to-be-adjusted PE is located, the first upgrading instruction is used to instruct
the first worker node to adjust the direction of the data stream between the PEs distributed
on the first worker node, and the second upgrading instruction is used to instruct
the second worker node to adjust a quantity of PEs distributed on the second worker
node.
[0069] Further, in an exemplary embodiment, the master node 30 further includes:
a data recovery module 304, configured to determine, according to a dependency relationship
between an input stream and an output stream of the PE related to the to-be-adjusted
data stream, a target PE that needs to perform data recovery and a checkpoint for
the target PE performing data recovery, where
the sending module 303 is further configured to: after the data recovery module 304
determines the target PE and the checkpoint, deliver a data recovery instruction to
a worker node at which the target PE is located, where the data recovery instruction
is used to instruct the target PE to recover data according to the checkpoint. It
can be understood that, the data recovery instruction is constructed according to
the target PE and the checkpoint that are determined by the data recovery module 304,
and includes information indicating the checkpoint.
[0070] Correspondingly, the master node 30 further includes an input triggering module 305,
configured to: after it is determined that the first worker node completes adjustment,
and the PEs distributed on the first worker node all get ready, trigger the target
PE determined by the data recovery module 304 to input the recovered data to a downstream
PE of the target PE for processing.
[0071] According to the master node in the stream computing system that is provided in this
embodiment of the present invention, a logical model of a streaming application is
compared with an updated logical model of the streaming application, to dynamically
determine a to-be-adjusted data stream, and a corresponding upgrading instruction
is generated and delivered to a worker node, to complete online upgrading of the streaming
application, thereby ensuring that an original service does not need to be interrupted
in an application upgrading process; and further, data is recovered in the upgrading
process, and therefore key data is not lost, and service running is not affected.
[0072] The master node in the stream computing system that is provided in the present invention
is configured to implement the streaming application upgrading method in the foregoing
method embodiment. For specific implementation of the master node, refer to the foregoing
method embodiment, and details are not described herein again.
[0073] An embodiment of the present invention further provides a stream computing system,
configured to implement a streaming application upgrading method provided in an embodiment
of the present invention. As shown in FIG. 16, the stream computing system includes
a master node 30 and at least one worker node (such as worker nodes 31 to 34 in FIG.
16). The master node 30 configures, according to an initial logical model of a streaming
application, multiple process elements (PE) to process a data stream of the streaming
application, and the initial logical model of the streaming application is used to
denote the multiple PEs processing data of the streaming application and a direction
of a data stream between the multiple PEs. As shown in FIG. 16, the configured multiple
PEs are distributed on one or more worker nodes. The master node 30 is configured
to: in a case in which a streaming application A is updated, obtain an updated logical
model of the streaming application, and determine a to-be-adjusted data stream by
comparing the updated logical model with an initial logical model; generate an upgrading
instruction according to the to-be-adjusted stream; and deliver the generated upgrading
instruction to a first worker node, where the first worker node is one or more worker
nodes of the at least one worker node included in the stream computing system, and
the first worker node is a worker node at which a PE related to the to-be-adjusted
data stream is located.
[0074] The first worker node is configured to receive the upgrading instruction sent by
the master node 30, and adjust, according to an indication of the upgrading instruction,
the direction of the data stream between the PEs distributed on the first worker node.
[0075] According to the stream computing system provided in this embodiment of the present
invention, a logical model of a streaming application is compared with an updated
logical model of the streaming application, to dynamically determine a to-be-adjusted
data stream, and a corresponding upgrading instruction is generated and delivered
to a worker node, to complete online upgrading of the streaming application, thereby
ensuring that an original service does not need to be interrupted in an application
upgrading process.
[0076] Specifically, in an embodiment, the upgrading instruction includes at least one of
the following: an instruction for adding a PE, an instruction for deleting a PE, an
instruction for updating a stream, an instruction for deleting a stream, and an instruction
for adding a stream. Correspondingly, after receiving the upgrading instruction, the
first worker node performs at least one of the following operations: adding a process
element, deleting a process element, updating a stream, deleting a stream, and adding
a stream, so that PE deployment after the foregoing operation is performed (including
a quantity of PEs and a dependency relationship between data streams between the PEs)
matches the updated logical model of the streaming application A.
[0077] Preferably, in an embodiment, the master node is specifically configured to: compare
the initial logical model of the streaming application with the updated logical model,
to determine the to-be-adjusted data stream, where the PEs denoted by the initial
logical model of the streaming application are the same as PEs denoted by the updated
logical model.
[0078] Preferably, in another embodiment, the master node is specifically configured to:
compare the initial logical model of the streaming application with the updated logical
model, to determine a to-be-adjusted PE and the to-be-adjusted data stream, where
the PEs denoted by the initial logical model of the streaming application are not
completely the same as PEs denoted by the updated logical model; generate a first
upgrading instruction according to the to-be-adjusted data stream; generate a second
upgrading instruction according to the to-be-adjusted PE; and deliver the first upgrading
instruction to a first worker node, and deliver the second upgrading instruction to
a second worker node, where the first worker node is a worker node at which a PE related
to the to-be-adjusted data stream is located, and the second worker node includes
a worker node at which the to-be-adjusted PE is located. Correspondingly, the first
worker node is specifically configured to receive the first upgrading instruction
sent by the master node 30, and adjust, according to an indication of the first upgrading
instruction, the direction of the data stream between the PEs distributed on the first
worker node; and the second worker node is configured to receive the second upgrading
instruction sent by the master node 30, and adjust, according to an indication of
the second upgrading instruction, a quantity of PEs distributed on the second worker
node.
[0079] Preferably, in another embodiment, the master node 30 is further configured to determine,
according to a dependency relationship between an input stream and an output stream
of the PE related to the to-be-adjusted data stream, a target PE that needs to perform
data recovery and a checkpoint checkpoint for the target PE performing data recovery;
deliver a data recovery instruction to a worker node at which the target PE is located,
where the data recovery instruction is used to instruct the target PE to recover data
according to the checkpoint; and after it is determined that the first worker node
completes adjustment, and the PEs distributed on the first worker node all get ready,
trigger the target PE to input the recovered data to a downstream PE of the target
PE for processing.
[0080] It should be noted that, the stream computing system provided in the present invention
is configured to implement this streaming application upgrading method in the foregoing
method embodiment. For specific implementation of the stream computing system, refer
to the foregoing method embodiment, and details are not described herein again. A
process element (PE) in this embodiment of the present invention may exist in a form
of software, such as a process, a thread, or a software function module, or may exist
in a form of hardware, such as a processor core, or a logic circuit that has a data
processing capability, and the functions described in this embodiment of the present
invention are implemented by reading executable code or service processing logic in
a memory, which is not particularly limited in the present invention.
[0081] An embodiment of the present invention further provides a master node in a stream
computing system, where the master node may be a computer or a server. FIG. 17 is
a schematic structural diagram of a master node 40 according to an embodiment of the
present invention. The master node 40 may include an input device 410, an output device
420, a processor 430, and a memory 440.
[0082] The master node 40 provided in this embodiment of the present invention is applied
to the stream computing system, the stream computing system further includes a worker
node, and a streaming application is deployed in the stream computing system.
[0083] The memory 440 may include a read-only memory and a random access memory, and provides
an instruction and data to the processor 430. A part of the memory 440 may further
include a non-volatile random access memory (NVRAM).
[0084] The memory 440 stores an operation instruction, an operating system (including various
system programs that are used to implement various basic services and processing a
hardware-based task), an executable module, or a data structure, or a subset thereof,
or an extension set thereof.
[0085] In this embodiment of the present invention, after the streaming application is updated,
the processor 430 performs the following operations by invoking the operation instruction
stored in the memory 440 (the operation instruction may be stored in the operating
system):
obtaining, by using the input device 410, a updated logical model of a streaming application,
and determining a to-be-adjusted stream by comparing the updated logical model with
an initial logical model; generating an upgrading instruction according to the to-be-adjusted
stream; and delivering the generated upgrading instruction to a first worker node,
where the first worker node is one or more worker nodes of at least one worker node
included in the stream computing system, and the first worker node is a worker node
at which a PE related to the to-be-adjusted data stream is located.
[0086] According to the master node provided in this embodiment of the present invention,
a logical model of a streaming application is compared with an updated logical model
of the streaming application, to dynamically determine a to-be-adjusted data stream,
and a corresponding upgrading instruction is generated and delivered to a worker node,
to complete online upgrading of the streaming application, thereby ensuring that an
original service does not need to be interrupted in an application upgrading process.
[0087] The processor 430 controls an operation of the service processing apparatus 40, and
the processor 430 may be further referred to as a CPU (Central Processing Unit, central
processing unit). The memory 440 may include a read-only memory and a random access
memory, and provides an instruction and data to the processor 430. A part of the memory
440 may further include a non-volatile random access memory (NVRAM). In a specific
application, components of the service processing apparatus 40 are coupled together
by using a bus system 450. In addition to a data bus, the bus system 450 may further
include a power supply bus, a control bus, a status signal bus, and the like. However,
for clear description, various types of buses in the figure are marked as the bus
system 450.
[0088] The method disclosed in the foregoing embodiment of the present invention may be
applied to the processor 430, or be implemented by the processor 430. The processor
430 may be an integrated circuit chip, and has a signal processing capability. During
implementation, the steps of the foregoing method may be implemented by using an integrated
logic circuit of hardware in the processor 430 or implemented by using an instruction
in a software form. The processor 430 may be a general purpose processor, a digital
signal processor (DSP), an application-specific integrated circuit (ASIC), a field
programmable gate array (FPGA) or another programmable logical device, a discrete
gate or a transistor logical device, or a discrete hardware component. The processor
430 may implement or execute methods, steps, and logical block diagrams disclosed
in the embodiments of the present invention. A general purpose processor may be a
microprocessor or the processor may be any conventional processor or the like. Steps
of the methods disclosed with reference to the embodiments of the present invention
may be directly executed and completed by a hardware decoding processor, or may be
executed and completed by using a combination of hardware and software modules in
the decoding processor. The software module may be located in a mature storage medium
in the field, such as a random access memory, a flash memory, a read-only memory,
a programmable read-only memory, an electrically-erasable programmable memory, or
a register. The storage medium is located in the memory 440, and the processor 430
reads information in the memory 440 and completes the steps in the foregoing methods
in combination with hardware of the processor 430.
[0089] It should be understood that, the data backup and stream computing system disclosed
in several embodiments provided in this application may be further implemented in
other manners. For example, the apparatus embodiments described above are merely exemplary.
[0090] The units described as separate parts may or may not be physically separate, and
parts displayed as units may or may not be physical units, may be located in one position,
or may be distributed on a plurality of network units. Some or all of the units may
be selected according to actual needs to achieve the objectives of the solutions of
the embodiments.
[0091] In addition, functional units in network devices provided by the embodiments of the
present invention may be integrated into one processing unit, or each of the units
may exist alone physically, or two or more units are integrated into one unit. The
integrated unit may be implemented in a form of hardware, or may be implemented in
a form of a software functional unit.
[0092] When the integrated unit is implemented in the form of a software functional unit
and sold or used as an independent product, the integrated unit may be stored in a
computer-readable storage medium. Based on such an understanding, the technical solutions
of the present invention essentially, or the part contributing to the prior art, or
all or some of the technical solutions may be implemented in the form of a software
product. The software product is stored in a storage medium and includes several instructions
for instructing a computer device (which may be a personal computer, a server, or
a network device) to perform all or some of the steps of the methods described in
the embodiments of the present invention. The foregoing storage medium includes any
medium that can store program code, such as a USB flash drive, a removable hard disk,
a read-only memory (ROM, Read-Only Memory), a random access memory (RAM, Random Access
Memory), a magnetic disk, or an optical disc.
[0093] Finally, it should be noted that the foregoing embodiments are merely intended for
describing the technical solutions of the present invention, but not for limiting
the present invention. Although the present invention is described in detail with
reference to the foregoing embodiments, persons of ordinary skill in the art should
understand that they may still make modifications to the technical solutions described
in the foregoing embodiments or make equivalent replacements to some technical features
thereof, without departing from the scope of the technical solutions of the embodiments
of the present invention.
1. A streaming application upgrading method, wherein the method is applied to a master
node in a stream computing system, and the stream computing system comprises the master
node and at least one worker node, wherein multiple process elements PE are distributed
on one or more worker nodes of the at least one worker node, and are configured to
process data of a streaming application deployed in the stream computing system, wherein
an initial logical model of the streaming application is used to denote the multiple
PEs processing the data of the streaming application and a direction of a data stream
between the multiple PEs; and the method comprises:
in a case in which the streaming application is updated, obtaining, by the master
node, an updated logical model of the streaming application, and determining a to-be-adjusted
data stream by comparing the initial logical model of the streaming application with
the updated logical model;
generating an upgrading instruction according to the to-be-adjusted data stream; and
delivering the upgrading instruction to a first worker node, wherein the first worker
node is a worker node at which a PE related to the to-be-adjusted data stream is located,
and the upgrading instruction is used to instruct the first worker node to adjust
a direction of a data stream between PEs distributed on the first worker node.
2. The upgrading method according to claim 1, wherein the determining a to-be-adjusted
data stream by comparing the initial logical model of the streaming application with
the updated logical model comprises:
comparing the initial logical model of the streaming application with the updated
logical model, to determine the to-be-adjusted data stream, wherein the PEs denoted
by the initial logical model of the streaming application are the same as PEs denoted
by the updated logical model.
3. The upgrading method according to claim 1, wherein the determining a to-be-adjusted
data stream by comparing the initial logical model of the streaming application with
the updated logical model comprises:
comparing the initial logical model of the streaming application with the updated
logical model, to determine a to-be-adjusted PE and the to-be-adjusted data stream,
wherein the PEs denoted by the initial logical model of the streaming application
are not completely the same as PEs denoted by the updated logical model;
the generating an upgrading instruction according to the to-be-adjusted data stream
comprises:
generating a first upgrading instruction according to the to-be-adjusted data stream;
and generating a second upgrading instruction according to the to-be-adjusted PE;
and
the delivering the upgrading instruction to a first worker node comprises:
delivering the first upgrading instruction to the first worker node, and delivering
the second upgrading instruction to a second worker node, wherein the second worker
node comprises a worker node at which the to-be-adjusted PE is located; and the first
upgrading instruction is used to instruct the first worker node to adjust the direction
of the data stream between the PEs distributed on the first worker node, and the second
upgrading instruction is used to instruct the second worker node to adjust a quantity
of PEs distributed on the second worker node.
4. The upgrading method according to any one of claims 1 to 3, further comprising:
determining, by the master node according to a dependency relationship between an
input stream and an output stream of the PE related to the to-be-adjusted data stream,
a target PE that needs to perform data recovery and a checkpoint checkpoint for the
target PE performing data recovery;
delivering a data recovery instruction to a worker node at which the target PE is
located, wherein the data recovery instruction is used to instruct the target PE to
recover data according to the checkpoint; and
after it is determined that the first worker node completes adjustment, and the PEs
distributed on the first worker node get ready, triggering, by the master node, the
target PE to input the recovered data to a downstream PE of the target PE for processing.
5. The upgrading method according to claim 4, wherein the to-be-adjusted data stream
comprises: a to-be-updated data stream and a to-be-deleted data stream; and the determining,
by the master node according to a dependency relationship between an input stream
and an output stream of the PE related to the to-be-adjusted data stream, a target
PE that needs to perform data recovery and a checkpoint checkpoint for the target
PE performing data recovery comprises:
determining, by the master node according to status data of a PE related to the to-be-updated
data stream and the to-be-deleted data stream, a checkpoint for performing data recovery;
and determining, according to a dependency relationship between an input stream and
an output stream of a PE related to the to-be-updated data stream and the to-be-deleted
data stream, a target PE that needs to perform data recovery, wherein status data
of each PE is backed up by the PE when being triggered by an output event, and is
used to indicate a status in which the PE processes data.
6. The upgrading method according to any one of claims 3 to 5, wherein the to-be-adjusted
PE comprises a to-be-added PE; the second worker node is a worker node selected by
the master node according to a load status of each worker node in the stream computing
system; and the second upgrading instruction is used to instruct the second worker
node to create the to-be-added PE.
7. The upgrading method according to any one of claims 3 to 6, wherein the to-be-adjusted
PE comprises a to-be-deleted PE; the second worker node is a worker node at which
the to-be-deleted PE is located; and the second upgrading instruction is used to instruct
the second worker node to delete the to-be-deleted PE.
8. The upgrading method according to any one of claims 1 to 7, further comprising:
configuring the multiple PEs according to the initial logical model of the streaming
application to process the data of the streaming application.
9. The upgrading method according to any one of claims 1 to 8, wherein the initial logical
model of the streaming application is denoted by using a directed acyclic graph DAG.
10. A master node in a stream computing system, wherein the stream computing system comprises
the master node and at least one worker node, wherein multiple process elements PE
are distributed on one or more worker nodes of the at least one worker node, and are
configured to process data of a streaming application deployed in the stream computing
system, wherein an initial logical model of the streaming application is used to denote
the multiple PEs processing the data of the streaming application and a direction
of a data stream between the multiple PEs; and the master node comprises:
an obtaining and comparing module, configured to: in a case in which the streaming
application is updated, obtain an updated logical model of the streaming application,
and determine a to-be-adjusted data stream by comparing the initial logical model
of the streaming application with the updated logical model;
an upgrading instruction generating module, configured to generate an upgrading instruction
according to the to-be-adjusted data stream; and
a sending module, configured to deliver the upgrading instruction to a first worker
node, wherein the first worker node is a worker node at which a PE related to the
to-be-adjusted data stream is located, and the upgrading instruction is used to instruct
the first worker node to adjust a direction of a data stream between PEs distributed
on the first worker node.
11. The master node according to claim 10, wherein the obtaining and comparing module
is specifically configured to:
compare the initial logical model of the streaming application with the updated logical
model, to determine the to-be-adjusted data stream, wherein the PEs denoted by the
initial logical model of the streaming application are the same as PEs denoted by
the updated logical model.
12. The master node according to claim 10, wherein the obtaining and comparing module
is specifically configured to: compare the initial logical model of the streaming
application with the updated logical model, to determine a to-be-adjusted PE and the
to-be-adjusted data stream, wherein the PEs denoted by the initial logical model of
the streaming application are not completely the same as PEs denoted by the updated
logical model;
the upgrading instruction generating module is specifically configured to generate
a first upgrading instruction according to the to-be-adjusted data stream; and generate
a second upgrading instruction according to the to-be-adjusted PE; and
the sending module is specifically configured to deliver the first upgrading instruction
to the first worker node, and deliver the second upgrading instruction to a second
worker node, wherein the second worker node comprises a worker node at which the to-be-adjusted
PE is located; and the first upgrading instruction is used to instruct the first worker
node to adjust the direction of the data stream between the PEs distributed on the
first worker node, and the second upgrading instruction is used to instruct the second
worker node to adjust a quantity of PEs distributed on the second worker node.
13. The master node according to any one of claims 10 to 12, further comprising:
a data recovery module, configured to determine, according to a dependency relationship
between an input stream and an output stream of the PE related to the to-be-adjusted
data stream, a target PE that needs to perform data recovery and a checkpoint checkpoint
for the target PE performing data recovery, wherein
the sending module is further configured to deliver a data recovery instruction to
a worker node at which the target PE is located, wherein the data recovery instruction
is used to instruct the target PE to recover data according to the checkpoint; and
the master node further comprises: an input triggering module, configured to: after
it is determined that the first worker node completes adjustment, and the PEs distributed
on the first worker node get ready, trigger the target PE to input the recovered data
to a downstream PE of the target PE for processing.
14. The master node according to claim 13, wherein the to-be-adjusted data stream comprises:
a to-be-updated data stream and a to-be-deleted data stream; and the data recovery
module is specifically configured to:
determine, by the master node according to status data of a PE related to the to-be-updated
data stream and the to-be-deleted data stream, a checkpoint for performing data recovery;
and determine, according to a dependency relationship between an input stream and
an output stream of a PE related to the to-be-updated data stream and the to-be-deleted
data stream, a target PE that needs to perform data recovery, wherein status data
of each PE is backed up by the PE when being triggered by an output event, and is
used to indicate a status in which the PE processes data.
15. The master node according to any one of claims 12 to 14, wherein the to-be-adjusted
PE comprises a to-be-deleted PE; the second worker node is a worker node at which
the to-be-deleted PE is located; and the second upgrading instruction is used to instruct
the second worker node to delete the to-be-deleted PE.
16. The master node according to any one of claims 12 to 15, wherein the to-be-adjusted
PE comprises a to-be-added PE; the second worker node is a worker node selected by
the master node according to a load status of each worker node in the stream computing
system; and the second upgrading instruction is used to instruct the second worker
node to create the to-be-added PE.
17. A stream computing system, comprising: a master node and at least one worker node,
wherein multiple process elements PE are distributed on one or more worker nodes of
the at least one worker node, and are configured to process data of a streaming application
deployed in the stream computing system, wherein an initial logical model of the streaming
application is used to denote the multiple PEs processing the data of the streaming
application and a direction of a data stream between the multiple PEs; and
the master node is configured to: in a case in which the streaming application is
updated, obtain an updated logical model of the streaming application, and determine
a to-be-adjusted data stream by comparing the initial logical model of the streaming
application with the updated logical model; generating an upgrading instruction according
to the to-be-adjusted data stream; and deliver the upgrading instruction to a first
worker node, wherein the first worker node is a worker node at which a PE related
to the to-be-adjusted data stream is located, and the upgrading instruction is used
to instruct the first worker node to adjust a direction of a data stream between PEs
distributed on the first worker node; and
the first worker node is configured to receive the upgrading instruction sent by the
master node, and adjust, according to an indication of the upgrading instruction,
the direction of the data stream between the PEs distributed on the first worker node.
18. The stream computing system according to claim 17, wherein in the aspect of determining
a to-be-adjusted data stream by comparing the initial logical model of the streaming
application with the updated logical model, the master node is specifically configured
to: compare the initial logical model of the streaming application with the updated
logical model, to determine the to-be-adjusted data stream, wherein the PEs denoted
by the initial logical model of the streaming application are the same as PEs denoted
by the updated logical model.
19. The stream computing system according to claim 17, wherein the master node is specifically
configured to compare the initial logical model of the streaming application with
the updated logical model, to determine a to-be-adjusted PE and the to-be-adjusted
data stream, wherein the PEs denoted by the initial logical model of the streaming
application are not completely the same as PEs denoted by the updated logical model;
generate a first upgrading instruction according to the to-be-adjusted data stream;
and generate a second upgrading instruction according to the to-be-adjusted PE; and
deliver the first upgrading instruction to the first worker node, and deliver the
second upgrading instruction to a second worker node, wherein the second worker node
comprises a worker node at which the to-be-adjusted PE is located;
the first worker node is specifically configured to receive the first upgrading instruction
sent by the master node, and adjust, according to an indication of the first upgrading
instruction, the direction of the data stream between the PEs distributed on the first
worker node; and
the second worker node is specifically configured to receive the second upgrading
instruction sent by the master node, and adjust, according to an indication of the
second upgrading instruction, a quantity of PEs distributed on the second worker node.
20. The stream computing system according to any one of claims 17 to 19, wherein the master
node is further configured to determine, according to a dependency relationship between
an input stream and an output stream of the PE related to the to-be-adjusted data
stream, a target PE that needs to perform data recovery and a checkpoint checkpoint
for the target PE performing data recovery; deliver a data recovery instruction to
a worker node at which the target PE is located, wherein the data recovery instruction
is used to instruct the target PE to recover data according to the checkpoint; and
after it is determined that the first worker node completes adjustment, and the PEs
distributed on the first worker node get ready, trigger the target PE to input the
recovered data to a downstream PE of the target PE for processing.