| (19) |
 |
|
(11) |
EP 3 142 011 B9 |
| (12) |
CORRECTED EUROPEAN PATENT SPECIFICATION |
|
Note: Bibliography reflects the latest situation |
| (15) |
Correction information: |
|
Corrected version no 1 (W1 B1) |
|
Corrections, see Claims DE |
| (48) |
Corrigendum issued on: |
|
29.05.2019 Bulletin 2019/22 |
| (45) |
Mention of the grant of the patent: |
|
12.12.2018 Bulletin 2018/50 |
| (22) |
Date of filing: 05.05.2015 |
|
| (51) |
International Patent Classification (IPC):
|
| (86) |
International application number: |
|
PCT/CN2015/078248 |
| (87) |
International publication number: |
|
WO 2015/169199 (12.11.2015 Gazette 2015/45) |
|
| (54) |
ANOMALY RECOVERY METHOD FOR VIRTUAL MACHINE IN DISTRIBUTED ENVIRONMENT
ANOMALIENKORREKTURVERFAHREN FÜR VIRTUELLE MASCHINEN IN EINER VERTEILTEN UMGEBUNG
PROCÉDÉ DE RÉCUPÉRATION D'ANOMALIE DESTINÉ À UNE MACHINE VIRTUELLE DANS UN ENVIRONNEMENT
DISTRIBUÉ
|
| (84) |
Designated Contracting States: |
|
AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL
NO PL PT RO RS SE SI SK SM TR |
| (30) |
Priority: |
08.05.2014 CN 201410191655
|
| (43) |
Date of publication of application: |
|
15.03.2017 Bulletin 2017/11 |
| (73) |
Proprietor: China Unionpay Co., Ltd |
|
Shanghai 200135 (CN) |
|
| (72) |
Inventors: |
|
- CHAI, Hongfeng
Shanghai 200135 (CN)
- LU, Zhijun
Shanghai 200135 (CN)
- ZU, Lijun
Shanghai 200135 (CN)
- YAN, Yixing
Shanghai 200135 (CN)
|
| (74) |
Representative: Berkenbrink, Kai-Oliver et al |
|
Patentanwälte Becker & Müller
Turmstrasse 22 40878 Ratingen 40878 Ratingen (DE) |
| (56) |
References cited: :
CN-A- 102 708 018 CN-A- 102 819 465 CN-A- 103 729 280 US-A1- 2006 080 678
|
CN-A- 102 819 465 CN-A- 103 559 108 JP-A- 2013 254 354 US-A1- 2011 035 620
|
|
| |
|
|
|
|
| |
|
| Note: Within nine months from the publication of the mention of the grant of the European
patent, any person may give notice to the European Patent Office of opposition to
the European patent
granted. Notice of opposition shall be filed in a written reasoned statement. It shall
not be deemed to
have been filed until the opposition fee has been paid. (Art. 99(1) European Patent
Convention).
|
FIELD OF THE INVENTION
[0001] The invention relates to a virtual machine abnormity recovering method, and in particular,
to a virtual machine abnormity recovering method in distributed environment.
BACKGROUND
[0002] Currently, as computer and network applications become increasingly widespread and
business types in different fields become increasingly abundant, a highly available
technology (i.e., a technology in which when a physical machine A has such problems
as breakdown, a virtual machine running on the physical machine A can start up on
a physical machine B without human intervention so that a continuous running of the
virtual machine is ensured) of a virtual machine in distributed environment (virtual
machine refers to a computer system which runs on a physical machine in a way of software
simulation, has a complete hardware system function, and operates in a completely
isolated environment) is becoming more and more important.
[0003] In existing technical solutions, typically, a high availability of virtual machine
in a distributed environment is realized in the following manner: a logic group composed
of a plurality of physical machines is defined as a highly available unit; in this
way, when any physical machine in this logic group has a breakdown or other problems,
all of the virtual machines running on this physical machine will start up on other
physical machines in the same logic group. Moreover, a control node detects the states
of physical machines in a way of heartbeating or regularly pinging physical machines,
that is, when the control node cannot detect a certain physical machine, it is considered
that this physical machine has a problem.
[0004] However, the existing technical solutions have the following problems: (1) after
a virtual machine is allocated to a highly available group, whether the business running
on this virtual machine is important or not, the virtual machine is acquiescently
considered to have a high availability. Therefore, such design cannot ensure that
a virtual machine which runs important businesses is activated preferentially, and
some wastes and redundancies are also caused to resources; (2) since only the state
of physical machine is detected, this detection method is simplex and one-sided, thus
possibly causing erroneous determination (e.g., if the ping function is prohibited
on a certain physical machine, it is possible to transfer the virtual machines on
the physical machines that are running normally to another physical machine); (3)
since the detection of the state of physical machine is initiated merely from the
control node, the determination of the state of physical machine is not complete and
accurate enough.
[0005] Therefore, there is a need to provide a virtual machine abnormity recovering method
which can accurately determine and efficiently handle the faults of physical machines
in a distributed environment.
[0006] The following documents are considered as the prior art: The document
CN102819465A, which discloses a method for fault recovery in a virtualized environment; The document
US2011/0035620A, which discloses a virtual machine infrastructure with storage domain monitoring;
The document
CN103559108A, which discloses a method and system for automatically restoring a virtualized system
from a fault; The document
CN103729280A, which discloses a high availability mechanism of a virtualized cloud computing system;
The document
US2006/0080678A, which discloses a task distribution method for protecting servers and tasks in a
distributed system.
SUMMARY OF THE INVENTION
[0007] In order to address the problems in the prior art solutions, the invention, which
is defined in detail in the appended independent claim 1, proposes a virtual machine
abnormity recovering method which can accurately determine and efficiently handle
the faults of physical machines in a distributed environment.
[0008] The objects of the invention is achieved by the following technical solution.
- A virtual machine abnormity recovering method in distributed environment, comprising
the following steps:
(A1) running an independent computing assembly on each physical machine on which a
virtual machine resides, wherein the computing assembly periodically reports the current
running state of the corresponding physical machine to a state database;
(A2) periodically polling the state database by a highly available controller so as
to check the running state of all the physical machines in a physical machine group
under the control of the highly available controller; and
(A3) ending this checking process if the running states of all the physical machines
in the physical machine group are normal; ending this checking process and sending
an alarm by way of log if the running states of a plurality of the physical machines
in the physical machine group are abnormal; and executing the subsequent abnormity
processing operation if the running state of only one physical machine in the physical
machine group is abnormal so as to ensure that the virtual machines on the physical
machine whose running state is abnormal continues running normally.
[0009] In the above disclosed technical solutions, preferably, the abnormity processing
operation comprises: the highly available controller detects the connectivity of the
physical machine whose running state is abnormal to a management network, wherein
the detection is performed in the following two ways: (1) pinging the physical machine;
and (2) monitoring number 22 port of the physical machine.
[0010] In the above disclosed technical solutions, preferably, the abnormity processing
operation further comprises: ending the abnormity processing operation if it is detected
in any of the above ways that the physical machine whose running state is abnormal
has connectivity to the management network; detecting the connectivity of all the
active virtual machines running on the physical machine whose running state is abnormal
to business network if it is detected in the above two ways that the physical machine
whose running state is abnormal has no connectivity to the management network; ending
the abnormity processing operation if any of the active virtual machines has a connectivity
to the business network, and executing a secondary voting operation if none of the
active virtual machines has connectivity to the business network so as to eventually
confirm whether the physical machine whose running state is abnormal has a fault.
[0011] In the above disclosed technical solutions, preferably, the secondary voting operation
comprises: (1) the highly available controller randomly selects several physical machines
from the physical machine group other than the physical machine whose running state
is abnormal; (2) the highly available controller instructs each selected physical
machine to detect the connectivity of the physical machine whose running state is
abnormal to the management network and/or the business network by pinging the physical
machine whose running state is abnormal and monitoring number 22 port of the physical
machine whose running state is abnormal; (3) if any of the selected physical machines
finds that the physical machine whose running state is abnormal has connectivity to
the management network or the business network, the secondary voting operation is
ended, and the result of the secondary voting operation is "the physical machine whose
running state is abnormal has no fault", whereas if all the selected physical machines
find that the physical machine whose running state is abnormal has no connectivity
to either the management network or the business network, the secondary voting operation
is ended, and the result of the secondary voting operation is "the physical machine
whose running state is abnormal has a fault", and subsequently, a virtual machine
relocating operation is executed.
[0012] In the above disclosed technical solutions, preferably, the virtual machine relocating
operation comprises: (1) the highly available controller sends a shutdown instruction
to the physical machine whose running state is abnormal via an intelligent platform
management interface (IPMI) so as to make the physical machine whose running state
is abnormal to to be in a shutdown state, and thus destroying the virtual machine
which resides in the memory thereof; (2) the highly available controller sends a relocation
scheduling instruction to a scheduling controller; (3) after receiving the relocation
scheduling instruction, the scheduling controller selects the physical machines in
the physical machine group which have idle resources, and then sends a relocating
instruction to the selected physical machines which have idle resources one by one
so that all the active virtual machines running on the physical machine whose running
state is abnormal to the selected physical machines which have idle resources, wherein
the virtual machines to be relocated and allocated to different physical machines
which have idle resources are different from each other; (4) the computing assembly
running on each of all the physical machines which have idle resources relocates the
virtual machine to be relocated and allocated to this physical machine to this physical
machine, via a shared storing device.
[0013] In the above disclosed technical solutions, preferably, user can configure a high
availability flag for individual virtual machines, and the highly available controller
determines the high availability flags of all the active virtual machines running
on the physical machine whose running state is abnormal before the virtual machine
relocating operation is executed, and executes the subsequent virtual machine relocating
operation only to a virtual machine whose value of high availability flag is "activated".
[0014] In the above disclosed technical solutions, preferably, user can configure a high
availability priority for individual virtual machines, and the highly available controller
relocates individual virtual machines to be relocated one by one according to the
height level of the high availability priority of each virtual machine to be relocated.
[0015] The virtual machine abnormity recovering method in distributed environment disclosed
by the invention has the following advantages: (1) it can be ensured that the virtual
machine on which important businesses are running is activated and recovered preferentially,
and thus saving resources; (2) due to the diversity and comprehensiveness of network
detecting, the possibility of erroneous determination is significantly reduced; (3)
since the detection of the state of physical machine can not only be initiated by
the control node, but also can be initiated by other physical machines randomly selected,
the state of physical machine can be determined more fully and accurately.
BRIEF DESCRIPTION OF THE DRAWINGS
[0016] The technical features and advantages of the invention will be better understood
by those skilled in the art with reference to the accompanying drawings, in which:
[0017] Fig. 1 is a flowchart of the virtual machine abnormity recovering method in distributed
environment according to the embodiment of the invention.
DETAILED DESCRIPTION OF THE INVENTION
[0018] Fig. 1 is a flowchart of the virtual machine abnormity recovering method in distributed
environment according to the embodiment of the invention. As shown in Fig. 1, the
virtual machine abnormity recovering method in distributed environment disclosed by
the invention comprises the following steps: (A1) running an independent computing
assembly on each physical machine on which a virtual machine resides, wherein the
computing assembly periodically (e.g., every other minute) reports the current running
state of the corresponding physical machine to a state database; (A2) periodically
(e.g., every other 2 seconds) polling the state database by a highly available controller
so as to check the running state of all the physical machines in a physical machine
group under the control of the highly available controller; (A3) ending this checking
process if the running states of all the physical machines in the physical machine
group are normal; ending this checking process and sending an alarm by way of log
if the running states of a plurality of the physical machines in the physical machine
group are abnormal; and executing subsequent abnormity processing operation if the
running state of only one physical machine in the physical machine group is abnormal
(e.g., a certain physical machine does not report the running state of its own within
1 minute) so as to ensure the virtual machines on the physical machine whose running
state is abnormal continues running normally.
[0019] Preferably, in the virtual machine abnormity recovering method in distributed environment
disclosed by the invention, the abnormity processing operation comprises: the highly
available controller detects the connectivity of the physical machine whose running
state is abnormal to a management network, wherein the detection is performed in the
following two ways: (1) pinging the physical machine; and (2) monitoring number 22
port of the physical machine.
[0020] Preferably, in the virtual machine abnormity recovering method in distributed environment
disclosed by the invention, the abnormity processing operation further comprises:
ending the abnormity processing operation if it is detected in any of the above ways
that the physical machine whose running state is abnormal has connectivity to the
management network; detecting the connectivity of all the active virtual machines
running on the physical machine whose running state is abnormal to business network
if it is detected in the above two ways that the physical machine whose running state
is abnormal has no connectivity to the management network; ending the abnormity processing
operation if any of the active virtual machines has the connectivity to the business
network, and executing a secondary voting operation if none of the active virtual
machines has connectivity to the business network so as to eventually confirm whether
the physical machine whose running state is abnormal has a fault.
[0021] Preferably, in the virtual machine abnormity recovering method in distributed environment
disclosed by the invention, the secondary voting operation comprises: (1) the highly
available controller randomly selects several physical machines (e.g., three physical
machines) from the physical machine group other than the physical machine whose running
state is abnormal; (2) the highly available controller instructs each selected physical
machine to detect the connectivity of the physical machine whose running state is
abnormal to the management network and/or the business network by pinging the physical
machine whose running state is abnormal and monitoring number 22 port of the physical
machine whose running state is abnormal; (3) if any of the selected physical machines
finds that the physical machine whose running state is abnormal has connectivity to
the management network or the business network, the secondary voting operation is
ended, and the result of the secondary voting operation is "the physical machine whose
running state is abnormal has no fault", whereas if all the selected physical machines
find that the physical machine whose running state is abnormal has no connectivity
to either the management network or the business network, the secondary voting operation
is ended, and the result of the secondary voting operation is "the physical machine
whose running state is abnormal has a fault", and subsequently, a virtual machine
relocating operation is executed.
[0022] Preferably, in the virtual machine abnormity recovering method in distributed environment
disclosed by the invention, the virtual machine relocating operation comprises: (1)
the highly available controller sends a shutdown instruction to the physical machine
whose running state is abnormal via an intelligent platform management interface (IPMI)
so as to make the physical machine whose running state is abnormal to be in a shutdown
state (i.e., it no longer provides any service to the outside), and thus destroying
the virtual machine which resides in the memory thereof (by way of example, if the
intelligent platform management interface (IPMI) is abnormal, the virtual machine
relocating operation is not stopped, but an alarm will be sent in the form of log);
(2) the highly available controller sends a relocation scheduling instruction to a
scheduling controller; (3) after receiving the relocation scheduling instruction,
the scheduling controller selects physical machines in the physical machine group
which have idle resources, and then sends a relocating instruction to the selected
physical machines which have idle resources one by one so that all the active virtual
machines running on the physical machine whose running state is abnormal to the selected
physical machines which have idle resources, wherein the virtual machines to be relocated
and allocated to different physical machines which have idle resources are different
from each other; (4) the computing assembly running on each of all the physical machines
which have idle resources relocates the virtual machine to be relocated and allocated
to this physical machine to this physical machine, via a shared storing device. By
way of example, in order to ensure that at the same time point, for each independent
virtual machine mirror image file, only one virtual machine instance is running in
the entire distributed system, the highly available controller can modify the mirror
image file saving directory of the virtual machine so as to prevent the physical machine
whose running state is abnormal from activating the virtual machine instance during
the procedure of the relocating operation.
[0023] Preferably, in the virtual machine abnormity recovering method in distributed environment
disclosed by the invention, user can configure a high availability flag for individual
virtual machines, and the highly available controller determines the high availability
flag of all the active virtual machines running on the physical machine whose running
state is abnormal before the virtual machine relocating operation is executed, and
executes the subsequent virtual machine relocating operation only to a virtual machine
whose value of high availability flag is "activated".
[0024] Preferably, in the virtual machine abnormity recovering method in distributed environment
disclosed by the invention, user can configure a high availability priority for individual
virtual machines, and the highly available controller relocates individual virtual
machines to be relocated one by one according to the height level of the high availability
priority of each virtual machine to be relocated. By way of example, if the high availability
priority of a virtual machine is configured to be "high", then it indicate: for this
virtual machine, it is ensured that enough idle resources are preserved so as to ensure
this virtual machine can be relocated, and if the high availability priority of a
virtual machine is configured to be "medium" or "low", then it indicate: for this
virtual machine, a corresponding priority sequence is instructed to be ensured during
relocating, but it cannot be ensured that there are enough idle resources to preserve.
[0025] It can be seen from the above that the virtual machine abnormity recovering method
in distributed environment disclosed by the invention has the following advantages:
(1) it can be ensured that the virtual machine on which important businesses are running
is activated and recovered preferentially, thus saving resources; (2) due to the diversity
and comprehensiveness of network detecting, a possibility of erroneous determination
is significantly reduced; (3) since the detection of the state of physical machine
can not only be initiated by the control node, but also can be initiated by other
physical machines randomly selected, the state of physical machine can be determined
more fully and accurately.
[0026] While the invention has been described through the above described preferred embodiments,
the ways of carrying out the invention are not limited to the above embodiments.
1. A virtual machine abnormity recovering method in distributed environment, the method
being
characterized by comprising the following steps:
(A1) running an independent computing assembly on each physical machine on which a
virtual machine resides, wherein the computing assembly periodically reports the current
running state of the corresponding physical machine to a state database;
(A2) periodically polling the state database by a highly available controller so as
to check the running state of all the physical machines in a physical machine group
under the control of the highly available controller; and
(A3) ending this checking process if the running states of all the physical machines
in the physical machine group are normal; ending this checking process and sending
an alarm by way of log if the running states of a plurality of the physical machines
in the physical machine group are abnormal; and executing subsequent abnormity processing
operation if the running state of only one physical machine in the physical machine
group is abnormal so as to ensure that virtual machines on the physical machine whose
running state is abnormal continues running normally.
2. A virtual machine abnormity recovering method in distributed environment according
to claim 1, wherein the abnormity processing operation comprises: the highly available
controller detects the connectivity of the physical machine whose running state is
abnormal to a management network, wherein the detection is performed in the following
two ways: (1) pinging the physical machine; and (2) monitoring number 22 port of the
physical machine.
3. A virtual machine abnormity recovering method in distributed environment according
to claim 2, wherein the abnormity processing operation further comprises: ending the
abnormity processing operation if it is detected in any of the above ways that the
physical machine whose running state is abnormal has connectivity to the management
network; detecting the connectivity of all the active virtual machines running on
the physical machine whose running state is abnormal to business network if it is
detected in the above two ways that the physical machine whose running state is abnormal
has no connectivity to the management network; ending the abnormity processing operation
if any of the active virtual machines has the connectivity to the business network,
and executing a secondary voting operation if none of the active virtual machines
has connectivity to the business network so as to eventually confirm whether the physical
machine whose running state is abnormal has a fault.
4. A virtual machine abnormity recovering method in distributed environment according
to claim 3, wherein the secondary voting operation comprises: (1) the highly available
controller randomly selects several physical machines from the physical machine group
other than the physical machine whose running state is abnormal; (2) the highly available
controller instructs each selected physical machine to detect the connectivity of
the physical machine whose running state is abnormal to the management network and/or
the business network by pinging the physical machine whose running state is abnormal
and monitoring number 22 port of the physical machine whose running state is abnormal
respectively; (3) if any of the selected physical machines finds that the physical
machine whose running state is abnormal has connectivity to the management network
or the business network, the secondary voting operation is ended, and the result of
the secondary voting operation is "the physical machine whose running state is abnormal
has no fault", whereas if all the selected physical machines find that the physical
machine whose running state is abnormal has no connectivity to either the management
network or the business network, the secondary voting operation is ended, and the
result of the secondary voting operation is "the physical machine whose running state
is abnormal has a fault", and subsequently, a virtual machine relocating operation
is executed.
5. A virtual machine abnormity recovering method in distributed environment according
to claim 4, wherein the virtual machine relocating operation comprises: (1) the highly
available controller sends a shutdown instruction to the physical machine whose running
state is abnormal via an intelligent platform management interface (IPMI) so as to
make the physical machine whose running state is abnormal to be in a shutdown state,
and thus destroying the virtual machine which resides in the memory thereof; (2) the
highly available controller sends a relocation scheduling instruction to a scheduling
controller; (3) after receiving the relocation scheduling instruction, the scheduling
controller selects physical machines in the physical machine group which have idle
resources, and then sends a relocating instruction to the selected physical machines
which have idle resources one by one so that all the active virtual machines running
on the physical machine whose running state is abnormal to the selected physical machines
which have idle resources, wherein the virtual machines to be relocated and allocated
to different physical machines which have idle resources are different from each other;
(4) the computing assembly running on each of all the physical machines which have
idle resources relocates the virtual machine to be relocated and allocated to this
physical machine, via a shared storing device.
6. A virtual machine abnormity recovering method in distributed environment according
to claim 5, wherein user can configure a high availability flag for individual virtual
machines, and the highly available controller determines the high availability flags
of all the active virtual machines running on the physical machine whose running state
is abnormal before the virtual machine relocating operation is executed, and executes
the subsequent virtual machine relocating operation only to the virtual machines whose
value of high availability flag is "activated".
7. A virtual machine abnormity recovering method in distributed environment according
to claim 6, wherein user can configure a high availability priority for individual
virtual machines, and the highly available controller relocates individual virtual
machines to be relocated one by one according to the height level of the high availability
priority of each virtual machine to be relocated.
1. Anomaliewiederherstellungsverfahren virtueller Maschinen in einer verteilten Umgebung,
wobei das Verfahren durch folgende Schritte gekennzeichnet ist:
(A1) Ausführen einer unabhängigen Computeranordnung auf jeder physischen Maschine,
auf der sich eine virtuelle Maschine befindet, wobei die Computeranordnung periodisch
den aktuellen Ausführungszustand der physischen Maschine einer Zustandsdatenbank meldet;
(A2) periodisches Abfragen der Zustandsdatenbank durch eine hoch verfügbare Steuereinrichtung,
um den Ausführungszustand sämtlicher physischer Maschinen in einer Gruppe physischer
Maschinen gesteuert durch die hoch verfügbare Steuereinrichtung zu überprüfen; und
(A3) Beenden dieser Überprüfungsverarbeitung, falls die Ausführungszustände sämtlicher
physischer Maschinen in der Gruppe physischer Maschinen normal sind; Beenden dieser
Überprüfungsverarbeitung und Senden eines Warnsignals mittels eines Protokolls, falls
die Ausführungszustände mehrerer physischer Maschinen in der Gruppe physischer Maschinen
anomal sind; und Ausführen eines nachfolgenden Anomalieverarbeitungsvorgangs, falls
der Ausführungszustand auch nur einer physischen Maschine in der Gruppe physischer
Maschinen anormal ist, um zu gewährleisten, dass virtuelle Maschinen auf der physischen
Maschine, deren Ausführungszustand anormal ist, weiter normal arbeiten.
2. Anomaliewiederherstellungsverfahren virtueller Maschinen in einer verteilten Umgebung
nach Anspruch 1, wobei der Anomalieverarbeitungsvorgang umfasst: die hoch verfügbare
Steuereinrichtung erfasst die Konnektivität der physischen Maschine, deren Ausführungszustand
anormal ist, mit einem Verwaltungsnetzwerk, wobei das Erfassen auf den folgenden zwei
Wegen durchgeführt wird: (1) Pingen der physischen Maschine; und (2) Überwachen des
Ports Nummer 22 der physischen Maschine.
3. Anomaliewiederherstellungsverfahren virtueller Maschinen in einer verteilten Umgebung
nach Anspruch 2, wobei der Anomalieverarbeitungsvorgang ferner umfasst: Beenden des
Anomalieverarbeitungsvorgangs, falls auf einem beliebigen der oben erwähnten Wege
erfasst ist, dass die physische Maschine, deren Ausführungszustand anormal ist, Konnektivität
mit dem Verwaltungsnetzwerk aufweist; Erfassen der Konnektivität sämtlicher auf der
physischen Maschine laufenden aktiven virtuellen Maschinen, deren Ausführungszustand
anormal ist, mit einem Geschäftsnetzwerk, falls auf den oben erwähnten zwei Wegen
erfasst wird, dass die physische Maschine, deren Ausführungszustand anormal ist, keine
Konnektivität mit dem Verwaltungsnetzwerk aufweist; Beenden des Anomalieverarbeitungsvorgangs,
falls eine beliebige der aktiven virtuellen Maschinen die Konnektivität mit dem Geschäftsnetzwerk
aufweist; und Ausführen eines sekundären Abstimmungsvorgangs, falls keine der aktiven
virtuellen Maschinen Konnektivität mit dem Geschäftsnetzwerk aufweist, um gegebenenfalls
rückzumelden, ob die physische Maschine, deren Ausführungszustand anormal ist, fehlerhaft
ist.
4. Anomaliewiederherstellungsverfahren virtueller Maschinen in einer verteilten Umgebung
nach Anspruch 3, wobei der sekundäre Abstimmungsvorgang aufweist: (1) die hoch verfügbare
Steuereinrichtung wählt nach einem Zufallsprinzip aus der Gruppe physischer Maschinen,
die sich von der physischen Maschine, deren Ausführungszustand anormal ist, unterscheiden,
mehrere physische Maschinen aus; (2) die hoch verfügbare Steuereinrichtung befiehlt
jeder ausgewählten physischen Maschine, die Konnektivität der physischen Maschine,
deren Ausführungszustand anormal ist, mit dem Verwaltungsnetzwerk und/oder dem Geschäftsnetzwerk
durch Pingen der physischen Maschine, deren Ausführungszustand anormal ist, bzw. durch
Überwachung des Ports Nummer 22 der physischen Maschine, deren Ausführungszustand
anormal ist, zu erfassen; (3) falls eine der ausgewählten physischen Maschinen entdeckt,
dass die physische Maschine, deren Ausführungszustand anormal ist, Konnektivität mit
dem Verwaltungsnetzwerk oder dem Geschäftsnetzwerk aufweist, wird der sekundäre Abstimmungsvorgang
beendet, und das Ergebnis des sekundären Abstimmungsvorgangs ist: "die physische Maschine,
deren Ausführungszustand anormal ist, ist fehlerlos", wohingegen, falls sämtliche
ausgewählten physischen Maschinen entdecken, dass die physische Maschine, deren Ausführungszustand
anormal ist, weder mit dem Verwaltungsnetzwerk noch mit dem Geschäftsnetzwerk Konnektivität
aufweist, wird der sekundäre Abstimmungsvorgang beendet, und das Ergebnis des sekundären
Abstimmungsvorgangs ist: "die physische Maschine, deren Ausführungszustand anormal
ist, ist fehlerhaft", und es wird folglich ein Neuanordnungsvorgang der virtuellen
Maschine ausgeführt.
5. Anomaliewiederherstellungsverfahren virtueller Maschinen in einer verteilten Umgebung
nach Anspruch 4, wobei die Neuanordnung der virtuellen Maschine umfasst: (1) die hoch
verfügbare Steuereinrichtung sendet über eine intelligente Plattformverwaltungsschnittstelle
(Intelligent Platform Management Interface, IPMI) einen Befehl zum Herunterfahren
zu der physischen Maschine, deren Ausführungszustand anormal ist, um die physische
Maschine, deren Ausführungszustand anormal ist, in einen Zustand des Herunterfahrens
zu versetzen, so dass dadurch die virtuelle Maschine, die sich in ihrem Speicher befindet,
gelöscht wird; (2) die hoch verfügbare Steuereinrichtung sendet eine Neuanordnungsplanungsanweisung
zu einer Planungssteuereinrichtung; (3) nachdem die Neuanordnungsplanungsanweisung
empfangen ist, wählt die Planungssteuereinrichtung aus der Gruppe physischer Maschinen
physische Maschinen aus, die Leerlaufressourcen haben, und sendet anschließend nacheinander
eine Neuanordnungsanweisung zu den ausgewählten physischen Maschinen, die Leerlaufressourcen
haben, so dass sämtliche auf der physischen Maschine laufenden aktiven virtuellen
Maschinen, deren Ausführungszustand anormal ist, wobei sich die virtuellen Maschinen,
die neu anzuordnen und unterschiedlichen physischen Maschinen, die Leerlaufressourcen
aufweisen, zuzuweisen sind, voneinander unterscheiden; (4) die Computeranordnung,
die auf jeder der sämtlichen physischen Maschinen läuft, die Leerlaufressourcen aufweisen,
ordnet die neu anzuordnende und dieser physischen Maschine zuzuweisende virtuelle
Maschine über eine gemeinsam verwendete Speichereinrichtung neu an.
6. Anomaliewiederherstellungsverfahren virtueller Maschinen in einer verteilten Umgebung
nach Anspruch 5, wobei ein Benutzer ein Hohe-Verfügbarkeit-Flag für einzelne virtuelle
Maschinen konfigurieren kann, und die hoch verfügbare Steuereinrichtung das Hohe-Verfügbarkeit-Flag
sämtlicher auf der physischen Maschine laufenden aktiven virtuellen Maschinen, deren
Ausführungszustand anormal ist, ermittelt, bevor die Neuanordnung der virtuellen.
Maschine ausgeführt wird, und den nachfolgenden Neuanordnungsvorgang virtuellen Maschine
lediglich an den virtuellen Maschinen ausführt, deren Wert des Hohe-Verfügbarkeit-Flags
"aktiviert" ist.
7. Anomaliewiederherstellungsverfahren virtueller Maschinen in einer verteilten Umgebung
nach Anspruch 6, wobei der Benutzer für einzelne virtuelle Maschinen eine Priorität
hoher Verfügbarkeit konfigurieren kann, und die hoch verfügbare Steuereinrichtung
einzelne neu anzuordnende virtuelle Maschinen gemäß der Höhe des Niveaus der Priorität
hoher Verfügbarkeit jeder neu anzuordnenden virtuellen Maschine nacheinander neu anordnet.
1. Procédé de récupération d'anomalie de machine virtuelle dans un environnement distribué,
le procédé étant
caractérisé en ce qu'il comprend les étapes suivantes :
(A1) le fonctionnement d'un ensemble de calcul indépendant sur chaque machine physique
sur laquelle une machine virtuelle réside, dans lequel l'ensemble de calcul reporte
de manière périodique l'état de fonctionnement actuel de la machine physique correspondante
à une base de données d'état ;
(A2) le sondage périodique de la base de données d'état au moyen d'un contrôleur hautement
disponible de façon à vérifier l'état de fonctionnement de toutes les machines physiques
dans un groupe de machines physiques sous le contrôle du contrôleur hautement disponible
; et
(A3) la terminaison de ce processus de vérification si les états de fonctionnement
de toutes les machines physiques dans le groupe de machines physiques sont normaux
; la terminaison de ce processus de vérification et l'envoi d'une alarme au moyen
d'un registre si les états de fonctionnement d'une pluralité des machines physiques
dans le groupe de machines physiques sont anormaux ; et l'exécution d'une opération
de traitement d'anomalie suivante si l'état de fonctionnement de seulement une machine
physique dans le groupe de machines physiques est anormal de façon à s'assurer que
les machines virtuelles sur la machine physique dont l'état de fonctionnement est
anormal continuent de fonctionner normalement.
2. Procédé de récupération d'anomalie de machine virtuelle dans un environnement distribué
selon la revendication 1, dans lequel l'opération de traitement d'anomalie comprend
: le contrôleur hautement disponible détecte la connectivité de la machine physique
dont l'état de fonctionnement est anormal à un réseau de gestion, dans lequel la détection
est effectuée des deux façons suivantes : (1) le sondage ping de la machine physique
; et (2) le suivi du port numéro 22 de la machine physique.
3. Procédé de récupération d'anomalie de machine virtuelle dans un environnement distribué
selon la revendication 2, dans lequel l'opération de traitement d'anomalie comprend
en outre : la terminaison de l'opération de traitement de l'anomalie si elle est détectée
dans n'importe laquelle des façons énoncées ci-dessus que la machine physique dont
l'état de fonctionnement est anormal présente une connectivité au réseau de gestion,
la détection de la connectivité de toutes les machines virtuelles actives fonctionnant
sur la machine physique dont l'état de fonctionnement est anormal à un réseau de travail
si elle est détectée des deux façons énoncées ci-dessus que la machine physique dont
l'état de fonctionnement est anormal ne présente aucune connectivité au réseau de
gestion ; la terminaison de l'opération de traitement d'anomalie si l'une quelconque
des machines virtuelles actives présente une connectivité au réseau commercial, et
l'exécution d'une opération de vote secondaire si aucune des machines virtuelles actives
ne présente une connectivité au réseau commercial de façon à éventuellement confirmer
si la machine physique dont l'état de fonctionnement est anormal présente ou non une
faute.
4. Procédé de récupération d'anomalie de machine virtuelle dans un environnement distribué
selon la revendication 3, dans lequel l'opération de vote secondaire comprend : (1)
le contrôleur hautement disponible choisit de manière aléatoire plusieurs machines
physiques du groupe de machines physiques autres que la machine physique dont l'état
de fonctionnement est anormal ; (2) le contrôleur hautement disponible instruit chaque
machine physique sélectionnée de détecter la connectivité de la machine physique dont
l'état de fonctionnement est anormal au réseau de gestion et/ou au réseau commercial
par le sondage ping de la machine physique dont l'état de fonctionnement est anormal
et le suivi du port numéro 22 de la machine physique dont l'état de fonctionnement
est anormal respectivement ; (3) si l'une quelconque des machines physiques sélectionnées
trouve que la machine physique dont l'état de fonctionnement est anormal présente
une connectivité au réseau de gestion ou au réseau commercial, l'opération de vote
secondaire se termine, et le résultat de l'opération de vote secondaire est que «
la machine physique dont l'état de fonctionnement est anormal ne présente aucune faute
», alors que si toutes les machines physiques sélectionnées trouvent que la machine
physique dont l'état de fonctionnement est anormal ne présente aucune connectivité
au réseau de gestion ou au réseau commercial, l'opération de vote secondaire se termine,
et le résultat de l'opération de vote secondaire est que « la machine physique dont
l'état de fonctionnement est anormal présent une faute », et ultérieurement, une opération
de délocalisation de machine virtuelle est exécutée.
5. Procédé de récupération d'anomalie de machine virtuelle dans un environnement distribué
selon la revendication 4, dans lequel l'opération de délocalisation de machine virtuelle
comprend : (1) le contrôleur hautement disponible envoie une instruction d'arrêt à
la machine physique dont l'état de fonctionnement est anormal par l'intermédiaire
d'une interface de gestion de plates-formes intelligentes (IPMI) de façon à faire
que la machine physique dont l'état de fonctionnement est anormal soit dans un état
d'arrêt, et ainsi la destruction de la machine virtuelle qui réside dans la mémoire
de celle-ci ; (2) le contrôleur hautement disponible envoie une instruction de planification
de délocalisation à un contrôleur de planification ; (3) après la réception de l'instruction
de planification de délocalisation, le contrôleur de planification sélectionne les
machines physiques dans le groupe de machines physiques qui présentent des ressources
inutilisées, et ensuite envoie une instruction de délocalisation aux machines physiques
sélectionnées qui présentent des ressources inutilisées une par une de sorte que toutes
les machines virtuelles actives fonctionnant sur la machine physique dont l'état de
fonctionnement est anormal, dans lequel les machines virtuelles devant être délocalisées
et allouées à des machines physiques différentes qui présentent des ressources inutilisées
sont différentes les unes des autres ; (4) l'ensemble de calcul fonctionnant sur chacune
parmi toutes les machines physiques qui présentent des ressources inutilisées délocalise
la machine virtuelle devant être délocalisée et allouée à cette machine physique,
par l'intermédiaire d'un dispositif de stockage partagé.
6. Procédé de récupération d'anomalie de machine virtuelle dans un environnement distribué
selon la revendication 5, dans lequel un utilisateur peut configurer un drapeau à
haute disponibilité pour des machines virtuelles individuelles, et le contrôleur hautement
disponible détermine les drapeaux à haute disponibilité de toutes les machines virtuelles
actives fonctionnant sur la machine physique dont l'état de fonctionnement est anormal
avant que l'opération de délocalisation de machine virtuelle ne soit exécutée, et
exécute l'opération de délocalisation de machines virtuelles ultérieures seulement
aux machines virtuelles dont la valeur de drapeau à haute disponibilité est « activée
».
7. Procédé de récupération d'anomalie de machine virtuelle dans un environnement distribué
selon la revendication 6, dans lequel un utilisateur peut configurer une priorité
à haute disponibilité pour des machines virtuelles individuelles, et le contrôleur
hautement disponible délocalise les machines virtuelles individuelles devant être
délocalisées une par une conformément au niveau de hauteur de la priorité à haute
disponibilité de chaque machine virtuelle devant être délocalisée.

REFERENCES CITED IN THE DESCRIPTION
This list of references cited by the applicant is for the reader's convenience only.
It does not form part of the European patent document. Even though great care has
been taken in compiling the references, errors or omissions cannot be excluded and
the EPO disclaims all liability in this regard.
Patent documents cited in the description