[0001] The present invention relates generally to distributed storage systems and more particularly
to non-intrusive crash consistent copying in a distributed storage system without
client cooperation
[0002] Techniques have been developed to efficiently store information in a computer network.
For example, the physical storage elements of a group of computer storage servers
can be used to form a logical storage device, commonly referred to as a "virtual disk."
The virtual disk is functionally equivalent to a single physical storage element but
is actually formed of several physical storage elements.
[0003] For reasons analogous to the reasons why information stored at a physical storage
element must be backed-up, the information stored at a virtual disk must also be backed-up.
Because of the distributed nature of a virtual disk, however, special care must taken
to ensure that the same "version" of the virtual disk is copied from each of the physical
storage elements that form the virtual disk.
[0004] To ensure consistency between the original and back-up copies of a virtual disk,
prior to copying the virtual disk, the entire computer storage system is conventionally
placed into a quiescent state. This basically causes the computer storage system to
become inactive. The copy operation is permitted only after the computer storage system
has reached quiescence. This procedure, although ensuring consistency between copies
of a virtual disk, significantly intrudes on the normal operation of the computer
storage system, including operations unrelated to the making of the copies. Accordingly,
it would be advantageous to be able to copy virtual disks without intruding on normal
computer storage system operations.
[0005] It is, accordingly, an object of the present invention to provide a technique for
copying the contents of a virtual disk without interfering with normal operation of
a distributed storage system.
[0006] It is another object of the present invention to provide further advantages and features,
the details of which shall be described below.
[0007] The present invention advantageously provides an apparatus and an associated method
for controlling access to storage elements within a physical storage device when an
associated logical storage device, which can be formed of a number of physical storage
elements, for example, storage elements distributed across multiple computing nodes
connected by a network, is being copied.
[0008] In accordance with the invention, a memory stores a write-barrier value and a processor
prohibits write operations on an associated element. In operation, a write-barrier
value of a first state, for example 0, is stored in the memory when the logical storage
device is to be copied.
[0009] While the stored write-barrier value is in the first state, the execution of write
operations to the storage element(s) associated with the memory is prohibited by the
processor. Advantageously, the write-barrier value stored in the memory is set to
the first state by the processor upon the receipt of a request to copy the logical
storage device. The copying of the logical storage device may be automatically initiated
by the processor .
[0010] After the write-barrier value is set to 0, portions of the logical storage device
are copied, and finally, the write-barrier value that was previously set to the first
state is set to a second state for example, to 1. While the write-barrier value is
set to 1, write operations upon the associated storage element can be executed in
a normal fashion, i.e., no longer prohibited by the processor.
[0011] Preferably, the logical storage device can only be copied when the write-barrier
value stored in the memory is in the first state, i.e., 0, The logical storage device
is copied by a copy-on-write technique which avoids the often cumbersome process of
completely copying each file.
[0012] FIGURE 1 illustrates a computer network formed of a plurality of client stations
connected by way of network connections to a plurality of storage servers, each including
hard disk drive assemblies.
[0013] FIGURE 2 illustrates a virtual disk formed of portions of hard disk drive assemblies
of Figure 1, in accordance with the present invention.
[0014] FIGURE 3 illustrates the relationship between the virtual disk and hard disk drive
assemblies at which the virtual disk is stored in accordance with the present invention.
[0015] FIGURE 4 illustrates a simplified functional block diagram of the storage servers
in the computer network shown in Figure 1, in accordance with the present invention.
[0016] Referring first to Figure 1, a computer network, shown generally at 10, provides
distributed processing capability to a plurality of users. The network 10 includes
a plurality of networked computers at client stations 12. The client stations 12 are
connected together by network connections 14. The client stations 12 can be formed
of personal computers, work stations, or other types of processing devices. Each of
the client stations includes bulk storage media, here represented by disk assemblies
16. The computer networks 10 is scalable, permitting additional client stations to
be added to the networks.
[0017] The computer network further includes a plurality of network storage servers 18,
here servers 18-1, 18-2, 18-3, . . . 18-n. The servers 18 are each also coupled by
way of the network connections 14. The client stations 12 are able to access the servers
18 by way of the network connections 14. Each of the servers 18 includes bulk storage
media 22, here hard disks 22-1, 22-2, 22-3, . . . 22-n. During operation of the computer
network 10, information stored at the hard disks 22 is accessible for executing read
operations and write operations.
[0018] The computer network 10 is illustrated in Figure 2 to include a virtual disk 24.
Portions of the storage media 22 form the virtual disk 24. Further, the servers 18
shown in Figure 1 are also utilized in forming the virtual disk. The virtual disk
24 forms a logical storage device to which a client station 12 can write and from
which a client station 12 can read information.
[0019] Figure 3 illustrates the relationship between the virtual disk 24 and the storage
media 22, shown in Figure 2. The virtual disk 24 is here shown to be formed of a plurality
of consecutively-numbered files numbered 1-7. The files 1-7 are mapped to individual
ones of the storage media 22. A one-to-one relationship between the files 1-7 and
their respective physical storage locations are exemplary only. Although, as shown,
each of the files 1-7 is shown to be stored entirely on a respective single hard disk
22 associated with a single server 18, two or more files, e.g., files 1 and 2, might
be commonly stored at a single hard drive 22 or portions of a single file, e.g., file
1, might be distributed across two or more hard disks 22.
[0020] When the contents of the virtual disk 24 are to be copied, a copy-on-write technique
can be utilized. Using this technique, physical copies of all of the files 1-7 of
the disk 24 need not be replicated. Rather, the copy need only be a logical copy of
those portions of the hard disks 22 storing files which are dissimilar to the files
1-7 stored on the virtual disk being replicated. The logical copy is often referred
to as a "snapshot" of the virtual disk 24. The snapshot of the virtual disk includes
a pointer having mapping information to map back to changed portions of the virtual
disk 24 and, in turn, back to the locations on hard disks 18 at which the information
is physically stored.
[0021] Figure 4 illustrates the computer network 10 in somewhat greater detail. As shown,
the servers 18-1 and 18-n each include a CPU (central processing unit) 32. The CPUs
32 are independently operable to perform functions including the control of the respective
storage server's operation.
[0022] Each of the storage servers 18, of which the servers 18-1 and 18-n are exemplary,
includes a write-barrier storage memory 34. The value stored at the write-barrier
storage memory is determinative of whether a write operation to an associated hard
disk 22 can be executed. For example, when the bit, e.g., zero, stored at the write-barrier
storage memory 34 is of a first logical value, a write operation cannot be performed,
and when the bit, e.g., one, stored at the write-barrier storage memory 34 is of a
second logical value, the write operation can be executed.
[0023] The storage servers 18 are further shown to include a copy initiator 38, here implemented
as programmed instructions executable by the CPU 32. The copy initiator 38 is signaled
by a CPU 32 when a copy of the virtual disk 24 is to be created. Copying is initiated
at preselected intervals, such as every ten minutes, or upon request of a client station
12. Execution of the copy initiator 38 by the CPU 32 causes the write-barrier storage
bit to be set as the first logical value, for example the bit may be flipped from
1 to 0. The CPU 32 also executes the copy initiator 38 to reset, i.e., "tear down",
the write-barrier storage bit to a second logical value, for example, to flip the
bit from 0 to 1, after the contents of the hard disk 22 associated with the flipped
write-barrier storage bit has been replicated.
[0024] The storage servers 18 each further include a write request command delayer 42. Each
write request command delayer 42 is implemented by programmed instructions executable
by the CPU 32. The write request command delayer 42 receives requests to write data
to the associated hard disk 18 which forms portions of the virtual disk 24. When a
write request is received at a server 18 and the write-barrier storage bit is set
at the first logical value, the write request is delayed by the write request command
delayer 42 until the copy initiator 38 resets the write-barrier storage bit at the
write-barrier storage location 34. Thereafter, write operations at the hard disk 22
can be performed without waiting for the associated other hard disks 22 of virtual
disk 24 to be copied.
[0025] The previous description is of preferred examples for implementing the invention,
and the scope of the invention should not necessarily be limited by this description.
The scope of the present invention is defined by the following claims.
1. A method for controlling access to a storage element, the storage element physically
storing information forming a portion of a contents of a logical storage device, the
method comprising the steps of:
setting a write-barrier value to a first value responsive to a request to copy the
contents of the logical storage device;
prohibiting execution of write operations to the storage element based upon the write-barrier
value being set to the first value;
copying the portion of the contents of the logical storage device, and
setting the write-barrier value to a second value responsive to completing the copying.
2. The method of claim 1, further comprising the step of:
automatically initiating the request to copy the contents of the logical storage device.
3. The method of claim 2, wherein the initiating is performed at selected intervals.
4. The method of claim 2, wherein the setting of the write-barrier value to the first
value is performed subsequent to initiating the request to copy.
5. The method of claim 1, wherein the contents are copied using a copy-on-write technique.
6. The method of claim 1, further comprising the step of:
executing the write operations after the write-barrier value is set to the second
value.
7. The method of claim 1, wherein:
the storage element forms a portion of a computer network, the computer network further
including a plurality of client stations and
requests for the write operations are generated by one or more of the plurality of
client stations.
8. An article of manufacture for controlling access by a plurality of stations within
a network to a storage element which forms a portion of a virtual disk, comprising:
a computer readable storage medium; and
computer programming stored on the storage medium;
wherein the stored computer programming is configured to be readable from the computer
readable storage medium by a computer and thereby cause the computer to operate as
to;
establish a write-barrier which prohibits write operations at the storage element
during copying of the portion of the virtual disk; and
remove the write-barrier to allow the write operations at the storage element after
completing the copying of the portion of the virtual disk.
9. The article of manufacture in claim 8, wherein the computer programming is further
configured to cause the computer to operate as to:
set a write-barrier value to a first state to establish the write-barrier.
10. The article of manufacture in claim 9, wherein the computer programming is further
configured to cause the computer to operate as to:
set the write-barrier value to a second state to remove the write-barrier .
11. The article of manufacture in claim 8, wherein the computer programming is configured
to cause the computer to operate as to:
automatically initiate the copying of at least the portion of the virtual disk.
12. The article of manufacture in claim 11, wherein the computer programming is configured
to cause the computer to operate as to:
automatically initiate the copying at selected intervals.
13. The article of manufacture in claim 8, wherein the computer programming is further
configured to cause the computer to operate so as to:
copy the virtual disk using a copy-on-write technique.
14. An apparatus for controlling access to one of a plurality of storage elements within
a physical storage device, the one storage element forming a portion of a logical
storage device, comprising:
a memory configured to store a write-barrier value having at least a first state;
and
a processor, responsive to the write-barrier value, configured to prohibit write operations
at the one storage element without prohibiting write operations at the other of the
plurality of storage elements.
15. The apparatus of claim 14, wherein:
the processor is further configured to copy the portion of the logical storage device
when the write-barrier value is in the first state.
16. The apparatus of claim 15, wherein:
the copy of the portion of the logical storage device is formed by a copy-on-write
technique.
17. The apparatus of claim 15, wherein:
the processor is further configured to automatically initiate copying.
18. The apparatus of claim 14, wherein:
the processor is further configured to set the write-barrier value to the first state
responsive to a request to copy at least the portion of the logical storage device.
19. The apparatus of claim 14, wherein:
the write-barrier value includes a second state; and
the processor is further configured to allow write operations at the one storage element
with the write-barrier value in the second state.
20. The apparatus of claim 19, wherein:
the processor is further configured to set the write-barrier value to the second state
responsive to completion of copying of the portion of the logical storage device.
21. The apparatus of claim 14, wherein the physical storage device is one of a plurality
of physical storage devices within a network.
22. The apparatus of claim 14, wherein:
the processor is further configured to delay execution of a write request.