(19)
(11)EP 1 723 784 B1

(12)EUROPEAN PATENT SPECIFICATION

(45)Mention of the grant of the patent:
29.04.2020 Bulletin 2020/18

(21)Application number: 05708663.9

(22)Date of filing:  03.03.2005
(51)International Patent Classification (IPC): 
H04N 5/262(2006.01)
H04N 19/48(2014.01)
H04N 5/272(2006.01)
(86)International application number:
PCT/IB2005/000553
(87)International publication number:
WO 2005/091621 (29.09.2005 Gazette  2005/39)

(54)

METHOD AND DEVICE FOR TRANSFORM-DOMAIN VIDEO EDITING

VERFAHREN UND EINRICHTUNG ZUR VIDEO-EDITIERUNG IN DER TRANSFORMATIONSDOMÄNE

PROCEDE ET DISPOSITIF D'EDITION VIDEO DANS LE DOMAINE TRANSFORME


(84)Designated Contracting States:
AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU MC NL PL PT RO SE SI SK TR

(30)Priority: 10.03.2004 US 798825

(43)Date of publication of application:
22.11.2006 Bulletin 2006/47

(73)Proprietor: Beijing Xiaomi Mobile Software Co., Ltd.
Beijing 100085 (CN)

(72)Inventors:
  • KURCEREN, Ragip
    Carrollton, Texas 75007 (US)
  • CHEBIL, Fehmi
    Irving, Texas 75062 (US)
  • ISLAM, Asad
    Richardson, Texas 75081 (US)

(74)Representative: dompatent von Kreisler Selting Werner - Partnerschaft von Patent- und Rechtsanwälten mbB 
Deichmannhaus am Dom Bahnhofsvorplatz 1
50667 Köln
50667 Köln (DE)


(56)References cited: : 
WO-A1-99/18735
US-A- 6 104 441
  
  • SHI-FU CHANG ET AL: 'Manipulation and compositing of MC-DCT compressed Video.' IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS vol. 13, no. 1, 01 January 1995, pages 1 - 11, XP000492740
  • YONEYAMA A. ET AL: 'Fast Dissolve operations for MPEG video contents.' IMAGE PROCESSING 2000.PROCEEDINGS 2000 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING. vol. 2, 10 September 2000, pages 291 - 294, XP010529981
  • SMITH B. ET AL: 'Algorithms for manipulating Compressed Images.' IEEE TRANS, ON COMPUTER GRAPHICS AND APPLICATIONS. September 1993, pages 34 - 42, XP000562744
  
Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give notice to the European Patent Office of opposition to the European patent granted. Notice of opposition shall be filed in a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art. 99(1) European Patent Convention).


Description

Field of the Invention



[0001] The present invention relates generally to video coding and, more particularly, to video editing.

Background of the Invention



[0002] Video editing capability is an increasingly requested feature in video playing and/or capturing devices. Transitional effects between different video sequences, logo insertion and over-layering sequences are among the most widely used operations in editing. Video editing tools enable users to apply a set of effects on their video clips aiming to produce a functionally and aesthetically better representation of their video.

[0003] To apply video editing effects on video sequences, several commercial products exist. These software products are targeted mainly for the PC platform. Because processing power, storage and memory constraints are not an issue in the PC platform today, the techniques utilized in such video-editing products operate on the video sequences mostly in their raw formats in the spatial domain. With such techniques, the compressed video is first decoded and then the editing effects are introduced in the spatial domain. Finally, the video is again encoded. This is known as spatial domain video editing operation.

[0004] For devices with low resources in processing power, storage space, available memory and battery power, decoding a video sequence and re-encoding it are costly operations that take a long time and consume a lot of battery power. Many of the latest communication devices, such as mobile phones, communicators and PDAs, are equipped with video cameras, offering users the capability to shoot video clips and send them over wireless networks. It is advantageous and desirable to allow users of those communication devices to generate quality video at their terminals. The spatial domain video editing operation is not suitable in wireless cellular environments.

[0005] As mentioned above, most video effects are performed in the spatial domain in prior art. In the case of video blending (transitional effects for fading, etc.) between two or more sequences, for instance, video clips are first decompressed and then the effects are performed according to the following equation:

where (x,y,t) is the edited sequence from the original sequences V1(x,y,t) and V2(x,y,t). α1 and α2 are two weighting parameters chosen according to the desired effect. Equation (1) is applied in the spatial domain for the various color components of the video sequence depending on the desired effect.

[0006] Finally, the resulting edited image sequence is re-encoded. The major disadvantage of this approach is that it is significantly computationally intensive, especially in the encoding part. Typical complexity ratio between generic encoders and decoders is approximately four. Using this conventional spatial-domain editing approach, all of the video frames coming right after the transition effect in the second sequence must be re-encoded.

[0007] Furthermore, it is not unusual that editing operations are usually repeated several times by users before the desired result is achieved. The repetition adds to the complexity of the editing operations, and requires more processing power. It is therefore important to develop efficient techniques minimizing the decoding and encoding operations, functioning in the compressed domain, to perform such editing effects.

[0008] In order to perform efficiently, video compression techniques exploit spatial redundancy in the frames forming the video. First, the frame data is transformed to another domain, such as the Discrete Cosine Transform (DCT) domain, to decorrelate it. The transformed data is then quantized and entropy coded.

[0009] In addition, the compression techniques exploit the temporal correlation between the frames: when coding a frame, utilizing the previous, and sometimes the future, frames(s) offers a significant reduction in the amount of data to compress.

[0010] The information representing the changes in areas of a frame can be sufficient to represent a consecutive frame. This is called prediction and the frames coded in this way are called predicted (P) frames or Inter frames. As the prediction cannot be 100% accurate (unless the changes undergone are described in every pixel), a residual frame representing the errors is also used to compensate the prediction procedure.

[0011] The prediction information is usually represented as vectors describing the displacement of objects in the frames. These vectors are called motion vectors. The procedure to estimate these vectors is called motion estimation. The usage of these vectors to retrieve frames is known as motion compensation.

[0012] Prediction is often applied on blocks within a frame. The block sizes vary for different algorithms (e.g. 8 x 8 or 16 x 16 pixels, or 2n x 2m pixels with n and m being positive integers). Some blocks change significantly between frames, to the point that it is better to send all the block data independently from any prior information, i.e. without prediction. These blocks are called Intra blocks.

[0013] In video sequences there are frames, which are fully coded in Intra mode. For example, the first frame of the sequence is usually fully coded in Intra mode, because it cannot be predicted from an earlier frame. Frames that are significantly different from previous ones, such as when there is a scene change, are usually also coded in Intra mode. The choice of the coding mode is made by the video encoder. Figures 1 and 2 illustrate a typical video encoder 410 and decoder 420 respectively.

[0014] The decoder 420 operates on a multiplexed video bit-stream (includes video and audio), which is demultiplexed to obtain the compressed video frames. The compressed data comprises entropy-coded-quantized prediction error transform coefficients, coded motion vectors and macro block type information. The decoded quantized transform coefficients c(x,y,t), where x,y are the coordinates of the coefficient and t stands for time, are inversely quantized to obtain transform coefficients d(x,y,t) according to the following relation:

where Q-1 is the inverse quantization operation. In the case of scalar quantization, equation (3) becomes

where QP is the quantization parameter. In the inverse transform block, the transform coefficients are subject to an inverse transform to obtain the prediction error Ec(x,y,t):

where T-1 is the inverse transform operation, which is the inverse DCT in many compression techniques.

[0015] If the block of data is an intra-type macro block, the pixels of the block are equal to Ec(x,y,t). In fact, as explained previously, there is no prediction, i.e.:

If the block of data is an inter-type macro block, the pixels of the block are reconstructed by finding the predicted pixel positions using the received motion vectors (Δxy) on the reference frame R(x,y,t-1) retrieved from the frame memory. The obtained predicted frame is:

The reconstructed frame is



[0016] In general, blending, transitional effects, logo insertion and frame superposition are editing operations which can be achieved by the following operation:

where (x,y,t) is the edited sequence from the N Vi(x,y,t) original sequences and t is the time index for which the effect would take place. The parameter αi(x,y,t) represents the modifications for introducing on Vi(x,y,t) for all pixels (x,y) at the desired time t.

[0017] For the sake of simplicity, we consider the case when N=2, i.e., the editing is performed using two input sequences. Nevertheless, it is important to stress that all of the following editing discussion can be generalized to n arbitrary input frames to produce one edited output frame.
For N=2, Equation (9) can be written as Equation (1):

A method for editing video data is disclosed in document D1 (SHI-FU CHANG ET AL: "Manipulation and compositing of MC-DCT compressed Video.", IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, vol. 13, no. 1, 1 January 1995, pages 1-11, XP000492740).

Summary of the Invention



[0018] The present invention provides a method for compressed domain operation to achieve the desired editing effects, with reduced complexity reduction, starting substantially at any frame (at any time t). The method, according to the present invention, offers the possibility of changing the effect including regaining the original clip. In the editing device, according to the present invention, transform coefficients of a part of the video sequence are obtained from an encoder so that they can be combined with transform coefficients of other part of the video sequence, the transform coefficients of other video sequence or the transform coefficients indicative of a logo in order to achieve video effects, such as blending, sliding transitional and logo insertion.

[0019] Thus, the present invention provides a method for editing a bitstream as claimed in the appended set of claims. The method comprises:

acquiring from the bitstream data indicative of transform coefficients of at least part of the video sequence; and

modifying the acquired data in the transform domain for providing modified data in a modified bitstream in order to achieve a video effect in said at least part of the video sequence.



[0020] According to present invention, the acquiring step includes:

decoding the bitstream for obtaining a plurality of quantized transform coefficients; and

converting the quantized transform coefficients by inverse quantization for providing the transform coefficients.



[0021] According to the present invention, the modified data contain a plurality of quantized modified transform coefficients, and the modifying step includes changing the transform coefficients for providing a plurality of modified transform coefficients. The method further comprises:
quantizing the modified transform coefficients for providing said plurality of quantized modified transform coefficients.

[0022] According to the present invention, the method further comprises:
obtaining further data indicative of a plurality of further transform coefficients, and the modifying step includes combining the further data with the acquired data for providing the modified data, and the combining step includes:

multiplying the further data by a first weighting parameter for providing a first weighted data;

multiplying the acquired data by a second weighting parameter for providing a second weighted data; and

summing the first weighted data and the second weighted data for providing the further data.



[0023] According to the present invention, one or both of the first and second weighting parameters are adjusted to achieve a blending effect, or a sliding transitional effect. The further data can be obtained from a memory device via a transform operation, or from the same or a different bitstream.

[0024] According to the present invention, the method further comprise:

decoding the bitstream for obtaining a plurality of quantized transform coefficients;

converting the quantized transform coefficient in an inverse quantization operation for obtaining a plurality of dequantized transform coefficients for use in said modifying;

inversely transforming the dequantized transform coefficients for obtaining information indicative of a prediction error;

combining the prediction error with motion compensation information in the video data for providing further video data indicative of a reference frame;

transforming the further video data for providing transformed reference data; and

combining the transform reference data with the transform coefficient in said modifying.



[0025] According the present invention, the method further comprises:

obtaining a plurality of further transform coefficients from a memory device via a transform operation; and

combining the further transform coefficients with the transform coefficient in said modifying.



[0026] The second aspect of the present invention provides a video editing device for editing a bitstream carrying video data indicative of a video sequence. The device comprises:

an acquiring module, responsive to the bitstream, for acquiring data indicative of transform coefficients of at least part of the video sequence; and

a modification module, responsive to the acquired data, for changing the transform coefficients in the transform domain for providing modified data in a modified bitstream in order to achieve a video effect in said at least part of the video sequence.



[0027] According to the present invention, the acquiring module comprises:

a decoding module, responsive to the bitstream, for obtaining a plurality of quantized transform coefficients; and

an inverse quantization module, responsive to the quantized transform coefficients, for providing the transform coefficients.



[0028] According to the present invention, the transform coefficients are changed in the transform domain to become modified transform coefficients by the modification module, and the editing device further comprises:
a quantization module for quantizing the modified transform coefficients for providing a plurality of quantized modified transform coefficients in the modified data.

[0029] According to the present invention, the editing device further comprises:

a further acquiring module for obtaining further data indicative of a plurality of further transform coefficients; and

a combination module, for combining the acquired data and the further data for providing the modified data.



[0030] According to the present invention, the editing device further comprises:

a further acquiring module for obtaining further data indicative of a plurality of further transform coefficients;

an inverse transform module, responsive to the further data, for providing information indicative of a prediction error;

a combination module, responsive to the prediction error and motion compensation information in the video data, for providing reference data indicative of a reference frame; and

a transform module, responsive to the reference data, for providing transformed reference data to the modification module so as to change the transform coefficient based on the transformed reference data.



[0031] The third aspect of the present invention provides a video coding system, which comprises:

a decoder; and

an encoder for receiving a bitstream carrying video data indicative of a video sequence, wherein the encoder comprises a video editing device for editing the bitstream, wherein the editing device comprises:

an acquiring module, responsive to the bitstream, for acquiring data indicative of transform coefficients of at least part of the video sequence; and

a modification module, responsive to the acquired data, for changing the transform coefficients in the transform domain for providing modified data in a modified bitstream in order to achieve a video effect in said at least part of the video sequence, and

wherein the decoder is operable

in a first mode for reconstructing video from the video data carried in the bitstream, and

in a second mode for reconstructing video from the modified data in the modified bitstream.



[0032] The fourth aspect of the present invention provides an electronic device, which comprises:

a video data acquisition module for acquiring a bitstream carrying a video sequence having video data; and

a video editing device for editing the bitstream to achieve a video effect, wherein the editing device comprises:

a first module for obtaining from the bitstream transform coefficients of at least a part of the video sequence;

a second module for modifying the transform coefficients in the transform domain for providing modified transform coefficients; and

a third module for converting the modified transform coefficients into modified video data in a modified bitstream.



[0033] The fifth aspect of the present invention provides a software product for use in a video editing device for editing a bitstream carrying video data indicative of a video sequence. The software product comprises:

a code for extracting from the bitstream data indicative of a plurality of transform coefficients of at least part of the video sequence; and

a code for modifying the transform coefficients for providing modified data indicative of the modified transform coefficients.



[0034] The software product further comprises:
a code for mixing the transform coefficients of said at least part of the video sequence with other transform coefficients.

[0035] According to the present invention, the code for extracting comprises:

a code for decoding the bitstream for obtaining a plurality of quantized transform coefficients; and

a code for converting the quantized transform coefficients by inverse quantization for providing the transform coefficients.



[0036] According to the present invention, the code for modifying comprises:
a code for changing the transform coefficients for providing a plurality of modified transform coefficients, said software product further comprising:
a code for quantizing the modified transform coefficients for providing a plurality of quantized modified transform coefficients in a modified bitstream.

[0037] According to the present invention, the code for mixing comprises:

a code for multiplying the transform coefficients by a first weighting parameter for providing a first weighted data, and multiplying the other transform coefficients by a second weighting parameter for providing a second weighted data; and

a code for summing the first weighted data with the second weighted data for providing the modified data.



[0038] According to the present invention, the software product comprises:

a code for extracting stored data from a memory for providing further data; and

a code for transforming the further data for providing the other transform coefficients.



[0039] According to the present invention, the software product comprises:

a code for decoding the bitstream for obtaining a plurality of quantized transform coefficients; and

a code for converting the quantized transform coefficient in an inverse quantization operation for obtaining a plurality of the dequantized transform coefficients;

a code for inversely transforming the dequantized transform coefficients for obtaining information indicative of a prediction error;

a code for combining the prediction error with motion compensation information in the video data for providing further video data indicative of a reference frame;

a code for transforming the further video data for providing transformed reference data; and

a code for mixing the transform reference data with the transform coefficient for providing the modified data.



[0040] The present invention will become apparent upon reading the description taken in conjunction with Figures 3-13.

Brief Description of the Drawings



[0041] 

Figure 1 is a block diagram illustrating a prior art video encoder process.

Figure 2 a block diagram illustrating a prior art video decoder process.

Figure 3 is a schematic representation showing a typical video-editing channel.

Figure 4 is a block diagram illustrating an embodiment of the compressed domain approach to dissolve effects for intra frames, according to the present invention.

Figure 5 is a block diagram illustrating an embodiment of the compressed domain approach to dissolve effects for inter frames, according to the present invention.

Figure 6 is a block diagram illustrating an embodiment of the compressed domain approach to logo insertion with blending, according to the present invention.

Figure 7 is a block diagram showing an embodiment of the compressed domain approach to logo insertion.

Figure 8 is a block diagram showing an expanded video encoder, which can be used for compressed-domain video editing, according to the present invention.

Figure 9 is a block diagram showing an expanded video decoder, which can be used for compressed-domain video editing, according to the present invention.

Figure 10 is a block diagram showing another expanded video decoder, which can be used for compressed-domain video editing, according to the present invention.

Figure 11a is a block diagram showing an electronic device having a compressed-domain video editing device, according to the present invention.

Figure 11b is a block diagram showing another electronic device having a compressed-domain video editing device, according to the present invention.

Figure 11c is a block diagram showing yet another electronic device having a compressed-domain video editing device, according to the present invention.

Figure 11d is a block diagram showing still another electronic device having a compressed-domain video editing device, according to the present invention.

Figure 12 is a schematic representation showing the software programs for providing the editing effects.

Figure 13 is a schematic representation showing another software program for providing the editing effects.


Detailed Description of the Invention



[0042] The present invention is mainly concerned with transitional effects between different video sequences, logo insertion and overlaying of video sequences while the sequences are in compressed format. As such, the editing effects are applied to the video sequences without requiring full decoding and re-encoding. Thus, the present invention is concerned with blending and logo insertion operations in video editing. Blending is the operation of combining or joining sequences, overlaying for the entire frames or part of the frames in the sequences. Logo insertion is the operation of inserting a logo, which can be an image or graphic at a particular area of the frames in the video sequences.

[0043] Transition effect editing between two frames can be broken down to performing such operations between the corresponding macroblocks of these two frames. As explained above macro blocks in compressed video are of two types: Intra and Inter. Hence, we find four different combinations for applying editing effects between the macroblocks. We will present how to achieve the above effects with combinations of these macroblocks.

[0044] In general, editing operations can happen on a video clip in a channel at one of its terminals. The edited video clip is outputted at the other terminal, as shown in Figure 3. Video editing operations can start at time t. From that time, the bitstream is modified in order to add the desired effects as described in the following.

Blending of an Intra block with an intra block



[0045] This operation in spatial domain is performed as follows:

For Intra frames, using the steps of the earlier section, we have,

For Intra frames, using the steps of the earlier section, and after taking the transform of the frame after special effects, the same operations can be formulated as follows in the compressed domain:

The transform domain approach significantly simplifies the blending operations, as can be seen from Figure 4.

[0046] Figure 4 illustrates an embodiment of the present invention for compressed domain solution to dissolve transitional effects for Intra frames. Both of the compressed bitstreams 100, 100' are partially decoded in the corresponding demultiplexing units 10 to obtain the quantized transform coefficients 110, 110 or c(i,j). The quantized transform coefficients are inverse quantized in inverse quantization blocks 20 to obtain inverse quantized transform coefficients 120, or d1(i,j) and 120' or d2(i,j). Each of these coefficients d1(i,j) and d2(i,j) are scaled with α1(t) and α2(t), respectively, in blocks 22 and 22' to become scaled coefficients 122, 122'. The resulting coefficients 122,122' are then summed by a summing device 24 to produce a weighted sum 124 (d12 or e(x,y), see Equation 11). The weighted sum 124 is re-quantized in the quantization block 26 to produce quantized coefficients 126, or e(x,y). Finally the quantized coefficients 126 are sent to a multiplexing unit 70, which performs entropy coding and multiplexing with other required information to produce a valid compressed video bitstream 170.

[0047] It should be understood that it is possible to combine the inverse quantization, scaling and quantization blocks or to combine the scaling and quantization blocks into a single coding block.

[0048] This process is repeated for both luminance and chrominance components of the video bitstream.

Blending of an inter block with an inter block.



[0049] Inter-frames are reconstructed by summing residual error with the motion-compensated prediction,

and similarly,



[0050] The spatial domain representation of dissolve effect is formulated as follows:





[0051] Note that (x + Δx1 , y + Δy1, t -1) is the previously reconstructed frame after the fading effects, and it can be re-written in terms of R(x + Δx1 , y + Δy1, t -1), which represents the frame that would have been reconstructed if transitional effects were not applied:



[0052] Then the prediction residual can be calculated by:



[0053] Taking the transform of new residual data, we have the blending effect of two inter blocks in the transform domain:


Blending of an intra block with an inter block



[0054] The spatial domain representation of dissolve effect can be formulated as follows:



[0055] Since the output is an intra block, i.e., no prediction, the transform of the block is given by,



[0056] Equation (15) gives the result of blending an intra block with an inter block in the transform domain.

Blending of an inter block with an intra block



[0057] The spatial domain representation of dissolve effect is then formulated as follows:

or



[0058] Again (x + Δx1 , y + Δy1, t - 1) is the previously reconstructed frame after fading effects and can be re-written in terms of R(x + Δx1, y + Δy1, t - 1), which represents the frame that would have been reconstructed if transition effects are not applied:



[0059] The prediction residual can be calculated by:



[0060] Taking the transform of new residual data, we have the effect of blending an inter block with an intra block:


Blending of an inter block with an intra block for the first intra frame



[0061] This is a special case of blending an intra block on inter blocks, applied to the first intra frame. Note that this case can be expressed by α2(t -1) = 0. The rest of the process follows the analysis. By applying α2(t - 1) = 0 to Equation (17), we obtain the final residual coefficients in the transform domain as follows:



[0062] These transform coefficients e(x,y) are then quantized and sent to the entropy coder.

[0063] Figure 5 demonstrates an embodiment of the present invention for compressed domain solution to dissolve transitional effects for Inter macroblocks with Inter macroblocks. As shown in Figure 5, the coding device 5' comprises two decoders, which are capable of decoding two compressed bitstreams 100, 100' into decoded video sequences 132, 132'. Part of the decoders is similar to a conventional decoder for inter block decoding, as shown in Figure 2. Thus, the process of decoding the compressed bitstreams 100, 100' into the decoded video sequences 132, 132' in the spatial domain can be carried out in a conventional fashion. However, the coding device 5' further comprises a number of processing blocks to produce special dissolve effects in an edited bitstream 170, in addition to the decoded video sequences 132, 132'.

[0064] Similar to the process as shown in Figure 4, the quantized transform coefficients 110 or c(i,j) are inverse quantized in the inverse quantization blocks 20 to obtain inverse quantized transform coefficients 120 or d1(i,j) and 120' or d2(i,j). Each of these coefficients d1(i,j) and d2(i,j) are scaled with α1(t) and α2(t), respectively, in blocks 22, 22' to become scaled coefficients 122, 122'. The resulting coefficients are summed by a summing device 24. The summing result d12(i,j) is denoted by reference numeral 124. Meanwhile, the predicted frames 136, or R1(xx1, yy1, t - 1) and 136' or R2(xx2, yy2, t-1) are subjected to transform coding in the Transform blocks 38, 38'. Furthermore, using the motion-vectors of the first video-clip and the reconstructed frames of the second video-clip, a reference block 137 R2(xx1, yy1, t - 1) is obtained through the Motion Compensation prediction block 36'. The reference block 137' is also subjected to transform coding by a transform block 39'. After the transform operations, transform coefficients 138, 138' and 139', respectively, of R1(xx1, yy1, t - 1), R2(xx2, yy2, t -1) and R2(xx1, yy1, t - 1) are scaled with (α1(t-1) - α1(t)), α2(t-1), and -α2(t), respectively. The scaled transform coefficients are then subtracted from d12(i,j) in the summing block 25. The final resulting coefficients 125 or e(i,j) are then quantized in the quantization block 26. Finally the quantized coefficients 126 are sent to a multiplexing unit 70 which performs entropy coding and multiplexing with other required information to produce a valid compressed video bitstream 170.

[0065] It should be understood that it is possible to combine the inverse quantization, scaling and quantization blocks or to combine the scaling and quantization blocks into a single coding block.

[0066] This process is repeated for both luminance and chrominance components of the video bitstream.

[0067] In typical applications, the above-described process can be further improved. For example, it is possible to allow only the selected transition frames to go through the method of producing edited bitstream 170, according to the present invention. For frames that are not transition frames, the operations can be skipped. This improvement process can be carried out by setting one of the weighting parameters in the above-described case to 0: a1(t)=0 or α2(t)=0. When α2(t)=0, there is no need to compute the transform coefficients 138' of R2(xx2, yy2, t-1). Likewise, when α2(t-1)= 0, there is no need to compute 137', or R2(x+Δx1, yy1, t-1). When α1(t-1)= α1(t), there is no need to compute the transform coefficients 138 of R1(xx1, yy1, t-1).

[0068] When α2(t-1)= α2(t), the transform coefficients of R2(xx2, yy2, t-1) and R2(x+Δx1, yy1, t-1) need not be computed separately in different coding blocks, but they can be computed as follows. After computing both R2(xx2, yy2, t-1) and R2(xx1, yy1, t-1), the block R2(xx2, yy2, t-1) is subtracted from R2(xx1, yy1, t-1). The difference is subjected to transform coding in one of the transform blocks, such as the block 39'. The results are scaled by α2(t-1) or α2(t), and the scaled result is fed to the summing block 25. The remaining steps are identical to the process as described in conjunction with Figure 5 above.

Sliding Transitional Effect



[0069] Sliding transitional effect, also known as "wipe" effect, makes one video clip slide into the other during transition. This can be accomplished by assigning appropriate weights α(x,y,t) that are dependent on the spatial location (x, y) in the frame.
Furthermore, for the frames V1(x,y,t), we set weights α1(x,y,t) = 0 and α1(x,y,t) = 1 in order to dictate which parts of frame 1 to be included in the sliding transition. Likewise, the setting α2(x,y,t) = 0 and αz(x,y,t) = 1 dictates which parts of the frame are to be included in frame 2.

Logo Insertion



[0070] Logo insertion can be accomplished in different ways. One way is logo insertion with blending, as shown in Figure 6. Alternatively, logo insertion can be carried out without blending, as shown in Figure 7.

[0071] In logo insertion with blending, the transform coefficients 120 from one of decoder (see Figure 5) are replaced by the transform coefficients of the logo in a logo memory 40, as shown in Figure 6. As shown, the logo frames or sequence 140 is transformed into transform coefficients 141 by a transform block 41. The transform coefficient 141 and the coefficient 120 are summed by the summing block 24 after scaling. At the same time, the logo frames are processed by a Motion Compensation prediction block 36' to produce the predicated frames 137'. The result is transformed into transform domain coefficients 139'. The remaining steps are similar to those depicted in Figure 5.

[0072] Logo insertion without blending is shown in Figure 7. As shown, the transform coefficients 141 are mixed with the inverse quantized transform coefficients 120 from the compressed bitstream 100 as well as the predicated frames based on the edited bitstream 126.

Superposition of multiple sequences or frames



[0073] In the above-described editing processes, the number of input sequences, or N, is set to 2 (Equation 1). Similarly, the number of frames, or n, for use in motion prediction is also set to 2. However, the method of transform domain editing, according to the present invention, can be generalized such that the number of frames can be extended from n=2 to n=N, with N being a positive integer larger than 2.

[0074] The compressed-domain editing modules as shown in Figures 4 to 7 can be incorporated into conventional encoders and decoders as shown in Figures 1 and 2. For example, a conventional encoder 410 can be operatively connected to an editing module 5, 5' or 7 of the present invention. As shown in Figure 8, the expanded encoder 610 has a switch to select which bitstream to be sent to a decoder. Without editing, the original bitstream 100 is sent. With editing, the edited bitstream 170 is sent. As such, the expanded encoder 610 can be used as a typical encoder, or it can be used for compressed-domain video editing.

[0075] Each of the editing modules 5, 5' and 7 can also be incorporated in an expanded decoder 620 as shown in Figure 9. As shown, the decoder 420 can accept an original bitstream 100, or an edited bitstream 170 from the editing module 5, 5' or 7. As such, the expanded decoder 620 can be used as a typical decoder, or it can be used for compressed-domain video editing.

[0076] The editing module 8 of Figure 6 can also be used along with a conventional decoder 420 in an expanded decoder 630. As shown, the decoded video sequences of the original bitstream 100 can be obtained directly from the upper part 6 of the editing module 8 (see Figure 6). Alternatively, the bitstream 100 can be edited by the lower part 5" of the editing module 8.

[0077] The expanded encoder 610 can be integrated into an electronic device 710, 720 or 730 to provide compressed domain video editing capability to the electronic device, as shown separately in Figures 11a to 11c. As shown in Figure 11a, the electronic device 710 comprises an expanded encoder 610 to receive video input. The bitstream from the output of the encoder 610 is provided to a decoder 420 so that the decoded video can be viewed on a display, for example. As shown in Figure 11b, the electronic device 720 comprises a video camera for taking video pictures. The video signal from the video camera is conveyed to an expanded encoder 610, which is operatively connected to a storage medium. The video input from the video camera can be edited to achieve one or more video effects, as discussed previously. As shown in Figure 11c, the electronic device 730 comprises a transmitter to transmit the bitstream from the expanded encoder 610. As shown in Figure 11d, the electronic device 740 comprises a receiver to receive a bitstream containing video data. The video data is conveyed to an expanded decoder 620 or 630. The output from the expanded decoder is conveyed to a display for viewing. The electronic devices 710, 720, 730, 740 can be a mobile terminal, a computer, a personal digital assistant, a video recording system or the like.

[0078] It should be understood that video effect provided in blocks 22, 22', as shown in Figures 4, 5 and 6 can be achieved by software programs 422, 424, as shown in Figure 12. For example, these software programs have a first code for providing editing data indicative of α(x,y,t) and a second code for applying this editing data to the transform coefficients d(x,y,t) by a multiplication operation. The second code can also have a summing operation to combine the scaled transform coefficients 122, 122', 142. Moreover, the summing operation in both the block 24 and the block 25 (see Figure 5 and 6) can be carried out by a software program 426 in a summing module 28, as shown in Figure 13.

[0079] In sum, the present invention provides a method and device for editing a bitstream carrying video data in a video sequence. The editing procedure includes:
  • decoding the bitstream to obtain quantized transform coefficients of the video sequence;
  • inversely quantizing the quantized coefficients to obtain transform coefficients;
  • modifying the transform coefficients in the transform domain;
  • quantizing the modified transform coefficients.


[0080] The transform coefficients can be modified by combining the transform coefficients with other transform coefficients by way of weighted summation, for example. The other transform coefficients can be obtained from the same video sequence or from a different video sequence. They can also be obtained from a memory via a transform module.

[0081] Many or all of these method steps can be carried out by software codes in a software program.

[0082] Thus, although the invention has been described with respect to a preferred embodiment thereof, it will be understood by those skilled in the art that the foregoing and various other changes, omissions and deviations in the form and detail thereof may be made without departing from the scope of this invention.


Claims

1. A method of editing effects in a bitstream (100) carrying video data indicative of a first video sequence, said method comprising:

- decoding, the bitstream (100) and a different bitstream (100') to obtain respectively a first decoded video sequence (132) and a second decoded video sequence (130');

- obtaining from the bitstream (100) carrying video data indicative of a first video sequence data (120) indicative of transform coefficients;

- obtaining, from the different bitstream (100') carrying video data indicative of a second video sequence further data (120') indicative of a plurality of further transform coefficients;

- obtaining predicted video data (137') based on motion vectors from the first decoded video sequence, (132) and on a previously reconstructed frame of the second decoded video sequence (132');

- performing a transform coding (39') on the predicted video data (137') to obtain transformed predicted video data (139'); and

- modifying (22, 24, 25) the first video sequence data (120) in the transform domain for providing modified transform coefficients in a valid compressed video bitstream (170), wherein said modifying (22, 24, 25) comprises combining (22, 22', 24, 25) the obtained data (120) with the obtained further data (120') and the transformed predicted video data (139').


 
2. The method of claim 1, characterized in that said acquiring comprises:

demultiplexing the bitstream (100) for obtaining a plurality of quantized transform coefficients (110); and

converting the quantized transform coefficients (110) by inverse quantization for providing the transform coefficients.


 
3. The method of claim 2, characterized in that the modified data contain a plurality of quantized modified transform coefficients (126), and said modifying comprises changing the transform coefficients for providing a plurality of modified transform coefficients (125), said method further characterized by:
quantizing (26) the modified transform coefficients (125) for providing a plurality of quantized modified transform coefficients (126) in the modified bitstream (170).
 
4. The method of any of the preceding claims, characterized in that said combining comprises:

multiplying (22') the further data (120') by a first weighting parameter for providing a first weighted data (122');

multiplying (22) the acquired data (120) by a second weighting parameter for providing a second weighted data (122); and

summing the first weighted data (122') and the second weighted data (122).


 
5. The method of claim 4, characterized in that one or both of the first and second weighting parameters are adjusted to achieve a blending effect.
 
6. The method of claim 4, characterized in that one or both of the first and second weighting parameters are adjusted to achieve a sliding transitional effect.
 
7. The method of claim 1, further characterized by:

demultiplexing the bitstream (100) for obtaining a plurality of quantized transform coefficients (110); and

converting the quantized transform coefficients (110) in an inverse quantization operation (20) for obtaining a plurality of dequantized transform coefficients (120) for use in said modifying.


 
8. The method of claim 7, further characterized by:

inversely transforming (30) the dequantized transform coefficients (120) for obtaining information indicative of a prediction error;

combining (32) the prediction error with motion compensation information in the video data for providing further video data indicative of a reference frame;

transforming the further video data for providing transformed reference data; and

combining the transformed reference data with the acquired data (120), the obtained further data (120') and the transformed predicted video data (139') in said modifying.


 
9. The method of claim 1, said method further comprising:

- obtaining first predicted video data (136) based on motion vectors from the first decoded video sequence (132) and on a previously reconstructed frame of said video sequence (132);

- performing a transform coding (38) on the first predicted video data (136) to obtain transformed first predicted video data (138);

- obtaining second predicted video data (136') based on motion vectors from the different second video sequence (132') and on a previously reconstructed frame of said second video sequence (132');

- performing a transform coding (38') on the second predicted video data (136') to obtain transformed second predicted video data (138'); and wherein said modifying (22, 24, 25) comprises combining (22, 22', 24, 25) the acquired data (120) with the obtained further data (120'), the transformed predicted video data (139'), the transformed first predicted video data (138) and the transformed second predicted video data (138').


 
10. A video editing device comprising:

- a decoding module for decoding the bitstream (100) and a different bitstream (100') to obtain respectively a first decoded video sequence (132) and a second decoded video sequence (130');

- an acquiring module, responsive to a bitstream (100) carrying video data indicative of a first video sequence, for obtaining data (120) indicative of transform coefficients;

- a further acquiring module for obtaining, from the different bitstream (100') carrying video data indicative of a second video sequence, further data (120') indicative of a plurality of further transform coefficients;

- a prediction module for obtaining predicted video data (137') based on motion vectors from the first decoded video sequence (132) and on a previously reconstructed frame of the second decoded video sequence (132');

- a transform module (39') for performing a transform coding on the predicted video data (137') to obtain transformed predicted video data (139'); and

- a modification module, responsive to the first video sequence data (120), for changing the transform coefficients in the transform domain for providing modified transform coefficients in a valid compressed video bitstream (170), wherein said modification module comprises a combination module, for combining at least the obtained data (120), the obtained further data (120') and the transformed predicted video data (139') for providing the modified data.


 
11. The editing device of claim 10, characterized in that the acquiring module comprises:

a demultiplexing module (10), responsive to the bitstream (100), for obtaining a plurality of quantized transform coefficients (110); and

an inverse quantization module (20), responsive to the quantized transform coefficients (110), for providing the transform coefficients.


 
12. The editing device of claim 11, characterized in that the transform coefficients are changed in the transform domain to become modified transform coefficients (125) by the modification module, said editing device further characterized by:
a quantization module (26) for quantizing the modified transform coefficients (125) for providing a plurality of quantized modified transform coefficients (126) in the modified data.
 
13. The editing device of claim 10, further characterized by:

an inverse transform module (30), responsive to the obtained further data, for providing information indicative of a prediction error;

a combination module (32), responsive to the prediction error and motion compensation information in the video data, for providing reference data indicative of a reference frame; and

a transform module, responsive to the reference data, for providing transformed reference data to the modification module so as to change the transform coefficients based on the transformed reference data.


 
14. The editing device of claim 10, further comprising:

- a module for obtaining first predicted video data (136) based on motion vectors from the first decoded video sequence (132) and on a previously reconstructed frame of the said first decoded video sequence (132);

- a module for performing a transform coding (38) on the first predicted video data (136) to obtain transformed first predicted video data (138);

- a module for obtaining second predicted video data (136') based on motion vectors from the different bitstream and on a previously reconstructed frame of the different bitstream;

- a module for performing a transform coding (38') on the second predicted video data (136') to obtain transformed second predicted video data (138'); and wherein in said combining the acquired data (120), the obtained further data (120'), the transformed predicted video data (139'), the transformed first predicted video data (138) and the transformed second predicted video data (138') are combined.


 
15. A video coding system, comprising:

a decoder and an encoder, the encoder comprising a video editing device according to any of claims 10 to 14,

wherein the decoder is operable in a first mode for reconstructing video from the video data carried in the bitstream (100), and in a second mode for reconstructing video from the modified data in the modified bitstream (170) .


 
16. A software product embedded in a computer readable medium for use in a video editing device for editing effects in a bitstream (100) carrying video data indicative of a first video sequence, said software product comprising:

- code for: decoding the bitstream (100) and a different bitstream (100') to obtain respectively a first decoded video sequence (132) and a second decoded video sequence (130');

- a code for obtaining, from the bitstream (100) carrying video data indicative of a first video sequence, data (120) indicative of a plurality of transform coefficients;

- a code for obtaining, from the different bitstream (100') carrying video data indicative of a second video sequence, further data (120') indicative of a plurality of further transform coefficients;

- a code for obtaining predicted video data (137') based on motion vectors from the first decoded video sequence (132) and on a previously reconstructed frame of the second decoded video sequence (132');

- a code for performing a transform coding (39') on the predicted video data (137') to obtain transformed predicted video data (139'); and

- a code for modifying (22, 24, 25) the first video sequence data (120) in the transform domain for providing modified transform coefficients in a valid compressed video bitstream (170), wherein said modifying (22, 24, 25) comprises combining (22, 22', 24, 25) the obtained data (120) with the obtained further data (120') and the transformed predicted video data (139').


 
17. The software product of claim 16, further characterized by:
a code for mixing the transform coefficients of said at least part of the first video sequence with other transform coefficients.
 
18. The software product of claim 16, characterized in that the code for acquiring comprises:

a code for demultiplexing the bitstream (100) for obtaining a plurality of quantized transform coefficients (110); and

a code for converting the quantized transform coefficients (110) by inverse quantization for providing the transform coefficients.


 
19. The software product of claim of 18, characterized in that the code for modifying comprises:
a code for changing the transform coefficients for providing a plurality of modified transform coefficients (125), said software product further characterized by:
a code for quantizing (26) the modified transform coefficients (125) for providing a plurality of quantized modified transform coefficients (126) in a modified bitstream (170).
 
20. The software product of claim 17, characterized in that the code for mixing comprises:

a code for multiplying (22') the transform coefficients by a first weighting parameter for providing a first weighted data (122'), and multiplying (22) the other transform coefficients by a second weighting parameter for providing a second weighted data (122); and

a code for summing the first weighted data (122') with the second weighted data (122).


 
21. The software product of claim 16, further characterized by:

a code for demultiplexing the bitstream (100) for obtaining a plurality of quantized transform coefficients (110); and

a code for converting the quantized transform coefficients (110) in an inverse quantization operation for obtaining a plurality of the dequantized transform coefficients (120) ;

a code for inversely transforming (30) the dequantized transform coefficients (120) for obtaining information indicative of a prediction error;

a code for combining (32) the prediction error with motion compensation information in the video data for providing further video data indicative of a reference frame;

a code for transforming the further video data for providing transformed reference data; and

a code for mixing the transform reference data with the acquired data (120) with the obtained further data (120') and the transformed predicted video data (139').


 
22. The software product of claim 16, further comprising:

- a code for obtaining first predicted video data (136) based on motion vectors from the bitstream and on a previously reconstructed frame of the bitstream;

- a code for performing a transform coding (38) on the first predicted video data (136) to obtain transformed first predicted video data (138);

- a code for obtaining second predicted video data (136') based on motion vectors from the different bitstream and on a previously reconstructed frame of the different bitstream;

- a code for performing a transform coding (38') on the second predicted video data (136') to obtain transformed second predicted video data (138'); and wherein said modifying (22, 24, 25) comprises combining (22, 22', 24, 25) the acquired data (120) with the obtained further data (120'), the transformed predicted video data (139'), the transformed first predicted video data (138) and the transformed second predicted video data (138').


 


Ansprüche

1. Verfahren zur Editierung von Effekten in einem Bitstrom (100), der Videodaten enthält, welche eine erste Videosequenz angeben, wobei das Verfahren aufweist:

- das Decodieren des Bitstroms (100) und eines anderen Bitstroms (100'), um eine erste decodierte Videosequenz (132) beziehungsweise eine zweite decodierte Videosequenz (130') zu erhalten;

- das Erhalten von Daten (120), die Transformationskoeffizienten angeben, aus dem Bitstrom (100), der Videodaten enthält, welche eine erste Videosequenz angeben;

- das Erhalten weiterer Daten (120'), welche mehrere weitere Transformationskoeffizienten angeben, aus dem anderen Bitstrom (100'), welcher Videodaten enthält, die eine zweite Videosequenz angeben,

- das Erhalten vorausgesagter Videodaten (137') auf der Basis von Bewegungsvektoren aus der ersten decodierten Videosequenz (132) und auf der Basis eines zuvor rekonstruierten Frames der zweiten dekodierten Videosequenz (132');

- das Durchführen einer Transformationscodierung (39') an den vorhergesagten Videodaten (137'), um transformierte vorhergesagte Videodaten (139') zu erhalten; und

- das Modifizieren (22, 24, 25) der ersten Videosequenzdaten (120) in der Transformationsdomäne, um modifizierte Transformationskoeffizienten in einem gültigen komprimierten Video-Bitstrom (170) bereitzustellen, wobei das Modifizieren (22, 24, 25) das Kombinieren (22, 22', 24, 24) der erhaltenen Daten (120) mit den erhaltenen weiteren Daten (120') und den transformierten vorhergesagten Videodaten (139') aufweist.


 
2. Verfahren nach Anspruch 1, dadurch gekennzeichnet, dass das Erfassen aufweist:

das Demultiplexieren des Bitstroms (100) zum Erhalten mehrerer quantisierter Transformationskoeffizienten (110); und

das Umwandeln der quantisierten Transformationskoeffizienten (110) durch inverses Quantisieren zur Bereitstellung der Transformationskoeffizienten.


 
3. Verfahren nach Anspruch 2, dadurch gekennzeichnet, dass die modifizierten Daten mehrere quantisierte modifizierte Transformationskoeffizienten (126) enthalten, und dass das Modifizieren das Verändern der Transformationskoeffizienten aufweist, um mehrere modifizierte Transformationskoeffizienten (125) bereitzustellen, wobei das Verfahren ferner gekennzeichnet ist durch:
das Quantisieren (26) der modifizierten Transformationskoeffizienten (125), um mehrere modifizierte Transformationskoeffizienten (126) in dem modifizierten Bitstrom (170) bereitzustellen.
 
4. Verfahren nach einem der vorhergehenden Ansprüche, dadurch gekennzeichnet, dass das Kombinieren aufweist:

das Multiplizieren (22') der weiteren Daten (120') mit einem ersten Gewichtungsparameter, um erste gewichtete Daten (122') bereitzustellen;

das Multiplizieren (22) der erfassten Daten (120) mit einem zweiten Gewichtungsparameter, um zweite gewichtete Daten (122) bereitzustellen; und

das Summieren der ersten gewichteten Daten (122') und der zweiten gewichteten Daten (122).


 
5. Verfahren nach Anspruch 4, dadurch gekennzeichnet, dass der erste und/oder der zweite Gewichtungsparameter eingestellt sind, um einen Blending-Effekt zu erzielen.
 
6. Verfahren nach Anspruch 4, dadurch gekennzeichnet, dass der erste und/oder der zweite Gewichtungsparameter eingestellt sind, um einen gleitenden Übergangseffekt zu erzielen.
 
7. Verfahren nach Anspruch 1, ferner gekennzeichnet durch:

das Demultiplexieren des Bitstroms (100) zum Erhalten mehrere quantisierter Transformationskoeffizienten (110); und

das Umwandeln der quantisierten Transformationskoeffizienten (110) in einer inversen Quantisierungsoperation (20), um mehrere dequantisierte Transformationskoeffizienten (120) zur Verwendung bei der Modifizierung zu erhalten.


 
8. Verfahren nach Anspruch 7, ferner gekennzeichnet durch:

das inverse Transformieren (30) der dequantisierten Transformationskoeffizienten (120), um Informationen zu erhalten, die einen Vorhersagefehler angeben;

das Kombinieren (32) des Vorhersagefehlers mit Bewegungskompensationsinformationen in den Videodaten, um weitere Videodaten bereitzustellen, die einen Referenz-Frame angeben;

das Transformieren der weiteren Videodaten, um transformierte Referenzdaten bereitzustellen; und

das Kombinieren der transformierten Referenzdaten mit den erfassten Daten (120), den erhaltenen weiteren Daten (120') und den transformierten vorhergesagten Videodaten (139') beim Modifizieren.


 
9. Verfahren nach Anspruch 1, bei welchem das Verfahren ferner aufweist:

- das Erhalten erster vorhergesagter Videodaten (136) auf der Basis von Bewegungsvektoren aus der ersten decodierten Videosequenz (132) und auf der Basis eines zuvor rekonstruierten Frames der Videosequenz (132);

- das Durchführen einer Transformationscodierung (38) an den ersten vorhergesagten Videodaten (136), um transformierte erste vorhergesagte Videodaten (138) zu erhalten;

- das Erhalten zweiter vorhergesagter Videodaten (136') auf der Basis von Bewegungsvektoren aus der anderen, zweiten Videosequenz (132') und auf der Basis eines zuvor rekonstruierten Frames der zweiten Videosequenz (132');

- das Durchführen einer Transformationscodierung (38') an den zweiten vorhergesagten Videodaten (136'), um transformierte zweite vorhergesagte Videodaten (138') zu erhalten; und

wobei das Modifizieren (22, 24, 25) das Kombinieren (22, 22', 24, 25) der erfassten Daten (120) mit den erhaltenen weiteren Daten (120'), den transformierten vorhergesagten Videodaten (139'), den transformierten ersten vorhergesagten Videodaten (138) und den transformierten zweiten vorhergesagten Videodaten (138') aufweist.
 
10. Videoeditierungsvorrichtung mit:

- einem Decodiermodul zum Decodieren des Bitstroms (100) und eines anderen Bitstroms (100'), um jeweils eine erste decodierte Videosequenz (132) und eine zweite decodierte Videosequenz (130') zu erhalten;

- einem Erfassungsmodul, das auf einen Bitstrom (100), welcher eine erste Videosequenz angebende Videodaten enthält, reagiert, um Daten (120) zu erhalten, die Transformationskoeffizienten angeben;

- einem weiteren Erfassungsmodul, um mehrere weitere Transformationskoeffizienten angebende weitere Daten (120') aus dem anderen Bitstrom (100') zu erhalten, der eine zweite Videosequenz angebende Videodaten enthält;

- einem Vorhersagemodul zum Erhalten von vorhergesagten Videodaten (137') auf der Basis von Bewegungsvektoren aus der ersten decodierten Videosequenz (132) und auf der Basis eines zuvor rekonstruierten Frames der zweiten decodierten Videosequenz (132');

- einem Transformationsmodul (39') zur Durchführung einer Transformationscodierung an den vorhergesagten Videodaten (137'), um transformierte vorhergesagte Videodaten (139') zu erhalten; und

- einem Modifikationsmodul, das auf die ersten Videosequenzdaten (120) reagiert, um die Transformationskoeffizienten in der Transformationsdomäne zu ändern, um modifizierte Transformationskoeffizienten in einem gültigen komprimierten Video-Bitstrom (170) bereitzustellen, wobei das Modifizierungsmodul ein Kombinationsmodul zum Kombinieren mindestens der erhaltenen Daten (120), der erhaltenen weiteren Daten (120') und der transformierten vorhergesagten Videodaten (139') aufweist, um die modifizierten Daten bereitzustellen.


 
11. Editiervorrichtung nach Anspruch 10, dadurch gekennzeichnet, dass das Erfassungsmodul aufweist:

ein Demultiplexiermodul (10), das auf den Bitstrom (100) reagiert, um mehrere quantisierte Transformationskoeffizienten (110) zu erhalten; und

ein Invers-Quantisiermodul (20), das auf die quantisierten Transformationskoeffizienten (110) reagiert, um die Transformationskoeffizienten bereitzustellen.


 
12. Editiervorrichtung nach Anspruch 11, dadurch gekennzeichnet, dass die Transformationskoeffizienten in der Transformationsdomäne durch das Modifikationsmodul zu modifizierten Transformationskoeffizienten (125) verändert werden, wobei die Editiervorrichtung ferner gekennzeichnet ist durch:
ein Quantisierungsmodul (26) zum Quantisieren der modifizierten Transformationskoeffizienten (125), um mehrere quantisierte modifizierte Transformationskoeffizienten (126) in den modifizierten Daten bereitzustellen.
 
13. Editiervorrichtung nach Anspruch 10, ferner gekennzeichnet durch:

ein Invers-Transformationsmodul (30), das auf die erhaltenen weiteren Daten reagiert, um Informationen bereitzustellen, die einen Vorhersagefehler angeben;

ein Kombinationsmodul (32), das auf den Vorhersagefehler und Bewegungskompensationsinformationen in den Videodaten reagiert, um Referenzdaten bereitzustellen, die ein Referenz-Frame angeben; und

ein Transformationsmodul, das auf die Referenzdaten reagiert, um transformierte Referenzdaten an das Modifikationsmodul zu liefern, um die Transformationskoeffizienten auf der Basis der transformierten Referenzdaten zu ändern.


 
14. Editiervorrichtung nach Anspruch 10, ferner mit:

- einem Modul zum Erhalten erster vorhergesagter Videodaten (136) auf der Basis von Bewegungsvektoren aus der ersten decodierten Videosequenz (132) und auf der Basis eines zuvor rekonstruierten Frames der genannten ersten decodierten Videosequenz (132);

- einem Modul zum Durchführen einer Transformationscodierung (38) an den ersten vorhergesagten Videodaten (136), um transformierte erste vorhergesagte Videodaten (138) zu erhalten;

- einem Modul zum Erhalten von zweiten vorhergesagten Videodaten (136') auf der Basis von Bewegungsvektoren aus dem anderen Bitstrom und auf der Basis eines zuvor rekonstruierten Frames des anderen Bitstroms;

- einem Modul zum Durchführen einer Transformationscodierung (38') an den zweiten vorhergesagten Videodaten (136'), um transformierte zweite vorhergesagte Videodaten (138') zu erhalten; und wobei bei dem Kombinieren die erfassten Daten (120), die erhaltenen weiteren Daten (120') die transformierten vorhergesagten Videodaten (139') die transformierten ersten vorhergesagten Videodaten (138) und die transformierten zweiten vorhergesagten Videodaten (138') kombiniert werden.


 
15. Videocodiersystem mit:
einem Decoder und einem Codierer, wobei der Codierer eine Videoeditiervorrichtung nach einem der Ansprüche 10 bis 14 aufweist, wobei der Decoder in einem ersten Modus betreibbar ist, um Video aus den in dem Bitstrom (100) enthaltenen Videodaten zu rekonstruieren, und in einem zweiten Modus, um Video aus den modifizierten Daten in dem modifizierten Bitstrom (170) zu rekonstruieren.
 
16. Softwareprodukt, eingebettet in einem computerlesbaren Medium, zur Verwendung in einer Videoeditiervorrichtung zum Editieren von Effekten in einem Bitstrom (100), der Daten enthält, welche eine erste Videosequenz angeben, wobei das Softwareprodukt aufweist:

- einen Code für das Decodieren des Bitstroms (100) und eines anderen Bitstroms (100'), um eine erste decodierte Videosequenz (132) beziehungsweise eine zweite decodierte Videosequenz (130') zu erhalten;

- einen Code, um aus dem Bitstrom (100), der Daten enthält, welche eine erste Videosequenz angeben, Daten (120) zu erhalten, die mehrere Transformationskoeffizienten angeben;

- einen Code, um aus dem anderen Bitstrom (100'), der Daten enthält, welche eine zweite Videosequenz angeben, weitere Daten (120') zu erhalten, die mehrere weitere Transformationskoeffizienten angeben;

- einen Code zum Erhalten von vorhergesagten Videodaten (137') auf der Basis von Bewegungsvektoren aus der ersten decodierten Videosequenz (132) und auf der Basis eines zuvor rekonstruierten Frames der zweiten decodierten Videosequenz (132');

- einen Code zum Durchführen einer Transformationscodierung (39') an den vorhergesagten Videodaten (137'), um transformierte vorhergesagte Videodaten (139') zu erhalten; und

- einen Code zum Modifizieren (22, 24, 25) der ersten Videosequenzdaten (120) in der Transformationsdomäne, um modifizierte Transformationskoeffizienten in einem gültigen komprimierten Video-Bitstrom (170) bereitzustellen, wobei das Modifizieren (22, 24, 25) das Kombinieren (22, 22', 24, 25) der erhaltenen Daten (120) mit den erhaltenen weiteren Daten (120') und den transformierten vorhergesagten Videodaten (139') aufweist.


 
17. Softwareprodukt nach Anspruch 16, ferner gekennzeichnet durch:
einen Code zum Mischen der Transformationskoeffizienten mindestens eines Teils der ersten Videosequenz mit anderen Transformationskoeffizienten.
 
18. Softwareprodukt nach Anspruch 16, dadurch gekennzeichnet, dass der Code für das Erfassen aufweist:

einen Code zum Demultiplexieren des Bitstroms (100), um mehrere quantisierte Transformationskoeffizienten (110) zu erhalten; und

einen Code zum Umwandeln der quantisierten Transformationskoeffizienten (110) durch inverse Quantisierung, um die Transformationskoeffizienten bereitzustellen.


 
19. Softwareprodukt nach Anspruch 18, dadurch gekennzeichnet, dass der Code für das Modifizieren aufweist:
einen Code zum Ändern der Transformationskoeffizienten, um mehrere modifizierte Transformationskoeffizienten (125) bereitzustellen, wobei das Softwareprodukt ferner gekennzeichnet ist durch:
einen Code zum Quantisieren (26) der modifizierten Transformationskoeffizienten (125), um mehrere quantisierte modifizierte Transformationskoeffizienten (126) in einem modifizierten Bitstrom (170) bereitzustellen.
 
20. Softwareprodukt nach Anspruch 17, dadurch gekennzeichnet, dass der Code für das Mischen aufweist:

einen Code zum Multiplizieren (22') der Transformationskoeffizienten mit einem ersten Gewichtungsparameter, um erste gewichtete Daten (122') bereitzustellen, und zum Multiplizieren (22) der anderen Transformationskoeffizienten mit einem zweiten Gewichtungsparameter, um zweite gewichtete Daten (122) bereitzustellen; und

einen Code zum Summieren der ersten gewichteten Daten (122') mit den zweiten gewichteten Daten (122).


 
21. Softwareprodukt nach Anspruch 16, ferner gekennzeichnet durch:

einen Code zum Demultiplexieren des Bitstroms (100), um mehrere quantisierte Transformationskoeffizienten (110) zu erhalten; und

einen Code zum Umwandeln der quantisierten Transformationskoeffizienten (110) durch eine inverse Quantisierungsoperation, um mehrere der dequantisierten Transformationskoeffizienten (120) bereitzustellen;

einen Code zum inversen Transformieren (30) der dequantisierten Transformationskoeffizienten (120), um Informationen zu erhalten, die einen Vorhersagefehler angeben;

einen Code zum Kombinieren (32) des Vorhersagefehlers durch Bewegungskompensationsinformationen in den Videodaten, um weitere Videodaten bereitzustellen, die einen Referenz-Frame angeben;

einen Code zum Transformieren der weiteren Videodaten, um transformierte Referenzdaten bereitzustellen; und

einen Code zum Mischen der transformierten Referenzdaten mit den erfassten Daten (120), mit den erhaltenen weiteren Daten (120') und den transformierten vorhergesagten Videodaten (139').


 
22. Softwareprodukt nach Anspruch 16, ferner mit:

- einem Code zum Erhalten von ersten vorhergesagten Videodaten (136) auf der Basis von Bewegungsvektoren aus dem Bitstrom und auf der Basis eines zuvor rekonstruierten Frames des Bitstroms;

- einem Code zum Durchführen einer Transformationscodierung (38) an den ersten vorhergesagten Videodaten (136), um transformierte erste vorhergesagte Videodaten (138) zu erhalten;

- einem Code zum Erhalten zweiter vorhergesagter Videodaten (136') auf der Basis von Bewegungsvektoren aus dem anderen Bitstrom und auf der Basis eines zuvor rekonstruierten Frames des anderen Bitstroms;

- einem Code zum Durchführen einer Transformationscodierung (38') an den zweiten vorhergesagten Videodaten (136'), um transformierte zweite vorhergesagte Videodaten (138') zu erhalten; und

wobei das Modifizieren (22, 24, 25) das Kombinieren (22, 22', 24, 25) der erfassten Daten (120) mit den erhaltenen weiteren Daten (120'), den transformierten vorhergesagten Videodaten (139'), den transformierten ersten vorhergesagten Videodaten (138) und den transformierten zweiten vorhergesagten Videodaten (138') aufweist.
 


Revendications

1. Procédé d'édition d'effets dans un flux de bits (100) transportant des données vidéo indiquant une première séquence vidéo, ledit procédé comprenant le fait de:

- décoder le flux de bits (100) et un flux de bits différent (100') pour obtenir respectivement une première séquence vidéo décodée (132) et une deuxième séquence vidéo décodée (130');

- obtenir, à partir du flux de bits (100) transportant les données vidéo indiquant une première séquence vidéo (120), les données indiquant les coefficients de transformée;

- obtenir, à partir du flux de bits différent (100') transportant les données vidéo indiquant une deuxième séquence vidéo, d'autres données (120') indiquant une pluralité d'autres coefficients de transformée;

- obtenir les données vidéo prédites (137') sur base de vecteurs de mouvement à partir de la première séquence vidéo décodée (132) et sur base d'une trame reconstruite auparavant de la deuxième séquence vidéo décodée (132');

- effectuer un codage par transformée (39') sur les données vidéo prédites (137') pour obtenir des données vidéo prédites transformées (139'); et

- modifier (22, 24, 25) les premières données de séquence vidéo (120) dans le domaine de la transformée pour fournir des coefficients de transformée modifiés dans un flux de bits vidéo comprimé valide (170), où ladite modification (22, 24, 25) comprend le fait de combiner (22, 22', 24, 25) les données obtenues (120) avec les autres données obtenues (120') et les données vidéo prédites transformées (139').


 
2. Procédé selon la revendication 1, caractérisé par le fait que ladite acquisition comprend le fait de:

démultiplexer le flux de bits (100) pour obtenir une pluralité de coefficients de transformée quantifiés (110); et

convertir les coefficients de transformée quantifiés (110) par quantification inverse pour fournir les coefficients de transformée.


 
3. Procédé selon la revendication 2, caractérisé par le fait que les données modifiées contiennent une pluralité de coefficients de transformée modifiés quantifiés (126), et que ladite modification comprend le fait de modifier les coefficients de transformée pour fournir une pluralité de coefficients de transformée modifiés (125), ledit procédé étant par ailleurs caractérisé par le fait de:
quantifier (26) les coefficients de transformée modifiés (125) pour fournir une pluralité de coefficients de transformée modifiés quantifiés (126) dans le flux de bits modifié (170).
 
4. Procédé selon l'une quelconque des revendications précédentes, caractérisé par le fait que ladite combinaison comprend le fait de:

multiplier (22') les autres données (120') par un premier paramètre de pondération pour fournir des premières données pondérées (122');

multiplier (22) les données acquises (120) par un deuxième paramètre de pondération pour fournir des deuxièmes données pondérées (122); et

additionner les premières données pondérées (122') et les deuxièmes données pondérées (122).


 
5. Procédé selon la revendication 4, caractérisé par le fait qu'un ou les deux parmi les premier et deuxième paramètres de pondération sont ajustés pour obtenir un effet de mélange.
 
6. Procédé selon la revendication 4, caractérisé par le fait qu'un ou les deux parmi les premier et deuxième paramètres de pondération sont ajustés pour obtenir un effet de transition glissant.
 
7. Procédé selon la revendication 1, caractérisé par ailleurs par le fait de:

démultiplexer le flux de bits (100) pour obtenir une pluralité de coefficients de transformée quantifiés (110); et

convertir les coefficients de transformée quantifiés (110) dans une opération de quantification inverse (20) pour obtenir une pluralité de coefficients de transformée déquantifiés (120) destinés à être utilisés dans ladite modification.


 
8. Procédé selon la revendication 7, caractérisé par ailleurs par le fait de:

transformer inversement (30) les coefficients de transformée déquantifiés (120) pour obtenir des informations indiquant une erreur de prédiction;

combiner (32) l'erreur de prédiction avec les informations de compensation de mouvement dans les données vidéo pour fournir d'autres données vidéo indiquant une trame de référence;

transformer les autres données vidéo pour fournir des données de référence transformées; et

combiner les données de référence transformées avec les données acquises (120), les autres données obtenues (120') et les données vidéo prédites transformées (139') dans ladite modification.


 
9. Procédé selon la revendication 1, ledit procédé comprenant par ailleurs le fait de:

- obtenir des premières données vidéo prédites (136) sur base de vecteurs de mouvement de la première séquence vidéo décodée (132) et sur base d'une trame reconstruite auparavant de ladite séquence vidéo (132);

- effectuer un codage par transformée (38) sur les premières données vidéo prédites (136) pour obtenir des premières données vidéo prédites transformées (138);

- obtenir des deuxièmes données vidéo prédites (136') sur base de vecteurs de mouvement de la deuxième séquence vidéo différente (132') et sur base d'une trame reconstruite auparavant de ladite deuxième séquence vidéo (132');

- effectuer un codage par transformée (38') sur les deuxièmes données vidéo prédites (136') pour obtenir des deuxièmes données vidéo prédites transformées (138'); et

dans lequel ladite modification (22, 24, 25) comprend le fait de combiner (22, 22', 24, 25) les données acquises (120) avec les autres données obtenues (120'), les données vidéo prédites transformées (139'), les premières données vidéo prédites transformées (138) et les deuxièmes données vidéo prédites transformées (138').
 
10. Dispositif d'édition de vidéo, comprenant:

- un module de décodage destiné à décoder le flux de bits (100) et un flux de bits différent (100') pour obtenir respectivement une première séquence vidéo décodée (132) et une deuxième séquence vidéo décodée (130');

- un module d'acquisition, réagissant à un flux de bits (100) transportant des données vidéo indiquant une première séquence vidéo, destiné à obtenir des données (120) indiquant les coefficients de transformée;

- un autre module d'acquisition destiné à obtenir, du flux de bits différent (100') portant les données vidéo indiquant une deuxième séquence vidéo, d'autres données (120') indiquant une pluralité d'autres coefficients de transformée;

- un module de prédiction destiné à obtenir des données vidéo prédites (137') sur base de vecteurs de mouvement de la première séquence vidéo décodée (132) et sur base d'une trame reconstruite auparavant de la deuxième séquence vidéo décodée (132');

- un module de transformée (39') destiné à effectuer un codage par transformée sur les données vidéo prédites (137') pour obtenir des données vidéo prédites transformées (139'); et

- un module de modification, réagissant aux premières données de séquence vidéo (120), destiné à modifier les coefficients de transformée dans le domaine de la transformée pour fournir des coefficients de transformée modifiés dans un flux de bits vidéo comprimé valide (170), où ledit module de modification comprend un module de combinaison destiné à combiner au moins les données obtenues (120), les autres données obtenues (120') et les données vidéo prédites transformées (139') pour fournir les données modifiées.


 
11. Dispositif d'édition selon la revendication 10, caractérisé par le fait que le module d'acquisition comprend:

un module de démultiplexage (10), réagissant au flux de bits (100), pour obtenir une pluralité de coefficients de transformée quantifiés (110); et

un module de quantification inverse (20), réagissant aux coefficients de transformée quantifiés (110), pour fournir les coefficients de transformée.


 
12. Dispositif d'édition selon la revendication 11, caractérisé par le fait que les coefficients de transformée sont modifiés dans le domaine de la transformée pour devenir des coefficients de transformée modifiés (125) par le module de modification, ledit dispositif d'édition étant par ailleurs caractérisé par:
un module de quantification (26) destiné à quantifier les coefficients de transformée modifiés (125) pour fournir une pluralité de coefficients de transformée modifiés quantifiés (126) dans les données modifiées.
 
13. Dispositif d'édition selon la revendication 10, caractérisé par ailleurs par:

un module de transformée inverse (30), réagissant aux autres données obtenues, destiné à fournir des informations indiquant une erreur de prédiction;

un module de combinaison (32), réagissant aux informations d'erreur de prédiction et de compensation de mouvement dans les données vidéo, destiné à fournir des données de référence indiquant une trame de référence; et

un module de transformée, réagissant aux données de référence, destiné à fournir des données de référence transformées au module de modification de manière à modifier les coefficients de transformée sur base des données de référence transformées.


 
14. Dispositif d'édition selon la revendication 10, comprenant par ailleurs:

- un module destiné à obtenir des premières données vidéo prédites (136) sur base de vecteurs de mouvement de la première séquence vidéo décodée (132) et sur base d'une trame reconstruite auparavant de ladite première séquence vidéo décodée (132);

- un module destiné à effectuer un codage par transformée (38) sur les premières données vidéo prédites (136) pour obtenir des premières données vidéo prédites transformées (138);

- un module destiné à obtenir des deuxièmes données vidéo prédites (136') sur base de vecteurs de mouvement du flux de bits différent et sur base d'une trame reconstruite auparavant du flux de bits différent;

- un module destiné à effectuer un codage par transformée (38') sur les deuxièmes données vidéo prédites (136') pour obtenir des deuxièmes données vidéo prédites transformées (138'); et

dans lequel lors de ladite combinaison sont combinées les données acquises (120), les autres données obtenues (120'), les données vidéo prédites transformées (139'), les premières données vidéo prédites transformées (138) et les deuxièmes données vidéo prédites transformées (138').
 
15. Système de codage vidéo, comprenant:

un décodeur et un codeur, le codeur comprenant un dispositif d'édition de vidéo selon l'une quelconque des revendications 10 à 14,

dans lequel le décodeur peut fonctionner dans un premier mode pour reconstruire la vidéo à partir des données vidéo transportées dans le flux de bits (100), et dans un deuxième mode pour reconstruire la vidéo à partir des données modifiées dans le flux de bits modifié (170).


 
16. Produit logiciel incorporé dans un support lisible par ordinateur destiné à être utilisé dans un dispositif d'édition de vidéo pour éditer les effets dans un flux de bits (100) transportant des données vidéo indiquant une première séquence vidéo, ledit produit de logiciel comprenant:

- un code pour: décoder le flux de bits (100) et un flux de bits différent (100') pour obtenir respectivement une première séquence vidéo décodée (132) et une deuxième séquence vidéo décodée (130');

- un code pour obtenir, du flux de bits (100) transportant les données vidéo indiquant une première séquence vidéo, les données (120) indiquant une pluralité de coefficients de transformée;

- un code pour obtenir, du flux de bits différent (100') transportant des données vidéo indiquant une deuxième séquence vidéo, d'autres données (120') indiquant une pluralité d'autres coefficients de transformée;

- un code pour obtenir des données vidéo prédites (137') sur base de vecteurs de mouvement de la première séquence vidéo décodée (132) et sur base d'une trame reconstruite auparavant de la deuxième séquence vidéo décodée (132');

- un code pour effectuer un codage par transformée (39') sur les données vidéo prédites (137') pour obtenir des données vidéo prédites transformées (139'); et

- un code pour modifier (22, 24, 25) les premières données de séquence vidéo (120) dans le domaine de la transformée pour fournir des coefficients de transformée modifiés dans un flux de bits vidéo comprimé valide (170), où ladite modification (22, 24, 25) comprend le fait de combiner (22, 22', 24, 25) les données obtenues (120) avec les autres données obtenues (120') et les données vidéo prédites transformées (139').


 
17. Produit logiciel selon la revendication 16, caractérisé par ailleurs par:
un code pour mélanger les coefficients de transformée de ladite au moins une partie de la première séquence vidéo avec d'autres coefficients de transformée.
 
18. Produit logiciel selon la revendication 16, caractérisé par le fait que le code d'acquisition comprend:

un code pour démultiplexer le flux de bits (100) pour obtenir une pluralité de coefficients de transformée quantifiés (110); et

un code pour convertir les coefficients de transformée quantifiés (110) par quantification inverse pour fournir les coefficients de transformée.


 
19. Produit logiciel selon la revendication 18, caractérisé par le fait que le code pour modifier comprend:
un code pour modifier les coefficients de transformée pour fournir une pluralité de coefficients de transformée modifiés (125), ledit produit logiciel étant par ailleurs caractérisé par:
un code pour quantifier (26) les coefficients de transformée modifiés (125) pour fournir une pluralité de coefficients de transformée modifiés quantifiés (126) dans un flux de bits modifié (170).
 
20. Produit logiciel selon la revendication 17, caractérisé par le fait que le code pour mélanger comprend:

un code pour multiplier (22') les coefficients de transformée par un premier paramètre de pondération pour fournir des premières données pondérées (122'), et multiplier (22) les autres coefficients de transformée par un deuxième paramètre de pondération pour fournir des deuxièmes données pondérées (122); et

un code pour additionner les premières données pondérées (122') avec les deuxièmes données pondérées (122).


 
21. Produit logiciel selon la revendication 16, caractérisé par ailleurs par:

un code pour démultiplexer le flux de bits (100) pour obtenir une pluralité de coefficients de transformée quantifiés (110); et

un code pour convertir les coefficients de transformée quantifiés (110) dans une opération de quantification inverse pour obtenir une pluralité des coefficients de transformée déquantifiés (120);

un code pour transformer inversement (30) les coefficients de transformée déquantifiés (120) pour obtenir des informations indiquant une erreur de prédiction;

un code pour combiner (32) l'erreur de prédiction avec les informations de compensation de mouvement dans les données vidéo pour fournir d'autres données vidéo indiquant une trame de référence;

un code pour transformer les autres données vidéo pour fournir des données de référence transformées; et

un code pour mélanger les données de référence transformées avec les données acquises (120), avec les autres données obtenues (120') et les données vidéo prédites transformées (139').


 
22. Produit logiciel selon la revendication 16, comprenant par ailleurs:

- un code pour obtenir des premières données vidéo prédites (136) sur base de vecteurs de mouvement du flux de bits et sur base d'une trame reconstruite auparavant du flux de bits;

- un code pour effectuer un codage par transformée (38) sur les premières données vidéo prédites (136) pour obtenir des premières données vidéo prédites transformées (138);

- un code pour obtenir des deuxièmes données vidéo prédites (136') sur base de vecteurs de mouvement du flux de bits différent et sur base d'une trame reconstruite auparavant du flux de bits différent;

- un code pour effectuer un codage par transformée (38') sur les deuxièmes données vidéo prédites (136') pour obtenir des deuxièmes données vidéo prédites transformées (138'); et

dans lequel ladite modification (22, 24, 25) comprend le fait de combiner (22, 22', 24, 25) les données acquises (120) avec les autres données obtenues (120'), les données vidéo prédites transformées (139'), les premières données vidéo prédites transformées (138) et les deuxièmes données vidéo prédites transformées (138').
 




Drawing



































Cited references

REFERENCES CITED IN THE DESCRIPTION



This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

Non-patent literature cited in the description