(19)
(11)EP 3 197 157 B1

(12)EUROPEAN PATENT SPECIFICATION

(45)Mention of the grant of the patent:
15.12.2021 Bulletin 2021/50

(21)Application number: 17155127.8

(22)Date of filing:  07.09.2006
(51)International Patent Classification (IPC): 
H04N 19/105(2014.01)
H04N 19/33(2014.01)
(52)Cooperative Patent Classification (CPC):
H04N 19/98; H04N 19/105; H04N 19/33; H04N 19/186; H04N 19/187; H04N 19/59; H04N 19/63; H04N 19/184

(54)

HIGH DYNAMIC RANGE CODECS

HOCHDYNAMIKBEREICHSCODECS

CODECS À PLAGE DYNAMIQUE ÉTENDUE


(84)Designated Contracting States:
AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

(30)Priority: 23.01.2006 US 761510 P

(43)Date of publication of application:
26.07.2017 Bulletin 2017/30

(62)Application number of the earlier application in accordance with Art. 76 EPC:
15179312.2 / 2988499
06791913.4 / 1989882

(73)Proprietor: Max-Planck-Gesellschaft zur Förderung der Wissenschaften E.V.
80537 München (DE)

(72)Inventors:
  • EFREMOV, Alexander
    684093 Viluchinsk (RU)
  • MANTIUK, Rafal
    Vancouver, British Columbia 203-2715 (CA)
  • KRAWCZYK, Grzegorz
    70-781 Szczecin (PL)
  • MYSZKOWSKI, Karol
    66133 Saarbrücken (DE)
  • SEIDEL, Hans-Peter
    66386 St. Ingbert (DE)

(74)Representative: Dolby International AB Patent Group Europe 
Apollo Building, 3E Herikerbergweg 1-35
1101 CN Amsterdam Zuidoost
1101 CN Amsterdam Zuidoost (NL)


(56)References cited: : 
US-A1- 2005 259 729
US-B1- 6 335 983
  
      
    Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give notice to the European Patent Office of opposition to the European patent granted. Notice of opposition shall be filed in a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art. 99(1) European Patent Convention).


    Description

    Technical Field



    [0001] The invention relates to decoding image data at an image decoder.

    Background



    [0002] Dynamic range is a measure of the relative brightness of the brightest and darkest parts of an image. Until recently, most televisions, computer monitors and other display devices have been capable of reproducing dynamic ranges of only a few hundred to one. This is far less than the dynamic range that can be appreciated by the human eye. Display devices having greater dynamic ranges are becoming available. Such high dynamic range display devices can provide images that are much more natural and realistic than the images produced by conventional "low dynamic range" display devices.

    [0003] High dynamic range display devices are beneficial in a wide range of applications. For example, high dynamic range display devices may be used for displaying realistic video images ranging from movies and game visuals, to visual displays in simulators such as flight simulators. High dynamic range display devices also have application in demanding imaging applications such as medical imaging.

    [0004] Many current image data formats specify pixel values using 24 or fewer bits per pixel. These bits specify both the brightness and color for the pixel. 24 bits is too few to specify both a full range of colors and a brightness that can be varied smoothly over the range that a high dynamic range display is capable of reproducing. To obtain full benefit from a high dynamic range display it is necessary to provide image data capable of specifying a wide range of pixel values. Various high dynamic range data formats that provide larger numbers of bits per pixel have been developed or proposed. Such high dynamic range data formats are not typically backward compatible with prior lower dynamic range data formats.

    [0005] For example, HDRV "Perception-motivated HDR Video Encoding" as described in R. Mantiuk, G. Krawczyk, K. Myszkowski, and H-P. Seidel.

    [0006] Perception-motivated high dynamic range video encoding. ACM Transactions on Graphics (Proc. of SIGGRAPH 2004), 23(3):730- 38, 2004 is a lossy HDR video compression method, which, does not offer backward compatibility. The method encodes HDR pixels using 11 bits for luminance and twice 8 bits for chrominance. The resulting video stream does not contain any information on LDR frames.

    [0007] JPEG HDR is described in Greg Ward and Maryann Simmons. Subband encoding of high dynamic range imagery. In APGV '04: Proceedings of the 1st Symposium on Applied perception in graphics and visualization, pages 83-90, New York, NY, USA, 2004. ACM Press. This method involves sub-sampling a subband layer, which can lead to the loss of high frequencies. To prevent it, the method suggests three approaches: pre-correction of LDR layer, to encode within this layer high frequencies that can be lost due to sub-sampling; post-correction which tries to restore high frequencies that has been lost rather than modifying LDR image and full-sampling, which means that no sub-sampling is performed.

    [0008] Further, US 2005/259729 A1 describes a method of coding a quality scalable video sequence. An N-bit input frame is converted to an M-bit input frame, where M is an integer between 1 and N. To be backwards compatible with existing 8-bit video systems, M would be selected to be 8. The M-bit input frame would be encoded to produce a base-layer output bitstream. An M-bit output frame would be reconstructed from the base-layer output bitstream and converted to a N-bit output frame. The N-bit output frame would be compared to the N-bit input frame to derive an N-bit image residual that could be encoded to produce an enhancement layer bitstream.

    [0009] And US 6,335,983 B1 describes a method for representing a digital image having color values with an extended color gamut in a storage color space having a limited color gamut including providing a color adjustment function; adjusting the color values of the extended color gamut digital image to fit within the limited color gamut to form a limited color gamut digital image using the color adjustment function; representing the limited color gamut digital image in the storage color space; and associating information about the color adjustment function with the limited color gamut digital image in the storage color space such that the associated information and the limited color gamut digital image are adapted to produce a reconstructed extended color gamut digital image.

    [0010] There remains a need for practical methods and apparatus for encoding and decoding HDR images, especially HDR video images. There remains a particular need for such methods and apparatus that provide backward compatibility with existing hardware for reproducing lower-dynamic-range images.

    Summary of the Invention



    [0011] This invention provides methods and apparatus for decoding image data at an image decoder according to the independent claims.

    [0012] A method for encoding a high dynamic range image may comprise obtaining a lower dynamic range image corresponding to the high dynamic range image; identifying groups of pixels in the high dynamic range image for which corresponding pixels in the lower dynamic range image all have the same pixel value; generating a prediction function based at least in part on pixel values of the pixels in the high dynamic range image belonging to each of a plurality of the groups; applying the prediction function to the lower dynamic range image to obtain a predicted image; computing a residual image representing differences between pixel values in the predicted image and corresponding pixel values in the high dynamic range image; and, encoding and storing data representing the lower dynamic range image, the prediction function and the residual image.

    [0013] The invention provides a method to decode image data at an image decoder. The method comprises receiving image coded data, the image coded data comprising a compressed low dynamic range image, a compressed residual image, and a prediction function, wherein for input pixel values in a low dynamic range the prediction function provides corresponding predicted pixel values in a high dynamic range; decompressing the compressed low dynamic range image to generate a reconstructed low dynamic range image; decompressing the compressed residual image to generate a decoded residual image, wherein the decoded residual image is based on a reference high dynamic range image representing the same scene as the reconstructed low dynamic range image and the low dynamic range image is generated by an encoder by applying a tone-mapping function to the reference high dynamic range image; applying the prediction function to the reconstructed low dynamic range image to generate a predicted high dynamic range image; and combining the predicted high dynamic range image with the decoded residual image data to generate an output high dynamic range image, wherein the method is performed by one or more processors, wherein the prediction function is based on the reconstructed low dynamic range image and the reference high dynamic range image, and wherein the prediction function is based at least in part on pixel values of the pixels in the high dynamic range image for which corresponding pixels in the lower dynamic range image all have the same pixel value, and exploits statistical relationships between pixel values of the pixels in the lower dynamic range image and corresponding pixel values in the high dynamic range image, wherein for each pixel value represented in the lower dynamic range image, the corresponding pixel values in the high dynamic range image are the pixel values of those pixels in the high dynamic range image that correspond to pixels in a respective set of pixels in the lower dynamic range image that all have the respective pixel value in the lower dynamic range image.

    [0014] Further aspects of the invention and features of specific embodiments of the invention are described below.

    Brief Description of the Drawings



    [0015] In drawings which illustrate non-limiting embodiments of the invention,

    Figure 1 is a flow chart illustrating an encoding method according to an embodiment of the invention;

    Figure 1A is a histogram of pixel values from a high dynamic range image for which corresponding pixels in a lower dynamic range version of the image all have the same pixel value;

    Figure 2 is a flow chart illustrating a decoding method according to the invention;

    Figure 3 is a flow chart illustrating an MPEG encoding method according to a specific embodiment;

    Figures 4A through 4F show the relationship between luma values in corresponding HDR and LDR images for various tone mapping algorithms;

    Figure 5 shows a method for filtering residual image data according to an embodiment of the invention; and,

    Figure 6 is a plot illustrating bit rate as a function of an image quality parameter for a prototype encoding system.


    Description



    [0016] Throughout the following description, specific details are set forth in order to provide a more thorough understanding of the invention. However, the invention may be practiced without these particulars. In other instances, well known elements have not been shown or described in detail to avoid unnecessarily obscuring the invention. Accordingly, the specification and drawings are to be regarded in an illustrative, rather than a restrictive, sense.

    [0017] Figure 1 shows a method 10 for encoding a frame of image data according to a basic embodiment of the invention. Method 10 encodes both high dynamic range (HDR) data 12 and lower dynamic range (LDR) data 14 into encoded image data 38. As described below, encoded image data 38 can be decoded to reconstruct both LDR data and HDR data.
    4

    [0018] By way of example only, HDR data 12 may be represented in a color space such the absolute CIE XYZ (2° standard observer) color space in which the color and brightness of each pixel are specified by three floating point numbers. LDR data 14 may be represented in a color space such as the sRGB color space in which the color and brightness of each pixel are specified by three bytes. In some embodiments, LDR data 14 is derived from HDR data 12 (or a precursor of HDR data 12) by a suitable dynamic range reduction process 16.

    [0019] Dynamic range reduction may comprise tone mapping and/or gamut mapping, for example. Any suitable tone mapping operator or gamut mapping may be used. For example, a tone mapping operator may be chosen to saturate both luminance and color, change color vales and enhance local contrast. Such changes may result in a lower compression ratio, but both LDR and HDR frames will be preserved in the resulting video stream.

    [0020] In block 18, method 10 establishes a prediction function 19. Prediction function 19 provides as an output a predicted pixel value for a pixel in HDR data 12 based upon the pixel value for the corresponding pixel in LDR data 14. Since the goal is to be able to reproduce HDR data 12 and LDR data 14 from encoded image data 38 it is preferable to base prediction function 19 on a version of LDR data 14 that can be reconstructed from encoded image data 38.

    [0021] Where LDR data 14 is encoded and compressed using a lossy algorithm, it is not possible to guarantee that the reconstructed version of LDR data 14 will be identical to the original LDR data 14. For this reason, Figure 1 shows that block 19 receives as an input reconstructed LDR data 26. Reconstructed LDR data 26 is obtained by encoding/compressing LDR data 14 in block 20 to provide encoded compressed LDR data 22 and then decoding/ decompressing encoded compressed LDR data 22 in block 24. Encoded compressed LDR data 22 is included in encoded image data 38. Line 15 illustrates a less accurate alternative in which block 18 uses LDR data 14 directly to establish prediction function 19.

    [0022] Prediction function 19 preferably exploits statistical relationships between pixel values in reconstructed LDR data 26 and corresponding pixel values in HDR data 12. In general, if one takes all pixels in reconstructed LDR image 26 for which the pixels all have the same particular pixel value, the corresponding pixels in HDR image data 12 will not all have the same pixel value. That is, in general, there is a one-to-many relationship between LDR pixel values and HDR pixel values.

    [0023] Figure 1A is a histogram in which the horizontal axis ranges over all possible HDR pixel values and the vertical axis indicates how many pixels in the image represented by HDR image data 12 have that value. There may be a significant number of pixel values for which the image does not have any pixels that have that value. The shaded bars in Figure 1A represent values of pixels in HDR image data 12 for which the corresponding pixels in reconstructed LDR image data 26 all have the same pixel value XLDR. The HDR pixel values that correspond to LDR pixel value XLDR range from A to B. All of the HDR pixel values for pixels that correspond to the same pixel value in reconstructed LDR image data 26 may be called a bin. It is typical, but not mandatory, that different bins do not overlap.

    [0024] A prediction function 19 for an image may be obtained from HDR image data 12 and reconstructed LDR image data 26 by collecting HDR pixel values into bins and statistically analyzing each of the bins. Collecting the HDR pixel values into bins may comprise:
    • taking reconstructed LDR image data 26, and for each pixel value represented in reconstructed LDR image data 26 identifying the set of all pixels that have that pixel value;
    • for each of the sets of pixels identifying the corresponding pixels in HDR data 12 and determining the pixel values of those corresponding pixels to yield a set of all of the HDR pixel values that corresponds to each LDR pixel value.


    [0025] Prediction function 19 may be obtained by any of:
    • finding the arithmetic mean of the HDR pixel values in each bin;
    • finding the median of the HDR pixel values in each bin;
    • finding the average of the values A and B that delimit the bin;
    • some combination of these; or
    • the like.


    [0026] It is believed that for many applications the arithmetic mean provides a good combination of accuracy and computational efficiency.

    [0027] Given a prediction function 19 it is only necessary to encode differences between the values predicted by prediction function 19 and the actual values from HDR image data 12. Such differences are usually close to zero and therefore can be efficiently compressed in residual frames.

    [0028] Prediction function 19 needs to be defined only for the possible pixel values in the LDR data 14 (256 values in the case where pixel values are represented by an 8-bit number). Prediction function 19 may comprise a lookup table that is indexed by valid values for LDR pixels. Prediction function 19 may be implemented as a lookup table having one output value corresponding to each index value. For example, where LDR pixels have 8-bit values, the lookup table may comprise 256 different values indexed by integers in the range of 1 to 256. Prediction function 19 does not need to be continuous since its major role is to make the values of residual frames as small as possible. In the alternative, prediction function 19 may be partly or entirely represented by a suitably parameterized continuous curve.

    [0029] In block 28 method 10 obtains a predicted HDR image by applying prediction function 19 to reconstructed LDR data 26. The pixel value for each pixel of reconstructed LDR data 26 is applied as an input to prediction function 19 and the pixel value is replaced with the resulting output from prediction function 19 to yield a predicted HDR image 29.

    [0030] Block 30 computes a difference between predicted HDR image 29 and the image of HDR data 12 to provide a residual image 32. Residual image 32 is encoded/compressed in block 34 and output as residual image data 35 to encoded image data 38. Block 34 may comprise filtering and quantizing residual image 32 to remove information that will not have a discernable effect (or, with more aggressive filtering and/or quantization an excessively deleterious effect) on the fidelity of a HDR image reconstructed from encoded image data 38.

    [0031] Figure 2 shows a method 40 for decoding encoded image data 38. LDR data 22 may be extracted from encoded image data 38 and decoded/decompressed in block 32 to yield LDR data 43 that is output as an LDR data output 44. If LDR data output 44 is all that is required then no further processing need be done.

    [0032] If an HDR data output 56 is also required then prediction function 37 is decoded at block 46 to yield prediction function 47 and residual image data 35 is decoded / decompressed at block 50 to yield residual image 52.

    [0033] In block 48, prediction function 47 is applied to LDR data 43 to yield a predicted HDR image 49. In block 54 the predicted HDR image 49 is combined 8 with residual image 52 to yield HDR data output 56. A decoder that operates as shown in Figure 2 can be backwards- compatible with systems and devices that require LDR data output 44 while providing high quality HDR data at HDR data output 56.

    [0034] Methods 10 and 40 may be performed by:
    • programmed data processors, which may comprise one or more microprocessors, digital signal processors, some combination thereof, or the like executing software that causes the data processors to implement the methods;
    • hardware circuits, for example circuits that include functional blocks that cooperate to implement the method - the circuits may comprise, for example, suitably configured field-programmable gate arrays ("FPGAs") or application-specific integrated circuits ("ASICs"); or,
    • performing some parts of the methods in programmed data processors and other parts of the methods in suitable hardware circuits.


    [0035] Figure 3 shows a method 70 according to a more specific example embodiment. Method 70 encodes video frames in a way that complies with the standards set by the Motion Pictures Experts Group (MPEG) standards. Method 70 receives two incoming streams of video data. A stream containing HDR frames 74 is received at input 72. A stream containing LDR frames 76 is received at input 78. LDR frames 76 may be derived from HDR frames 74 or some precursor of HDR frames 74 upstream from input 78.

    [0036] An encoder operating as shown in Figure 3 produces three compressed streams: a LDR stream 80, which may be fully compatible with MPEG; a residual stream 82, which contains differences between LDR frames 76 and the corresponding HDR frames 74; and an auxiliary stream, 84 which contains auxiliary data for reconstructing HDR frames 74. The best performance can be achieved when residual stream 82 and auxiliary stream 84 do not duplicate the information encoded in LDR stream 80.

    [0037] LDR frames 76 are encoded in block 88 using a suitable encoder. For example, block 88 may use an MPEG video encoder compatible with the ISO/IEC 14496-2 standard. Other video encoders may be used in the alternative. The resulting video stream may be encapsulated in a suitable media container format, such as Audio Video Interleave (AVI) or QuickTime, so that it can be recognized and played back by existing software.

    [0038] In block 90 the MPEG encoded LDR frames are decoded. To minimize computation, the decoding in block 90 may be performed by the MPEG encoder used in block 88. MPEG encoders typically decode frames internally for use in estimating motion vectors. Block 90 may comprise accessing the decoded frames generated by the MPEG encoder. In the alternative, block 90 may be performed independently from block 88.

    [0039] The output from block 90 will generally different from the input to block 88 because MPEG is a lossy compression method. LDR frames that are MPEG encoded and then decoded are not exactly the same as the original LDR frames but contain compression artifacts.

    [0040] In blocks 92A and 92B the color spaces of one or both of LDR frames 76 and HDR frames 74 are transformed, if necessary, to provide LDR frames and HDR frames that are represented in mutually compatible color spaces. Which transformations, if any, are performed in blocks 92A and 92B depends upon the color spaces of incoming LDR frames 76 and HDR frames 74. In some cases blocks 92A and 92B are not required. In other cases, only one of blocks 92A and 92B is required.

    [0041] HDR and LDR color spaces are compatible when color channels of both the LDR and HDR color spaces represent approximately the same information. It is also desirable that the HDR and LDR color spaces be perceptually uniform. Perceptual uniformity facilitates the estimation of color differences according to perceivable, rather than arithmetic, differences. It is also desirable that the HDR color space preserve a broad color gamut, ideally the full visible color gamut, even though the full visible color gamut cannot be displayed on the existing displays.

    [0042] The inventors consider that a good color space for use in representing HDR image data is a combination of the CIE 1976 Uniform Chromaticity Scales (u0, v0) with the gamma correction of the sRGB color space. Other color spaces could also be used. In one example, incoming LDR frames 76 are represented in the sRGB color space while incoming HDR frames 74 are represented in the CIE XYZ (2° standard observer) color space. In this case, block 92A comprises converting LDR pixels from the sRGB color space to the l1dru1drv1dr space. This can be done by computing the CIE XYZ color coordinates and then computing luma and u' and v' color coordinates from the XYZ values. The XYZ values may be determined using the sRGB conversion formulas given in IEC 61966-2-1:1999. Multimedia systems and equipment - Colour measurement and management - Part 2-1: Colour management - Default RGB colour space-sRGB. International Electrotechnical Commission, 1999. For example for the R8-bit is the 8-bit color coordinate:





    [0043] The G8-bit and B8-bit color coordinates may be converted to floating point values similarly and then X, Y, and Z may be determined from:



    [0044] The example matrix in Equation (3) assumes the white point D65. Luma can be computed for each LDR pixel by using suitable corrected color values. For example, luma may be given by:

    where: lldr is the luma value for an LDR pixel. Luma is the weighted sum of the non-linear R' G' B' components after gamma correction has been applied.

    [0045] The chromaticities u' and v' may be obtained from:

    and

    then 8-bit numbers uldr and vldr can be obtained by multiplying each of u' and v' by a suitable scaling factor such as:

    and



    [0046] In the transformed color space, each pixel of the LDR data is represented by the pixel values lldr, vldr, uldr.

    [0047] Block 92B may transform color values of HDR frames 74 in substantially the same manner as described above for the LDR pixel values. Ordinary gamma correction typically cannot be used for the range of luminance values that may be specified in an HDR frame. Therefore, some embodiments use a perceptually uniform luminance representation that has been derived from the contrast detection measurements for human observers. This space has properties similar to a space in which LDR pixel values are gamma corrected but can encode the full visible range of luminance (using 11-12 bits for example).

    [0048] In an example embodiment, HDR luminance, y, is transformed into 12-bit HDR luma, lhdr, by the formula:

    where constants are listed in Table I below. The inverse transformation is given by:

    where the various constants used in Equations (9) and (10) are listed in Table I below.
    TABLE I - Example Constants for Equations (9) and (10)
    abcdef
    17.554 826.81 0.10013 -884.17 209.16 -731.28
    yl yh        
    5.6046 10469        
    a'b'c'd'e'f'
    0.056968 7.3014e-30 9.9872 884.17 32.994 0.00478
    ll lh        
    98.381 1204.7        


    [0049] Block 94 generates a prediction function for the HDR image data. The prediction function attempts to predict a pixel value for a pixel in the HDR image data based upon a corresponding pixel value for the corresponding pixel in the LDR image data. The prediction function is ideally chosen to minimize the number of pixels in the HDR image data that have values that differ significantly from the values predicted by the prediction function. The prediction function is preferably non-linear in the logarithmic domain.

    [0050] In cases where pixel values representing chromaticity in the HDR data are almost the same as the corresponding pixel values in the LDR image data, it is not beneficial to compute a prediction function for the chromaticity pixel values (e.g. u' and v'). In such cases, it is only necessary to provide a prediction function for the brightness values (e.g. luma, luminance or the like).

    [0051] Since LDR frames 76 and HDR frames 74 contain similar information, these frames are strongly correlated. Where LDR frames 76 are obtained by applying a tone mapping algorithm to HDR frames 74, the particular nature of the correlation depends upon what tone mapping algorithm was used.

    [0052] Figures 4A through 4F show how the luma values of a LDR frame relate to the luma values of a corresponding HDR frame. Each of these Figures applies a different tone mapping function to derive an LDR image from an example HDR image. These tone mapping functions provide generally linear relationships between lldr and lhdr at lower values. There is more variation between the tone mapping functions for higher luminance values. In each of Figures 4A to 4D, LDR luma values are plotted on the horizontal axis and HDR luma values are plotted on the vertical axis. The points marked X indicate the pixel values of corresponding pixels in the LDR and HDR images.

    [0053] Figures 4A to 4F correspond respectively to the tone mapping functions disclosed in:

    [0054] The prediction function may be generated as described above. Where the prediction function is defined as the arithmetic mean of the values of all HDR pixels falling into a corresponding bin, then the prediction can be written as:

    where Ωi={i=1...N | lldr (i)=l}, l=0...255;

    [0055] N is the number of pixels in a frame and lldr(i) and lhdr(i) are the luma values for the ith pixel in the LDR and HDR frames respectively. The prediction function is preferably updated for each frame.

    [0056] In Figures 4A to 4F, the prediction functions are shown as solid lines. The prediction functions will depend on the image content as well as on the tone-mapping function used. Figures 4A through 4F show prediction functions for typical HDR images. Figures 4A to 4F show that typical prediction functions tend to be slowly changing with an increasing slope over significant portions of their range. Therefore, in some embodiments, instead of encoding the values of the prediction function for every bin, the differences between prediction function values for two consecutive bins are encoded. To further reduce the number of bits, these differences can be compressed, for example, using an adaptive Huffman algorithm as indicated in block 95. The size of auxiliary data stream 84 is 1% or less of the total stream size in some embodiments. Therefore the storage overhead of a prediction function can be almost insignificant. Prediction functions or parts of prediction functions may also be represented in other ways, for example, as parameterized polynomial curves, spline curves, or other parameterized functions.

    [0057] In block 96 residual frames are computed. The pixel values in the residual frame each represent the difference between the pixel value for the corresponding pixel of the HDR frame and the pixel value for that pixel predicted by applying the prediction function to the pixel value of the corresponding pixel of the LDR frame. Block 96 may be performed separately for each pixel value (l, u and v in this example). For luminance values, each pixel rl(i) in the residual frame may be calculated as:

    for chromatic values, the prediction function may be an identity function, in which case:

    and



    [0058] A properly chosen prediction function can reduce the amount of data that encodes HDR frames significantly. Despite this saving, residual frames can still contain a significant amount of noise that does not visibly improve the quality of reconstructed HDR images. The compression ratio can be improved without causing a noticeable reduction in image quality by filtering the residual frames to reduce or eliminate this noise. Block 98 filters the residual frames.

    [0059] The signal in residual frames is often relatively close to the visibility threshold. Therefore, filtering can result in significant reduction of data without significant degradation in the quality of HDR images reconstructed from the data.

    [0060] An output of block 98 is a residual frame in which high frequencies have been attenuated in those regions where they are not visible. Figure 5 shows a method 110 that may be applied for filtering residual frames. Method 110 may be performed in the context of an encoding method according to the invention but also has application in other contexts where it is desired to reduce the amount of data representing an image without introducing visible artifacts into the image.

    [0061] The description that follows describes processing that is done on a luma channel. The same processing may be also applied to chroma channels. The chroma channels may be subsampled, for example to half of their original resolution to reduce processing. This reduction approximately accounts for differences in luminance and chrominance CSF.

    [0062] Method 110 receives a residue frame 112 and a HDR frame 114 that masks the residue frame. In blocks 116 and 118 a Discrete Wavelet Transform (DWT) is applied to split each of masking frame 114 and residue frame 112 into several frequency- and orientation-selective channels. Other suitable transforms, such as the cortex transform described in A.B. Watson. The cortex transform: Rapid computation of simulated neural images. Computer Vision Graphics and Image Processing, 39:311-327, 1987, may be applied in place of the DWT. The cortex transform can be very computationally-intensive and so is practical only if sufficient computational resources are available.

    [0063] A prototype embodiment uses the CDF 9/7 discrete wavelet basis (which is used also for the lossy compression of images according to the JPEG-2000 standard). This wavelet basis gives a good trade-off between smoothness and computational efficiency. In the prototype, only the three finest scales of the wavelet decomposition are used since filtering of lower spatial frequencies at coarser scales could lead to noticeable artifacts.

    [0064] In block 120 a function such as a contrast sensitivity function (CSF) is applied to account for the lower sensitivity of the human visual system for high spatial frequencies. Applying the CSF involves weighting each band of wavelet coefficients by a constant value. Example weighting factors for a viewing distance of 1700 pixels are given in Table 2.
    TABLE 2 - CSF coefficients
    ScaleLHHLHH
    1 0.275783 0.275783 0.090078
    2 0.837755 0.837755 0.701837
    3 0.999994 0.999994 0.999988


    [0065] Human visual channels have limited phase sensitivity. This provides a further opportunity to discard information without noticeable degradation of reconstructed images. A masking signal does not only affect regions where the values of wavelet coefficients are the highest, but it may also affect neighboring regions. Phase uncertainty also reduces the effect of masking at edges, as opposed to textures which show higher amounts of masking.

    [0066] Phase uncertainty may be modelled with the L0.2 norm, which is also used in JPEG-2000 image compression. The L0.2 norm is given by:

    and mathematical equivalents thereof where θ denotes the neighbourhood of a coefficient (a 13x13 box is used as the neighbourhood in the prototype implementation), LCSF is a wavelet coefficient that has been weighted by applying a CSF factor and is the CSF-weighted wavelet coefficient after taking phase uncertainty into account.

    [0067] Block 124 predicts how threshold contrast changes in the presence of the masking signal from original HDR frame 114. To model contrast masking, one can employ a threshold elevation function. The threshold elevation function may, for example, have the form:

    In the prototype embodiment, the constants in Equation (16) are given by a=0.093071 and b=1.0299 and c=11.535.

    [0068] Each CSF-weighted coefficient for the residual frame, RCSF, is compared to the value of the corresponding threshold elevation Te calculated from original HDR frame 114. If RCSF is smaller than the visibility threshold Te from Equation (16), the coefficient may be set to zero without introducing changes in the eventual reconstructed image that are noticeable. This may be expressed by:



    [0069] Finally, the filtered wavelet coefficients, Rfilt are transformed back to the image domain. The prefiltering method presented above can substantially reduce the size of a residual stream. The filtering is a reasonable trade-off between computational efficiency and accuracy of the visual model. Filtering as described herein typically increases encoding time by no more than about 80%. Filtering during encoding does not increase decoding times.

    [0070] Returning to Figure 3, block 100 quantizes the filtered residual frames. Although the magnitudes of the differences encoded in residual frames are usually small, they can take values in the range of -4095 to 4095 (for 12-bit HDR luma encoding). Obviously, such values cannot be encoded using 8-bit MPEG encoder. Although the MPEG standard provides an extension for encoding luma values on 12 bits, such an extension is rarely implemented, especially in hardware.

    [0071] Quantization block 100 permits the magnitude of residual values to be reduced, preferably sufficiently that those values can be encoded using a standard 8-bit MPEG encoder. Various quantization schemes may be used. For example, some embodiments apply a non-linear quantization, where large absolute values of residual are heavily quantized, while small values are preserved with maximum accuracy. Since very few pixels contain a residual having a large magnitude, most pixels are not affected by the strong quantization.

    [0072] Strong quantization can result in some images having poor visual quality. This is because even a few pixels that have large quantization errors can stand out in a way that detracts from perceived image quality.

    [0073] A simple clamping of residual values (for example, to an 8-bit range) can produce visually better results at the cost of losing details in very bright or dark regions. Furthermore, in typical images, with suitably chosen prediction functions, only a very few pixels have residual values that exceed an 8-bit range.

    [0074] In some embodiments, to reduce clamping at the cost of a stronger quantization, residual values are divided by a constant quantization factor. The factor can be chosen based upon a trade-off between errors due to clamping and errors due to quantization. Such quantization factors can be set separately for each bin, based on the maximum magnitude of the residual for all pixels that belong to that bin. Therefore, the residual values after quantization can be computed as:

    where:
    • the operator [.]-127÷127 rounds the value inside the square brackets to the closest integer and then clamps the value if it is larger than 127 or smaller than -127;
    • q(l) is a quantization factor that is selected separately for each bin Ωk. The quantization factor may be given by:

      where qmin is a minimum quantization factor which may be, for example, 1 or 2.


    [0075] The quantization factors q(l) may be stored together with the prediction function in auxiliary data stream 84. This data may be first compressed as in block 95. In most cases, most of quantization factors q(l) will have the value qmin. Therefore, run-length encoding followed by Huffman encoding is an effective way to compress the data representing the quantization factors.

    [0076] In block 102 the residual values are encoded. When the residual values are 8-bit values they can be encoded using ordinary MPEG compression (e.g. MPEG-4 compression). In a prototype embodiment, the quantized residual values, r̂1, and chroma residual values ru and rv are MPEG encoded after rounding them to the nearest integer value. Note that the operations applied to obtain residual values are approximately linear in cases where the prediction function is nearly linear and the effect of the adaptive quantization of Equation (18) is minimal. In such cases, the visual information of a residual frame is in the same frequency bands as the original HDR frame, and the DCT quantization of the residual has a similar effect as for the original HDR pixel values. Therefore, a standard DCT quantization matrix can be used for encoding the residual frames.

    [0077] Since the MPEG encoding in blocks 88 and 102 are independent, it is possible to separately set MPEG quality parameters for each of blocks 88 and 102. In most applications, setting two sets of MPEG quality parameters is neither intuitive nor convenient. In preferred embodiments, a single quality control sets quality parameters for both of blocks 88 and 102. It has been found that setting the quality parameters in blocks 88 and 102 to be equal to one another provides generally satisfactory results.

    [0078] Some quality settings for blocks 88 and 102 may give better compression results than others. To achieve the best quality HDR images, block 102 should comprise encoding using the best quality. The quality settings in block 88 primarily affect the quality of LDR images reconstructed from stream 80 but may have some impact on the quality of HDR images also.

    [0079] Some embodiments of the invention exploit the fact that both LDR and HDR frames contain the same scenes. Therefore the optical flow should be the same for both of them. In such embodiments, the same motion vectors computed for LDR frames are also used for residual frames. Data structure 38 may include only one set of motion vectors. In alternative embodiments of the invention, motion vectors are computed separately for LDR and residual frames and both sets of motion vectors are stored in encoded image data 38.

    [0080] Software for performing methods according to the invention may be implemented in various ways. In a prototype embodiment, the software is implemented as a dynamic library to simplify integration with external software. A separate set of command line tools permits for encoding and decoding of video streams from and to HDR image files.

    [0081] Since HDR video playback involves decoding two MPEG streams, 80 and 82, achieving an acceptable frame rate is more challenging than in the case of ordinary LDR video playback. Playback frame rate can be boosted by performing some parts of the decoding process using graphics hardware. For example, both color space conversion and up-sampling of color channels can be computationally expensive when executed on a CPU and yet can be performed extremely efficiently in a graphics processor (GPU) as fragment programs. Additionally, some color conversion functions can be significantly accelerated with the use of fixed point arithmetic and lookup tables.

    [0082] Figure 6 illustrates the performance of the prototype embodiment as a function of the quality setting. The lower points correspond to LDR stream 80 whereas the upper points correspond to the sum of LDR stream 80 and residual stream 82. It can be seen that for lower values of the qscale quality parameter (i.e. for higher quality images) the percentage of the overall data stream made up by residual stream 82 is smaller than it is for higher values of the quality parameter (corresponding to lower-quality LDR images).

    [0083] Codecs as described herein may be used to encode and decode both individual images and video sequences. Such codecs may be used to encode and decode movies to be stored on media such as DVDs, or other storage media that may become common for storing movies in future.

    [0084] Some aspects of the invention provide media players that include an output for HDR images to which a HDR display device is connected or can be connected. The media players include hardware, software, or a combination of hardware and software that implement decoding methods as shown in Figure 2 for example.

    [0085] Certain implementations of the invention comprise computer processors which execute software instructions which cause the processors to perform a method of the invention. For example, one or more processors in a data processing system may implement the encoding methods of Figures 1 or 3 or the decoding method of Figure 2 by executing software instructions stored in a memory accessible to the processors. The invention may also be provided in the form of a program product. The program product may comprise any medium which carries a set of computer-readable signals comprising instructions which, when executed by a data processor, cause the data processor to execute a method of the invention. Program products according to the invention may be in any of a wide variety of forms. The program product may comprise, for example, physical media such as magnetic data storage media including floppy diskettes, hard disk drives, optical data storage media including CD ROMs, DVDs, electronic data storage media including ROMs, flash RAM, or the like. The computer-readable signals on the program product may optionally be compressed or encrypted.

    [0086] Where a component (e.g. a software module, processor, assembly, device, circuit, etc.) is referred to above, unless otherwise indicated, reference to that component (including a reference to a "means") should be interpreted as including as equivalents of that component any component which performs the function of the described component (i.e., that is functionally equivalent), including components which are not structurally equivalent to the disclosed structure which performs the function in the illustrated exemplary embodiments of the invention.


    Claims

    1. A method to decode image data at an image decoder, the method comprising:

    receiving image coded data, the image coded data comprising a compressed low dynamic range image, a compressed residual image, and a prediction function, wherein for input pixel values in a low dynamic range the prediction function provides corresponding predicted pixel values in a high dynamic range;

    decompressing the compressed low dynamic range image to generate a reconstructed low dynamic range image;

    decompressing the compressed residual image to generate a decoded residual image, wherein the decoded residual image is based on a reference high dynamic range image representing the same scene as the reconstructed low dynamic range image and the low dynamic range image is generated by an encoder by applying a tone-mapping function to the reference high dynamic range image;

    applying the prediction function to the reconstructed low dynamic range image to generate a predicted high dynamic range image; and

    combining the predicted high dynamic range image with the decoded residual image data to generate an output high dynamic range image,

    wherein the method is performed by one or more processors,
    wherein the prediction function is based on the reconstructed low dynamic range image and the reference high dynamic range image, and wherein the prediction function is based at least in part on pixel values of the pixels in the high dynamic range image for which corresponding pixels in the lower dynamic range image all have the same pixel value, and exploits statistical relationships between pixel values of the pixels in the lower dynamic range image and corresponding pixel values in the high dynamic range image,

    wherein for each pixel value represented in the lower dynamic range image, the corresponding pixel values in the high dynamic range image are the pixel values of those pixels in the high dynamic range image that correspond to pixels in a respective set of pixels in the lower dynamic range image that all have the respective pixel value in the lower dynamic range image.


     
    2. The method of claim 1, wherein the prediction function comprises a look-up table that is indexed by pixel values in a low dynamic range.
     
    3. The method of any previous claim, wherein the prediction function comprises a representation of a parameterized polynomial curve.
     
    4. The method of any previous claim, wherein the reconstructed low dynamic range image is based on a tone-mapped representation of the reference high dynamic range image.
     
    5. The method of any previous claim, further comprising receiving a quantization factor associated with the residual image.
     
    6. The method of claim 5, further comprising:

    applying the quantization factor to the decoded residual image data to generate dequantized residual image data; and

    combining the predicted high dynamic range image with the dequantized residual image data to generate the output high dynamic range image.


     
    7. The method of any previous claim, wherein the prediction function determines for each pixel value represented in the lower dynamic range image a predicted pixel value represented in the high dynamic range image based at least in part on the pixel values belonging to a respective bin among a plurality of bins.
     
    8. The method of claim 7, wherein generating a bin in the plurality of bins comprises:

    identifying groups of pixels in a reference high dynamic range image to which the low dynamic range image corresponds for which corresponding pixels in the lower dynamic range image all have the same pixel value within the lower dynamic range image; and

    for each of the plurality of groups, collecting the pixel values of the pixels in the respective group into a respective bin.


     
    9. The method of any previous claim, wherein the lower dynamic range image is obtainable from the high dynamic range image by a dynamic range reduction process and contains the same scene as the high dynamic range image.
     
    10. An apparatus for image decoding, the apparatus comprising:

    a processor; and

    a memory storing instructions that, when executed by the processor, configure the apparatus to:

    receive image coded data, the image coded data comprising a compressed low dynamic range image, a compressed residual image, and a prediction function, wherein for input pixel values in a low dynamic range the prediction function provides corresponding predicted pixel values in a high dynamic range;

    decompress the compressed low dynamic range image to generate a reconstructed low dynamic range image;

    decompress the compressed residual image to generate a decoded residual image, wherein the decoded residual image is based on a reference high dynamic range image representing the same scene as the reconstructed low dynamic range image and the low dynamic range image is generated by an encoder by applying a tone-mapping function to the reference high dynamic range image;

    apply the prediction function to the reconstructed low dynamic range image to generate a predicted high dynamic range image; and

    combine the predicted high dynamic range image with the decoded

    residual image data to generate an output high dynamic range image,
    wherein the prediction function is based on the reconstructed low dynamic range image and the reference high dynamic range image, and wherein the prediction function is based at least in part on pixel values of the pixels in the high dynamic range image for which corresponding pixels in the lower dynamic range image all have the same pixel value, and exploits statistical relationships between pixel values of the pixels in the lower dynamic range image and corresponding pixel values in the high dynamic range image,

    wherein for each pixel value represented in the lower dynamic range image, the corresponding pixel values in the high dynamic range image are the pixel values of those pixels in the high dynamic range image that correspond to pixels in a respective set of pixels in the lower dynamic range image that all have the respective pixel value in the lower dynamic range image.


     
    11. The apparatus of claim 10, wherein instructions further configure the apparatus to receive a quantization factor associated with the residual image.
     
    12. The apparatus of claim 11, wherein instructions further configure the apparatus to:

    apply the quantization factor to the decoded residual image to generate a dequantized residual image; and

    combine the predicted high dynamic range image with the dequantized residual image to generate the output high dynamic range image.


     
    13. A non-transitory computer-readable storage medium having stored thereon computer-executable instructions for executing with one or more processors a method comprising:

    receiving image coded data, the image coded data comprising a compressed low dynamic range image, a compressed residual image, and a prediction function, wherein for input pixel values in a low dynamic range the prediction function provides corresponding predicted pixel values in a high dynamic range;

    decompressing the compressed low dynamic range image to generate a reconstructed low dynamic range image;

    decompressing the compressed residual image to generate a decoded residual image, wherein the decoded residual image is based on a reference high dynamic range image representing the same scene as the reconstructed low dynamic range image;

    applying the prediction function to the reconstructed low dynamic range image to generate a predicted high dynamic range image; and

    combining the predicted high dynamic range image with the decoded residual image data to generate an output high dynamic range image,

    wherein the prediction function is based on the reconstructed low dynamic range image and the reference high dynamic range image, and wherein the prediction function is based at least in part on pixel values of the pixels in the high dynamic range image for which corresponding pixels in the lower dynamic range image all have the same pixel value, and exploits statistical relationships between pixel values of the pixels in the lower dynamic range image and corresponding pixel values in the high dynamic range image,

    wherein for each pixel value represented in the lower dynamic range image, the corresponding pixel values in the high dynamic range image are the pixel values of those pixels in the high dynamic range image that correspond to pixels in a respective set of pixels in the lower dynamic range image that all have the respective pixel value in the lower dynamic range image.


     


    Ansprüche

    1. Verfahren zum Decodieren von Bilddaten bei einem Bilddecodierer, wobei das Verfahren Folgendes umfasst:

    Empfangen von bildcodierten Daten, wobei die bildcodierten Daten ein komprimiertes Bild mit niedrigem Dynamikumfang, ein komprimiertes Restbild und eine Vorhersagefunktion umfassen, wobei für Eingangspixelwerte in einem niedrigem Dynamikumfang die Vorhersagefunktion entsprechende vorhergesagte Pixelwerte in einem hohen Dynamikumfang bereitstellt;

    Dekomprimieren des komprimierten Bilds mit niedrigem Dynamikumfang, um ein rekonstruiertes Bild mit niedrigem Dynamikumfang zu erzeugen;

    Dekomprimieren des komprimierten Restbilds, um ein decodiertes Restbild zu erzeugen, wobei das decodierte Restbild auf einem Referenzbild mit hohem Dynamikumfang basiert, das dieselbe Szene wie das rekonstruierte Bild mit niedrigem Dynamikumfang darstellt, und das Bild mit niedrigem Dynamikumfang von einem Codierer erzeugt wird, indem eine Dynamikkompressionsfunktion auf das Referenzbild mit hohem Dynamikumfang angewendet wird;

    Anwenden der Vorhersagefunktion an dem rekonstruierten Bild mit niedrigem Dynamikumfang, um ein vorhergesagtes Bild mit hohem Dynamikumfang zu erzeugen; und

    Kombinieren des vorhergesagten Bilds mit hohem Dynamikumfang mit den decodierten Restbilddaten, um ein Ausgangsbild mit hohem Dynamikumfang zu erzeugen,

    wobei das Verfahren von einem oder mehreren Prozessoren durchgeführt wird,

    wobei die Vorhersagefunktion auf dem rekonstruierten Bild mit niedrigem Dynamikumfang und dem Referenzbild mit hohem Dynamikumfang basiert und wobei die Vorhersagefunktion zumindest teilweise auf Pixelwerten der Pixel in dem Bild mit hohem Dynamikumfang basiert, für das entsprechende Pixel in dem Bild mit niedrigerem Dynamikumfang denselben Pixelwert aufweisen, und statistische Beziehungen zwischen Pixelwerten der Pixel in dem Bild mit niedrigerem dynamischen Bildumfang und entsprechenden Pixelwerten in dem Bild mit hohem Dynamikumfang ausnutzt,

    wobei für jeden Pixelwert, der in dem Bild mit niedrigerem Dynamikumfang dargestellt wird, die entsprechenden Pixelwerte in dem Bild mit hohem Dynamikumfang die Pixelwerte jener Pixel in dem Bild mit hohem Dynamikumfang sind, die Pixeln in einem jeweiligen Satz von Pixeln in dem Bild mit niedrigerem Dynamikumfang entsprechen, die alle den jeweiligen Pixelwert in dem Bild mit niedrigerem Dynamikumfang aufweisen.


     
    2. Verfahren nach Anspruch 1, wobei die Vorhersagefunktion eine Nachschlagetabelle umfasst, die von Pixelwerten in einem niedrigem Dynamikumfang besetzt ist.
     
    3. Verfahren nach einem vorstehenden Anspruch, wobei die Vorhersagefunktion eine Darstellung einer parametrisierten Polynomkurve umfasst.
     
    4. Verfahren nach einem vorstehenden Anspruch, wobei das rekonstruierte Bild mit niedrigem Dynamikumfang auf einer dynamikkomprimierten Darstellung des Referenzbilds mit hohem Dynamikumfang basiert.
     
    5. Verfahren nach einem vorstehenden Anspruch, weiter umfassend Empfangen eines Quantisierungsfaktors, der mit dem Restbild verknüpft ist.
     
    6. Verfahren nach Anspruch 5, das ferner Folgendes umfasst:

    Anwenden des Quantisierungsfaktors auf die decodierten Restbilddaten, um dequantisierte Restbilddaten zu erzeugen; und

    Kombinieren des vorhergesagten Bilds mit hohem Dynamikumfang mit den dequantisierten Restbilddaten, um das Ausgangsbild mit hohem Dynamikumfang zu erzeugen.


     
    7. Verfahren nach einem vorstehenden Anspruch, wobei die Vorhersagefunktion für jeden Pixelwert, der in dem Bild mit niedrigerem Dynamikumfang dargestellt ist, einen vorhersagten Pixelwert, der in dem Bild mit hohem Dynamikumfang dargestellt ist, auf Basis zumindest teilweise der Pixelwerte, die zu einem jeweiligen Bin unter einer Vielzahl von Bins gehören, ermittelt.
     
    8. Verfahren nach Anspruch 7, wobei Erzeugen eines Bins in der Vielzahl von Bins umfasst:

    Identifizieren von Pixelgruppen in einem Referenzbild mit hohem Dynamikumfang, denen das Bild mit niedrigem Dynamikumfang entspricht, für das entsprechende Pixel in dem Bild mit niedrigerem Dynamikumfang alle denselben Pixelwert innerhalb des Bilds mit niedrigerem Dynamikumfang aufweisen; und

    für jede der Vielzahl von Gruppen, Sammeln der Pixelwerte der Pixel in der jeweiligen Gruppe in einen jeweiligen Bin.


     
    9. Verfahren nach einem vorstehenden Anspruch, wobei das Bild mit niedrigerem Dynamikumfang aus dem Bild mit hohem Dynamikumfang durch einen Dynamikumfangsreduktionsprozess erhalten werden kann und dieselbe Szene wie das Bild mit hohem Dynamikumfang enthält.
     
    10. Einrichtung zur Bilddecodierung, wobei die Einrichtung Folgendes umfasst:

    einen Prozessor; und

    einen Speicher, der Anweisungen speichert, die, wenn von dem Prozessor ausgeführt, die Einrichtung für Folgendes konfigurieren:

    Empfangen von bildcodierten Daten, wobei die bildcodierten Daten ein komprimiertes Bild mit niedrigem Dynamikumfang, ein komprimiertes Restbild und eine Vorhersagefunktion umfassen, wobei für Eingangspixelwerte in einem niedrigem Dynamikumfang die Vorhersagefunktion entsprechende vorhergesagte Pixelwerte in einem hohen Dynamikumfang bereitstellt;

    Dekomprimieren des komprimierten Bilds mit niedrigem Dynamikumfang, um ein rekonstruiertes Bild mit niedrigem Dynamikumfang zu erzeugen;

    Dekomprimieren des komprimierten Restbilds, um ein decodiertes Restbild zu erzeugen, wobei das decodierte Restbild auf einem Referenzbild mit hohem Dynamikumfang basiert, das dieselbe Szene wie das rekonstruierte Bild mit niedrigem Dynamikumfang darstellt, und das Bild mit niedrigem Dynamikumfang von einem Codierer erzeugt wird, indem eine Dynamikkompressionsfunktion auf das Referenzbild mit hohem Dynamikumfang angewendet wird;

    Anwenden der Vorhersagefunktion an dem rekonstruierten Bild mit niedrigem Dynamikumfang, um ein vorhergesagtes Bild mit hohem Dynamikumfang zu erzeugen; und

    Kombinieren des vorhergesagten Bilds mit hohem Dynamikumfang mit den decodierten Restbilddaten, um ein Ausgangsbild mit hohem Dynamikumfang zu erzeugen,

    wobei die Vorhersagefunktion auf dem rekonstruierten Bild mit niedrigem Dynamikumfang und dem Referenzbild mit hohem Dynamikumfang basiert und wobei die Vorhersagefunktion zumindest teilweise auf Pixelwerten der Pixel in dem Bild mit hohem Dynamikumfang basiert, für das entsprechende Pixel in dem Bild mit niedrigerem Dynamikumfang denselben Pixelwert aufweisen, und statistische Beziehungen zwischen Pixelwerten der Pixel in dem Bild mit niedrigerem dynamischen Bildumfang und entsprechenden Pixelwerten in dem Bild mit hohem Dynamikumfang ausnutzt,

    wobei für jeden Pixelwert, der in dem Bild mit niedrigerem Dynamikumfang dargestellt wird, die entsprechenden Pixelwerte in dem Bild mit hohem Dynamikumfang die Pixelwerte jener Pixel in dem Bild mit hohem Dynamikumfang sind, die Pixeln in einem jeweiligen Satz von Pixeln in dem Bild mit niedrigerem Dynamikumfang entsprechen, die alle den jeweiligen Pixelwert in dem Bild mit niedrigerem Dynamikumfang aufweisen.


     
    11. Einrichtung nach Anspruch 10, wobei Anweisungen die Einrichtung ferner konfigurieren, einen Quantisierungsfaktor zu empfangen, der mit dem Restbild verknüpft ist.
     
    12. Einrichtung nach Anspruch 11, wobei Anweisungen ferner die Einrichtung für Folgendes konfigurieren:

    Anwenden des Quantisierungsfaktors auf das decodierte Restbild, um ein dequantisiertes Restbild zu erzeugen; und

    Kombinieren des vorhergesagten Bilds mit hohem Dynamikumfang mit dem dequantisierten Restbild, um das Ausgangsbild mit hohem Dynamikumfang zu erzeugen.


     
    13. Nichtflüchtiges computerlesbares Speichermedium, auf dem computerausführbare Anweisungen gespeichert sind, um mit einem oder mehreren Prozessoren ein Verfahren auszuführen, das Folgendes umfasst:

    Empfangen von bildcodierten Daten, wobei die bildcodierten Daten ein komprimiertes Bild mit niedrigem Dynamikumfang, ein komprimiertes Restbild und eine Vorhersagefunktion umfassen, wobei für Eingangspixelwerte in einem niedrigem Dynamikumfang die Vorhersagefunktion entsprechende vorhergesagte Pixelwerte in einem hohen Dynamikumfang bereitstellt;

    Dekomprimieren des komprimierten Bilds mit niedrigem Dynamikumfang, um ein rekonstruiertes Bild mit niedrigem Dynamikumfang zu erzeugen;

    Dekomprimieren des komprimierten Restbilds, um ein decodiertes Restbild zu erzeugen, wobei das decodierte Restbild auf einem Referenzbild mit hohem Dynamikumfang basiert, das dieselbe Szene wie das rekonstruierte Bild mit niedrigem Dynamikumfang darstellt

    Anwenden der Vorhersagefunktion an dem rekonstruierten Bild mit niedrigem Dynamikumfang, um ein vorhergesagtes Bild mit hohem Dynamikumfang zu erzeugen; und

    Kombinieren des vorhergesagten Bilds mit hohem Dynamikumfang mit den decodierten Restbilddaten, um ein Ausgangsbild mit hohem Dynamikumfang zu erzeugen,

    wobei die Vorhersagefunktion auf dem rekonstruierten Bild mit niedrigem Dynamikumfang und dem Referenzbild mit hohem Dynamikumfang basiert und wobei die Vorhersagefunktion zumindest teilweise auf Pixelwerten der Pixel in dem Bild mit hohem Dynamikumfang basiert, für das entsprechende Pixel in dem Bild mit niedrigerem Dynamikumfang denselben Pixelwert aufweisen, und statistische Beziehungen zwischen Pixelwerten der Pixel in dem Bild mit niedrigerem dynamischen Bildumfang und entsprechenden Pixelwerten in dem Bild mit hohem Dynamikumfang ausnutzt,

    wobei für jeden Pixelwert, der in dem Bild mit niedrigerem Dynamikumfang dargestellt wird, die entsprechenden Pixelwerte in dem Bild mit hohem Dynamikumfang die Pixelwerte jener Pixel in dem Bild mit hohem Dynamikumfang sind, die Pixeln in einem jeweiligen Satz von Pixeln in dem Bild mit niedrigerem Dynamikumfang entsprechen, die alle den jeweiligen Pixelwert in dem Bild mit niedrigerem Dynamikumfang aufweisen.


     


    Revendications

    1. Procédé de décodage de données d'images au niveau d'un décodeur d'images, le procédé comprenant les étapes consistant à :

    recevoir des données codées d'images, les données codées d'images comprenant une image à gamme dynamique basse compressée, une image résiduelle compressée et une fonction de prédiction, dans lequel la fonction de prédiction fournit des valeurs de pixel prédites correspondantes dans une gamme dynamique élevée pour des valeurs de pixel d'entrée dans une gamme dynamique basse ;

    décompresser l'image à gamme dynamique basse compressée pour générer une image à gamme dynamique basse reconstruite ;

    décompresser l'image résiduelle compressée pour générer une image résiduelle décodée, dans lequel l'image résiduelle décodée est basée sur une image à gamme dynamique élevée de référence représentant la même scène que l'image à gamme dynamique basse reconstruite et l'image à gamme dynamique basse est générée par un codeur en appliquant une fonction de mappage de tonalité à l'image à gamme dynamique élevée de référence ;

    appliquer la fonction de prédiction à l'image à gamme dynamique basse reconstruite pour générer une image à gamme dynamique élevée prédite ; et

    combiner l'image à gamme dynamique élevée prédite avec les données d'image résiduelle décodées pour générer une image à gamme dynamique élevée de sortie,

    dans lequel le procédé est exécuté par un ou plusieurs processeurs,

    dans lequel la fonction de prédiction est basée sur l'image à gamme dynamique basse reconstruite et l'image à gamme dynamique élevée de référence, et dans lequel la fonction de prédiction est basée au moins en partie sur des valeurs de pixel des pixels dans l'image à gamme dynamique élevée pour lesquels des pixels correspondants dans l'image à gamme dynamique inférieure présentent tous la même valeur de pixel, et exploite des relations statistiques entre des valeurs de pixel des pixels dans l'image à gamme dynamique inférieure et des valeurs de pixel correspondantes dans l'image à gamme dynamique élevée,

    dans lequel, pour chaque valeur de pixel représentée dans l'image à gamme dynamique inférieure, les valeurs de pixel correspondantes dans l'image à gamme dynamique élevée sont les valeurs de pixel de ces pixels dans l'image à gamme dynamique élevée qui correspondent à des pixels dans un ensemble respectif de pixels dans l'image à gamme dynamique inférieure qui présentent tous la valeur de pixel respective dans l'image à gamme dynamique inférieure.


     
    2. Procédé selon la revendication 1, dans lequel la fonction de prédiction comprend une table de consultation qui est indexée par des valeurs de pixel dans une gamme dynamique basse.
     
    3. Procédé selon l'une quelconque des revendications précédentes, dans lequel la fonction de prédiction comprend une représentation d'une courbe polynomiale paramétrée.
     
    4. Procédé selon l'une quelconque des revendications précédentes, dans lequel l'image à gamme dynamique basse reconstruite est basée sur une représentation de mappage de tonalité de l'image à gamme dynamique élevée de référence.
     
    5. Procédé selon l'une quelconque des revendications précédentes, comprenant en outre l'étape consistant à recevoir un facteur de quantification associé à l'image résiduelle.
     
    6. Procédé selon la revendication 5, comprenant en outre les étapes consistant à :

    appliquer le facteur de quantification aux données d'image résiduelle décodées afin de générer des données d'image résiduelle déquantifiées ; et

    combiner l'image à gamme dynamique élevée prédite avec les données d'image résiduelle déquantifiées pour générer l'image à gamme dynamique élevée de sortie.


     
    7. Procédé selon l'une quelconque des revendications précédentes, dans lequel la fonction de prédiction détermine pour chaque valeur de pixel représentée dans l'image à gamme dynamique inférieure une valeur de pixel prédite représentée dans l'image à gamme dynamique élevée sur la base au moins en partie des valeurs de pixel appartenant à une case respective parmi une pluralité de cases.
     
    8. Procédé selon la revendication 7, dans lequel une génération d'une case dans la pluralité de cases comprend les étapes consistant à :

    identifier des groupes de pixels dans une image à gamme dynamique élevée de référence à laquelle correspond l'image à gamme dynamique basse pour laquelle des pixels correspondants dans l'image à gamme dynamique basse présentent tous la même valeur de pixel à l'intérieur de l'image à gamme dynamique basse ; et

    pour chacun de la pluralité de groupes, collecter des valeurs de pixel des pixels dans le groupe respectif dans une case respective.


     
    9. Procédé selon l'une quelconque des revendications précédentes, dans lequel l'image à gamme dynamique inférieure peut être obtenue à partir de l'image à gamme dynamique élevée par un processus de réduction de gamme dynamique et contient la même scène que l'image à gamme dynamique élevée.
     
    10. Appareil pour de décodage d'images, l'appareil comprenant :

    un processeur ; et

    une mémoire stockant des instructions qui, lorsqu'elles sont exécutées par le processeur, configurent l'appareil pour :

    recevoir des données codées d'images, les données codées d'images comprenant une image à gamme dynamique basse compressée, une image résiduelle compressée et une fonction de prédiction, dans lequel la fonction de prédiction fournit des valeurs de pixel prédites correspondantes dans une gamme dynamique élevée pour des valeurs de pixel d'entrée dans une gamme dynamique basse ;

    décompresser l'image à gamme dynamique basse compressée pour générer une image à gamme dynamique basse reconstruite ;

    décompresser l'image résiduelle compressée pour générer une image résiduelle décodée, dans lequel l'image résiduelle décodée est basée sur une image à gamme dynamique élevée de référence représentant la même scène que l'image à gamme dynamique basse reconstruite et l'image à gamme dynamique basse est générée par un codeur en appliquant une fonction de mappage de tonalité à l'image à gamme dynamique élevée de référence ;

    appliquer la fonction de prédiction à l'image à gamme dynamique basse reconstruite pour générer une image à gamme dynamique élevée prédite ; et

    combiner l'image à gamme dynamique élevée prédite avec les données d'image résiduelle décodées pour générer une image à gamme dynamique élevée de sortie,

    dans lequel la fonction de prédiction est basée sur l'image à gamme dynamique basse reconstruite et l'image à gamme dynamique élevée de référence, et dans lequel la fonction de prédiction est basée au moins en partie sur des valeurs de pixel des pixels dans l'image à gamme dynamique élevée pour lesquels des pixels correspondants dans l'image à gamme dynamique inférieure présentent tous la même valeur de pixel, et exploite des relations statistiques entre des valeurs de pixel des pixels dans l'image à gamme dynamique inférieure et des valeurs de pixel correspondantes dans l'image à gamme dynamique élevée,

    dans lequel pour chaque valeur de pixel représentée dans l'image à gamme dynamique inférieure, les valeurs de pixel correspondantes dans l'image à gamme dynamique élevée sont les valeurs de pixel de ces pixels dans l'image à gamme dynamique élevée qui correspondent à des pixels dans un ensemble respectif de pixels dans l'image à gamme dynamique inférieure qui présentent tous la valeur de pixel respective dans l'image à gamme dynamique inférieure.


     
    11. Appareil selon la revendication 10, dans lequel des instructions configurent en outre l'appareil pour recevoir un facteur de quantification associé à l'image résiduelle.
     
    12. Appareil selon la revendication 11, dans lequel des instructions configurent en outre l'appareil pour :

    appliquer le facteur de quantification à l'image résiduelle décodée pour générer une image résiduelle déquantifiée ; et

    combiner l'image à gamme dynamique élevée prédite avec l'image résiduelle déquantifiée pour générer l'image à gamme dynamique élevée de sortie.


     
    13. Support de stockage non transitoire lisible par ordinateur sur lequel sont stockées des instructions exécutables par ordinateur pour exécuter avec un ou plusieurs processeurs un procédé comprenant les étapes consistant à :

    recevoir des données codées d'images, les données codées d'images comprenant une image à gamme dynamique basse compressée, une image résiduelle compressée et une fonction de prédiction, dans lequel la fonction de prédiction fournit des valeurs de pixel prédites correspondantes dans une gamme dynamique élevée pour des valeurs de pixel d'entrée dans une gamme dynamique basse ;

    décompresser l'image à gamme dynamique basse compressée pour générer une image à gamme dynamique basse reconstruite ;

    décompresser l'image résiduelle compressée pour générer une image résiduelle décodée, dans lequel l'image résiduelle décodée est basée sur une image à gamme dynamique élevée de référence représentant la même scène que l'image à gamme dynamique basse reconstruite,

    appliquer la fonction de prédiction à l'image à gamme dynamique basse reconstruite pour générer une image à gamme dynamique élevée prédite ; et

    combiner l'image à gamme dynamique élevée prédite avec les données d'image résiduelle décodées pour générer une image à gamme dynamique élevée de sortie,

    dans lequel la fonction de prédiction est basée sur l'image à gamme dynamique basse reconstruite et l'image à gamme dynamique élevée de référence, et dans lequel la fonction de prédiction est basée au moins en partie sur des valeurs de pixel des pixels dans l'image à gamme dynamique élevée pour lesquels des pixels correspondants dans l'image à gamme dynamique inférieure présentent tous la même valeur de pixel, et exploite des relations statistiques entre des valeurs de pixel des pixels dans l'image à gamme dynamique inférieure et des valeurs de pixel correspondantes dans l'image à gamme dynamique élevée,

    dans lequel pour chaque valeur de pixel représentée dans l'image à gamme dynamique inférieure, les valeurs de pixel correspondantes dans l'image à gamme dynamique élevée sont les valeurs de pixel de ces pixels dans l'image à gamme dynamique élevée qui correspondent à des pixels dans un ensemble respectif de pixels dans l'image à gamme dynamique inférieure qui présentent tous la valeur de pixel respective dans l'image à gamme dynamique inférieure.


     




    Drawing
































    Cited references

    REFERENCES CITED IN THE DESCRIPTION



    This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

    Patent documents cited in the description




    Non-patent literature cited in the description