(19)
(11)EP 3 136 733 B1

(12)EUROPEAN PATENT SPECIFICATION

(45)Mention of the grant of the patent:
04.11.2020 Bulletin 2020/45

(21)Application number: 15783156.1

(22)Date of filing:  20.04.2015
(51)International Patent Classification (IPC): 
H04N 19/31(2014.01)
H04N 21/2362(2011.01)
H04N 21/84(2011.01)
H04N 19/156(2014.01)
H04N 19/127(2014.01)
H04N 19/172(2014.01)
H04N 19/70(2014.01)
H04N 21/4402(2011.01)
H04N 21/434(2011.01)
H04N 19/157(2014.01)
H04N 19/187(2014.01)
(86)International application number:
PCT/JP2015/061937
(87)International publication number:
WO 2015/163267 (29.10.2015 Gazette  2015/43)

(54)

TRANSMISSION DEVICE, TRANSMISSION METHOD, RECEPTION DEVICE, AND RECEPTION METHOD

ÜBERTRAGUNGSVORRICHTUNG, ÜBERTRAGUNGSVERFAHREN, EMPFANGSVORRICHTUNG UND EMPFANGSVERFAHREN

DISPOSITIF D'ÉMISSION, PROCÉDÉ D'ÉMISSION, DISPOSITIF DE RÉCEPTION ET PROCÉDÉ DE RÉCEPTION


(84)Designated Contracting States:
AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

(30)Priority: 25.04.2014 JP 2014092083

(43)Date of publication of application:
01.03.2017 Bulletin 2017/09

(73)Proprietor: Sony Corporation
Tokyo 108-0075 (JP)

(72)Inventor:
  • TSUKAGOSHI, Ikuo
    Tokyo 108-007 (JP)

(74)Representative: Witte, Weller & Partner Patentanwälte mbB 
Postfach 10 54 62
70047 Stuttgart
70047 Stuttgart (DE)


(56)References cited: : 
WO-A1-2014/050597
  
  • CHEN Y ET AL: "Carriage of HEVC multi-layer extension streams over MPEG-2 Systems", 107. MPEG MEETING; 13-1-2014 - 17-1-2014; SAN JOSE; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. m32295, 8 January 2014 (2014-01-08), XP030060747,
  • HAQUE M ET AL: "On HEVC descriptors for Temporal sub-streams with a single PID in a MPEG-2 Transport Stream", 101. MPEG MEETING; 16-7-2012 - 20-7-2012; STOCKHOLM; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. m26186, 11 July 2012 (2012-07-11), XP030054519,
  • "WD of ISO/IEC 13818-1:2013 AMD 7 Carriage of Layered HEVC", 107. MPEG MEETING;13-1-2014 - 17-1-2014; SAN JOSE; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. N14118, 13 February 2014 (2014-02-13), XP030020856,
  • JIANLE CHEN: 'Qualcomm, High efficiency video coding (HEVC) scalable extension Draft 6' JCTVC-Q1008_V5, [Online] vol. 90, pages 4 - 7, XP055232994 Retrieved from the Internet: <URL:http:// phenix.it-sudparis.eu/jct/doc_end_user/ documents/17_Valencia/wgll/JCTVC-Q1008-v5.z ip>
  • 'Service Information for Digital Broadcasting System' ARIB STD-B10 31 July 2014, pages 199 - 201, XP055233001
  • YE-KUI WANG ET AL.: 'Vinod Kumar Malamal Vadakital' TEXT OF ISO/IEC 14996-15:2013 PDAM 1 CARRIAGE OF LAYERED HEVC, [Online] July 2014, pages 18 - 21, XP055233026 Retrieved from the Internet: <URL:http://mpeg. chiariglione.org/sites/default/files/files/ standards/parts/docs/w14727-v2-w14727.zip>
  • HAQUE M ET AL: "On HEVC descriptors for Temporal sub-streams with multiple PIDs in a MPEG-2 Transport Stream", 101. MPEG MEETING; 16-7-2012 - 20-7-2012; STOCKHOLM; (MOTION PICTURE EXPERT GROUP OR ISO/IEC JTC1/SC29/WG11),, no. m26187, 16 July 2012 (2012-07-16), XP030054520,
  • "HEVC / MPEG-H part 2 or ITU-T Rec. H.265 &#61623", 3GPP DRAFT; S4-121322 SYSTEM LAYER INTEGRATION OF HEVC, 3RD GENERATION PARTNERSHIP PROJECT (3GPP), MOBILE COMPETENCE CENTRE ; 650, ROUTE DES LUCIOLES ; F-06921 SOPHIA-ANTIPOLIS CEDEX ; FRANCE , 31 October 2012 (2012-10-31), XP050687116, Retrieved from the Internet: URL:http://www.3gpp.org/ftp/tsg_sa/WG4_COD EC/TSGS4_71/Docs/ [retrieved on 2012-10-31]
  
Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give notice to the European Patent Office of opposition to the European patent granted. Notice of opposition shall be filed in a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art. 99(1) European Patent Convention).


Description

TECHNICAL FIELD



[0001] The present technology relates to a transmission apparatus, a transmission method, a reception apparatus, and a reception method, and relates to a transmission apparatus or the like which hierarchically encodes and transmits image data of each picture consisting moving picture data.

BACKGROUND ART



[0002] When a service of a compressed moving picture is provided by broadcasting, the internet, or the like, the upper limit of a reproducible frame frequency is defined by the decoding capability of receivers. Thus, service sides need to restrict services having a low frame frequency and simultaneously provide services having a plurality of high and low frame frequencies taking the reproducible capability of widespread receivers into consideration.

[0003] The cost of a receiver is too high to deal with a service having a high frame frequency, which is an obstruction factor of spreading. When service sides start a service having a high frame frequency although inexpensive receivers dedicated for services having a low frame frequency have been widespread, the service having a high frame frequency can never be viewed without a new receiver, which is an obstruction factor of spreading of the service.

[0004] For example, in High Efficiency Video Coding (HEVC), a time direction scalability by hierarchically encoding image data of each picture consisting moving picture data has been presented (Non Patent Document 1). A receiving side can identify, based on a temporal ID (temporal_id) inserted in a header of a network abstraction layer (NAL) unit, the layer of each picture, and can selectively perform decoding to a layer corresponding to decoding capability.

[0005] Non Patent Document 2, according to its abstract, relates to a design for the carriage of multi-layer HEVC extension bitstreams, including SHVC and MV-HEVC bitstreams, over MPEG-2 Systems.

[0006] Non Patent Documents 3 and 4, according to their abstracts, relate to transport of HEVC video Over MPEG-2 Systems. HEVC video in its version-1 specification supports Temporal Scalability with multiple temporal layers using Hierarchical B-pictures, etc. For a constrained HEVC bitstream with suitable temporal scalability and GOP structures, it is quite possible to separate as well as concatenate the temporal layers of such HEVC video bitstream, thus generating a set of Single to multi-layer sub-streams with different frame-rates.

[0007] Non Patent Document 5, according to its abstract, relates to the integration of HEVC into end-to-end multimedia systems, formats, and protocols such as Real-Time Transport Protocol (RTP), the Transport Stream of the MPEG-2 standard suite and Dynamic Adaptive Streaming over the Hypertext Transfer Protocol (DASH). The document further relates to applications of video transport and delivery such as broadcast, IPTV (television over the Internet Protocol), Internet streaming, video conversation and storage as provided by the different system layers.

CITATION LIST


NON PATENT DOCUMENT



[0008] 

Non Patent Document 1: Gary J. Sullivan, Jens-Rainer Ohm, Woo-Jin Han, Thomas Wiegand, "Overview of the High Efficiency Video Coding (HEVC) Standard" IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECNOROGY, VOL. 22, NO. 12, pp. 1649-1668, DECEMBER 2012

Non Patent Document 2: Ying Chen, Ye-Kui Wang, Hendry, "Carriage of HEVC multi-layer extension streams over MPEG-2 Systems", ISO/IEC JTC1/SC29/WG11/M32295, January 8, 2014, XP030060747

Non Patent Document 3: Munsi Haque, Kanzushi Sato, Teruhiko Suzuki, "On HEVC descritors for temporal sub-streams with a single PID in a MPEG-2 Transport Stream", ISO/IEC JTC1/SC29/WG11, MPEG2012/m26186, July 11, 2012, XP030054519

Non Patent Document 4: Munsi Haque, Kanzushi Sato, Teruhiko Suzuki, "On HEVC descritors for temporal sub-streams with a multiple PIDs in a MPEG-2 Transport Stream", ISO/IEC JTC1/SC29/WG11, MPEG2012/m26187, July 16, 2012, XP030054520

Non Patent Document 5: Thomas Schierl, Miska M. Hannuksela, Ye-Kui Wang, Stephan Wenger, "System Layer Integration of HEVC", HEVC / MPEG-H part 2 or ITU Rec. H.265&#61623, 3GPP DRAFT, S4-121322, October 31, 2012, XP050687116


SUMMARY OF THE INVENTION


PROBLEMS TO BE SOLVED BY THE INVENTION



[0009] A purpose of the present technology is to enable a receiving side to perform excellent decode processing according to decoding capability.

SOLUTIONS TO PROBLEMS



[0010] Therefore, there are provided a transmission apparatus, a transmission method, a reception apparatus, and a reception method according to the independent claims.

EFFECTS OF THE INVENTION



[0011] According to the present technology, it is possible for a receiving side to perform excellent decode processing according to decoding capability. Note that, effects are not necessarily limited to the effects described here and may be any one of the effects described in the present disclosure.

BRIEF DESCRIPTION OF DRAWINGS



[0012] 

Fig. 1 is a block diagram illustrating a configuration example of a transceiver system of an embodiment.

Figs. 2(a) and 2(b) are structure examples of a NAL unit header and a diagram illustrating semantics of a main parameter in the structure example.

Figs. 3(a) to 3(c) are diagrams illustrating an example of hierarchical encoding.

Fig. 4 is a diagram illustrating "level_idc" and an example of the value.

Fig. 5 is a block diagram illustrating a configuration example of a transmission apparatus.

Fig. 6 is a diagram illustrating a structure example of an element-added HEVC descriptor.

Fig. 7 is a diagram illustrating a structure example of a layer/signaling/descriptor.

Fig. 8 is a diagram illustrating semantics of main information in the structure example of the layer/signaling/descriptor.

Figs. 9(a) to 9(c) are diagrams explaining a configuration example of a profile layer.

Fig. 10 is a diagram illustrating a configuration example of a transport stream TS.

Fig. 11 is a block diagram illustrating a configuration example of a reception apparatus.

Fig. 12 is a diagram schematically illustrating a configuration of a demultiplexer and a decoder.

Fig. 13 is a diagram schematically illustrating behavior of a temporal ID analysis unit.

Fig. 14 is an example of a processing flow of a decoder and a post-processing unit.

Figs. 15(a) and 15(b) are diagrams illustrating a configuration example of an MPEG-DASH-based stream distribution system.

Figs. 16(a) to 16(d) are diagrams illustrating an example of a relation between structures hierarchically arranged in an MPD file.

Fig. 17 is a diagram illustrating a configuration example of a FragmentedMP4 stream in association with an MPD file.

Fig. 18 is a diagram illustrating an implementation of an MPD file.


MODE FOR CARRYING OUT THE INVENTION



[0013] Hereinafter, mode for carrying out the invention (hereinafter, referred to as an "embodiment") is described. Note that, the description is made in the following order:
  1. 1. Embodiment
  2. 2. Modified example

<1. Embodiment>


[Transceiver system]



[0014] Fig. 1 illustrates a configuration example of a transceiver system 10 of an embodiment. The transceiver system 10 includes a transmission apparatus 100 and a reception apparatus 200.

[0015] The transmission apparatus 100 transmits a transport stream TS as a container on a broadcast wave. The transport stream TS contains video data having encoded image data of a picture in each of a plurality of layers obtained by classifying image data of the picture consistingmovingpicture data into the plurality of layers and encoding the image data. In this case, by, for example, performing encoding such as H.264/AVC or H.265/HEVC, the image data is encoded so that a referred picture belongs to the own layer and/or a layer lower than the own layer.

[0016] Layer identification information is added to the encoded image data of the picture of each of the layers in order to identify the layer to which each picture belongs. In this embodiment, the layer identification information ("nuh_temporal_id_plus1"indicatingtemporal_id) is arranged in a header part of a NAL unit (nal_unit) of each picture. By adding the layer identification information in this manner, it is possible for a receiving side to selectively extract encoded image data in a layer lower than a predetermined layer and perform decode processing.

[0017] Fig. 2(a) illustrates a structure example (Syntax) of a NAL unit header, and Fig. 2(b) illustrates semantics of a main parameter in the structure example. A 1-bit field of "Forbidden_zero_bit" is necessarily 0. A 6-bit field of "Nal_unit_type" indicates a NAL unit type. A 6-bit field of "Nuh_layer_id" assumes 0. A 3-bit field of "Nuh_temporal_id_plus1" indicates temporal_id, and takes a value obtained by adding 1 (1 to 7).

[0018] The transport stream TS contains a single video stream. Furthermore, a level designation value of a bit stream and information on a layer range in each of a plurality of layer ranges having a different maximum layer are inserted in the transport stream TS. The information is inserted, for example, under a program map table (PMT).

[0019] Figs. 3 (a) to 3(c) illustrate an example of hierarchical encoding, and each rectangular frame indicates a picture. Fig. 3(a) is an example in which the uppermost layer is 3. For example, a level designation value of a bit stream of a layer range whose maximum layer is layer 3 is "level5.2" corresponding to a service of 2160/100P, and the information on the layer range indicates 0 to 3. Furthermore, for example, a level designation value of a bit stream of a layer range whose maximum layer is layer 2 is "level5.1" corresponding to a service of 2160/50P, and the information on the layer range indicates 0 to 2.

[0020] Fig. 3(b) is an example in which the uppermost layer is 3. For example, a level designation value of a bit stream of a layer range whose maximum layer is layer 3 is "level5.2" corresponding to a service of 2160/100P, and the information on the layer range indicates 0 to 3. Furthermore, for example, a level designation value of a bit stream of a layer range whose maximum layer is layer 2 is "level5.1" corresponding to a service of 2160/50P, and the information on the layer range indicates 0 to 2.

[0021] Fig. 3(c) is an example in which the uppermost layer is 4. For example, a level designation value of a bit stream of a layer range whose maximum layer is layer 4 is "level5.2" corresponding to a service of 2160/100P, and the information on the layer range indicates 0 to 4. Furthermore, for example, a level designation value of a bit stream of a layer range whose maximum layer is layer 3 is "level5.1" corresponding to a service of 2160/50P, and the information on the layer range indicates 0 to 3. Furthermore, for example, a level designation value of a bit stream of a layer range whose maximum layer is layer 2 is "level5" corresponding to a service of 2160/25P, and the information on the layer range indicates 0 to 2.

[0022] Here, a value of "level_idc" is described. Fig. 4 illustrates "level_idc" and an example of the value. For example, "level_idc" corresponding to a service of 1080/50P is "level4.1", and the value is "123" in decimal and "0x7b" in hexadecimal. To indicate this "level4.1", "b", which is the lower 4 bits, is to be described as "ls4b_sublayer_level_idc" in an HEVC descriptor, which will be described later. Furthermore, for example, "level_idc" corresponding to a service of 1080/100P is "level4.2", and the value is"126" in decimal and "0x7e" in hexadecimal.

[0023] Furthermore, for example, "level_idc" corresponding to a service of 2160/50P is "level5.1", and the value is "153" in decimal and "0x99" in hexadecimal. To indicate this "level5.1", "9", which is the lower 4 bits, is to be described as "ls4b_sublayer_level_idc" in an HEVC descriptor, which will be described later. Furthermore, for example, "level_idc" corresponding to a service of 2160/100P is "level5.2", and the value is "156" in decimal and "0x9c" in hexadecimal.

[0024] Furthermore, for example, "level_idc" corresponding to a service of 4320/50P s "level6.1", and the value is"183" in decimal and "0xb7" in hexadecimal. To indicate this "level6.1", "7", which is the lower 4 bits, is to be described as "ls4b_sublayer_level_idc" in an HEVC descriptor, which will be described later. Furthermore, for example, "level_idc" corresponding to a service of 4320/100P is "level6.2", and the value is "186" in decimal and "0xba" in hexadecimal.

[0025] The reception apparatus 200 receives the above described transport stream TS transmitted from the transmission apparatus 100 on a broadcast wave or on an internet packet. The reception apparatus 200 extracts the encoded image data of the picture in the layer lower than the predetermined layer from the video data contained in the transport stream TS and decodes the encoded image data according to the own decoding capability. At this time, the reception apparatus 200 performs decoding on the basis of the level designation value of the bit stream and the information on the layer range in each of the layer ranges having a different maximum layer which are inserted in the transport stream TS as described above.

"Configuration of the transmission apparatus"



[0026] Fig. 5 illustrates a configuration example of the transmission apparatus 100. The transmission apparatus 100 includes a central processing unit (CPU) 101, an encoder 102, a compressed data buffer (cpb: coded picture buffer) 103, a multiplexer 104, and a transmission unit 105. The CPU 101 is a control unit, and controls the operations of the units of the transmission apparatus 100.

[0027] The encoder 102 inputs decoded moving picture data VD and hierarchically encodes the data. The encoder 102 classifies image data of each picture consisting the moving picture data VD into a plurality of layers. Then, the encoder 102 encodes the classified image data of the picture in each of the layers, and generates a video stream (video data) having the encoded image data of the picture in each of the layers.

[0028] The encoder 102 performs encoding, such as H.264/AVC or H. 265/HEVC. Atthistime, the encoder 102 performs encoding so that a picture to be referred (a referred picture) belongs to the own layer and/or a layer lower than the own layer. The coded picture buffer (cpb) 103 temporarily stores the video stream containing the encoded image data of the picture in each of the layers and generated by the encoder 102.

[0029] The multiplexer 104 reads and PES-packetizes the video stream stored in the compressed data buffer 103, multiplexes the video stream by transport-packetizing the video stream, and obtains the transport stream TS as a multiplexed stream. The transport stream TS contains a single video stream as described above. The multiplexer 104 inserts, in a layer of a container, the level designation value of the bit stream and the information on the layer range in the layer ranges (level layers) having a different maximum layer. The transmission unit 105 transmits the transport stream TS obtained by the multiplexer 104 to the reception apparatus 200 on a broadcast wave or an internet packet.

[Insertion of information]



[0030] The insertion of information by the multiplexer 104 is further described. To insert the information, an existing HEVC descriptor (HEVC_descriptor) or a newly defined layer/signaling/descriptor (Layer_signaling descriptor) is used.

[0031] Fig. 6 illustrates a structure example (Syntax) in the case of an existing HEVC descriptor to which elements are added. An 8-bit field of "descriptor_tag" indicates a descriptor type, and indicates an HEVC descriptor here. An 8-bit field of "descriptor_length" indicates the length (size) of the descriptor, and indicates the following number of bytes as the length of a descriptor.

[0032] An 8-bit field of "profile_idc" indicates a profile of a bit stream. An 8-bit field of "level_idc" indicates a level designation value of a bit stream in the uppermost layer. A 4-bit field of "ls4b_sublayer_level_idc" indicates a level designation value of a bit stream in a layer lower than the uppermost layer (for example, the layer one level below the uppermost layer). In this case, lower 4 bits are arranged in hexadecimal.

[0033] Furthermore, in the case of "temporal_layer_subset_flag=1", there exists a 3-bit field of each of "temporal_id_min", "temporal_id_max", "temporal_id_sublayer_min", and "temporal_id_sublayer_max". "temporal_id_max" indicates a value of temporal_id of the uppermost layer of the layer range in which the maximum layer is the highest layer, that is, the uppermost layer, and "temporal_id_min" indicates a value of temporal_id of the lowest layer of the layer range. Furthermore "temporal_id_sublayer_max" indicates a value of temporal_id of the uppermost layer of the layer range in which the maximum layer is lower than the uppermost layer (normally, the layer one level below the uppermost layer), and "temporal_id_sublayer_min" indicates a value of temporal_id of the lowest layer of the layer range.

[0034] For example, a specific example of each field description in the hierarchical encoding example illustrated in Fig. 3 (a) is described. "0x9c", which is the value of "level5.2", is described in the 8-bit field of "level_idc". "9" of 0x99", which is the value of "level5.1" is described in the 4-bit field of "ls4b_sublayer_level_idc".

[0035] "011" indicating temporal_id=3 is described in the 3-bit field of "temporal_id_max", and "000" indicating t temporal_id=0 is described in the 3-bit field "temporal_id_min". Furthermore, "010" indicating temporal_id=2 is described in the 3-bit field of "temporal_id_sublayer_max", and "000" indicating temporal_id=0 is described in the 3-bit field of "temporal_id_min".

[0036] Fig. 7 illustrates a structure example (Syntax) of a layer/signaling/descriptor (Layer_signaling descriptor). Furthermore, Fig. 8 illustrates semantics of main information in the structure example. By the above described HEVC descriptor (Fig. 6), a level designation value of a bit stream and information on a layer range in each of two layer ranges are transmitted. However, by the layer/signaling/descriptor, a level designation value of a bit stream and information on a layer range in each of three or more layer ranges can be transmitted.

[0037] The 8-bit field of "descriptor_tag" indicates a descriptor type, and indicates a layer/signaling/descriptor here. An 8-bit field of "descriptor_length" indicates the length (size) of the descriptor, and indicates the following number of bytes as the length of a descriptor.

[0038] An 8-bit field of "overall_profile_idc" indicates a profile of the maximum range related to scalable encode tools . An 8-bit field of "highest_level_idc" indicates the maximum level of a scalable range. An 8-bit field of "number_of_profile_layers" indicates the number of profile layers having a scalable function. "number_of_level_layers" indicates the number of level layers.

[0039] An 8-bit field of "layer_profile_idc[i]" indicates a profile of each profile layer. An 8-bit field of "layer_level_idc[i][j]" indicates a level of each level layer. An 8-bit field of "temporal_id_layer_min[i][j]" indicates a value of the minimum temporal_id in each level layer. An 8-bit field of "temporal_id_layer_max[i][j]" indicates a value of the maximum temporal_id in each level layer.

[0040] For example, a specific example of each field description related to a level in the hierarchical encoding example illustrated in Fig. 3(c) is described. "0x9c", which is the value of "level5.2", is described in the 8-bit field of "highest_level_idc". When three layer ranges in which each of layers 4, 3, and 2 is the maximum layer are set, "3" indicating that that the number of level layers is three is described in an 8-bit field of "number_of_level_layers".

[0041] Then, with regard to a first level layer, "0x9c", which is the value of "level5.2", is described in the 8-bit field of "layer_level_idc[i][j]", "100" indicating temporal_id=4 is described in the 8-bit field of "temporal_id_layer_max[i][j]", and "000" indicating temporal_id=0 is described in the 8-bit field of "temporal_id_layer_min[i][j]".

[0042] Furthermore, with regard to a second level layer, "0x99", which is the value of "level5.1", is described in the 8-bit field of "layer_level_idc[i][j]", "011" indicating temporal_id=3 is described in the 8-bit field of "temporal_id_layer_max[i][j]", and "000" indicating temporal_id=0 is described in the 8-bit field of "temporal_id_layer_min[i][j]".

[0043] Furthermore, with regard to a third level layer, "0x96", which is the value of "level5", is described in the 8-bit field of "layer_level_idc[i][j]", "010" indicating temporal__id=2 is described in the 8-bit field of "temporal_id_layer_max[i][j]", and "000" indicating temporal_id=0 is described in the 8-bit field of "temporal_id_layer_min[i][j]".

[0044] Here, a configuration example of a profile layer is described with reference to Figs. 9(a) to 9(c). Fig. 9(a) is an example of encoding an 8-bit image whose image quality is improved with two streams by encoding difference data as an extended stream of a scalable main profile of "Profile_layering_id=1" in addition to a base stream of a main profile of "Profile_layering_id=0", compared with the base stream alone.

[0045] Fig. 9(b) is an example of encoding a 10-bit image whose image quality is improved with two streams by encoding difference data as an extended stream of a scalable main10 profile of "Profile_layering_id=1" in addition to a base stream of a main10 profile of "Profile_layering_id=0", compared with the base stream alone.

[0046] Fig. 9(c) is an example of encoding an 8-bit image whose image quality is improved with two streams by encoding difference data as an extended stream 1 of a scalable main profile of "Profile_layering_id=1" in addition to abase stream of a main profile of "Profile_layering_id=0", compared with the base stream alone. Furthermore, an image having an accuracy of 10 bits obtained by encoding difference data between 10 bits and 8 bits as an extended stream 2 of a scalable main profile of "Profile_layering_id=2" is encoded.

[0047] Note that, the added value of displaying by the scalable extended stream not only improves the above described image quality, but also applies to scalable extension related to increase of a spatial resolution, expansion of a color gamut, and an expansion of a luminance level. With regard to these streams, by analyzing a packet from a decoder input buffer of a receiver and appropriately discriminating the packet, it is possible to perform desired decoding.

[Configuration of the transport stream TS]



[0048] Fig. 10 illustrates a configuration example of the transport stream TS. The transport stream TS contains one video stream. In other words, in this configuration example, there exists a PES packet "video PES1" of a video stream having encoded image data of pictures of a plurality of layers by, for example, an HEVC.

[0049] In the encoded image data of each picture, there exists an NAL unit, such as a VPS, SPS, PPS, SLICE, or SEI. As described above, the layer identification information on the picture ("nuh_temporal_id_plus1"indicatingtemporal_id) is arranged in the header of the NAL unit. "general_level_idc", which is a level designation value of a bit stream, is inserted in SPS

[0050] Furthermore, the transport stream TS contains a program map table (PMT) as program specific information (PSI). The PSI is the information in which it is described that elementary streams contained in the transport stream each belong to which program.

[0051] In the PMT, there exists a program/loop (Program loop) describing information related to an entire program. Furthermore, there exists an elementary/loop having information related to each elementary stream in the PMT. In the configuration example, there exists a video elementary/loop (video ES1 loop).

[0052] In the video elementary/loop, information, such as a stream type and a packet identifier (PID), corresponding to a video stream (video PES1), and a descriptor describing information related to the video stream is arranged. As one of the descriptors, the above described HEVC descriptor (HEVC_descriptor) or layer/signaling/descriptor (Layer_signaling descriptor) is inserted. Note that, the layer/signaling/descriptor is not inserted when the element-added HEVC descriptor illustrated in Fig. 6 is used.

[0053] The operations of the transmission apparatus 100 illustrated in Fig. 5 are briefly described. The decoded moving picture data VD is input to the encoder 102. The encoder 102 performs hierarchical encoding to the moving picture data VD. In other words, the encoder 102 classifies the image data of each picture consisting the moving picture data VD into a plurality of layers, encodes the image data, and generates a video stream (video data) having the encoded image data of the picture in each of the layers. In this case, the image data is encoded so that a picture to be referred belongs to the own layer and/or a layer lower than the own layer.

[0054] The video stream containing the encoded data of the picture in each of the layers and generated by the encoder 102 is supplied to the compressed data buffer (cpb) 103 and temporarily stored. By the multiplexer 104, the video stream stored in the compressed data buffer 103 is read, PES-packetized, and multiplexed by being transport-packetized, and the transport stream TS as a multiplexed stream is obtained. The transport stream TS contains a single video stream.

[0055] When the transport stream TS is generated by the multiplexer 104 in this manner, the level designation value of the bit stream and the information on the layer range in the layer ranges having a different maximum layer are inserted in the layer of the container. For example, the element-added HEVC descriptor (see Fig. 6) or the layer/signaling/descriptor (see Fig. 7) is inserted under the program map table (PMT). The transport stream TS generated by the multiplexer 104 is transmitted to the transmission unit 105. The transport stream TS is transmitted to the reception apparatus 200 on a broadcast wave or an internet packet by the transmission unit 105.

"Configuration of the reception apparatus"



[0056] Fig. 11 illustrates a configuration example of the reception apparatus 200. The reception apparatus 200 includes a central processing unit (CPU) 201, a reception unit 202, a demultiplexer 203, and a compressed data buffer (cpb: coded picture buffer) 204. Furthermore, the reception apparatus 200 includes a decoder 205, a decompressed data buffer (dpb: decoded picture buffer) 206, a post-processing unit 207, and a display unit 208. The CPU 201 consists a control unit, and controls the operations of the units of the reception apparatus 200.

[0057] The reception unit 202 receives the transport stream TS transmitted from the transmission apparatus 100 on a broadcast wave or on an internet packet. The demultiplexer 203 extracts, from the transport stream TS, a TS packet consisting the video stream contained in the transport stream TS after filtering the TS packet with a PID filter, and transmits the transport stream TS to the compressed data buffer (cpb: coded picture buffer) 204.

[0058] Furthermore, the demultiplexer 203 extracts section data data from the transport stream TS, and transmits the section data data to the CPU 201. The section data contains the above described HEVC descriptor (HEVC_descriptor) and layer/signaling/descriptor (Layer_signaling descriptor). The CPU 201 determines the layer range which the decoder 205 can decode from the layer ranges indicated by these descriptors with the level designation value of the bit stream and the information on the layer range, and transmits the information on the temporal ID (temporal_id) of the layer range to the decoder 205.

[0059] Furthermore, the demultiplexer 203 extracts a program clock reference (PCR) from the TS packet containing the PCR, and transmits the PCR to the CPU 201. Furthermore, the demultiplexer 203 extracts time stamps (DTS and PTS) inserted in a PES header for each picture, and transmits the time stamps to the CPU 201.

[0060] The compressed data buffer (cpb) 204 temporarily stores the encoded image data of each picture according to the TS packet transferred from the demultiplexer 203. The decoder 205 reads and decodes the encoded image data of each picture stored in the compressed data buffer 204 at a decode timing supplied by a decoding time stamp (DTS) of the picture, and transmits the decoded image data to the decompressed data buffer (dpb: decoded picture buffer) 206. At this time, the decoder 205 selectively decodes only the encoded image data of the picture contained in the decodable layer range on the basis of the information on the temporal ID (temporal_id) of the decodable layer range supplied by the CPU 201.

[0061] Fig. 12 schematically illustrates a configuration example of the demultiplexer 203 and the decoder 205. The demultiplexer 203 includes a video multiplexing buffer 203a and a section data buffer 203b. The TS packet corresponding to a program number equivalent to a broadcasting service channel and extracted from the transport stream TS with the PID filter is temporarily stored in the video multiplexing buffer 203a, and then, is transferred to the compressed data buffer 204. Furthermore, the section data extracted from the transport stream TS with the PID filter is temporarily stored in the section data buffer 203b, and then, is transmitted to the CPU 201.

[0062] Furthermore, the decoder 205 includes a temporal ID analysis unit 205a and a decode processing unit 205b. The temporal ID analysis unit 205a sequentially reads the encoded data of each picture stored in the compressed data buffer 204 at the decode timing, and analyzes the information on the temporal ID (temporal_id) inserted in the NAL unit header. Then, the temporal ID analysis unit 205a transfers the encoded data to the decode processing unit 205b when determining the encoded data is within the decodable layer range, and discards the encoded data without transferring the encoded data to the decode processing unit 205b when determining the encoded data is not within the decodable layer range. Note that, the information on the temporal ID (temporal_id) of the decodable layer range is supplied to the temporal ID analysis unit 205a by the CPU 201.

[0063] For example, the case of the hierarchical encoding example of Fig. 3(a) is described. For example, when the decoder 205 is a lOOp decoder, "0 to 3" is supplied to the temporal ID analysis unit 205a as the information on the temporal ID of the decodable layer ranges by the CPU 201. Thus, the temporal ID analysis unit 205a transmits the encoded image data of the pictures in the layers 0 to 3 to the decode processing unit 205b.

[0064] On the other hand, when the decoder 205 is a 50p decoder, "0 to 2" is supplied to the temporal ID analysis unit 205a as the information on the temporal ID of the decodable layer ranges by the CPU 201. Thus, the temporal ID analysis unit 205a transmits the encoded image data of the pictures in the layers 0 to 2 to the decode processing unit 205b. On the other hand, the temporal ID analysis unit 205a discards the encoded image data of the picture in the layer 3 without transmitting the encoded image to the decode processing unit 205b. Fig. 13 schematically illustrates such behavior of the temporal ID analysis unit 205a. Note that, "tid=high" indicates the layer 3, and "tid=low" indicates the layers 0 to 2.

[0065] Returning back to Fig. 11, the decompressed data buffer (dpb) 206 temporarily stores the image data of each picture decoded by the decoder 205. The post-processing unit 207 performs, to the image data of each picture sequentially read at the display timing supplied by the presentation time stamp (PTS) from the decompressed data buffer (dpb) 206, processing to adjust the frame rate to the display capability.

[0066] For example, when the frame rate of the image data of each picture after decoding is 50 fps and the display capability is 100 fps, the post-processing unit 207 performs interpolation processing to the image data of each picture after decoding so that the time direction resolution becomes twice, and transmits the image data of 100 fps to the display unit 208.

[0067] The display unit 208 is constituted by, for example, a liquid crystal display (LCD), an organic electro-luminescence (EL) panel, or the like. Note that, the display unit 208 may be an external device connected to the reception apparatus 200.

[0068] Fig. 14 illustrates an example of a processing flow of the decoder 205 and the post-processing unit 207. In step ST1, the processing is started, and then moves to the processing in step ST2. In step ST2, the decoder 205 reads, from the compressed data buffer (cpb) 204, the video stream to be decoded at the decode timing (the encoded image data of the picture to be decoded), and detects the temporal ID (temporal_id).

[0069] Next, in step ST3, the decoder 205 determines whether the temporal ID (temporal_id) detected in step ST2 is within the decodable range. When the temporal ID is not within the decodable range, the decoder 205 does not perform the decode processing, and returns back to the processing in step ST2. On the other hand, when the temporal ID is within the decodable range, the decoder 205 moves to the processing in step ST4. In step ST4, the decoder 205 performs the decode processing, and transfers the image data of the picture after decoding to the decompressed data buffer (dpb) 206.

[0070] Next, in step ST5, the post-processing unit 207 reads, from the decompressed data buffer (dpb) 206, the image data of the image data of the picture to be displayed at the display timing. Next, in step ST6, the post-processing unit 207 determines whether a display frequency and a read frequency from the decompressed data buffer (dpb) 206 are different. When the frequencies are different, in step ST7, the post-processing unit 207 adjusts the read frequency to the display frequency by performing frame interpolation or thinning of the picture. After the processing in step ST7, the processing is terminated in step ST8. Furthermore, when the frequencies are not different in step ST6, the processing is immediately terminated in step ST8.

[0071] The operations of the reception apparatus 200 illustrated in Fig. 11 are briefly described. The transport stream TS transmitted from the transmission apparatus 100 on a broadcast wave or on an internet packet is received by the reception unit 202. The transport stream TS is transmitted to the demultiplexer 203. The TS packet corresponding to a service is extracted, from the transport stream TS, on the basis of the PID information contained in the PMT by the demultiplexer 203. The TS packet is transmitted to the compressed data buffer (cpb) 204, and temporarily stored.

[0072] Furthermore, the section data is extracted from the transport stream TS, and transmitted to the CPU 201 by the demultiplexer 203. The layer range which the decoder 205 can decode is determined from the layer ranges described by the HEVC descriptor or the layer/signaling/descriptor with the level designation value of the bit stream and the information on the layer range, and the information on the temporal ID (temporal_id) of the layer range is transmitted to the decoder 205 by the CPU 201.

[0073] The encoded image data of each picture stored in the compressed data buffer 204 is decoded at the decode timing of the picture, transmitted to the decompressed data buffer (dpb) 206, and temporarily stored by the decoder 205. In this case, by the decoder 205, the encoded image data of the picture within the decodable layer range is only selectively decoded on the basis of the information on the temporal ID (temporal_id) of the decodable layer range supplied by the CPU 201.

[0074] The image data of each picture stored in the decompressed databuffer (dpb) 206 is sequentially read at the display timing, and transmitted to the post-processing unit 207. Interpolation, subsample, or thinning is performed to the image data of each picture to adjust the frame rate to the display capability by the post-processing unit 207. The image data of each picture processed by the post-processing unit 207 is supplied to the display unit 208, and the moving picture is displayed with the image data of each picture.

[0075] As described above, in the transceiver system 10 illustrated in Fig. 1, the element-added HEVC descriptor (see Fig. 6) or the layer/signaling/descriptor (see Fig. 7) is inserted in the layer of the container of the transport stream TS by the transmission apparatus 100. These descriptors have the level designation value of the bit stream and the information on the layer range in each of the layer ranges having a different maximum layer. Thus, for example, it is possible for the receiving side to decode the encoded image data of the picture in the layer range according to the decoding performance on the basis of the information.

<2. Modified example>


[Application to the MPEG-DASH-based stream distribution system]



[0076] Note that, in the above described embodiment, the example in which the container is a transport stream (MPEG-2 TS) has been described. However, the present technology can be similarly applied to a system having a configuration in which a stream is distributed to a reception terminal using a network, such as the internet. In an internet distribution, a stream is mainly distributed by a container of an MP4 or other formats .

[0077] Fig. 15(a) illustrates a configuration example of an MPEG-DASH-based stream distribution system 30. The stream distribution system 30 has a configuration in which a DASH stream file server 31 and a DASH MPD server 32 are connected to N number of receivers 33-1, 33-2, ..., and 33-N through a content delivery network (CDN) 34.

[0078] The DASH stream file server 31 generates, on the basis of media data (video data, audio data, subtitle data, or the like) of predetermined content, a stream segment conforming to DASH (hereinafter, appropriately referred to as a "DASH segment"), and transmits the segment in response to an HTTP request from the receiver. The DASH stream file server 31 may be a streaming-dedicated server, or used as a web server.

[0079] Furthermore, the DASH stream file server 31 transmits, in response to a request of a segment of a predetermined stream transmitted from the receiver 33 (33-1, 33-2, ..., and 33-N) through the CDN 34, the segment of the stream to the receiver, which is the request source, through the CDN 34. In this case, the receiver 33 performs the request by referring to the value of the rate described in a media presentation description (MPD) file and selecting a stream of an optimal rate according to a network environment where a client is placed.

[0080] The DASH MPD server 32 is a server to generate an MPD file to acquire the DASH segment generated by the DASH stream file server 31. The MPD file is generated based on content metadata from a content management server (not illustrated) and an address (url) of the segment generated by the DASH stream file server 31.

[0081] In an MPD format, using an element of representation for each stream of a video and audio, each attribute is described. For example, by separating the representation for each of a plurality of video data streams having a different rate, each rate is described in the MPD file. The receiver 33 can select an optimal stream according to the conditions of the network environment where the receiver 33 is placed by referring to the value of the rate as described above.

[0082] Fig. 15(b) also illustrates a configuration example of an MPEG-DASH-based stream distribution system 30A. The stream distribution system 30A is constituted by a broadcast transmission system 36 with which the DASH stream file server 31 and the DASH MPD server 32 are connected, and M number of receivers 35-1, 35-2, ..., and 35-M. In the case of the stream distribution system 30A, the broadcast transmission system 36 transmits a stream segment conforming to DASH (DASH segment) generated by the DASH stream file server 31 and the MPD file generated by the DASH MPD server 32 on a broadcast wave.

[0083] Figs. 16(a) to 16(d) illustrate an example of a relation between structures hierarchically arranged in the above described MPD file. As illustrated in Fig. 16(a), a plurality of periods divided by time intervals exists in a media presentation as a whole MPD file. For example, a first period starts from 0 second, the next period starts from 100 seconds, and so on.

[0084] As illustrated in Fig. 16(b), a plurality of representations exists in the periods. A stream attribute grouped by AdaptationSet, such as a representation group associated with a video data stream having different rate and the same contents exists in the representations.

[0085] As illustrated in Fig. 16(c), the representation contains SegmentInfo. As illustrated in Fig. 16(d), there exist an initialization segment and a plurality of media segments, in which information on a segment into which the period is further divided is described, in the SegmentInfo. There exist information and the like on an address (url) to actually acquire segment data, such as a video or audio, in the media segment.

[0086] Note that, switching stream can be freely performed between the representations grouped by AdaptationSet. Thus, according to the conditions of a network environment where an IPTV client is placed, it is possible to select a stream of an optimal rate, and perform seamless moving picture distribution.

[0087] Fig. 17 illustrates a configuration example of a FragmentedMP4 stream in association with an MPD file. The FragmentedMP4 stream contains FragmentedMP4 obtained by packetizing a video stream. Encoded video data of predetermined pictures of the video stream, for example, 1 GOP is inserted in the part of "mdat" of FragmentedMP4.

[0088] The present technology can be applied to the stream distribution systems 30 and 30A illustrated in Figs. 15(a) and (b). In this case, the video stream has been hierarchically encoded similarly to the above described embodiment. Furthermore, the contents equivalent to the description contents of, for example, the above described layer/signaling/descriptor (see Fig. 7) are inserted in the MPD file, and transmitted to the receiving side. Fig. 18 illustrates an implementation of an MPD file.

[0089] Furthermore, the transceiver system 10 constituted by the transmission apparatus 100 and the reception apparatus 200 has been described in the above described embodiment, however, the transceiver system to which the present technology can be applied is not limited to this. For example, the part of the reception apparatus 200 may be, for example, a set top box and monitor connected by a digital interface, such as High-Definition Multimedia Interface (HDMI). Note that, "HDMI" is a registered trademark.

[0090] The main feature of the present technology is by transmitting a level designation value of a bit stream and information on a layer range in each of a plurality of layer ranges having a different maximum layer when video data hierarchically encoded is transmitted, it is possible for a receiving side to easily decode the encoded image data of the picture of the layer range according to decoding performance (see Fig. 10, Fig. 6, and Fig. 7).

REFERENCE SIGNS LIST



[0091] 
10
Transceiver system
30, 30A
MEPEG-DASH-based stream distribution system
31
DASH stream file server
32
DASH MPD server
33-1, 33-2, ..., 33-N, 35-1, 35-2, ..., 35-M
receiver
34
Content delivery network (CDN)
36
Broadcast transmission system
100
Transmission apparatus
101
CPU
102
Encoder
103
Compressed data buffer (cpb)
104
Multiplexer
105
Transmission unit
200
Reception apparatus
201
CPU
202
Reception unit
203
Demultiplexer
203a
Video multiplexing buffer
203b
Section data buffer
204
Compressed data buffer (cpb)
205
Decoder
205a
Temporal ID analysis unit
205b
Decode processing unit
206
Decompressed data buffer (dpb)
207
Post-processing unit
208
Display unit



Claims

1. A transmission apparatus (100) comprising:

an encoder (102) configured to receive moving picture data, classify image data of each picture of the moving picture data into a plurality of layers, hierarchically encode the classified image data of the picture in each of the plurality of layers, and generate a video stream having the encoded image data of the picture in each of the plurality of layers;

a multiplexer (104) configured to generate a transport stream comprising the video stream; and

a transmission unit (105) configured to transmit the transport stream,

wherein the multiplexer is configured to insert a layer signaling descriptor under a program map table of the transport stream, wherein the layer signaling descriptor comprises a field indicating a number of profile layers having scalable function, a field indicating a number of layer ranges, a field indicating a profile of each profile layer, a field indicating a level of each layer range for each profile layer, a field indicating a value of a minimum temporal id in each layer range for each profile layer, and a field indicating a value of a maximum temporal id in each layer range for each profile layer, wherein each layer range has a different maximum layer.


 
2. A transmission method comprising:

an image encoding step for receiving moving picture data, classifying image data of each picture of the moving picture data into a plurality of layers, hierarchically encoding the classified image data of the picture in each of the plurality of layers, and generating a video stream having the encoded image data of the picture in each of the plurality of layers;

a multiplexing step for generating a transport stream; and

a data transmitting step for transmitting the transport stream,

wherein, in the multiplexing step, a layer signaling descriptor is inserted under a program map table of the transport stream, wherein the layer signaling descriptor comprises a field indicating a number of profile layers having scalable function, a field indicating a number of layer ranges, a field indicating a profile of each profile layer, a field indicating a level of each layer range for each profile layer, a field indicating a value of a minimum temporal id in each layer range for each profile layer, and a field indicating a value of a maximum temporal id in each layer range for each profile layer, wherein each layer range has a different maximum layer.


 
3. A reception apparatus (200) comprising:

a reception unit (202) configured to receive a transport stream comprising a video stream and a layer signaling descriptor inserted under a program map table of the transport stream, the video stream having encoded image data of a picture in each of a plurality of layers obtained by classifying image data of each picture of moving picture data into the plurality of layers and hierarchically encoding the image data, the layer signaling descriptor comprising a field indicating a number of profile layers having scalable function, a field indicating a number of layer ranges, a field indicating a profile of each profile layer, a field indicating a level of each layer range for each profile layer, a field indicating a value of a minimum temporal id in each layer range for each profile layer, and a field indicating a value of a maximum temporal id in each layer range for each profile layer, wherein each layer range has a different maximum layer;

a demultiplexer (203) configured to extract from the transport stream the video stream and the layer signaling descriptor;

a processing unit (201) configured to determine the layer range which is decodable by a decoder (205) from the layer ranges indicated by the layer signaling descriptor; and

the decoder (205) configured to selectively decode the extracted encoded image data of the video stream in the decodable layer range on the basis of the received information on the temporal id.


 
4. A reception method comprising:

a data receiving step for receiving a transport stream comprising a video stream and a layer signaling descriptor inserted under a program map table of the transport stream, the video stream having encoded image data of a picture in each of a plurality of layers obtained by classifying image data of each picture of moving picture data into the plurality of layers and encoding the image data, the layer signaling descriptor comprising a field indicating a number of profile layers having scalable function, a field indicating a number of layer ranges, a field indicating a profile of each profile layer, a field indicating a level of each layer range for each profile layer, a field indicating a value of a minimum temporal id in each layer range for each profile layer, and a field indicating a value of a maximum temporal id in each layer range for each profile layer, wherein each layer range has a different maximum layer;

a processing step for determining the layer range which is decodable by a decoder (205) from the respective layer ranges indicated by the layer signaling descriptor; and

a decoding step for selectively decoding, by the decoder (205), the extracted encoded image data of the video stream in the decodable layer range on the basis of the received information on the temporal id.


 


Ansprüche

1. Übertragungsvorrichtung (100), Folgendes umfassend:

einen Codierer (102), der eingerichtet ist, um bewegte Bilddaten zu empfangen, Bilddaten jedes Bilds der bewegten Bilddaten in mehrere Schichten zu klassifizieren, die klassifizierten Bilddaten des Bilds in jeder der mehreren Schichten hierarchisch zu codieren und einen Videostrom zu erzeugen, der die codierten Bilddaten des Bilds in jeder der mehreren Schichten aufweist;

einen Multiplexer (104), der eingerichtet ist, um einen Transportstrom zu erzeugen, der den Videostrom umfasst; und

eine Übertragungseinheit (105), die eingerichtet ist, um den Transportstrom zu übertragen,

wobei der Multiplexer eingerichtet ist, um einen Schichtsignalisierungsdeskriptor unter einer Programmabbildungstabelle des Transportstroms einzufügen, wobei der Schichtsignalisierungsdeskriptor ein Feld, das eine Anzahl Profilschichten angibt, die eine skalierbare Funktion aufweisen, ein Feld, das eine Anzahl Schichtbereiche angibt, ein Feld, das ein Profil jeder Profilschicht angibt, ein Feld, das ein Niveau jedes Schichtbereichs für jede Profilschicht angibt, ein Feld, das einen Wert einer minimalen zeitlichen ID in jedem Schichtbereich für jede Profilschicht angibt, und ein Feld umfasst, das einen Wert einer maximalen zeitlichen ID in jedem Schichtbereich für jede Profilschicht angibt, wobei jeder Schichtbereich eine verschiedene maximale Schicht aufweist.


 
2. Übertragungsverfahren, Folgendes umfassend:

einen Bildcodierschritt zum Empfangen von bewegten Bilddaten, zum Klassifizieren von Bilddaten jedes Bilds der bewegten Bilddaten in mehrere Schichten, hierarchisches Codieren der klassifizierten Bilddaten des Bilds in jeder der mehreren Schichten und Erzeugen eines Videostroms, der die codierten Bilddaten des Bilds in jeder der mehreren Schichten aufweist;

einen Multiplexierschritt zum Erzeugen eines Transportstroms; und

einen Datenübertragungsschritt zum Übertragen des Transportstroms,

wobei bei dem Multiplexierschritt ein Schichtsignalisierungsdeskriptor unter einer Programmabbildungstabelle des Transportstroms eingefügt wird, wobei der Schichtsignalisierungsdeskriptor ein Feld, das eine Anzahl Profilschichten angibt, die eine skalierbare Funktion aufweisen, ein Feld, das eine Anzahl Schichtbereiche angibt, ein Feld, das ein Profil jeder Profilschicht angibt, ein Feld, das ein Niveau jedes Schichtbereichs für jede Profilschicht angibt, ein Feld, das einen Wert einer minimalen zeitlichen ID in jedem Schichtbereich für jede Profilschicht angibt, und ein Feld umfasst, das einen Wert einer maximalen zeitlichen ID in jedem Schichtbereich für jede Profilschicht angibt, wobei jeder Schichtbereich eine verschiedene maximale Schicht aufweist.


 
3. Empfangsvorrichtung (200), Folgendes umfassend:

eine Empfangseinheit (202), die eingerichtet ist, um einen Transportstrom zu empfangen, der einen Videostrom und einen Schichtsignalisierungsdeskriptor umfasst, der unter einer Programmabbildungstabelle des Transportstroms eingefügt ist, wobei der Videostrom codierte Bilddaten eines Bilds in jeder von mehreren Schichten aufweist, die durch Klassifizieren von Bilddaten jedes Bilds der bewegten Bilddaten in die mehreren Schichten erhalten werden, und die Bilddaten hierarchisch zu codieren, wobei der Schichtsignalisierungsdeskriptor ein Feld, das eine Anzahl Profilschichten mit skalierbarer Funktion angibt, ein Feld, das eine Anzahl von Schichtbereichen angibt, ein Feld, das ein Profil jeder Profilschicht angibt, ein Feld, das ein Niveau jedes Schichtbereichs für jede Profilschicht angibt, ein Feld, das einen Wert einer minimalen zeitlichen ID in jedem Schichtbereich für jede Profilschicht angibt, und ein Feld umfasst, das einen Wert einer maximalen zeitlichen ID in jedem Schichtbereich für jede Profilschicht angibt, wobei jeder Schichtbereich eine verschiedene maximale Schicht aufweist;

einen Demultiplexer (203), der eingerichtet ist, um den Videostrom und den Schichtsignalisierungsdeskriptor aus dem Transportstrom zu extrahieren;

eine Verarbeitungseinheit (201), die eingerichtet ist, um den Schichtbereich, der durch einen Decodierer (205) decodierbar ist, aus den Schichtbereichen zu bestimmen, die durch den Schichtsignalisierungsdeskriptor angegeben sind; und

den Decodierer (205), der eingerichtet ist, um die extrahierten codierten Bilddaten des Videostroms in dem decodierbaren Schichtbereich auf der Grundlage der empfangenen Informationen über die zeitliche ID wahlweise zu decodieren.


 
4. Empfangsverfahren, Folgendes umfassend:

einen Datenempfangsschritt zum Empfangen eines Transportstroms, der einen Videostrom und einen Schichtsignalisierungsdeskriptor umfasst, der unter einer Programmabbildungstabelle des Transportstroms eingefügt ist, wobei der Videostrom codierte Bilddaten eines Bilds in jeder von mehreren Schichten aufweist, die durch Klassifizieren von Bilddaten jedes Bilds der bewegten Bilddaten in die mehreren Schichten erhalten werden, und zum Codieren der Bilddaten, wobei der Schichtsignalisierungsdeskriptor ein Feld, das eine Anzahl Profilschichten mit skalierbarer Funktion angibt, ein Feld, das eine Anzahl von Schichtbereichen angibt, ein Feld, das ein Profil jeder Profilschicht angibt, ein Feld, das ein Niveau jedes Schichtbereichs für jede Profilschicht angibt, ein Feld, das einen Wert einer minimalen zeitlichen ID in jedem Schichtbereich für jede Profilschicht angibt, und ein Feld umfasst, das einen Wert einer maximalen zeitlichen ID in jedem Schichtbereich für jede Profilschicht angibt, wobei jeder Schichtbereich eine verschiedene maximale Schicht aufweist;

einen Verarbeitungsschritt zum Bestimmen des Schichtbereichs, der durch einen Decodierer (205) decodierbar ist, aus den jeweiligen Schichtbereichen, die durch den Schichtsignalisierungsdeskriptor angegeben sind; und

einen Decodierschritt zum wahlweisen Decodieren, durch den Decodierer (205), der extrahierten codierten Bilddaten des Videostroms in dem decodierbaren Schichtbereich auf der Grundlage der empfangenen Informationen über die zeitliche ID.


 


Revendications

1. Appareil de transmission (100) comprenant :

un encodeur (102) configuré pour recevoir des données d'images animées, classer les données d'image de chaque image des données d'images animées dans une pluralité de couches, coder hiérarchiquement les données d'image classées de l'image dans chacune de la pluralité de couches, et générer un flux vidéo ayant les données d'image codées de l'image dans chacune de la pluralité de couches ;

un multiplexeur (104) configuré pour générer un flux de transport comprenant le flux vidéo ; et

une unité de transmission (105) configurée pour transmettre le flux de transport,

dans lequel le multiplexeur est configuré pour insérer un descripteur de signalisation de couches sous une table de correspondance des programmes du flux de transport, le descripteur de signalisation de couches comprenant un champ indiquant un nombre de couches de profil ayant une fonction extensible, un champ indiquant un nombre de gammes de couches, un champ indiquant un profil de chaque couche de profil, un champ indiquant un niveau de chaque gamme de couches pour chaque couche de profil, un champ indiquant une valeur d'un id temporel minimal dans chaque gamme de couches pour chaque couche de profil, et un champ indiquant une valeur d'un id temporel maximal dans chaque gamme de couches pour chaque couche de profil, chaque gamme de couches ayant une couche maximale différente.


 
2. Procédé de transmission comprenant :

une étape de codage d'images destinée à recevoir des données d'images animées, classer les données d'image de chaque image des données d'images animées dans une pluralité de couches, coder hiérarchiquement les données d'image classées de l'image dans chacune de la pluralité de couches, et générer un flux vidéo ayant les données d'image codées de l'image dans chacune de la pluralité de couches ;

une étape de multiplexage destinée à générer un flux de transport ; et

une étape de transmission de données destinée à transmettre le flux de transport,

dans lequel, à l'étape de multiplexage, un descripteur de signalisation de couches est inséré sous une table de correspondance des programmes du flux de transport, le descripteur de signalisation de couches comprenant un champ indiquant un nombre de couches de profil ayant une fonction extensible, un champ indiquant un nombre de gammes de couches, un champ indiquant un profil de chaque couche de profil, un champ indiquant un niveau de chaque gamme de couches pour chaque couche de profil, un champ indiquant une valeur d'un id temporel minimal dans chaque gamme de couches pour chaque couche de profil, et un champ indiquant une valeur d'un id temporel maximal dans chaque gamme de couches pour chaque couche de profil, chaque gamme de couches ayant une couche maximale différente.


 
3. Appareil de réception (200) comprenant :

une unité de réception (202) configurée pour recevoir un flux de transport comprenant un flux vidéo et un descripteur de signalisation de couches inséré sous une table de correspondance des programmes du flux de transport, le flux vidéo ayant des données d'image codées d'une image dans chacune d'une pluralité de couches obtenues par classement de données d'image de chaque image de données d'images animées dans la pluralité de couches et codage hiérarchique des données d'image, le descripteur de signalisation de couches comprenant un champ indiquant un nombre de couches de profil ayant une fonction extensible, un champ indiquant un nombre de gammes de couches, un champ indiquant un profil de chaque couche de profil, un champ indiquant un niveau de chaque gamme de couches pour chaque couche de profil, un champ indiquant une valeur d'un id temporel minimal dans chaque gamme de couches pour chaque couche de profil, et un champ indiquant une valeur d'un id temporel maximal dans chaque gamme de couches pour chaque couche de profil, chaque gamme de couches ayant une couche maximale différente ;

un démultiplexeur (203) configuré pour extraire du flux de transport le flux vidéo et le descripteur de signalisation de couches ;

une unité de traitement (201) configurée pour déterminer la gamme de couches qui est décodable par un décodeur (205) à partir des gammes de couches indiquées par le descripteur de signalisation de couches ; et

le décodeur (205) configuré pour décoder sélectivement les données d'image codées extraites du flux vidéo dans la gamme de couches décodable sur la base des informations reçues sur l'id temporel.


 
4. Procédé de réception comprenant :

une étape de réception de données destinée à recevoir un flux de transport comprenant un flux vidéo et un descripteur de signalisation de couches inséré sous une table de correspondance des programmes du flux de transport, le flux vidéo ayant des données d'image codées d'une image dans chacune d'une pluralité de couches obtenues par classement de données d'image de chaque image de données d'images animées dans la pluralité de couches et codage des données d'image, le descripteur de signalisation de couches comprenant un champ indiquant un nombre de couches de profil ayant une fonction extensible, un champ indiquant un nombre de gammes de couches, un champ indiquant un profil de chaque couche de profil, un champ indiquant un niveau de chaque gamme de couches pour chaque couche de profil, un champ indiquant une valeur d'un id temporel minimal dans chaque gamme de couches pour chaque couche de profil, et un champ indiquant une valeur d'un id temporel maximal dans chaque gamme de couches pour chaque couche de profil, chaque gamme de couches ayant une couche maximale différente ;

une étape de traitement destinée à déterminer la gamme de couches qui est décodable par un décodeur (205) à partir des gammes de couches respectives indiquées par le descripteur de signalisation de couches ; et

une étape de décodage destinée au décodage sélectif, par le décodeur (205), les données d'image codées extraites du flux vidéo dans la gamme de couches décodable sur la base des informations reçues sur l'id temporel.


 




Drawing



























































Cited references

REFERENCES CITED IN THE DESCRIPTION



This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

Non-patent literature cited in the description