Audio reproducing device and audio reproducing method

(19)

(11)

EP 2 610 867 A1

(12)	EUROPEAN PATENT APPLICATION

(43)	Date of publication:
	03.07.2013 Bulletin 2013/27

(21)	Application number: 13161700.3

(22)	Date of filing: 13.10.2009

(51)

International Patent Classification (IPC):

G10L 19/24^(2013.01)
G10L 19/008^(2013.01)

G10L 21/038^(2013.01)
G10L 19/005^(2013.01)

(84)	Designated Contracting States:
	AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO SE SI SK SM TR

(30)

Priority:

21.11.2008 JP 2008298809

(62)	Application number of the earlier application in accordance with Art. 76 EPC:
	09827300.6 / 2360684

(71)	Applicant: Panasonic Corporation
	Kadoma-shi Osaka 571-8501 (JP)

(72)	Inventor:
	YOKOYAMA, Takashi Osaka-shi, Osaka, 540-6207 (JP)

(74)	Representative: Vigand, Philippe et al
	Novagraaf International SA 3 chemin de l'Echo 1213 Onex Geneva 1213 Onex Geneva (CH)


	Remarks:
	This application was filed on 28.03.2013 as a divisional application to the application mentioned under INID code 62.

(54)	Audio reproducing device and audio reproducing method

(57) This invention prevents occurrence of abnormal sounds without significantly increasing calculation amount even for multi-channel signals.
An audio reproducing device (100) decodes and reproduces a stream including basic codec. The audio reproducing device (100) includes: a stream separating unit (101) which separates the stream into basic codec and bandwidth extension data; a basic codec analyzing unit (102) which analyzes the basic codec; a basic codec decoding unit (103) which decodes the basic codec in accordance with basic codec analysis information; a bandwidth extension data analyzing unit (104) which analyzes bandwidth extension data; a first bandwidth extension processing unit (106) which extends a decoded basic codec signal by using the bandwidth extension information; a second bandwidth extension processing unit (107) which extends a decoded basic codec signal by using the bandwidth extension information with an accuracy higher than an accuracy of the first bandwidth extension processing unit (106); and a switching unit (109) which switches between the first bandwidth extension processing unit (106) and the second bandwidth extension processing unit (107) based on the basic codec analysis information.

Description

[Technical Field]

[0001] The present invention relates to an audio reproducing device which decodes an encoded audio signal and reproduces the decoded audio signal.

[Background Art]

[0002] There is a conventional audio reproducing device which receives a low-band audio signal and bandwidth extension information, and generates an extended high-band audio signal by using a spectral band replication (hereinafter, referred to as SBR) technique. SBR reconstructs the high-band of the received signal by predicting the high-band with reference to side information included in the bandwidth extension information. Here, only a small amount of side information is necessary; and thus, the SBR enhances the sound quality of the encoded audio signal at low bit rates.

[0003] Two types of SBR are defined, which are high-quality SBR (hereinafter, referred to as HQ-SBR) and low-power SBR (hereinafter, referred to as LP-SBR).

[0004] HQ-SBR performs complex arithmetic for overall processing of sub-band analysis, high-band generation, and sub-band synthesis. Thus, HQ-SBR is suitable for enhancing sound quality, but requires a large amount of computation.

[0005] LP-SBR performs real number operations instead of the complex arithmetic of HQ-SBR. LP-SBR is designed to reduce aliasing distortion generated by the real number operation. Thus, LP-SBR is capable of significantly reducing the amount of computation, and achieving, at low bit rates, the sound quality equivalent to that of HQ-SBR. It is known that LP-SBR requires only approximately half the amount of processing that is required in HQ-SBR (See Non-Patent Literature (NPL) 1).

[0006] SBR is used in combination with Advanced Audio Coding (AAC), and the combined configuration is referred to as High-Efficiency AAC (HE-AAC) profile. In combination with AAC, it is known that AAC+LP-SBR requires only approximately 70 % of the processing amount that is required in AAC+HQ-SBR (see NPL 1).

[0007] There is also a conventional reproducing device which receives a monaural audio signal and stereo information, and performs stereo processing on the monaural audio signal based on the stereo information to generate a stereo audio signal. The stereo processing is known as Parametric Stereo (hereinafter, referred to as PS), and used in combination with SBR. PS commonly uses a complex Quadrature Mirror Filter (QMF) with SBR for stereo processing (see NPL 2).

[0008] It is known that PS is used in combination with AAC and SBR, and the combined configuration is referred to as HE-AACv2 profile. PS needs to be used in combination with HQ-SBR which uses the complex QMF (see Non-Patent Literatures 2 and 3). When there is no PS data, AAC may be used in combination with either HQ-SBR or LP-SBR.

[0009] The HE-AAC profile and HE-AACv2 profile have a concept of levels. The higher the level is, the more variety of types of signals can be decoded. Examples of the types here include maximum sampling frequency or maximum number of channels of an encoded input audio signal, and maximum sampling frequency of a decoded output audio signal (see NPL 3).

[Citation List]

[Non Patent Literature]

[0010]

[NPL 1] Mitsutoshi HATORI, "One segment broadcasting textbook", Impress Japan, June 15, 2005

[NPL 2] Toshiyuki NOMURA, "Latest trends and applications of MPEG audio", [online], University of the Ryukyus Computing and Networking center publication No. 5, April 2008, [searched on September 17, 2008], Internet <URL: http://www.cc.u-ryukyu-ac.jp/news/kouhoujNo5/2-5.pdf>

[NPL 3] ISO/IEC 14496-3:2005/FDAM2, "Information technology-Coding of audio-visual objects-Part 3: Audio, AMENDMENT 2: Audio Lossless Coding (ALS), new audio profiles and BSAC extensions" August 2005

[Summary of Invention]

[Technical Problem]

[0011] However, in order for the conventional techniques of decoding encoded audio signals to comply with HE-AACv2 profile and higher levels, HQ-SBR needs to be used which requires a large amount of computation. As a result, for example, in the case where an encoded input audio signal is a multi-channel signal, computation amount (processing amount) significantly increases. Furthermore, attempts to solve the problem by using the conventional techniques result in generating abnormal sounds in the decoded audio signal. Details are described below.

[0012] According to NPL 3, as described above, in the case where the technique of decoding encoded audio signals complies with HE-AACv2 profile and there is PS data, processing needs to be performed in combination with HQ-SBR. However, in the case where there is no PS data, processing may be performed in combination with either HQ-SBR or LP-SBR.

[0013] For example, in view of NPL 3, for preventing the increase in the computation amount, such a method can be considered which switches SBR depending on the state of the stream to be decoded. More specifically, when HQ-SBR needs to be used, that is, when there is PS data, HQ-SBR is used. In other cases, that is, when there is no PS data, LP-SBR is used for reducing the increase in the computation amount.

[0014] Here, in the case where a stream includes a plurality pieces of normal PS data but a piece of PS data is missing at some point in the stream, HQ-SBR is switched to LP-SBR. Alternatively, LP-SBR is switched to HQ-SBR when the state where a stream includes PS data without any missing, but SBR and stereo processing cannot be executed because SBR header is not yet obtained, is changed to the state where the SBR header is obtained.

[0015] As described earlier, HQ-SBR performs complex arithmetic for QMF filtering, and LP-SBR performs real number operations for QMF filtering. Thus, HQ-SBR and LP-SBR have different formats of delay information, which does not allow HQ-SBR and LP-SBR to share the delay information of the QMF filtering. As a result, delay information of the QMF filtering becomes discontinuous at the time of switching of SBR, thereby generating abnormal sounds.

[0016] In FIG. 7, (a) shows an output audio signal for a single channel in the case where SBR is switched at times t0 and t2. It is shown that abnormal sounds are generated during the periods between t0 and t1 and between t2 and t3 because delay information cannot be used due to the switching of SBR (in FIG. 7, (b) shows a normal audio signal). In such a manner, attempts to prevent the increase in computation amount by switching SBR results in generating abnormal sounds at the time of switching of SBR.

[0017] The present invention has been conceived in order to solve the problem, and has an object to provide an audio reproducing device and an audio reproducing method which prevent occurrence of abnormal sounds without significantly increasing the computation amount even when an encoded input audio signal is a multi-channel signal.

[Solution to Problem]

[0018] In order to solve the problem, an audio reproducing device according to an aspect of the present invention is an audio reproducing device which reproduces a stream including a basic codec that is an encoded audio signal. The audio reproducing device includes: a stream separating unit which separates, on a frame basis, the stream into the basic codec and bandwidth extension information that is used for extending a band of the basic codec; a basic codec information analyzing unit which analyzes the basic codec separated by the stream separating unit, to generate analysis information indicating a type of the basic codec; a basic codec decoding unit which decodes the basic codec in accordance with the analysis information generated by the basic codec information analyzing unit, to generate a decoded basic codec signal; a first bandwidth extension processing unit which executes first processing which extends, by using the bandwidth extension information, a frequency band of the decoded basic codec signal generated by the basic codec decoding unit; a second bandwidth extension processing unit which executes second processing which extends, by using the bandwidth extension information, the frequency band of the decoded basic codec signal generated by the basic codec decoding unit, the second processing being executed with an accuracy higher than an accuracy of the first processing; and a switching unit which switches between the first bandwidth extension processing unit and the second bandwidth extension processing unit based on the analysis information.

[0019] According to the structure, two separate processing having different processing amount are switched based on the analysis information indicating the type of basic codec. As a result, more appropriate processing can be selected. Thus, for example, even when an input encoded audio signal is a multi-channel signal, the computation amount (processing amount) does not significantly increase. In addition, processing is switched based on the analysis information; and thus, processing is not switched while the type of the basic codec is the same. As a result, it is possible to prevent abnormal sounds which may occur at the time of switching of processing.

[0020] It may also be that the stream separating unit separates, on the frame basis, the stream into the basic codec, the bandwidth extension information, and stereo extension information that is used for performing stereo processing on the basic codec, and that the audio reproducing device further includes: a stereo extension processing unit which performs, by using the stereo extension information, stereo processing on the decoded basic codec signal having the frequency band extended by the second bandwidth extension processing unit.

[0021] Accordingly, when the basic codec is a monaural audio signal, proper stereo processing can be performed.

[0022] It may also be that the basic codec information analyzing unit analyzes the basic codec separated by the stream separating unit, to generate analysis information including at least one of channel information and sampling frequency information, the channel information indicating the number of channels of the basic codec, the sampling frequency information indicating a sampling frequency of the basic codec, and that the switching unit determines at least one of (i) whether the number of channels indicated by the channel information is greater than a predetermined first threshold and (ii) whether the sampling frequency indicated by the sampling frequency information is greater than a predetermined second threshold, and select the first bandwidth extension processing unit when at least one of the following is determined: (i) the number of channels is greater than the predetermined first threshold and (ii) the sampling frequency is greater than the predetermined second threshold.

[0023] According to the structure, in the case where the basic codec has a large number of channels, that is, where the basic codec is multi-channel, a first processing is selected which requires less processing amount but produces lower accuracy. As a result, it is possible to prevent processing amount from significantly increasing compared to a single channel signal. Alternatively, when the sampling frequency of the basic codec is high, the first processing is also selected which requires less processing amount but produces lower accuracy. Thus, similarly, it is possible to prevent the processing amount from significantly increasing compared to the case where the basic codec with lower sampling frequency is processed.

[0024] It may also be that the audio reproducing device further includes a buffer which stores stereo extension information of a first frame, wherein the stereo extension processing unit performs stereo processing on a decoded basic codec signal of a second frame by using the stereo extension information stored in the buffer, the second frame being a frame after the first frame and being a frame in which the stereo extension information is missing.

[0025] Accordingly, stereo extension information used for stereo processing is stored in a buffer, and the stereo extension information stored in the buffer is used when stereo extension information cannot be obtained. Thus, even when a stream includes a frame in which stereo extension data is missing, stereo processing can be properly performed on the frame.

[0026] It also may be that the second bandwidth extension processing unit generates a high-frequency component signal from the decoded basic codec signal by using the bandwidth extension information, the stereo extension processing unit performs, by using the stereo extension information, stereo processing on the decoded basic codec signal and the high-frequency component signal generated by the second bandwidth extension processing unit, to generate a decoded basic codec signal and a high-frequency component signal for a first channel and a decoded basic codec signal and a high-frequency component signal for a second channel, and the second bandwidth extension processing unit further includes a band synthesis filter for synthesizing the high-frequency component signal and the decoded basic codec signal that have been generated, and synthesizes bands of the second channel by using delay information that is stored in the band synthesis filter of the first channel, as delay information stored in the band synthesis filter of the second channel, when the stereo extension information is missing.

[0027] According to the structure, even when delay information only for a single channel is obtained, the obtained delay information is used as delay information for the other channel. As a result, bands of the respective signals of two channels can be properly synthesized.

[0028] It may also be that the basic codec is an audio signal encoded according to Advanced Audio Coding (AAC) scheme, the bandwidth extension information is Spectral Band Replication (SBR) information generated according to SBR scheme, the stereo extension information is Parametric Stereo (PS) information generated according to PS scheme, the first bandwidth extension processing unit extends a frequency band of the decoded basic codec signal according to Low Power-SBR (LP-SBR) scheme, and the second bandwidth extension processing unit extends a frequency band of the decoded basic codec signal according to High Quality-SBR (HQ-SBR) scheme.

[0029] The present invention may be implemented not only as an audio reproducing device, but also as an audio reproducing method which includes processing units of the audio reproducing device as steps. The present invention may be also implemented as a program causing a computer to execute these steps. Furthermore, the present invention may be implemented as a computer-readable recording medium, such as a Compact Disc-Read Only Memory (CD-ROM), which records the program therein, and as information, data, or signals indicating the program. Such program, information, data, and signals may be distributed over a communication network such as the Internet.

[0030] In addition, part or all of the elements included in each audio reproducing device above may be in a form of a single system large scale integration (LSI). The system LSI is a ultra-multifunctional LSI which is produced by integrating a plurality of constitutional units on a single chip. More specifically, the system LSI is a computer system including, for example, a microprocessor, a ROM, and a Random Access Memory (RAM).

[Advantageous Effects of Invention]

[0031] According to the present invention, it is possible to prevent occurrence of abnormal sounds without significantly increasing the computation amount even when an encoded input audio signal is a multi-channel signal.

[Brief Description of Drawings]

[0032]

[FIG. 1]
FIG. 1 is a block diagram illustrating an example of a structure of an audio reproducing device according to Embodiment 1.

[FIG. 2]
FIG. 2 is a flowchart of an example of operations of the audio reproducing device according to Embodiment 1.

[FIG. 3]
FIG. 3 is a flowchart of a specific example of operations of a switching unit according to Embodiment 1.

[FIG. 4]
FIG. 4 is a diagram illustrating an example of an input stream which includes stereo extension data.

[FIG. 5]
FIG. 5 is a diagram illustrating an example of an input stream which does not include stereo extension data.

[FIG. 6]
FIG. 6 is a diagram illustrating an example of an input stream including a frame in which stereo extension data is missing.

[FIG. 7]
FIG. 7 is a diagram illustrating an example of waveforms of output audio signals.

[FIG. 8]
FIG. 8 is a block diagram illustrating an example of a structure of an audio reproducing device according to Embodiment 2.

[FIG. 9]
FIG. 9 is a flowchart of an example of operations of a stereo extension processing unit according to Embodiment 2.

[FIG. 10]
FIG. 10 is a diagram illustrating an example of waveforms of stereo audio signals to be output.

[FIG. 11]
FIG. 11 is an external view of an example of an audio reproducing apparatus incorporating an audio reproducing device according to the present invention.

[Description of Embodiments]

[0033] Hereinafter, embodiments of an audio reproducing device according to the present invention will be described with reference to the drawings.

(Embodiment 1)

[0034] An audio reproducing device according to Embodiment 1 is characterized in switching between two bandwidth extension processing having different characteristics based on an analysis result of basic codec, regardless of validity of stereo extension information used for performing stereo processing on a monaural audio signal. The two bandwidth extension processing are: processing which requires larger processing amount but produces higher accuracy, that is, processing for outputting an audio signal with excellent sound quality; and processing which requires less processing amount but produces lower accuracy.

[0035] FIG. 1 is a block diagram illustrating an example of a structure of an audio reproducing device 100 according to Embodiment 1. The audio reproducing device 100 in FIG. 1 includes: a stream separating unit 101; a basic codec analyzing unit 102; a basic codec decoding unit 103; a bandwidth extension data analyzing unit 104; a stereo extension data analyzing unit 105; a first bandwidth extension processing unit 106; a second bandwidth extension processing unit 107; a stereo extension processing unit 108; and a switching unit 109.

[0036] The stream separating unit 101 separates an input stream into basic codec, bandwidth extension data, and stereo extension data. When an input stream includes no stereo extension data, the stream separating unit 101 separates the stream into basic codec and bandwidth extension data. The stream separating unit 101 then transmits the separated basic codec to the basic codec analyzing unit 102, transmits the bandwidth extension data to the bandwidth extension data analyzing unit 104, and transmits the stereo extension data to the stereo extension data analyzing unit 105.

[0037] Here, the stream input to the audio reproducing device 100 is, for example, a stream having HE-AACv2 profile. The basic codec is an encoded audio signal, and is, for example, an audio signal encoded in accordance with AAC scheme. The bandwidth extension data is data used for extending bandwidth of the basic codec, and is, for example, SBR data. The stereo extension data is data used for performing stereo processing on a monaural audio signal, and is, for example, PS data.

[0038] The basic codec analyzing unit 102 generates basic codec analysis information by analyzing the basic codec transmitted from the stream separating unit 101. The basic codec analysis information includes, for example, channel information representing the number of channels (CH) of the basic codec, and sampling frequency information representing the sampling frequency (FS) of the basic codec. The basic codec analyzing unit 102 transmits the generated basic codec analysis information to the basic codec decoding unit 103. Of the basic codec analysis information, the basic codec analyzing unit 102 also transmits the channel information and the sampling frequency information to the switching unit 109.

[0039] The basic codec decoding unit 103 decodes the basic codec by using the basic codec analysis information transmitted from the basic codec analyzing unit 102, and generates a decoded basic codec signal. The basic codec decoding unit 103 then transmits the decoded basic codec signal to the switching unit 109.

[0040] The bandwidth extension data analyzing unit 104 analyzes the bandwidth extension data transmitted from the stream separating unit 101 to generate bandwidth extension information, and transmits the generated bandwidth extension information to the switching unit 109. The bandwidth extension information includes, for example, side information used for prediction for reconstruction of high band of the decoded basic codec signal using the SBR technique.

[0041] The stereo extension data analyzing unit 105 analyzes the stereo extension data transmitted from the stream separating unit 101 to generate stereo extension information, and transmits the generated stereo extension information to the stereo extension processing unit 108. The stereo extension information is, for example, information used for performing stereo extension processing (also referred to as stereo processing) on a monaural audio signal using the PS technique.

[0042] The first bandwidth extension processing unit 106 extends the frequency band of the decoded basic codec signal by using the bandwidth extension information transmitted from the switching unit 109 to output an audio signal. More specifically, the first bandwidth extension processing unit 106 predicts and generates high frequency components by using the bandwidth extension information, and synthesizes the bands of the generated high frequency component signal and the decoded basic codec signal to output an audio signal.

[0043] Here, the first bandwidth extension processing unit 106 has an advantage over the second bandwidth extension processing unit 107 in that the first bandwidth extension processing unit 106 requires less processing amount for processing a same signal. However, the sound quality of the audio signal output by the first bandwidth extension processing unit 106 is lower than that of the audio signal output by the second bandwidth extension processing unit 107. The first bandwidth extension processing unit 106 performs, for example, bandwidth extension based the LP-SBR scheme.

[0044] The second bandwidth extension processing unit 107 extends the frequency band of the decoded basic codec signal by using the bandwidth extension information transmitted from the switching unit 109 to output an audio signal. More specifically, the second bandwidth extension processing unit 107 predicts and generates high frequency components by using the bandwidth extension information, and synthesizes the bands of the generated high frequency component signal and the decoded basic codec signal to output an audio signal.

[0045] Here, the sound quality of the audio signal output by the second bandwidth extension processing unit 107 is higher than that of the audio signal output by the first bandwidth extension processing unit 106. However, the second bandwidth extension processing unit 107 requires processing amount larger than that of the first bandwidth extension processing unit 106. The second bandwidth extension processing unit 107 performs, for example, bandwidth extension based the HQ-SBR scheme.

[0046] Generally, when encoding an audio signal (that is, when generating basic codec), high frequency components are removed to reduce encoding amount. Thus, the decoded basic codec signal is an audio signal mainly including low frequency components. The bandwidth extension performed by the first bandwidth extension processing unit 160 and the second bandwidth extension processing unit 107 are processing in which the removed high frequency components are predicted and generated by using bandwidth extension information.

[0047] More specifically, the first bandwidth extension processing unit 106 and the second bandwidth extension processing unit 107 each includes a band synthesis filter. The first and second bandwidth extension processing units 106 and 107 reconstruct an output audio signal that is close to an original sound by synthesizing the bands of the decoded basic codec signal generated by the basic codec decoding unit 103 and the high frequency component signal reconstructed based on the decoded basic codec signal by using the bandwidth extension information.

[0048] The stereo extension processing unit 108 uses stereo extension information transmitted from the stereo extension data analyzing unit 105 to perform stereo processing on the monaural audio signal having a frequency band extended by the second bandwidth extension processing unit 107. More specifically, the stereo extension processing unit 108 performs, by using the stereo extension information, stereo processing on the decoded basic codec signal that is a monaural audio signal and the high frequency component signal generated by the second bandwidth extension processing unit 107, to generate a decoded basic codec signal and a high frequency component signal for left (L) channel, and a decoded basic codec signal and a high frequency component signal for right (R) channel. The stereo extension processing unit 108 performs, for example, stereo processing based on the PS scheme. Here, the stereo extension processing unit 108 has to be used in combination with the second bandwidth extension processing unit 107. In other words, the stereo extension processing unit 108 shares the complex QMF with the second bandwidth extension processing unit 107.

[0049] The second bandwidth extension processing unit 107 synthesizes the bands of the generated L-channel signals and the bands of the generated R-channel signals. In the band synthesis processing of the second bandwidth extension processing unit 107, when an input stream includes a frame in which stereo extension data is missing, the delay information of the L channel is copied to the delay information of the R channel. When stereo extension data is obtained, band synthesis of R channel is performed using the delay information of the L channel copied for a previous frame, as the delay information of R channel. The delay information of the L channel is information held over frames in the band synthesis filter of the band synthesis processing.

[0050] The switching unit 109 determines whether the outputs of the basic codec decoding unit 103 and the bandwidth extension data analyzing unit 104 are connected to terminal A or terminal B, based on the number of channels CH and the sampling frequency FS transmitted form the basic codec analyzing unit 102. The determination procedure will be specifically described later with reference to FIG. 3. The switching unit 109 transmits the decoded basic codec signal transmitted form the basic codec decoding unit 103 and the bandwidth extension information transmitted from the bandwidth extension data analyzing unit 104, to the first bandwidth extension processing unit 106 or the second bandwidth extension processing unit 107 depending on the determination result.

[0051] As described above, the audio reproducing device 100 according to Embodiment 1 includes the switching unit 109 which selects one of the two bandwidth extension processing having different characteristics, based on the analysis result of the basic codec. The two bandwidth extension processing are: a first processing which requires less processing amount but produces lower sound quality; and a second processing which requires larger processing amount but produces higher sound quality.

[0052] Next, operations of the audio reproducing device 100 according to Embodiment 1 are described.

[0053] FIG. 2 is a flowchart of the operations of the audio reproducing device 100 according to Embodiment 1. The following operations are performed on a frame basis.

[0054] First, the stream separating unit 101 separates an input stream into basic codec, bandwidth extension data, and stereo extension data (S101). The basic codec is transmitted to the basic codec analyzing unit 102. The bandwidth extension data is transmitted to the bandwidth extension data analyzing unit 104. The stereo extension data is transmitted to the stereo extension data analyzing unit 105.

[0055] Next, each of separated data is analyzed (S102). More specifically, the basic codec analyzing unit 102 analyzes the basic codec to generate basic codec analysis information. The bandwidth extension data analyzing unit 104 analyzes the bandwidth extension data to generate bandwidth extension information. The stereo extension data analyzing unit 105 analyzes the stereo extension data to generate stereo extension information. In the case where stereo extension information cannot be generated, such as the case where stereo extension data is missing, the stereo extension data analyzing unit 105 transmits, to the stereo extension processing unit 108, information indicating that there is no stereo extension information.

[0056] Next, the basic codec decoding unit 103 decodes the basic codec in accordance with the basic codec analysis information (S103). The decoded basic codec signal is transmitted to the switching unit 109.

[0057] The switching unit 109 determines the connection destination of the transmission path of the decoded basic codec signal based on the basic codec analysis information, and switches between the terminal A and the terminal B based on the determination result (S104). For example, the switching unit 109 refers to the channel information included in the basic codec analysis information, and selects the terminal A when the number of channels CH of the basic codec is greater than a predetermined threshold. Alternatively, the switching unit 109 refers to the sampling frequency information included in the basic codec analysis information, and selects the terminal A when the sampling frequency FS of the basic codec is equal to or greater than a predetermined threshold. In other cases, the switching unit 109 selects the terminal B.

[0058] When the terminal A is selected ("A" in S105), the decoded basic codec signal and the bandwidth extension information are transmitted to the first bandwidth extension processing unit 106. The first bandwidth extension processing unit 106 extends the frequency band of the decoded basic codec signal to generate an output audio signal (S106). The first bandwidth extension processing unit 106 executes processing in accordance with the LP-SBR scheme or the like which requires less processing amount but generates an audio signal with lower sound quality.

[0059] When the terminal B is selected ("B" in S105), the decoded basic codec signal and the bandwidth extension information are transmitted to the second bandwidth extension processing unit 107. The second bandwidth extension processing unit 107 extends the frequency band of the decoded basic codec signal to generate an output audio signal (S107). The second bandwidth extension processing unit 107 executes processing in accordance with the HQ-SBR scheme or the like which requires larger processing amount but generates an audio signal with higher sound quality.

[0060] Here, when there is stereo extension information, the stereo extension processing unit 108 performs stereo processing on the decoded basic codec signal (monaural audio signal) having the frequency band extended by the second bandwidth extension processing unit 107.

[0061] Lastly, the audio signal generated by the first bandwidth extension processing unit 106 or the second bandwidth extension processing unit 107 is output (S108).

[0062] In such a manner, it is possible to generate an output audio signal that is close to an original sound by predicting and reconstructing the high frequency components of the decoded basic codec signal. Here, processing is selected based on the basic codec analysis information representing the type of the basic codec. Accordingly, for example, in the case where processing amount is increased due to multi-channel or higher sampling frequency, it is possible to prevent an increase in the processing amount by selecting the first bandwidth extension processing unit 106 which requires less processing amount.

[0063] Next, reference is made to a specific example of the determination processing of the connection destination (S104).

[0064] FIG. 3 is a flowchart of a specific example of the operations of the switching unit 109 according to Embodiment 1.

[0065] First, it is determined whether or not the number of channels CH and the sampling frequency FS of an input basic codec meets a predetermined condition (S201). Here, it is determined whether the CH is 1 and also the FS is at most equal to 24 kHz.

[0066] In the case where the number of channels CH is 2 or more, or the sampling frequency FS is higher than 24 kHz (No in S201), the transmission path is connected to the terminal A, and the input bandwidth extension information and the decoded basic codec signal are transmitted to the first bandwidth extension processing unit 106 (S202). In the case where the number of channels CH is 1, and the sampling frequency FS is 24 kHz or lower (Yes in S201), the transmission path is connected to the terminal B, and the input bandwidth extension information and the decoded basic codec signal are transmitted to the second bandwidth extension processing unit 107 (S203).

[0067] In the following, reference is made to the operations of the audio reproducing device 100 according to Embodiment 1 with a specific example of a stream.

[0068] FIG. 4 is a diagram illustrating an example of an input stream which includes stereo extension data.

[0069] For example, when the audio reproducing device 100 receives a stream as shown in FIG. 4, the basic codec analyzing unit 102 analyzes the basic codec, and transmits the number of channels CH (=1) and the sampling frequency information FS (=24 kHz) to the switching unit 109. Since the condition shown in FIG. 3 is met (Yes in S201), the switching unit 109 connects the transmission path to the terminal B, and transmits the decoded basic codec signal and the bandwidth extension information to the second bandwidth extension processing unit 107 (S203). The second bandwidth extension processing unit 107 extends the band of the decoded basic codec signal transmitted from the switching unit 109, by using the bandwidth extension information. Here, at the same time, the stereo extension processing unit 108 performs stereo extension processing by using the stereo extension information, and outputs the stereo audio signal.

[0070] As shown in FIG. 4, when the stereo extension data is included, the number of channels CH is 1. The stereo extension data is information used for performing stereo processing on a monaural audio signal. When the number of channels is 1, it represents that the decoded basic codec signal is a monaural audio signal.

[0071] FIG. 5 is a diagram illustrating an example of an input stream which includes no stereo extension data. When the audio reproducing device 100 receives a stream as shown in FIG. 5, the basic codec analyzing unit 102 analyzes the basic codec, and transmits the number of channels CH (= 5.1) and the sampling frequency information FS (= 24 kHz) to the switching unit 109. Since the condition shown in FIG. 3 is not met (No in S201), the switching unit 109 connects the transmission path to the terminal A, and transmits the decoded basic codec signal and the bandwidth extension information to the first bandwidth extension processing unit 106 (S202). The first bandwidth extension processing unit 106 extends the band of the decoded basic codec signal transmitted from the switching unit 109, by using the bandwidth extension information, and outputs an audio signal.

[0072] Next, reference is made to the case where the audio reproducing device 100 receives a stream in which stereo extension data is missing in a frame at some point within the stream, and stereo extension data reappears in the subsequent frames.

[0073] FIG. 6 is a diagram illustrating an example of an input stream including a frame in which stereo extension data is missing. As shown in FIG. 6, stereo extension data is included in frames 201 and 203, but stereo extension data is missing in a frame 202. However, there is no change in the basic codec analysis information generated by the analysis of the basic codec included in the frames 201, 202, and 203. More specifically, each of the number of channels CH of the basic codec of the frames 201, 202, 203 is 1, and the sampling frequency is 24 kHz.

[0074] Thus, the switching unit 109 determines that each frame meets the condition shown in FIG. 3 (Yes in S201), and connects the transmission path to the terminal B (S203). The second bandwidth extension processing unit 107 performs bandwidth extension on each frame.

[0075] FIG. 7 is a diagram illustrating an example of waveforms of output audio signals. In FIG. 7, (a) shows a conventional waveform of an output audio signal of the case where HQ-SBR is switched to LP-SBR at time t0 and LP-SBR is switched to HQ-SBR at time t2 due to the missing PS data of the frame 202. Conventionally, such switching of the processing causes abnormal sounds because delay information is not available during the periods between times t0 and t1 and between times t2 and t3.

[0076] On the other hand, as described above, the audio reproducing device 100 according to Embodiment 1 determines the first bandwidth extension processing unit 106 or the second bandwidth extension processing unit 107 for performing processing, independently of the existence of the stereo extension data within a stream. More specifically, in the case where the respective frames have the same analysis information of the basic codec, the same processing unit is used for extending the band of the decoded basic codec signal of each frame. Thus, discontinuity of the delay information does not occur, thereby preventing abnormal sounds as shown in FIG. 7(b).

[0077] As described above, in the audio reproducing device 100 according to Embodiment 1, the second bandwidth extension processing unit 107 performs bandwidth extension on a stream including stereo extension data (that is, the stream having CH = 1); and thus, it is possible to perform stereo extension processing without any problems. Furthermore, the first bandwidth extension processing unit 106 performs bandwidth extension on a stream that is multi-channel and includes no stereo extension data; and thus, it is possible to reduce processing amount (computation amount).

[0078] As a result, for example, it is possible to reproduce an audio signal with which a stream having HE-AACv2 profile properly decoded, without increasing the computation amount required when reproducing a multi-channel audio signal. Here, it is possible to reproduce audio signals without abnormal sounds even in the case where no PS data is input and then PS data is input.

(Embodiment 2)

[0079] An audio reproducing device according to Embodiment 2 includes a buffer for storing stereo extension information. For example, when there is missing stereo extension data under the influences of broadcast receiving, stereo processing is performed by using the stereo extension information stored in the buffer.

[0080] FIG. 8 is a block diagram illustrating an example of a structure of an audio reproducing device 300 according to Embodiment 2. The audio reproducing device 300 shown in FIG. 8 differs from the audio reproducing device 100 shown in FIG. 1 in that a stereo extension processing unit 308 is included instead of the stereo extension processing unit 108, and that a buffer 310 is further included. In the following, only the differences from Embodiment 1 are described, and the descriptions of the same points are omitted.

[0081] In addition to the processing performed by the stereo extension processing unit 108, the stereo extension processing unit 308 stores, in the buffer 310, stereo extension information used for the stereo processing. More specifically, the stereo extension processing unit 308 performs stereo processing on the decoded basic codec signal having the frequency band extended by the second bandwidth extension processing unit 107, by using the stereo extension information transmitted from the stereo extension data analyzing unit 105. The stereo extension information used here is stored in the buffer 310. For example, each time stereo extension information is obtained, the stereo extension processing unit 308 updates the stereo extension information stored in the buffer 310.

[0082] Furthermore, in the case where there is no stereo extension information such as the case where the stereo extension information in a frame is missing, the stereo extension processing unit 308 reads stereo extension information from the buffer 310, and performs stereo processing on the decoded basic codec signal (monaural audio signal) of the frame by using the read stereo extension information.

[0083] The buffer 310 stores the stereo extension information transmitted from the stereo extension data analyzing unit 105. The buffer 310 not only stores newest stereo extension information, but also may store a plurality pieces of stereo extension information. In the case where the buffer 310 stores a plurality pieces of stereo extension information, the stereo extension processing unit 308, for example, refers to the basic codec extension information and uses the stereo extension information used for the stereo processing of a previous decoded basic codec signal similar to the current decoded basic codec signal.

[0084] As described, the audio reproducing device 300 according to Embodiment 1 includes the buffer 310 for storing stereo extension information. In the case where there is no stereo extension information, the audio reproducing device 300 performs stereo processing on the decoded basic codec signal by using the stereo extension information stored in the buffer 310.

[0085] Next, of the operations of the audio reproducing device 300 according to Embodiment 2, the operations of the stereo extension processing unit 308 are described. The audio reproducing device 300 decodes input streams in accordance with the flowcharts shown in FIG. 2 and FIG. 3. The stereo extension processing unit 308 according to Embodiment 2 performs processing when the second bandwidth extension processing unit 107 performs bandwidth extension (S107).

[0086] FIG. 9 is a flowchart of the operations of the stereo extension processing unit 308 according to Embodiment 2.

[0087] First, the stereo extension processing unit 308 determines whether or not a stream includes stereo extension data, that is, whether or not stereo extension information is transmitted from the stereo extension data analyzing unit 105 (S301). In the case where the stereo extension information is transmitted (Yes in S301), stereo extension processing is performed by using the stereo extension information (S302). The stereo extension processing unit 308 further stores the stereo extension information used here (S303).

[0088] In the case where the stereo extension information is not transmitted (No in S301), it is determined whether or not stereo extension processing has been performed for decoding previous frames (S304). In the case where the stereo extension processing has been performed (Yes in S304), stereo extension processing is performed by using the stereo extension information stored when decoding previous frames (S305). In the case where no stereo extension processing has been performed (No in S304), the processing ends here.

[0089] In such a manner, the stereo extension processing unit 308 according to Embodiment 2 stores, in the buffer 310, the stereo extension information used for decoding previous frames. When stereo extension data is missing in a subsequent frame, stereo processing is performed on the decoded basic codec signal by using the stereo extension information stored in the buffer 310.

[0090] In the following, reference is made to the operations performed by the audio reproducing device 300 according to Embodiment 2 when a stream as shown in FIG. 6 is input.

[0091] According to Embodiment 2, in the case where a stream is input which includes a frame in which stereo extension data is missing at some point within the stream as shown in FIG. 6, the switching unit 109 connects the transmission path to the terminal B because all of the frames 201 to 203 has a single channel and sampling frequency FS that is equal to or higher than 24 kHz. The decoded basic codec signal and bandwidth extension information are transmitted to the second bandwidth extension processing unit 107. In such a manner, the bandwidth extension processing on all of the frames 201 to 203 is performed by the second bandwidth extension processing unit 107, which allows continuity of delay information.

[0092] FIG. 10 is a diagram illustrating an example of waveforms of output stereo audio signals. Conventionally, stereo extension processing is not performed during a period of the frame in which stereo extension data is missing(period between t4 and t5). As shown in (a) in FIG. 10, R-channel audio signal is not output, which gives a listener a feeling of strangeness. In order to overcome such a strange feeling, and properly output R-channel audio signal as shown in (b) in FIG. 10, the stereo extension processing unit 308 performs the following operations.

[0093] Since the frame 201 includes stereo extension data (Yes in S301), the stereo extension processing unit 308 performs stereo extension processing (S302), and stores the stereo extension information used here (S303).

[0094] Next, the frame 202 in which stereo extension data is missing is input. Since stereo extension data is missing in the frame 202 (No in S301) and the stereo extension processing is performed at the time of decoding of the frame 201 (Yes in S304), the stereo extension processing unit 308 performs stereo extension processing on the frame 202 by using the stereo extension information of the frame 201.

[0095] Subsequently, the next frame 203 with stereo extension data is input. Since the frame 203 includes stereo extension data (Yes S301), the stereo extension processing unit 308 performs stereo extension processing on the frame 203 by using the stereo extension information extracted from the frame 203 (S302).

[0096] In such a manner, the audio reproducing device 300 according to Embodiment 2 is capable of keeping continuity of an output sound, and also performing stereo extension even on a frame in which stereo extension data is missing.

[0097] As a result, for example, it is possible to reproduce an audio signal with which a stream having HE-AACv2 profile properly decoded, without increasing the computation amount required when reproducing a multi-channel audio signal. Here, it is possible to reproduce audio signals without abnormal sounds even in the case where no PS data is input and then PS data is input. Alternatively, in the case where PS data is input in a preceding frame and then no PS data is input in a subsequent frame, a stereo audio signal can be reproduced by using the previous PS data.

[0098] FIG. 11 is an external view of an example of an audio reproducing apparatus incorporating an audio reproducing device according to the present invention. FIG. 11 illustrates a recording medium 401, an audio reproducing apparatus 402, and earphones 403.

[0099] The recording medium 401 is a recording medium which is capable of recording compressed audio streams. FIG. 11 shows the recording medium 401 as a medium, such as a secure digital (SD) card, removable from an apparatus; however, the recording medium 401 may also be implemented as an optical disk, a hard disk drive (HDD) incorporated in the apparatus, or the like.

[0100] The audio reproducing apparatus 402 is an apparatus which reproduces compressed audio streams, and includes at least one of the audio reproducing devices 100 and 300 according to Embodiments 1 and 2.

[0101] The earphones 403 are loud speaker apparatus which output audio signals output from the audio reproducing apparatus 402 to outside. FIG. 11 illustrates earphones which are inserted into the ears of a user; however, the earphones may be headphones which are put on the head of the user, or desktop loudspeakers.

[0102] According to such structure of the audio reproducing apparatus 402, it is possible to obtain an output audio signal without causing abnormal sounds even when a stream includes a frame in which stereo extension data is missing.

[0103] The audio reproducing device and the audio reproducing method according to the present invention have been described based on the embodiments; however, the present invention is not limited to these embodiments. Those skilled in the art will readily appreciate that many modifications are possible in the exemplary embodiments without materially departing from the novel teachings and advantages of this invention. Accordingly, all such modifications are intended to be included within the scope of this invention.

[0104] For example, the switching unit 109 makes the determination based on the determination condition that the number of channels is 1 and the sampling frequency is 24 kHz or lower; however, the determination condition is not limitative. For example, it may be that the switching unit 109 determines to use the second bandwidth extension processing unit 107 (connect to the terminal B) only when the number of channels is two or less. In this case, when a stream having the basic codec with 1 or 2 channels is input, bandwidth extension is performed by the second bandwidth extension processing unit 107 which generates higher sound quality but requires larger processing amount.

[0105] In the case where a stream of 3 or more channels is input, it is possible to perform bandwidth extension by using the first bandwidth extension processing unit 106 which requires less processing amount but generates lower sound quality, to reduce the overall processing amount. In such a manner, it is possible to provide high-quality sound even for multi-channel processing as long as the processing capability and memory resources permit.

[0106] The present invention may be implemented not only as an audio reproducing device and an audio reproducing method as described above, but also as a program causing a computer to execute an audio reproducing method according to the embodiments. The present invention may also be implemented as a recording medium, such as a computer readable CD-ROM, which stores the program. Furthermore, the present invention may be implemented as information, data, or a signal indicating the program. Such program, information, data, and signal may be distributed over a communication network such as the Internet.

[0107] Furthermore, portion or all of the constituent elements of the audio reproducing device according to the present invention may be structured as a single system LSI. The system LSI is a super multi-functional LSI manufactured by integrating a plurality of structural units onto a single chip. Specifically, it is a computer system including a microprocessor, a ROM, a RAM, and the like.

[Industrial Applicability]

[0108] The present invention prevents a significant increase in processing amount, and also prevents occurrence of abnormal sounds. The present invention may be used for, for example, an audio reproducing device. For example, the present invention may be used for an audio reproducing apparatus, such as a portable music player, which has limited processor capability and limited memory resources.

[Reference Signs List]

[0109]

100, 300	Audio reproducing device
101	Stream separating unit
102	Basic codec analyzing unit
103	Basic codec decoding unit
104	Bandwidth extension data analyzing unit
105	stereo extension data analyzing unit
106	First bandwidth extension processing unit
107	Second bandwidth extension processing unit
108, 308	Stereo extension processing unit
109	Switching unit
201, 202, 203	Frame
310	Buffer
401	Recording medium
402	Audio reproducing apparatus
403	Earphones

Claims

1. An audio reproducing device which reproduces a stream including a basic codec that is an encoded audio signal, said audio reproducing device comprising:

a stream separating unit configured to separate, on a frame basis, the stream into the basic codec and bandwidth extension information that is used for extending a band of the basic codec;

a basic codec information analyzing unit configured to analyze the basic codec separated by said stream separating unit, to generate analysis information indicating a type of the basic codec;

a basic codec decoding unit configured to decode the basic codec in accordance with the analysis information generated by said basic codec information analyzing unit, to generate a decoded basic codec signal;

a first bandwidth extension processing unit configured to execute first processing which extends, by using the bandwidth extension information, a frequency band of the decoded basic codec signal generated by said basic codec decoding unit;

a switching unit configured to switch between said first bandwidth extension processing unit and said second bandwidth extension processing unit based on the analysis information,

wherein said first bandwidth extension processing unit is configured to execute the first processing by performing real number operations, and

said second bandwidth extension processing unit is configured to execute the second processing by performing complex arithmetic.

2. The audio reproducing device according to Claim 1,
wherein said stream separating unit is configured to separate, on the frame basis, the stream into the basic codec, the bandwidth extension information, and stereo extension information that is used for performing stereo processing on the basic codec,
said audio reproducing device further comprises:

a stereo extension processing unit configured to perform, by using the stereo extension information, stereo processing on the decoded basic codec signal having the frequency band extended by said second bandwidth extension processing unit.

3. The audio reproducing device according to Claim 2,
wherein said basic codec information analyzing unit is configured to analyze the basic codec separated by said stream separating unit, to generate analysis information including at least one of channel information and sampling frequency information, the channel information indicating the number of channels of the basic codec, the sampling frequency information indicating a sampling frequency of the basic codec, and
said switching unit is configured to determine at least one of (i) whether the number of channels indicated by the channel information is greater than a predetermined first threshold and (ii) whether the sampling frequency indicated by the sampling frequency information is greater than a predetermined second threshold, and select said first bandwidth extension processing unit when at least one of the following is determined: (i) the number of channels is greater than the predetermined first threshold and (ii) the sampling frequency is greater than the predetermined second threshold.

4. The audio reproducing device according to Claim 2 or Claim 3, further comprising
a buffer which stores stereo extension information of a first frame,
wherein said stereo extension processing unit is configured to perform stereo processing on a decoded basic codec signal of a second frame by using the stereo extension information stored in said buffer, the second frame being a frame after the first frame and being a frame in which the stereo extension information is missing.

5. The audio reproducing device according to Claim 2 or Claim 3, wherein said second bandwidth extension processing unit is configured to generate a high-frequency component signal from the decoded basic codec signal by using the bandwidth extension information,
said stereo extension processing unit is configured to perform, by using the stereo extension information, stereo processing on the decoded basic codec signal and the high-frequency component signal generated by said second bandwidth extension processing unit, to generate a decoded basic codec signal and a high-frequency component signal for a first channel and a decoded basic codec signal and a high-frequency component signal for a second channel, and
said second bandwidth extension processing unit further includes a band synthesis filter for synthesizing the high-frequency component signal and the decoded basic codec signal that have been generated, and is configured to synthesize bands of the second channel by using delay information that is stored in said band synthesis filter of the first channel, as delay information stored in said band synthesis filter of the second channel, when the stereo extension information is missing.

6. The audio reproducing device according to one of Claims 2 to 5, wherein the basic codec is an audio signal encoded according to Advanced Audio Coding (AAC) scheme,
the bandwidth extension information is Spectral Band Replication (SBR) information generated according to SBR scheme,
the stereo extension information is Parametric Stereo (PS) information generated according to PS scheme,
said first bandwidth extension processing unit is configured to extend a frequency band of the decoded basic codec signal according to Low Power-SBR (LP-SBR) scheme, and
said second bandwidth extension processing unit is configured to extend a frequency band of the decoded basic codec signal according to High Quality-SBR (HQ-SBR) scheme.

7. An audio reproducing apparatus comprising the audio reproducing device according to one of Claims 1 to 6.

8. An audio reproducing method which reproducing a stream including a basic codec that is an encoded audio signal, said audio reproducing method comprising:

separating, on a frame basis, the stream into the basic codec and bandwidth extension information that is used for extending a band of the basic codec;

analyzing the basic codec separated in said separating, to generate analysis information indicating a type of the basic codec;

decoding the basic codec in accordance with the analysis information generated in said analyzing, to generate a decoded basic codec signal;

switching between first processing and second processing based on the analysis information, the second processing being executed with an accuracy higher than an accuracy of the first processing;

executing, when the first processing is selected in said switching, the first processing which extends, by using the bandwidth extension information, a frequency band of the decoded basic codec signal generated in said decoding; and

executing, when the second processing is selected in said switching, the second processing which extends, by using the bandwidth extension information, the frequency band of the decoded basic codec signal generated in said decoding,

wherein real number operations are performed in the first processing, and

complex arithmetic is performed in the second processing.

9. An integrated circuit which reproduces a stream including a basic codec that is an encoded audio signal, said integrated circuit comprising:

a stream separating unit configured to separate, on a frame basis, the stream into the basic codec and bandwidth extension information that is used for extending a band of the basic codec;

a basic codec information analyzing unit configured to analyze the basic codec separated by said stream separating unit, to generate analysis information indicating a type of the basic codec;

a second bandwidth extension processing unit configured to execute second processing which extends, by using the bandwidth extension information, the frequency band of the decoded basic codec signal generated by said basic codec decoding unit, the second processing being executed with an accuracy higher than an accuracy of the first processing; and
a switching unit configured to switch between said first bandwidth extension processing unit and said second bandwidth extension processing unit based on the analysis information,

wherein said first bandwidth extension processing unit is configured to execute the first processing by performing real number operations, and

said second bandwidth extension processing unit is configured to execute the second processing by performing complex arithmetic.

10. A program causing a computer to execute an audio reproducing method for reproducing a stream including a basic codec that is an encoded audio signal, said program comprising:

separating, on a frame basis, the stream into the basic codec and bandwidth extension information that is used for extending a band of the basic codec;

analyzing the basic codec separated in said separating, to generate analysis information indicating a type of the basic codec;

decoding the basic codec in accordance with the analysis information generated in said analyzing, to generate a decoded basic codec signal;

switching between first processing and second processing based on the analysis information, the second processing being executed with an accuracy higher than an accuracy of the first processing;

wherein real number operations are performed in the first processing, and

complex arithmetic is performed in the second processing.

Drawing

Search report

Search report

Cited references

REFERENCES CITED IN THE DESCRIPTION

This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

Non-patent literature cited in the description

MITSUTOSHI HATORIOne segment broadcasting textbookImpress Japan20050615 [0010]
TOSHIYUKI NOMURALatest trends and applications of MPEG audioUniversity of the Ryukyus Computing20080400 [0010]
Information technology-Coding of audio-visual objects-Part 3: Audio, AMENDMENT 2: Audio Lossless Coding (ALS), new audio profiles and BSAC extensionsISO/IEC 14496-3:2005/FDAM2, 2005, [0010]