A METHOD AND AN APPARATUS FOR ADJUSTING QUANTIZATION QUALITY IN ENCODER AND DECODER

(19)

(11)

EP 2 104 095 A1

(12)	EUROPEAN PATENT APPLICATION
	published in accordance with Art. 153(4) EPC

(43)	Date of publication:
	23.09.2009 Bulletin 2009/39

(21)	Application number: 07855801.2

(22)	Date of filing: 26.12.2007

(51)

International Patent Classification (IPC):

G10L 19/00^(2006.01)

G10L 19/02^(2006.01)

(86)	International application number:
	PCT/CN2007/003799

(87)	International publication number:
	WO 2008/064577 (05.06.2008 Gazette 2008/23)

(84)	Designated Contracting States:
	AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC MT NL PL PT RO SE SI SK TR

(30)

Priority:

01.12.2006 CN 200610164330

(71)	Applicant: Huawei Technologies Co., Ltd.
	Shenzhen, Guangdong Province 518129 (CN)

(72)

Inventors:

LI, Wei
Shenzhen, Guangdong Province 518129 (CN)
XU, Lijing
Shenzhen, Guangdong Province 518129 (CN)
ZHANG, Qing
Shenzhen, Guangdong Province 518129 (CN)
XU, Jianfeng
Shenzhen, Guangdong Province 518129 (CN)
SANG, Shenghu
Shenzhen, Guangdong Province 518129 (CN)
DU, Zhengzhong
Shenzhen, Guangdong Province 518129 (CN)
ZOU, Yao
Shanghai Jiaotong University 200025 (CN)
LIU, Peilin
Shanghai Jiaotong University 200025 (CN)

(74)	Representative: Pfenning, Meinig & Partner GbR
	Patent- und Rechtsanwälte Theresienhöhe 13 80339 München 80339 München (DE)

(54)	A METHOD AND AN APPARATUS FOR ADJUSTING QUANTIZATION QUALITY IN ENCODER AND DECODER

(57) A method for adjusting quality of quantization for encoding is disclosed according to embodiments of the present invention. The method includes adjusting a first group of sample values for encoding with at least two scale factors; quantizing the adjusted first group of sample values to obtain quantized sample values; eliminating the impact of the scale factors from the quantized sample values to obtain a second group of sample values; obtaining a global gain with the first group of sample values and the second group of sample values; outputting the quantized sample values, information of the at least two scale factors and the obtained global gain as an encoding stream. Embodiments of the present invention further disclose a method for adjusting quality of quantization for decoding, and apparatuses for adjusting quality of quantization for encoding and decoding. The methods and apparatuses disclosed by the present invention may greatly reduce the implementation complexity and may well adjust the quality of quantization at critical bands so that a better encoding performance may be achieved.

Description

FIELD OF THE INVENTION

[0001] The present invention relates to encoding technology, and more specifically, to a method and apparatus for adjusting quality of quantization for encoding/decoding.

BACKGROUND

[0002] With the development of communication technology and the extension of multi-media service, encoding such as digital audio cording or digital video encoding not only requires a higher encoding efficiency and a real-time feature, but also a further extended encoding bandwidth. In terms of the digital audio encoding, techniques meeting the requirement of low bit rate and high audio encoding quality mainly include: AAC+, EAAC+ and AMR-WB+. The AAC+ and EAAC+ are evolved from an audio encoder with high bit rate, while the AMR-WB+ is a mixing encoding method by extending audio encoding with low bit rate.

[0003] In normal audio encoding, to better combine characteristics of human auditory system, time-frequency transformation is typically first performed on samples and the rounding, weighting, and quantization are then performed on spectrum coefficients based on auditory characteristics. The quantized spectrum coefficients are transported by encoding entropy. A major distortion in the encoding comes from quantization of various parameters. Therefore, to accommodate different requirements, the encoder needs to adjust the quality of quantization based on a specified encoding rate. In an encoding scheme with high bit rate above 24kbps, a good encoder may achieve a transparent sound, i.e., human ear may not perceive the noise introduced in the encoding and quantizing process. In an encoding scheme with low bit rate, since the number of bits is insufficient, the effect of a perfect sound transparency may not be achieved. Therefore, one may only pursue a minimum subject distortion.

[0004] A common scheme for adjusting quality of quantization is to use a scale factor or a gain. First, a coded coefficient is divided by a scale factor or multiplied with a gain. Then, the scaled coefficient is quantized. The optimal scale factor may both satisfy the requirement of bit rate and minimize the quantization error. Therefore, when the bit rate is high, a smaller scale factor is chosen such that the quantized coefficient may have a larger dynamic range and a relatively refined quantization. When the bit rate is slow, a bigger scale factor is chosen such that the quantized coefficient may have a smaller dynamic range and a relatively coarse quantization.

[0005] Figure 1 illustrates a block diagram of an MPEG1-LAYER3 audio encoding algorithm. In the MPEG1-LAYER3 audio encoding algorithm, before the time-frequency transform, the whole encoding band is divided into 32 sub-bands with each assigned with a scale factor. The whole band is assigned with a global scale factor. Before quantization, the global scale factor is adjusted using a close loop search algorithm such that the number of quantization bits is controlled within the range allowed by the current bit rate. At the same time, scale factors for sub-bands are adjusted such that the quantization noise is controlled under the masking threshold of human auditory system. That is, the human ear may not perceive the presence of the quantization noise. Finally, the quantized coefficient flow is transmitted by way of Huffman encoding.

[0006] Multiple scale factors for sub-bands encoding method in the MPEG1-LAYER3 encoding algorithm have the following defects:

(1) the division into sub-bands requires 32 sub-band analysis filters, resulting in a highly complicated computation.
(2) the scale factor of each sub-band needs to be transmitted by way of Huffman encoding, which occupies a large number of bits and is not appropriate for low bit rate encoding.

[0007] Figure 2 illustrates a partial flowchart of Transform Coded Excitation (TCX) of AMR-WB+ audio encoding algorithm. In AMR-WB+ audio encoding, a global scale factor is used. Due to the limitation of using one scale factor, a specific frequency band cannot be finely tuned. Moreover, considering encoding requirement on low bit rate, the frequency domain samples in the spectrum which have a low energy may lost during vector quantization. However, since human auditory system has different sensitivities over different frequency bands, it is desired that the frequency domain samples with low energy at critical frequency bands can still be quantized during encoding. Therefore, in AMR-WB+ audio encoding, the spectrum pre-rectification and spectrum inverse rectification are employed. For TCX of AMR-WB+ audio encoding algorithm, critical frequency bands in the whole spectrum are first pre-rectified to raise the energy at these specific bands and then a global scale factor is used for the whole frequency band.

[0008] Since human auditory system has a high resolution for low frequency bands, the above mentioned critical frequency bands typically refers to low frequency bands. For spectrum pre-rectification scheme in AMR-WB+ audio encoding, every 8 frequency domain samples in the first quarter of the spectrum is treated as a block. The energy of each block E_m is calculated where m denotes a block index number. Then, a maximum block energy E_max is determined and R_m = (E_max / E_m)^1/4 for each block is computed. A gain factor G_m for each block is obtained based on R_m such that the gain factor G_m processes a monotonic decreasing property. Finally, the frequency domain sample values in each block are multiplied with a gain factor G_m associated with the block. In AMR-WB+ audio encoding, the gain factor obtained from the spectrum pre-rectification is not transmitted in encoding streams. Instead, according to spectrum inverse rectification method, original sample values in frequency domain are restored by dividing sample values in frequency domain of each block by a gain factor of a corresponding block after a gain factor G_m of each block is calculated based on sample values in frequency domain.

[0009] It is discovered that the global scale factor algorithm for the TCX portion of the existing AMR-WB+ audio encoding algorithm has the following defects.

(1) Since only one scale factor is used for the whole frequency band, the quality of quantization may be adjusted only on a whole band basis. As a result, some critical frequency bands cannot be emphasized.
(2) Although a spectrum pre-rectification technique and a spectrum inverse rectification technique are used for improving the quality of quantization at low frequencies, the quality of quantization at other frequencies is thereby sacrificed.
(3) Spectrum pre-rectification and inverse rectification techniques can only be applied to narrower frequency bands, otherwise, the global scale factor will become increased significantly and the effect of the quantization as a whole may be reduced.
(4) Since the gain factors for pre-rectification at the encoding stage are not recorded in the encoding streams, the accumulation of errors introduced by the quantization may be reflected in the reduction factors during the inverse rectification process.

SUMMARY

[0010] According to one embodiment of the present invention, a method for adjusting quality of quantization for encoding is provided to reduce the implementation complexity.

[0011] According to one embodiment of the present invention, a method for adjusting quality of quantization for decoding is provided to guarantee the quality of quantization.

[0012] According to one embodiment of the present invention, an apparatus for adjusting quality of quantization for encoding is provided to reduce the implementation complexity.

[0013] According to one embodiment of the present invention, an apparatus for adjusting quality of quantization for decoding is provided to guarantee the quality of quantization.

[0014] A method for adjusting quality of quantization for encoding is provided according to one embodiment of the present invention. The method includes: adjusting a first group of sample values for encoding with at least two scale factors; quantizing the adjusted first group of sample values to obtain the quantized sample values; eliminating the impact of the scale factors from the quantized sample values to obtain a second group of sample values; obtaining a global gain with the first group of sample values and the second group of sample values; and outputting the quantized sample values, information of the two or more scale factors and the obtained global gain as an encoding stream.

[0015] A method for adjusting quality of quantization for decoding is provided according to one embodiment of the present invention where an encoding stream output by an encoder is decoded as a decoding stream. The method includes: acquiring quantized sample values, information of two or more scale factors and a global gain from the decoding stream; utilizing the information of the two or more scale factors to eliminate the impact of the scale factors from the quantized sample values to obtain sample values; and multiplying the sample values with the global gain.

[0016] An apparatus for adjusting quality of quantization for encoding is provided according to one embodiment of the present invention. The apparatus includes: a multiple scale factors control unit, a quantization unit, a gain balancing unit, and a global gain computing unit. The multiple scale factors control unit is configured to receive a first group of sample values, configure two or more scale factors for the first group of sample values, adjust the first group of sample values with the scale factors, and output the first group of adjusted sample values to the quantization unit. The quantization unit is configured to quantize the received first group of sample values, obtain quantized sample values and output the quantized sample values to the gain balancing unit. The gain balancing unit is configured to receive the quantized sample values, eliminate the impact of the scale factors from the quantized sample values, obtain a second group of sample values, and output the second group of sample values to the global gain computing unit. The global gain computing unit is configured to receive the first group of sample values and the second group of sample values, and obtain the global gain by using the first group of sample values and the second group of sample values.

[0017] An apparatus for adjusting quality of quantization for decoding is provided according to one embodiment of the present invention. The apparatus includes: a gain balancing unit, and a global gain balancing unit. The gain balancing unit is configured to receive the quantized sample values and reduction factors, utilize the received reduction factors to eliminate the impact of the scale factors from the quantized sample values and obtain sample values, and output the sample values to the global gain balancing unit. The global gain balancing unit is configured to receive a global gain and the sample values, multiply the sample values with the global gain and output the multiplications.

[0018] As different from the prior art scheme where filters are utilized, methods and apparatuses for adjusting quality of quantization according to various embodiments of the present invention directly divide the sample values into a plurality of portions and configure a scale factor for each portion. Therefore, the present invention may greatly reduce the implementation complexity. Moreover, compared with the prior art scheme using one global factor, since a plurality of scale factors are introduced, the present invention may better adjust the quality of quantization at critical bands and achieve a better encoding performance.

BRIEF DESCRIPTION OF THE DRAWINGS

[0019] Figure 1 illustrates a conventional block diagram of an MPEG1-LAYER3 audio encoding algorithm;

[0020] Figure 2 illustrates a conventional flowchart of TCX part in AMR-WB+ audio encoding algorithm;

[0021] Figure 3 illustrates a block diagram of an encoder for adjusting quality of quantization according to Embodiment 1;

[0022] Figure 4 illustrates a block diagram of a decoder for adjusting quality of quantization according to Embodiment 1;

[0023] Figure 5 illustrates a flowchart of adjusting quality of quantization at the encoder by using a plurality of scale factors according to Embodiment 1;

[0024] Figure 6 illustrates a flowchart of selecting a plurality of scale factors and finely tuning the frequency domain sample values on the whole frequency band according to Embodiment 1;

[0025] Figure 7 illustrates a flowchart of adjusting quality of quantization at the decoder by using a plurality of scale factors according to Embodiment 1;

[0026] Figure 8 illustrates a schematic diagram of an encoder for adjusting quality of quantization according to Embodiment 2;

[0027] Figure 9 illustrates a schematic diagram of a decoder for adjusting quality of quantization according to Embodiment 2;

[0028] Figure 10 illustrates a schematic diagram of peak pre-rectification according to Embodiment 2;

[0029] Figure 11 illustrates a schematic diagram of peak inverse rectification according to Embodiment 2;

[0030] Figure 12 illustrates a schematic diagram of an encoder for adjusting quality of quantization according to Embodiment 3;

[0031] Figure 13 illustrates a schematic diagram of a decoder for adjusting quality of quantization according to Embodiment 3;

[0032] Figure 14 illustrates a block diagram of an apparatus for adjusting quality of quantization at an encoder according to Embodiment 4; and

[0033] Figure 15 illustrates a block diagram of adjusting quality of quantization at a decoder according to Embodiment 4.

DETAILED DESCRIPTION

[0034] The purposes, technical solutions and advantages concerning the embodiments of the present invention will become more readily appreciated by reference to the following description of the embodiments.

[0035] The main idea of adjusting quality of quantization according to embodiments of the present invention is to utilize a plurality of scale factors or further utilize the spectrum rectification technique to adjust quality of quantization during an encoding process. An encoding process where a time-frequency transform has been performed is illustrated below. Embodiments of the present invention also apply to an encoding process where time-frequency transform has not been performed.

[0036] Embodiment 1:

[0037] Embodiment 1 provides a method for adjusting quality of quantization with a plurality of scale factors.

[0038] Figure 3 illustrates a schematic diagram of an encoder for adjusting quality of quantization according to Embodiment 1. In the encoding process, sample values in time domain (time domain frequency values) are first transformed to frequency domain by a time-frequency transform operation. Then, after the control of a plurality of scale factors, these sample values are quantized and the quantized sample values are output. An optimal global gain is calculated by performing gain balancing and inverse time-frequency transform on the output quantized sample values. Scale factors, quantized sample values in frequency domain (frequency domain sample values) and a global gain need to be transmitted in encoding streams.

[0039] Figure 4 illustrates a schematic diagram of a decoder for adjusting quality of quantization according to Embodiment 1. In a decoding process, after the quantized sample values in frequency domain are gain-balanced and inversely transformed from frequency domain to time domain, sample values in time domain are obtained. Finally, these sample values are multiplied with the global gain to form restored sample values in time domain.

[0040] Figure 5 illustrates steps of adjusting quality of quantization at the encoder with a plurality of scale factors according to Embodiment 1. The steps are as following.

[0041] Step 501: Time domain sample values x(n) is transformed to frequency domain sample values X(k) by virtue of a time-frequency transform.

[0042] Time-frequency transform herein may include a Discrete Fourier Transform (DFT), a Discrete Cosine Transform (DCT, MDCT, IDCT), a Discrete Wavelet Transform (DWT), etc. In a time-frequency transform process, Fast Fourier Transform (FFT) can also be applied for reducing computational complexity.

[0043] Step 502: A plurality of scale factors is used to control frequency domain sample values X(k). In particular, a plurality of proper scale factors are selected and used to finely tune the frequency domain sample values on the whole frequency band.

[0044] In the present embodiment, assume that m scale factors are applied to frequency domain sample values X(k) (k=0,1,...,N) on the whole frequency band and assume that the maximum allowable number of bits during encoding process is b_max. Referring to the flowchart illustrated in Figure 6, steps of selecting a plurality of proper scale factors and finely tuning the frequency domain sample values are described hereinafter.

[0045] Step 601: divide the whole frequency band into m portions [0,n₁],[n₁+1,n₂],···,[n_m-1+1,N] and frequency domain sample values X(0,1,···,n₁),X(n_m-1+1,n_m-1+2,···,N),···,X(n₁+1,n₁+2,···,n₂) for m portions are obtained. The scale factor for each portion is denoted as g₁,g₂,···,g_m.

[0046] In one embodiment, the plurality of scale factors can be used for a direct division of the whole frequency band after a time-frequency transform is performed, thereby eliminating the necessity of first using a group of filters for dividing the spectrum into several bands and then configuring a scale factor for each band. Compared with prior art, the present invention may significantly reduce the implementation complexity.

[0047] Step 602: A criteria value g₀ is selected for estimating m scale factors. The criteria g₀ for scale factors is selected in such a way that the estimation of the number of consumed bits b₀ is less than the maximum allowable number of bits b_max.

[0048] In the present embodiment, the number of consumed bits b is associated with frequency domain sample values X, the number of frequency domain samples N and the scale factor g, which can be expressed as a function of b = cons(X,N,g). Therefore, in step 602, when the criteria is selected as g₀, the estimation of the number of consumed bits is b₀ = cons(X,N,g₀) where b₀ < b_max.

[0049] Step 603: m scale factors g₁,g₂,···,g_m are adjusted around g₀.

[0050] In step 603, m scale factors are adjusted in such a way as to decrease scale factors at more critical bands and increase scale factors at less critical bands. Here, the more critical bands refer to low frequency bands while the less critical bands refer to high frequency bands. Because g₁ ∼ g_m correspond to bands from low frequency to high frequency, the adjusted m scale factors g^'₁,g^'₂,···,g^'_m increase gradually. With such adjustment, the quality of quantization at more critical bands is relatively good and the quality of quantization at less critical bands is relatively lower. Consequently, the quality of quantization at the whole frequency band can be optimized.

[0051] Step 604: It is determined whether the estimated number of consumed bits is no more than the total number of bits. If not, the process returns to step 603 and the scale factors are adjusted again. If so, m scale factors which satisfy the number of consumed bits are denoted as g^'₁,g^'₂,···,g^'_m.

[0052] Step 605: Quantization perception distortion is computed based on m adjusted scale factors g^'₁,g^'₂,···g^'_m.

[0053] In the present embodiment, the quantization perception distortion C is related to frequency domain sample values X and m scale factors g₁,g₂,···,g_m, which can be expressed as a function of c=f(X,g₁,g₂,···,g_m). The quantization perception distortion C indicates a distortion due to the difference between the original frequency domain sample values X and the sample values which come from the frequency domain sample values X adjusted by m scale factors g₁,g₂,···,g_m. In step 605, quantization perception distortion computed based on m adjusted scale factors g^'₁,g^'₂,···,g^'_m is denoted as c=f(X,g^'₁,g^'₂,···,g^'_m).

[0054] Step 606: It is determined whether the quantization perception distortion is within an imperceptible range. If so, m scale factors obtained from the current adjustment are regarded as the optimal scale factors which are denoted as g₁_opt,g₂_opt,···,g_mopt. Then, the process proceeds to step 607; otherwise, the process returns to step 603.

[0055] If the perception distortion is within the imperceptible range, people may not perceive the quantization noise introduced by the encoder. For instance, for audio encoding, human ear may not perceive the quantization noise introduced by the encoder. For video encoding, human eye may not perceive the quantization noise introduced by the encoder. The specific imperceptible range herein is a specific value interval where distortion is tolerated. The method for determining whether the quantization perception distortion is within an imperceptible range includes determining whether the quantization perception distortion computed at step 605 is within a value interval where distortion is tolerated. If the quantization perception distortion computed at step 605 is within a value interval where distortion is tolerated, the quantization perception distortion is regarded as imperceptible; otherwise, the quantization perception distortion is regarded as perceptible.

[0056] In the present embodiment, according to the determination result at step 606, in the case where the quantization perception distortion can be perceived, if the quantization perception distortion can still be perceived after the adjusting step described above is repeated M times, the close loop selection is terminated and a set of scale factors which contribute to a minimum perception distortion are selected from the scale factors obtained during the repetitive adjustment procedure as optimal scale factors. Then, the process proceeds to step 607. In practice, the times of close loop selection M may be determined based on actual situation.

[0057] Step 607: m optical scale factors g_1opt,g_2opt,···,g_mopt obtained are used to finely tune the frequency domain sample values X. That is, the frequency domain sample values of each block are divided by an optical scale factor corresponding to the block. The finely tuned spectrum X' obtained can be expressed as

[0058] The finely tuned frequency domain sample values X' obtained at steps 601∼607 are fed into encoder.

[0059] Considering the fact that scale factors are needed for restoring data during decoding, scale factors should be transmitted in the encoding streams. A variety of methods of transmitting scale factors are introduced below, respectively.

[0060] A first method for transmitting scale factors is to encode m scale factors g₁_opt,g_2opt,···,g_mopt which are used to finely tune the sample values in frequency domain. Thus, the data can be restored more correctly when being decoded.

[0061] A second method for transmitting scale factors is to select a scale factor as a criteria scale factor from m scale factors g_1opt, g_2opt,···, g_mopt which are used to finely tune the sample values in frequency domain, and compute the ratios of the remaining m-1 scale factors to the criteria scale factor and encode these m-1 ratios. For instance, if g₁_opt is selected as the criteria scale factor, only

needs to be coded, thereby reducing the number of consumed bits.

[0062] A third method for transmitting scale factors is to select a scale factor as a criteria scale factor from m scale factors g_1opt,g_2opt,···,g_mopt which are used to finely tune the sample values in frequency domain, and compute the ratios of the remaining m-1 scale factors to the criteria scale factor and encode the criteria scale factor and these m-1 ratios. For instance, if g_1opt is selected as a criteria scale factor, only g_1opt and

need to be encoded. Therefore, not only the number of consumed bits can be reduced, but the data can also be restored more correctly since the decoder can compute g_1opt,g_2opt,···,g_mopt from g_1opt and

[0063] In order not to take up lots of number of bits when a plurality of scale factors are used, optimal number of scale factors may be selected in accordance with the requirement of encoding bit rate and quality of quantization. For instance, 2∼3 scale factors may be selected for a low bit rate encoding.

[0064] Step 503: Frequency domain sample values X' obtained by controlling a plurality of scale factors are quantized and quantized frequency domain sample values X_q are output.

[0065] In the step 503, other quantization approaches may be employed in accordance with encoding requirement, such as multistage vector quantization, split vector quantization, tree-structured vector quantization and trellis coded vector quantization.

[0066] Step 504: The impact imposed by the scale factors is eliminated from the quantized frequency domain sample values X_q obtained from step 503 and original frequency domain sample values X_balance can thus be restored. That is, X_balance can be obtained by performing a gain balance on the quantized frequency domain sample values X_q.

[0067] The gain balancing method varies with different method for transmitting scale factors.

[0068] If the method for transmitting scale factors is the first method or the third method, the scale factors g_1opt,g_2opt,···, g_mopt selected according to step 502 may be used for gain balancing. In particular, quantized frequency domain sample values X_q are also divided into m portions in accordance with the method for dividing frequency bands as described in step 601. Then, X_q(0,1,···,n₁),X_q(n_m-1+1,n_m-1+2,···,N),···,X_q(n₁+1,n₁+2,···,n₂) are obtained. Quantized frequency domain sample values for each portion are multiplied with a scale factor of a corresponding portion. X_balance can be expressed as

[0069] If the method for transmitting scale factors is the third method, ratios of a plurality of scale factors can be used for gain balancing. In particular, quantized frequency domain sample values X_q are also divided into m portions in accordance with the method for dividing frequency bands as described in step 601. Then, X_q(0,1,···,n₁),X_q(n_m-1+1,n_m-1,+2,···,N),···,X_q(n₁+1,n₁+2,···,n₂) are obtained. The frequency domain sample values of the portion to which the criteria scale factor corresponds are multiplied with 1. The quantized frequency domain sample values of the remaining portions are all multiplied with the ratios of the scale factors for the remaining portions to the criteria scale factor. If scale factor g_1opt of the first portion is adopted as the criteria scale factor, X_balance can be expressed as

[0070] Step 505: Inverse time-frequency transform is performed on X_balance which are obtained through gain balancing. The restored frequency domain sample values X_balance are transformed to the restored time domain sample values x_q(n).

[0071] Step 506: The original time domain sample values x(n) and the restored time domain sample values x_q(n) are used to compute an optimal global gain g_gopt.

[0072] Here, a global gain g_g is selected as an optimal global gain g_gopt such that the variance between the original time domain sample values and the restored time domain sample values is at its minimum, i.e., the optimal global gain g_gopt renders

at its minimum. Thus, the optimal global gain can be computed from

[0073] The optimal global gain g_gopt may also require an encoding transmission so that the optimal global gain g_gopt can be used for data recovery.

[0074] The foregoing is a procedure for adjusting quality of quantization at the encoder by using a plurality of scale factors. Corresponding to adjusting quality of quantization during encoding, the process of restoring the sample values in time domain at the decoder based on the decoded quantized sample values in frequency domain is illustrated in Figure 7. The process includes the following steps.

[0075] Step 701: Scale factors obtained from the encoding streams are used for gain balancing for the quantized sample values in frequency domain. The implementation is similar to the method described in step 504, which is omitted herein for brevity. It should be noted that the gain balancing method may vary with the different method of transmitting scale factors. In addition, the gain balancing method at the encoder and the gain balancing method at the decoder should also be consistent with each other.

[0076] Step 702: Inverse time-frequency transform is performed on the sample values in frequency domain which have been gain balanced and the sample values in time domain are thus obtained.

[0077] Step 703: Restored sample values in time domain are obtained by multiplying the sample values in time domain with the global gain obtained from the coded streams.

[0078] The scheme of multiple scale factors control adopted in Embodiment 1 may be applied directly to sample values in time domain, which means that the scheme may be applied to the case where no time-frequency transform is performed. Accordingly, no inverse time-frequency transform is involved during the computation of the global gain. In this case, when a plurality of scale factors are being configured, the sample values in time domain can be divided by time intervals. When adjusting the plurality of scale factors, scale factors associated with more critical time intervals are decreased and the scale factors associated with less critical time intervals are increased.

[0079] Embodiment 2:

[0080] Embodiment 2 provides a method for adjusting quality of quantization with a plurality of scale factors and spectrum rectification.

[0081] Figure 8 illustrates a schematic diagram of an encoder for adjusting quality of quantization according to Embodiment 2. In the encoding process, sample values in time domain are first transformed to frequency domain by a time-frequency transform operation. Then, after the spectrum pre-rectification and the control of a plurality of scale factors, these samples are quantized and the quantized sample values are output. An optimal global gain is calculated by performing gain balancing, inverse spectrum rectification and inverse time-frequency transform on the output quantized sample values. Scale factors, quantized sample values in frequency domain and a global gain need to be transmitted in an encoding stream.

[0082] Figure 9 illustrates a schematic diagram of a decoder for adjusting quality of quantization according to Embodiment 2. In a decoding process, after the quantized sample values in frequency domain go through a gain-balancing, inverse spectrum rectification and inverse time-frequency transform, sample values in time domain are obtained. Finally, these sample values are multiplied with the global gain to form restored sample values in time domain.

[0083] In addition to the process illustrated in Figure 5 according to Embodiment 1, the method for adjusting quality of quantization with a plurality of scale factors and peak rectification according to Embodiment 2 may further include a spectrum pre-rectification step between the time-frequency transform at step 501 and the control of scale factors at step 502 and may further include an inverse spectrum rectification step between the gain balancing at step 504 and the inverse time-frequency transform at step 505. The spectrum pre-rectification and inverse spectrum rectification are now detailed below.

[0084] Figure 10 illustrates a schematic diagram of spectrum pre-rectification which is implemented by the following steps.

[0085] Step 1001: A spectrum rectification area is determined and in this spectrum rectification area, a peak set for the sample values in frequency domain obtained at step 501 are marked as {p_m,m=1,···,M}.

[0086] The spectrum rectification area herein refers to a spectrum area at more critical bands. For instance, for audio data, since human auditory system has a high resolution at low frequencies, the low frequencies are considered as more critical bands. For another instance, for data such as videos, images, since most of the data information is distributed at low frequencies, the low frequencies are considered as more critical bands. Therefore, the spectrum rectification area may take front part of the whole band, such as, the first quarter of the band.

[0087] Therefore, peak p_k may be defined as a local maximum value among amplitudes in the spectrum rectification area. If X(i)>X(j), ∀j∈[i-Δ,i+Δ], i≠j, X(i) is a local maximum value among 2Δ+1 points within [i-Δ,i+Δ] where the local area can be selected at random.

[0088] Step 1002: Reference p_ref for spectrum pre-rectification is computed.

[0089] The principle of selecting reference is to remain the value of the reference unchanged before or after spectrum rectification. In step 1002, the maximum value in peak set {p_m,m=1,···,M} is regarded as reference p_ref. Alternatively, a local maximum energy value can be regarded as reference p_ref. Considering the impact by the quantization error, a characteristic parameter of a block of data can be regarded as reference p_ref for lessoning the impact by the quantization error on the reference. Preferably, the energy level at a data point close to the maximum peak in the peak set {p_m,m=1,···,M} may be selected as reference p_ref. Alternatively, the average energy of the data points close to the maximum peak in the peak set {p_m,m=1,···,M} may be selected as reference p_ref.

[0090] Step 1003: Gain factor R_m of each peak p_m in the peak set {p_m,m=1,···,M} is computed as

k∈(0,1) where parameters C_m and k may be selected according to actual situation.

[0091] Step 1004: The computed gain factors of the peaks are used to amplify the peaks.

[0092] To ensure that reference p_ref is constant, apart from peaks which are used to calculate reference p_ref , the remaining peaks p_m should be multiplied with their corresponding gain factors R_m. The amplified peaks are expressed as p'_m = p_m · R_m.

[0093] Considering the fact that the human auditory system may have a high resolution at low frequencies, the peak energies can be captured by a quantizer by simply amplifying the peak energies at low frequencies. Therefore, in Embodiment2, only a few frequency points, or peaks, need to be amplified. In the present embodiment, the spectrum pre-rectification technique may also be referred to as peak pre-rectification. With such peak pre-rectification technique, there is less impact on the global gain increase. The quantization error caused by the global gain increase may be neglectable. For a better outcome of the spectrum rectification, the frequency points neighboring to the peaks can also be amplified. For instance, in addition to amplifying a local peak among 2Δ+1 points, 2Δ points or less than 2Δ points adjacent to the peak may be amplified by corresponding gain factors.

[0094] With the above described spectrum pre-amplification, the peaks of frequency domain sample values at more critical bands are enhanced, thereby reducing quantization error at peaks of sample values in frequency domain at more critical bands and reducing the possibility of the loss of the spectrum peaks at more critical bands during quantization.

[0095] In an encoder, to calculate the optimal global gain, the sample values in time domain have to be restored from quantized sample values in frequency domain. If spectrum pre-rectification is employed, inverse frequency rectification needs to be performed on X_balance which have been gain balanced at step 504. The process is illustrated in Figure 11 which includes the following steps.

[0096] Step 1101: In X_balance obtained from step 504, peak set {q_m,m=1,···,M} for sample values in frequency domain is marked in the spectrum rectification area. The spectrum rectification area and the peak marking principle during inverse spectrum rectification is the same as those in the process of spectrum pre-rectification.

[0097] Step 1102: Reference q_ref for inverse spectrum rectification is computed. The principle of computing reference value during inverse spectrum rectification is also the same as that in the process of spectrum pre-rectification. For instance, if the energy, during spectrum pre-rectification process, at a data point adjacent to the maximum peak in the peak set {p_m,m=1,···,M} is regarded as the reference value, the energy, during inverse spectrum rectification process, at the data point adjacent to the maximum peak in the peak set {q_m,m=1,···,M} should also be regarded as the reference value.

[0098] Step 1103: Reduction factor r_m for each peak q_m in the peak set {q_m,m=1,···,M} is computed as

k∈(0,1) where C_m and k have to be consistent with the parameters in the spectrum pre-rectification.

[0099] The principle of computing reduction factor r_m in the inverse spectrum rectification is described as follow: in the spectrum pre-rectification process, the gain factor is

, k∈(0,1). If the value of a certain peak is p, the amplified peak is

According to this equation, p can be expressed as

[0100] As can be seen from the foregoing principle of computing reduction factor in the inverse spectrum rectification, there is no need to transmit the reference value for inverse spectrum rectification in the encoding streams. Such principle can also be applied to the decoder. The reference for inverse spectrum rectification can be computed by utilizing characteristics of sample values itself of the decoder. Reduction factor of the corresponding peak can further be calculated. Therefore, no extra bits are consumed.

[0101] Step 1104: The computed reduction factors of the peaks are used to decrease the peaks. In inverse spectrum rectification, the peaks which have been amplified in spectrum pre-rectification should be decreased. If other peaks in addition to the peaks used to compute the reference value are amplified in the spectrum pre-rectification process, the other peaks in addition to the peaks used to compute the reference value should also be decreased in the inverse spectrum rectification. That is, in addition to the peaks for computing the reference value q_ref, the remaining peaks q_m are divided by corresponding reduction factors r_m. The decreased peaks are expressed as q'_m = q_m/r_m.

[0102] After inverse spectrum rectification is performed according to the above described steps, the sample values in frequency domain obtained from inverse spectrum rectification at step 505 are transformed from frequency domain to time domain.

[0103] In Embodiment 2, since spectrum pre-rectification is performed between the time-frequency transform process and the process of the control of scale factors, the decoder may need to perform, accordingly, an inverse spectrum rectification between the gain balance process and the inverse time- frequency transform process. The detailed implementation is similar to that of the method of inverse spectrum rectification in the above encoding process, which is omitted herein for brevity.

[0104] In the above Embodiment 2, the spectrum pre-rectification is performed prior to the scale factors being controlled. In addition, in the encoding process, the scale factors may also be controlled prior to the spectrum pre-rectification. Accordingly, in the process of restoring the original sample values during encoding and in the decoding process, inverse spectrum rectification may be performed prior to gain balancing. Description of such situation will not be detailed.

[0105] Embodiment 3:

[0106] Embodiment 3 provides a method for adjusting quality of quantization by spectrum rectification.

[0107] Figure 12 illustrates a schematic diagram of an encoder for adjusting quality of quantization according to Embodiment 3. In the encoding process, sample values in time domain are first transformed to frequency domain by a time-frequency transform operation. Then, after the spectrum pre-rectification, these samples are quantized and the quantized sample values are output. An optimal global gain is calculated by performing inverse spectrum rectification and inverse time-frequency transform on the output quantized sample values. Quantized sample values in frequency domain and a global gain need to be transmitted in the encoding streams.

[0108] Figure 13 illustrates a schematic diagram of a decoder for adjusting quality of quantization according to Embodiment 3. In a decoding process, after the quantized sample values in frequency domain go through an inverse spectrum rectification and inverse time-frequency transform, sample values in time domain are obtained. Finally, these sample values are multiplied with the global gain to form restored sample values in time domain.

[0109] In Embodiment 3, the methods of spectrum pre-rectification and the inverse spectrum rectification and the technical effects thereof are the same as those in Embodiment 2, which are omitted herein for brevity.

[0110] Embodiment 4:

[0111] An apparatus for adjusting quality of quantization according to Embodiment 4 is provided.

[0112] Corresponding to methods in Embodiment 2, Figure 14 illustrates a block diagram of an apparatus for adjusting quality of quantization at an encoder according to Embodiment 4. As illustrated in Figure 14, the apparatus for adjusting quality of quantization at the encoder may include a time-frequency transform unit, a spectrum pre-rectification unit, a multiple scale factors control unit, a quantization unit, a gain balancing unit, an inverse spectrum rectification unit, an inverse time-frequency transform unit, and a global gain computing unit. The time-frequency transform unit receives a first group of sample values, performs a time-frequency transform on the first group of sample values and outputs to the spectrum pre-rectification unit. The spectrum pre-rectification unit receives the first group of sample values output from the time-frequency transform unit, performs a spectrum pre-rectification on the first group of sample values and outputs to the multiple scale factors control unit. The multiple scale factors control unit receives the first group of sample values, configures two or more scale factors for the first group of sample values, adjusts the first group of sample values with the scale factors, and outputs the adjusted first sample value to the quantization unit. The quantization unit quantizes the received first sample value, obtains quantized sample values and outputs the quantized sample values to the gain balancing unit. The gain balancing unit receives the quantized sample value, eliminates the influence imposed by the scale factors on the quantized sample value, obtains a second group of sample values, and outputs the second group of sample values to the inverse spectrum rectification unit. The inverse spectrum rectification unit receives the second group of sample values output from the gain balancing unit, performs an inverse spectrum rectification on the second group of sample values and outputs to the inverse time-frequency transform unit. The inverse time-frequency transform unit receives the second group of sample values from the peak spectrum rectification unit, performs an inverse time-frequency transform on the second group of sample values and outputs to the global gain computing unit. The global gain computing unit receives the first group of sample values and the second group of sample values, and obtains the global gain by using the first group of sample values and the second group of sample values.

[0113] The multiple scale factors control unit includes a scale factor configuration unit and a sample value adjusting unit. The scale factor configuration unit is configured to configure two or more scale factors for the first group of sample values and outputs the configured scale factor to the sample value adjusting unit. The sample value adjusting unit is configured to receive scale factors and adjust the first group of sample values with the scale factors.

[0114] The scale factor configuration unit includes a criteria setting unit, a scale factor adjusting unit, a unit for estimating the number of consumed bits, a perception distortion computing unit. The criteria setting unit is configured to set a criterion for scale factors and output the criteria to the scale factor adjusting unit. The scale factor adjusting unit is configured to adjust the scale factors based on the criteria and output the adjusted scale factors to the unit for estimating the number of consumed bits and the perception distortion computing unit. The unit for estimating the number of consumed bits is configured to estimate the number of consumed bits based on the scale factors and determine if the number of consumed bits is less than the total number of bits allowable by an encoding process and transmit a determination result to the scale factor adjusting unit. The perception distortion computing unit is configured to calculate perception distortion based on the scale factors, determine whether the perception distortion is within an imperceptible range and transmit the determination result to the scale factor adjusting unit.

[0115] The spectrum pre-rectification unit includes a peak marking unit, a reference computing unit, a gain factor computing unit and a pre-rectification unit. The peak marking unit is configured to receive the first group of sample values, mark a peak among the first group of sample values within the spectrum rectification area, and output the peak to the reference computing unit. The reference computing unit is configured to compute based on the peak a reference for spectrum pre-rectification and output the reference to the gain factor computing unit. The gain factor computing unit is configured to compute based on the reference a gain factor for each marked peak and output the gain factor to the pre-rectification unit. The pre-rectification unit is configured to pre-rectify the spectrum with the gain factor.

[0116] The inverse spectrum rectification unit includes a peak marking unit, a reference computing unit, a reduction factor computing unit and an inverse rectification unit. The peak marking unit is configured to receive the sample values, mark peaks among the sample values within the spectrum rectification area, and output the marked peaks to the reference computing unit. The reference computing unit is configured to compute based on the peaks the reference for inverse spectrum rectification and output the reference to the reduction factor computing unit. The reduction factor computing unit is configured to compute based on the reference a reduction factor for each marked peak and output the reduction factor to the inverse rectification unit. The inverse rectification unit is configured to inversely rectify the spectrum with the reduction factor.

[0117] Corresponding to methods in Embodiment 2, Figure 15 illustrates a block diagram of an apparatus for adjusting quality of quantization at a decoder according to Embodiment 4. As illustrated in Figure 15, the apparatus for adjusting quality of quantization at the decoder includes a gain balancing unit, an inverse spectrum rectification unit, an inverse time-frequency transform unit and a global gain balancing unit. The gain balancing unit is configured to receive the quantized sample values and scale factors, utilize the received scale factors to eliminate the influence of the scale factors from the quantized sample values and obtain sample values, and output the sample values to the inverse spectrum rectification unit. The inverse spectrum rectification unit receives the sample values output from the gain balancing unit, performs an inverse spectrum rectification on the sample values and outputs to the inverse time-frequency transform unit. The inverse time-frequency transform unit receives the sample values from the inverse spectrum rectification unit, performs an inverse time-frequency transform on the sample values and outputs to the global gain balancing unit. The global gain balancing unit receives a global gain and sample values, multiplies the sample values with the global gain and outputs the multiplications. The global gain balancing unit may be a multiplier. Like the encoder, the inverse spectrum rectification unit of the decoder includes a peak marking unit, a reference computing unit, a reduction factor computing unit and an inverse rectification unit. The peak marking unit is configured to receive the sample values, mark peaks among the sample values within the spectrum rectification area, and output the marked peaks to the reference computing unit. The reference computing unit is configured to compute based on the peaks the reference for inverse spectrum rectification and output the reference to the reduction factor computing unit. The reduction factor computing unit is configured to compute based on the reference a reduction factor for each marked peak and output the reduction factor to the inverse rectification unit. The inverse rectification unit is configured to inversely rectify the spectrum with the reduction factor.

[0118] Corresponding to the methods of Embodiment 1, 3, and implementations thereof, apparatuses for adjusting quality of quantization with different structure can be contemplated. The functionality of each unit in the apparatus has been described above in detail, which is omitted herein for brevity.

[0119] Embodiments described above may be applicable to various encoding fields such as audio encoding, video encoding, image encoding.

[0120] With the description of the foregoing embodiments, it is readily appreciated by those skilled in the art that the present invention may be implemented with software on a necessary hardware platform. The embodiment may also be implemented with hardware. But, most of the time, the former approach is more preferable. Based on this understanding, technical solutions of the present invention, or the part which the present invention makes contribution over the prior art may be embodied in a software product. The computer software product may be stored in a readable storage media. The software product may include a set of instructions enabling a computer (may be a personal computer, a server, or a network device, etc.) to perform methods according to various embodiments of the present invention. The foregoing disclosure is only a few embodiments of the present invention. However, the present invention is not intended to be limiting in these respects. Any modification made by those skilled in the art shall be construed as falling within the scope of the present invention.

[0121] The foregoing are merely procedures and method embodiments of the present invention, which not be construed as limitation to the present invention. Any modifications, equivalents, improvements, etc., made within the spirit and principle of the present invention shall be construed as falling within the scope of the present invention.

Claims

1. A method for adjusting quality of quantization for encoding, characterized in comprising:

adjusting a first group of sample values for encoding with at least two scale factors; quantizing the adjusted first group of sample values to obtain the quantized sample values;

eliminating the impact of the scale factors from the quantized sample values to obtain a second group of sample values;

obtaining a global gain based on the first group of sample values and the second group of sample values; and

outputting the quantized sample values, information of the at least two scale factors and the global gain as an encoding stream.

2. The method of claim 1, characterized in that,

the first group of sample values and the second group of sample values are sample values in time domain; and

before adjusting the first group of sample values, the method further comprises: transforming the first group of sample values in time domain into the first group of sample values in frequency domain;

the adjusting a first group of sample values for encoding with at least two scale factors comprises: utilizing the scale factors to adjust the first group of sample values in frequency domain;

the quantizing the adjusted first group of sample values to obtain quantized sample values comprises: quantizing the adjusted first group of sample values in frequency domain to obtain the quantized sample values.

the eliminating the impact of the scale factors from the quantized sample values to obtain a second group of sample values comprises: eliminating the impact of the scale factors from the quantized sample values to obtain a second group of sample values in frequency domain;

after obtaining the second group of sample values and before obtaining the global gain, the method further comprises: transforming the second group of sample values in frequency domain into the second group of sample values in time domain;

the obtaining a global gain based on the first group of sample values and the second group of sample values comprises: obtaining the global gain by utilizing the first group of sample values in time domain and the second group of sample values in time domain.

3. The method of claim 2, characterized in that the transforming the first group of sample values in time domain into the first group of sample values in frequency domain comprises: transforming the first group of sample values in time domain into the first group of sample values in frequency domain based on a Discrete Fourier Transform, or a Fast Fourier Transform, or a Discrete Cosine Transform, or a Discrete Wavelet Transform.

4. The method of claim 2, characterized in that the at least two scale factors are scale factors configured for the first group of sample values in frequency domain.

5. The method of claim 4, characterized in that the configuring at least two scale factors for the first group of sample values in frequency domain comprises: dividing the first group of sample values in frequency domain into two or more portions and configuring a scale factor for each portion.

6. The method of claim 5, characterized in that the process of utilizing the scale factors to adjust the first group of sample values in frequency domain comprises: utilizing scale factors to adjust corresponding portions of the first group of sample values in frequency domain, respectively.

7. The method of claim 6, characterized in that the process of eliminating the impact of the scale factors from the quantized sample values comprises:

dividing the quantized sample values into two or more portions in accordance with the method of dividing the first group of sample values in frequency domain; and

utilizing the scale factor of each portion to eliminate the impact of the scale factor from the corresponding portion of the quantized sample value.

8. The method of claim 7, characterized in that the process of outputting the information of the at least two scale factors as an encoding stream comprises: outputting at least two scale factors as an encoding stream.

9. The method of claim 6, characterized in that after configuring a scale factor for each portion, the method further comprises:

selecting one scale factor of one portion among the scale factors as a criteria scale factor; and computing ratios of the scale factors of the remaining portions to the criteria scale factor;

eliminating the impact of the scale factors from the quantized sample values comprises dividing the quantized sample values into two or more portions in accordance with the method of dividing the first group of sample values in frequency domain; and utilizing the obtained ratios to eliminate the impact of the scale factors from the corresponding portions of the quantized sample values.

10. The method of claim 9, characterized in that the process of outputting the information of the at least two scale factors as an encoding stream comprises: outputting the ratios of the scale factors of the remaining portions to the criteria scale factor as an encoding stream.

11. The method of claim 9, characterized in that the process of eliminating the impact of the scale factors from the quantized sample values comprises:

dividing the quantized sample values into two or more portions in accordance with the method of dividing the first group of sample values in frequency domain;

utilizing the criteria scale factor and the obtained ratios to compute a scale factor for each portion; and

utilizing the scale factor of each portion to eliminate the impact of the scale factor from the corresponding portion of the quantized sample values.

12. The method of claim 11, characterized in that the process of outputting the information of the at least two scale factors as an encoding stream comprises: outputting the criteria factor and the ratios of the scale factors of the remaining portions to the criteria scale factor as an encoding stream.

13. The method of claim 6, characterized in that the process of configuring a scale factor for each portion comprises: adjusting the scale factor for each portion based on the number of consumed bits and perception distortion to obtain an optimal scale factor for each portion.

14. The method of claim 13, characterized in that the process of adjusting the scale factor for each portion to obtain an optimal scale factor comprises:

setting a criteria for the scale factors such that the number of consumed bits is less than the total number of bits allowable by the encoding;

adjusting the scale factor for each portion based on the criteria;

determining whether the adjusted scale factors make the number of consumed bits less than the total number of bits allowable by the encoding; if the condition is not met, continuing performing the process of adjusting scale factors until the condition is met; if the condition is met, computing the perception distortion;

determining whether the perception distortion is within an imperceptible range; if it is determined that the perception distortion is within an imperceptible range, regarding the scale factors obtained from the current adjustment as the optimal scale factors; otherwise, returning to the process of adjusting scale factors and repeating the step of adjusting scale factors and subsequent steps.

15. The method of claim 14, characterized in that the number of consumed bits is estimated based on the first group of sample values in frequency domain, the number of first sample values in frequency domain and the scale factors.

16. The method of claim 14, characterized in that, the perception distortion is obtained based on the first group of sample values in frequency domain, and the scale factor of each portion.

17. The method of claim 14, characterized in comprising:

if the perception distortion is within a perceptible range, repeating the process of adjusting scale factors and subsequent steps for a predetermined times;

if the perception distortion is still within a perceptible range after a predetermined times of repetition, selecting a scale factor as the optical scale factor which contributes to a minimum perception distortion from the scale factors adjusted during the repeating process.

18. The method of claim 14, characterized in that the process of adjusting the scale factor for each portion based on the criteria comprises: decreasing the scale factors at critical bands based on the criteria and increasing the scale factors at uncritical bands based on the criteria.

19. The method of claim 18, characterized in that the critical bands are low frequency bands and the uncritical bands are high frequency bands.

20. The method of claim 2, characterized in that,

before utilizing the scale factors to adjust the first group of sample values in frequency domain, the method further comprises: performing a spectrum pre-rectification on the first group of sample values in frequency domain;

after eliminating the impact of the scale factors from the quantized sample values to obtain a second group of sample values in frequency domain and before transforming the second group of sample values in frequency domain to the second group of sample values in time domain, the method further comprises: performing an inverse spectrum rectification on the second group of sample values in frequency domain.

21. The method of claim 2, characterized in that,

after utilizing the scale factors to adjust the first group of sample values in frequency domain and before quantizing the sample values, the method further comprises: performing a spectrum pre-rectification on the adjusted first group of sample values in frequency domain;

after quantization and before eliminating the impact of the scale factors from the quantized sample values, the method further comprises: performing an inverse spectrum rectification on the quantized sample values.

22. The method of claim 20 or 21, characterized in comprising:

determining a spectrum rectification area;

performing a spectrum pre-rectification on the sample values comprises performing a spectrum pre-rectification on the sample values in the determined spectrum rectification area; and

performing an inverse spectrum rectification on the sample values comprises performing an inverse spectrum rectification on the sample values in the determined spectrum rectification area.

23. The method of claim 22, characterized in that, the spectrum pre-rectification process comprises:

marking peaks of the sample values among the sample values in the determined spectrum rectification area;

utilizing a peak of the marked peaks to compute a reference for spectrum pre-rectification;

utilizing the reference to compute a gain factor for each marked peak; and

utilizing the computed gain factor to pre-rectify the spectrum.

24. The method of claim 23, characterized in that the process of marking peaks of the sample values comprises: selecting one or more local areas from the spectrum rectification area and selecting a sample value which has the largest amplitude from each local area as a peak for a corresponding local area.

25. The method of claim 24, characterized in that the process of pre-rectifying the spectrum comprises: utilizing gain factors for corresponding peaks to perform pre-rectification on the local areas containing the remaining peaks other than the peak for computing the reference.

26. The method of claim 25, characterized in that the process of performing pre-rectification comprising: amplifying the peaks with the gain factors; or utilizing the gain factors to amplify the peaks and the sample values in the areas containing the peaks.

27. The method of claim 23, characterized in that the process of computing the reference comprising: selecting a maximum peak from the marked peaks and utilizing the maximum peak to obtain the reference.

28. The method of claim 27, characterized in that, the reference is an amplitude of the maximum peak, or energy of a sample point close to the maximum peak; or average energy of sample points close to the maximum peak.

29. The method of claim 23, characterized in that,
the gain factor for the peak is C_m times of k th power of the ratio of the reference number to the peak where k is a number greater than zero and less than 1 and C_m is an arbitrary number.

30. The method of claim 22, characterized in that, the inverse spectrum rectification process comprises:

marking peaks of the sample values among the sample values in the determined spectrum rectification area;

utilizing a peak of the marked peaks to compute a reference for inverse spectrum rectification;

utilizing the reference to compute a reduction factor for each marked peak; and

utilizing the computed reduction factor to inversely rectify the spectrum.

31. The method of claim 2, characterized in that,
a global gain obtained by utilizing the first group of sample values in time domain and the second group of sample values in time domain is determined in such a way that the variance between the first group of sample values in time domain and the multiplication of the second group of sample values in time domain and the global gain is at a minimum.

32. A method for adjusting quality of quantization for decoding, wherein an encoding stream output by an encoder is decoded as a decoding stream, the method characterized in comprising:

acquiring quantized sample values, information of at least two scale factors and a global gain from the decoding stream; and

utilizing the information of the at least two scale factors to eliminate the impact of the scale factors from the quantized sample values to obtain sample values; and multiplying the sample values with the global gain.

33. The method of claim 32, characterized in that,

the quantized sample values are quantized sample values in frequency domain;

the process of eliminating the impact of the scale factors from the quantized sample values to obtain the sample values comprises: eliminating the impact of the scale factors from the quantized sample values to obtain the sample values in frequency domain; and

after eliminating the impact of the scale factors from the quantized sample values to obtain sample values and before multiplying the sample values with the global gain, the method further comprises: transforming the sample values in frequency domain to sample values in time domain.

34. The method of claim 33, characterized in that,

after eliminating the impact of the scale factors from the quantized sample values in frequency domain to obtain sample values in frequency domain and before transforming the sample values in frequency domain to the sample values in time domain, the method further comprises: performing an inverse spectrum rectification on the sample values in frequency domain; or

before eliminating the impact of the scale factors from the quantized sample values in frequency domain to obtain sample values in frequency domain, the method further comprises: performing an inverse spectrum rectification on the sample values in frequency domain.

35. The method of any one of claims 32-34, characterized in that
the information of the scale factors acquired from the decoding stream comprises all scale factors;
the process of eliminating the impact of the scale factors from the quantized sample values comprises:

dividing the quantized sample values into two or more portions in accordance with the method of dividing the sample values in frequency domain during encoding; and

utilizing the scale factor for each portion to eliminate the impact of the scale factor from the corresponding portion of the quantized sample values.

36. The method of any one of claims 32-34, characterized in that
the information of the scale factors acquired from the decoding stream is the remaining scale factors and ratios of the remaining scale factors to a criteria scale factor wherein a scale factor is treated as the criteria scale factor;
the process of eliminating the impact of the scale factors from the quantized sample values comprises:

dividing the quantized sample values into two or more portions in accordance with the method of dividing the sample values in frequency domain during encoding; and

utilizing the obtained ratios to eliminate the impact of the scale factors from the corresponding portions of the quantized sample value.

37. The method of any one of claims 32-34, characterized in that
the information of the scale factors acquired from the decoding stream is a scale factor treated as a criteria scale factor and the ratios of the remaining scale factors to the criteria scale factor;
the process of eliminating the impact of the scale factors from the quantized sample values comprises:

dividing the quantized sample values into two or more portions in accordance with the method of dividing the sample values in frequency domain during encoding;

utilizing the criteria scale factor and the ratios to compute a scale factor for each portion; and

utilizing the scale factor for each portion to eliminate the impact of the scale factor from the corresponding portion of the quantized sample values.

38. The method of claim 34, characterized in that, the inverse spectrum rectification process comprises:

marking peaks of the sample values among the sample values in a spectrum rectification area determined during encoding;

utilizing a peak of the marked peaks to compute a reference for inverse spectrum rectification;

utilizing the reference to compute a reduction factor for each marked peak; and

utilizing the computed reduction factor to inversely rectify the spectrum.

39. An apparatus for adjusting quality of quantization for encoding, characterized in comprising a multiple scale factors control unit, a quantization unit, a gain balancing unit, and a global gain computing unit; wherein
the multiple scale factors control unit is configured to receive a first group of sample values, configure at least two scale factors for the first group of sample values, adjust the first group of sample values with the scale factors, and output the adjusted first group of sample values to the quantization unit;
the quantization unit is configured to quantize the received first group of sample values, obtain quantized sample values and output the quantized sample values to the gain balancing unit;
the gain balancing unit is configured to receive the quantized sample values, eliminate the impact of the scale factors from the quantized sample values, obtain a second group of sample values, and output the second group of sample values to the global gain computing unit; and
the global gain computing unit is configured to receive the first group of sample values and the second group of sample values, and obtain a global gain based on the first group of sample values and the second group of sample values.

40. The apparatus of claim 39, characterized in further comprising a time-frequency transform unit and an inverse time-frequency transform unit; wherein
the time-frequency transform unit is configured to receive the first group of sample values, perform a time-frequency transform on the first group of sample values and output to the multiple scale factors control unit; and
the inverse time-frequency transform unit is configured to receive the second group of sample values from the gain balancing unit, perform an inverse time-frequency transform on the second group of sample values and output to the global gain computing unit.

41. The apparatus of claim 40, characterized in further comprising a spectrum pre-rectification unit and an inverse spectrum rectification unit; wherein
the spectrum pre-rectification unit is configured to receive the first group of sample values output from the time-frequency transform unit, perform a spectrum pre-rectification on the first group of sample values and output to the multiple scale factors control unit; the inverse spectrum rectification unit is configured to receive the second group of sample values output from the gain balancing unit, perform an inverse spectrum rectification on the second group of sample values and output to the inverse time-frequency transform unit;
or,
the spectrum pre-rectification unit is configured to receive the first group of sample values output from the multiple scale factors control unit, perform a spectrum pre-rectification on the first group of sample values and output to the quantization unit; the inverse spectrum rectification unit is configured to receive the second group of sample values output from the quantization unit, perform an inverse spectrum rectification on the quantized sample values and output to the gain balancing unit.

42. The apparatus of any one of claims 39 to 41, characterized in that the multiple scale factors control unit comprises a scale factor configuration unit and a sample value adjusting unit; wherein
the scale factor configuration unit is configured to configure at least two scale factors for the first group of sample values and output the configured scale factors to the sample value adjusting unit; and
the sample value adjusting unit is configured to receive scale factors and adjust the first group of sample values with the scale factors.

43. The apparatus of claim 42, characterized in that, the scale factor configuration unit comprises a criteria setting unit, a scale factor adjusting unit, a unit for estimating the number of consumed bits, a perception distortion computing unit; wherein
the criteria setting unit is configured to set a criteria for the scale factors and output the criteria to the scale factor adjusting unit;
the scale factor adjusting unit is configured to adjust the scale factors based on the criteria and output the adjusted scale factors to the unit for estimating the number of consumed bits and the perception distortion computing unit;
the unit for estimating the number of consumed bits is configured to estimate the number of consumed bits based on the scale factors and determine if the number of consumed bits is less than the total number of bits allowable by an encoding process and transmit a determination result to the scale factor adjusting unit; and
the perception distortion computing unit is configured to calculate perception distortion based on the scale factors, determine whether the perception distortion is within an imperceptible range and transmit the determination result to the scale factor adjusting unit.

44. The apparatus of claim 41, characterized in that, the spectrum pre-rectification unit comprises a peak marking unit, a reference computing unit, a gain factor computing unit and a pre-rectification unit; wherein
the peak marking unit is configured to receive the first group of sample values, mark a peak among the first group of sample values within the spectrum rectification area, and output the peak to the reference computing unit;
the reference computing unit is configured to compute based on the peak a reference for spectrum pre-rectification and output the reference to the gain factor computing unit;
the gain factor computing unit is configured to compute based on the reference a gain factor for each marked peak and output the gain factor to the pre-rectification unit; and
the pre-rectification unit is configured to pre-rectify the spectrum with the gain factor.

45. The apparatus of claim 41, characterized in that, the inverse spectrum rectification unit comprises a peak marking unit, a reference computing unit, a reduction factor computing unit and an inverse rectification unit; wherein
the peak marking unit is configured to receive the sample values, mark peaks among the sample values within the spectrum rectification area, and output the marked peaks to the reference computing unit;
the reference computing unit is configured to compute based on the peaks a reference for inverse spectrum rectification and output the reference to the reduction factor computing unit;
the reduction factor computing unit is configured to compute based on the reference a reduction factor for each marked peak and output the reduction factor to the inverse rectification unit; and
the inverse rectification unit is configured to inversely rectify the spectrum with the reduction factor.

46. An apparatus for adjusting quality of quantization for decoding, characterized in comprising a gain balancing unit, and a global gain balancing unit; wherein
the gain balancing unit is configured to receive quantized sample values and scale factors, utilize the received scale factors to eliminate the impact of the scale factors from the quantized sample values and obtain sample values, and output the sample values to the global gain balancing unit; and
the global gain balancing unit is configured to receive a global gain and the sample values, multiply the sample values with the global gain and output the multiplications.

47. The apparatus of claim 46, characterized in further comprising an inverse time-frequency transform unit configured to receive the sample values from the gain balancing unit, perform an inverse time-frequency transform on the sample values and output to the global gain balancing unit.

48. The apparatus of claim 47, characterized in further comprising an inverse spectrum rectification unit; wherein
the inverse spectrum rectification unit is configured to receive the sample values output from the gain balancing unit, perform an inverse spectrum rectification on the sample values and output to the inverse time-frequency transform unit;
or,
the inverse spectrum rectification unit is configured to receive the quantized sample values, perform an inverse spectrum rectification on the quantized sample values and output to the gain balancing unit.

49. The apparatus of claim 48, characterized in that, the inverse spectrum rectification unit comprises a peak marking unit, a reference computing unit, a reduction factor computing unit and an inverse rectification unit; wherein
the peak marking unit is configured to receive the sample values, mark peaks among the sample values within the spectrum rectification area, and output the marked peaks to the reference computing unit;
the reference computing unit is configured to compute based on the peaks a reference for inverse spectrum rectification and output the reference to the reduction factor computing unit;
the reduction factor computing unit is configured to compute based on the reference a reduction factor for each marked peak and output the reduction factor to the inverse rectification unit; and
the inverse rectification unit is configured to inversely rectify the spectrum with the reduction factor.

Drawing

Search report