FIELD OF THE INVENTION
[0001] The present invention relates to encoding technology, and more specifically, to a
method and apparatus for adjusting quality of quantization for encoding/decoding.
BACKGROUND
[0002] With the development of communication technology and the extension of multi-media
service, encoding such as digital audio cording or digital video encoding not only
requires a higher encoding efficiency and a real-time feature, but also a further
extended encoding bandwidth. In terms of the digital audio encoding, techniques meeting
the requirement of low bit rate and high audio encoding quality mainly include: AAC+,
EAAC+ and AMR-WB+. The AAC+ and EAAC+ are evolved from an audio encoder with high
bit rate, while the AMR-WB+ is a mixing encoding method by extending audio encoding
with low bit rate.
[0003] In normal audio encoding, to better combine characteristics of human auditory system,
time-frequency transformation is typically first performed on samples and the rounding,
weighting, and quantization are then performed on spectrum coefficients based on auditory
characteristics. The quantized spectrum coefficients are transported by encoding entropy.
A major distortion in the encoding comes from quantization of various parameters.
Therefore, to accommodate different requirements, the encoder needs to adjust the
quality of quantization based on a specified encoding rate. In an encoding scheme
with high bit rate above 24kbps, a good encoder may achieve a transparent sound, i.e.,
human ear may not perceive the noise introduced in the encoding and quantizing process.
In an encoding scheme with low bit rate, since the number of bits is insufficient,
the effect of a perfect sound transparency may not be achieved. Therefore, one may
only pursue a minimum subject distortion.
[0004] A common scheme for adjusting quality of quantization is to use a scale factor or
a gain. First, a coded coefficient is divided by a scale factor or multiplied with
a gain. Then, the scaled coefficient is quantized. The optimal scale factor may both
satisfy the requirement of bit rate and minimize the quantization error. Therefore,
when the bit rate is high, a smaller scale factor is chosen such that the quantized
coefficient may have a larger dynamic range and a relatively refined quantization.
When the bit rate is slow, a bigger scale factor is chosen such that the quantized
coefficient may have a smaller dynamic range and a relatively coarse quantization.
[0005] Figure 1 illustrates a block diagram of an MPEG1-LAYER3 audio encoding algorithm.
In the MPEG1-LAYER3 audio encoding algorithm, before the time-frequency transform,
the whole encoding band is divided into 32 sub-bands with each assigned with a scale
factor. The whole band is assigned with a global scale factor. Before quantization,
the global scale factor is adjusted using a close loop search algorithm such that
the number of quantization bits is controlled within the range allowed by the current
bit rate. At the same time, scale factors for sub-bands are adjusted such that the
quantization noise is controlled under the masking threshold of human auditory system.
That is, the human ear may not perceive the presence of the quantization noise. Finally,
the quantized coefficient flow is transmitted by way of Huffman encoding.
[0006] Multiple scale factors for sub-bands encoding method in the MPEG1-LAYER3 encoding
algorithm have the following defects:
- (1) the division into sub-bands requires 32 sub-band analysis filters, resulting in
a highly complicated computation.
- (2) the scale factor of each sub-band needs to be transmitted by way of Huffman encoding,
which occupies a large number of bits and is not appropriate for low bit rate encoding.
[0007] Figure 2 illustrates a partial flowchart of Transform Coded Excitation (TCX) of AMR-WB+
audio encoding algorithm. In AMR-WB+ audio encoding, a global scale factor is used.
Due to the limitation of using one scale factor, a specific frequency band cannot
be finely tuned. Moreover, considering encoding requirement on low bit rate, the frequency
domain samples in the spectrum which have a low energy may lost during vector quantization.
However, since human auditory system has different sensitivities over different frequency
bands, it is desired that the frequency domain samples with low energy at critical
frequency bands can still be quantized during encoding. Therefore, in AMR-WB+ audio
encoding, the spectrum pre-rectification and spectrum inverse rectification are employed.
For TCX of AMR-WB+ audio encoding algorithm, critical frequency bands in the whole
spectrum are first pre-rectified to raise the energy at these specific bands and then
a global scale factor is used for the whole frequency band.
[0008] Since human auditory system has a high resolution for low frequency bands, the above
mentioned critical frequency bands typically refers to low frequency bands. For spectrum
pre-rectification scheme in AMR-WB+ audio encoding, every 8 frequency domain samples
in the first quarter of the spectrum is treated as a block. The energy of each block
E
m is calculated where m denotes a block index number. Then, a maximum block energy
E
max is determined and
Rm = (
Emax /
Em)
1/4 for each block is computed. A gain factor G
m for each block is obtained based on R
m such that the gain factor G
m processes a monotonic decreasing property. Finally, the frequency domain sample values
in each block are multiplied with a gain factor G
m associated with the block. In AMR-WB+ audio encoding, the gain factor obtained from
the spectrum pre-rectification is not transmitted in encoding streams. Instead, according
to spectrum inverse rectification method, original sample values in frequency domain
are restored by dividing sample values in frequency domain of each block by a gain
factor of a corresponding block after a gain factor G
m of each block is calculated based on sample values in frequency domain.
[0009] It is discovered that the global scale factor algorithm for the TCX portion of the
existing AMR-WB+ audio encoding algorithm has the following defects.
- (1) Since only one scale factor is used for the whole frequency band, the quality
of quantization may be adjusted only on a whole band basis. As a result, some critical
frequency bands cannot be emphasized.
- (2) Although a spectrum pre-rectification technique and a spectrum inverse rectification
technique are used for improving the quality of quantization at low frequencies, the
quality of quantization at other frequencies is thereby sacrificed.
- (3) Spectrum pre-rectification and inverse rectification techniques can only be applied
to narrower frequency bands, otherwise, the global scale factor will become increased
significantly and the effect of the quantization as a whole may be reduced.
- (4) Since the gain factors for pre-rectification at the encoding stage are not recorded
in the encoding streams, the accumulation of errors introduced by the quantization
may be reflected in the reduction factors during the inverse rectification process.
SUMMARY
[0010] According to one embodiment of the present invention, a method for adjusting quality
of quantization for encoding is provided to reduce the implementation complexity.
[0011] According to one embodiment of the present invention, a method for adjusting quality
of quantization for decoding is provided to guarantee the quality of quantization.
[0012] According to one embodiment of the present invention, an apparatus for adjusting
quality of quantization for encoding is provided to reduce the implementation complexity.
[0013] According to one embodiment of the present invention, an apparatus for adjusting
quality of quantization for decoding is provided to guarantee the quality of quantization.
[0014] A method for adjusting quality of quantization for encoding is provided according
to one embodiment of the present invention. The method includes: adjusting a first
group of sample values for encoding with at least two scale factors; quantizing the
adjusted first group of sample values to obtain the quantized sample values; eliminating
the impact of the scale factors from the quantized sample values to obtain a second
group of sample values; obtaining a global gain with the first group of sample values
and the second group of sample values; and outputting the quantized sample values,
information of the two or more scale factors and the obtained global gain as an encoding
stream.
[0015] A method for adjusting quality of quantization for decoding is provided according
to one embodiment of the present invention where an encoding stream output by an encoder
is decoded as a decoding stream. The method includes: acquiring quantized sample values,
information of two or more scale factors and a global gain from the decoding stream;
utilizing the information of the two or more scale factors to eliminate the impact
of the scale factors from the quantized sample values to obtain sample values; and
multiplying the sample values with the global gain.
[0016] An apparatus for adjusting quality of quantization for encoding is provided according
to one embodiment of the present invention. The apparatus includes: a multiple scale
factors control unit, a quantization unit, a gain balancing unit, and a global gain
computing unit. The multiple scale factors control unit is configured to receive a
first group of sample values, configure two or more scale factors for the first group
of sample values, adjust the first group of sample values with the scale factors,
and output the first group of adjusted sample values to the quantization unit. The
quantization unit is configured to quantize the received first group of sample values,
obtain quantized sample values and output the quantized sample values to the gain
balancing unit. The gain balancing unit is configured to receive the quantized sample
values, eliminate the impact of the scale factors from the quantized sample values,
obtain a second group of sample values, and output the second group of sample values
to the global gain computing unit. The global gain computing unit is configured to
receive the first group of sample values and the second group of sample values, and
obtain the global gain by using the first group of sample values and the second group
of sample values.
[0017] An apparatus for adjusting quality of quantization for decoding is provided according
to one embodiment of the present invention. The apparatus includes: a gain balancing
unit, and a global gain balancing unit. The gain balancing unit is configured to receive
the quantized sample values and reduction factors, utilize the received reduction
factors to eliminate the impact of the scale factors from the quantized sample values
and obtain sample values, and output the sample values to the global gain balancing
unit. The global gain balancing unit is configured to receive a global gain and the
sample values, multiply the sample values with the global gain and output the multiplications.
[0018] As different from the prior art scheme where filters are utilized, methods and apparatuses
for adjusting quality of quantization according to various embodiments of the present
invention directly divide the sample values into a plurality of portions and configure
a scale factor for each portion. Therefore, the present invention may greatly reduce
the implementation complexity. Moreover, compared with the prior art scheme using
one global factor, since a plurality of scale factors are introduced, the present
invention may better adjust the quality of quantization at critical bands and achieve
a better encoding performance.
BRIEF DESCRIPTION OF THE DRAWINGS
[0019] Figure 1 illustrates a conventional block diagram of an MPEG1-LAYER3 audio encoding
algorithm;
[0020] Figure 2 illustrates a conventional flowchart of TCX part in AMR-WB+ audio encoding
algorithm;
[0021] Figure 3 illustrates a block diagram of an encoder for adjusting quality of quantization
according to Embodiment 1;
[0022] Figure 4 illustrates a block diagram of a decoder for adjusting quality of quantization
according to Embodiment 1;
[0023] Figure 5 illustrates a flowchart of adjusting quality of quantization at the encoder
by using a plurality of scale factors according to Embodiment 1;
[0024] Figure 6 illustrates a flowchart of selecting a plurality of scale factors and finely
tuning the frequency domain sample values on the whole frequency band according to
Embodiment 1;
[0025] Figure 7 illustrates a flowchart of adjusting quality of quantization at the decoder
by using a plurality of scale factors according to Embodiment 1;
[0026] Figure 8 illustrates a schematic diagram of an encoder for adjusting quality of quantization
according to Embodiment 2;
[0027] Figure 9 illustrates a schematic diagram of a decoder for adjusting quality of quantization
according to Embodiment 2;
[0028] Figure 10 illustrates a schematic diagram of peak pre-rectification according to
Embodiment 2;
[0029] Figure 11 illustrates a schematic diagram of peak inverse rectification according
to Embodiment 2;
[0030] Figure 12 illustrates a schematic diagram of an encoder for adjusting quality of
quantization according to Embodiment 3;
[0031] Figure 13 illustrates a schematic diagram of a decoder for adjusting quality of quantization
according to Embodiment 3;
[0032] Figure 14 illustrates a block diagram of an apparatus for adjusting quality of quantization
at an encoder according to Embodiment 4; and
[0033] Figure 15 illustrates a block diagram of adjusting quality of quantization at a decoder
according to Embodiment 4.
DETAILED DESCRIPTION
[0034] The purposes, technical solutions and advantages concerning the embodiments of the
present invention will become more readily appreciated by reference to the following
description of the embodiments.
[0035] The main idea of adjusting quality of quantization according to embodiments of the
present invention is to utilize a plurality of scale factors or further utilize the
spectrum rectification technique to adjust quality of quantization during an encoding
process. An encoding process where a time-frequency transform has been performed is
illustrated below. Embodiments of the present invention also apply to an encoding
process where time-frequency transform has not been performed.
[0037] Embodiment 1 provides a method for adjusting quality of quantization with a plurality
of scale factors.
[0038] Figure 3 illustrates a schematic diagram of an encoder for adjusting quality of quantization
according to Embodiment 1. In the encoding process, sample values in time domain (time
domain frequency values) are first transformed to frequency domain by a time-frequency
transform operation. Then, after the control of a plurality of scale factors, these
sample values are quantized and the quantized sample values are output. An optimal
global gain is calculated by performing gain balancing and inverse time-frequency
transform on the output quantized sample values. Scale factors, quantized sample values
in frequency domain (frequency domain sample values) and a global gain need to be
transmitted in encoding streams.
[0039] Figure 4 illustrates a schematic diagram of a decoder for adjusting quality of quantization
according to Embodiment 1. In a decoding process, after the quantized sample values
in frequency domain are gain-balanced and inversely transformed from frequency domain
to time domain, sample values in time domain are obtained. Finally, these sample values
are multiplied with the global gain to form restored sample values in time domain.
[0040] Figure 5 illustrates steps of adjusting quality of quantization at the encoder with
a plurality of scale factors according to Embodiment 1. The steps are as following.
[0041] Step 501: Time domain sample values x(n) is transformed to frequency domain sample
values X(k) by virtue of a time-frequency transform.
[0042] Time-frequency transform herein may include a Discrete Fourier Transform (DFT), a
Discrete Cosine Transform (DCT, MDCT, IDCT), a Discrete Wavelet Transform (DWT), etc.
In a time-frequency transform process, Fast Fourier Transform (FFT) can also be applied
for reducing computational complexity.
[0043] Step 502: A plurality of scale factors is used to control frequency domain sample
values X(k). In particular, a plurality of proper scale factors are selected and used
to finely tune the frequency domain sample values on the whole frequency band.
[0044] In the present embodiment, assume that m scale factors are applied to frequency domain
sample values X(k) (k=0,1,...,N) on the whole frequency band and assume that the maximum
allowable number of bits during encoding process is b
max. Referring to the flowchart illustrated in Figure 6, steps of selecting a plurality
of proper scale factors and finely tuning the frequency domain sample values are described
hereinafter.
[0045] Step 601: divide the whole frequency band into m portions [0,
n1],[
n1+1,
n2],···,[
nm-1+1,
N] and frequency domain sample values
X(0,1,···
,n1),
X(
nm-1+1,
nm-1+2,···
,N),···
,X(
n1+1
,n1+2,···
,n2) for m portions are obtained. The scale factor for each portion is denoted as
g1,
g2,···,
gm.
[0046] In one embodiment, the plurality of scale factors can be used for a direct division
of the whole frequency band after a time-frequency transform is performed, thereby
eliminating the necessity of first using a group of filters for dividing the spectrum
into several bands and then configuring a scale factor for each band. Compared with
prior art, the present invention may significantly reduce the implementation complexity.
[0047] Step 602: A criteria value
g0 is selected for estimating m scale factors. The criteria
g0 for scale factors is selected in such a way that the estimation of the number of
consumed bits
b0 is less than the maximum allowable number of bits
bmax.
[0048] In the present embodiment, the number of consumed bits b is associated with frequency
domain sample values X, the number of frequency domain samples N and the scale factor
g, which can be expressed as a function of
b =
cons(
X,N,g). Therefore, in step 602, when the criteria is selected as
g0, the estimation of the number of consumed bits is
b0 = cons(
X,N,g0) where
b0 <
bmax.
[0049] Step 603: m scale factors
g1,
g2,···,
gm are adjusted around
g0.
[0050] In step 603, m scale factors are adjusted in such a way as to decrease scale factors
at more critical bands and increase scale factors at less critical bands. Here, the
more critical bands refer to low frequency bands while the less critical bands refer
to high frequency bands. Because
g1 ∼
gm correspond to bands from low frequency to high frequency, the adjusted m scale factors
g'1,
g'2,···,
g'm increase gradually. With such adjustment, the quality of quantization at more critical
bands is relatively good and the quality of quantization at less critical bands is
relatively lower. Consequently, the quality of quantization at the whole frequency
band can be optimized.
[0051] Step 604: It is determined whether the estimated number of consumed bits is no more
than the total number of bits. If not, the process returns to step 603 and the scale
factors are adjusted again. If so, m scale factors which satisfy the number of consumed
bits are denoted as
g'1,
g'2,···,
g'm.
[0052] Step 605: Quantization perception distortion is computed based on m adjusted scale
factors
g'1,
g'2,···
g'm.
[0053] In the present embodiment, the quantization perception distortion C is related to
frequency domain sample values X and m scale factors
g1,
g2,···,
gm, which can be expressed as a function of
c=
f(
X,g1,
g2,···,
gm). The quantization perception distortion C indicates a distortion due to the difference
between the original frequency domain sample values X and the sample values which
come from the frequency domain sample values X adjusted by m scale factors
g1,
g2,···,
gm. In step 605, quantization perception distortion computed based on m adjusted scale
factors
g'1,
g'2,···,
g'm is denoted as
c=
f(
X,g'1,
g'2,···,
g'm).
[0054] Step 606: It is determined whether the quantization perception distortion is within
an imperceptible range. If so, m scale factors obtained from the current adjustment
are regarded as the optimal scale factors which are denoted as
g1opt,g2opt,···,
gmopt. Then, the process proceeds to step 607; otherwise, the process returns to step 603.
[0055] If the perception distortion is within the imperceptible range, people may not perceive
the quantization noise introduced by the encoder. For instance, for audio encoding,
human ear may not perceive the quantization noise introduced by the encoder. For video
encoding, human eye may not perceive the quantization noise introduced by the encoder.
The specific imperceptible range herein is a specific value interval where distortion
is tolerated. The method for determining whether the quantization perception distortion
is within an imperceptible range includes determining whether the quantization perception
distortion computed at step 605 is within a value interval where distortion is tolerated.
If the quantization perception distortion computed at step 605 is within a value interval
where distortion is tolerated, the quantization perception distortion is regarded
as imperceptible; otherwise, the quantization perception distortion is regarded as
perceptible.
[0056] In the present embodiment, according to the determination result at step 606, in
the case where the quantization perception distortion can be perceived, if the quantization
perception distortion can still be perceived after the adjusting step described above
is repeated M times, the close loop selection is terminated and a set of scale factors
which contribute to a minimum perception distortion are selected from the scale factors
obtained during the repetitive adjustment procedure as optimal scale factors. Then,
the process proceeds to step 607. In practice, the times of close loop selection M
may be determined based on actual situation.
[0057] Step 607: m optical scale factors
g1opt,
g2opt,···,
gmopt obtained are used to finely tune the frequency domain sample values X. That is, the
frequency domain sample values of each block are divided by an optical scale factor
corresponding to the block. The finely tuned spectrum X' obtained can be expressed
as

[0058] The finely tuned frequency domain sample values X' obtained at steps 601∼607 are
fed into encoder.
[0059] Considering the fact that scale factors are needed for restoring data during decoding,
scale factors should be transmitted in the encoding streams. A variety of methods
of transmitting scale factors are introduced below, respectively.
[0060] A first method for transmitting scale factors is to encode m scale factors
g1opt,g2opt,···,
gmopt which are used to finely tune the sample values in frequency domain. Thus, the data
can be restored more correctly when being decoded.
[0061] A second method for transmitting scale factors is to select a scale factor as a criteria
scale factor from m scale factors
g1opt,
g2opt,···,
gmopt which are used to finely tune the sample values in frequency domain, and compute
the ratios of the remaining m-1 scale factors to the criteria scale factor and encode
these m-1 ratios. For instance, if
g1opt is selected as the criteria scale factor, only

needs to be coded, thereby reducing the number of consumed bits.
[0062] A third method for transmitting scale factors is to select a scale factor as a criteria
scale factor from m scale factors
g1opt,
g2opt,···,
gmopt which are used to finely tune the sample values in frequency domain, and compute
the ratios of the remaining m-1 scale factors to the criteria scale factor and encode
the criteria scale factor and these m-1 ratios. For instance, if
g1opt is selected as a criteria scale factor, only
g1opt and

need to be encoded. Therefore, not only the number of consumed bits can be reduced,
but the data can also be restored more correctly since the decoder can compute
g1opt,
g2opt,···,
gmopt from
g1opt and

[0063] In order not to take up lots of number of bits when a plurality of scale factors
are used, optimal number of scale factors may be selected in accordance with the requirement
of encoding bit rate and quality of quantization. For instance, 2∼3 scale factors
may be selected for a low bit rate encoding.
[0064] Step 503: Frequency domain sample values X' obtained by controlling a plurality of
scale factors are quantized and quantized frequency domain sample values
Xq are output.
[0065] In the step 503, other quantization approaches may be employed in accordance with
encoding requirement, such as multistage vector quantization, split vector quantization,
tree-structured vector quantization and trellis coded vector quantization.
[0066] Step 504: The impact imposed by the scale factors is eliminated from the quantized
frequency domain sample values
Xq obtained from step 503 and original frequency domain sample values
Xbalance can thus be restored. That is,
Xbalance can be obtained by performing a gain balance on the quantized frequency domain sample
values
Xq.
[0067] The gain balancing method varies with different method for transmitting scale factors.
[0068] If the method for transmitting scale factors is the first method or the third method,
the scale factors
g1opt,
g2opt,···,
gmopt selected according to step 502 may be used for gain balancing. In particular, quantized
frequency domain sample values
Xq are also divided into m portions in accordance with the method for dividing frequency
bands as described in step 601. Then,
Xq(0,1,···,
n1),
Xq(
nm-1+1,
nm-1+2,···,
N),···,
Xq(
n1+1,
n1+2,···,
n2) are obtained. Quantized frequency domain sample values for each portion are multiplied
with a scale factor of a corresponding portion.
Xbalance can be expressed as

[0069] If the method for transmitting scale factors is the third method, ratios of a plurality
of scale factors can be used for gain balancing. In particular, quantized frequency
domain sample values
Xq are also divided into m portions in accordance with the method for dividing frequency
bands as described in step 601. Then,
Xq(0,1,···,
n1),
Xq(
nm-1+1,
nm-1,+2,···,
N),···,
Xq(
n1+1,
n1+2,···,
n2) are obtained. The frequency domain sample values of the portion to which the criteria
scale factor corresponds are multiplied with 1. The quantized frequency domain sample
values of the remaining portions are all multiplied with the ratios of the scale factors
for the remaining portions to the criteria scale factor. If scale factor
g1opt of the first portion is adopted as the criteria scale factor,
Xbalance can be expressed as

[0070] Step 505: Inverse time-frequency transform is performed on
Xbalance which are obtained through gain balancing. The restored frequency domain sample values
Xbalance are transformed to the restored time domain sample values
xq(
n).
[0071] Step 506: The original time domain sample values x(n) and the restored time domain
sample values
xq(
n) are used to compute an optimal global gain
ggopt.
[0072] Here, a global gain
gg is selected as an optimal global gain
ggopt such that the variance between the original time domain sample values and the restored
time domain sample values is at its minimum, i.e., the optimal global gain
ggopt renders

at its minimum. Thus, the optimal global gain can be computed from

[0073] The optimal global gain
ggopt may also require an encoding transmission so that the optimal global gain
ggopt can be used for data recovery.
[0074] The foregoing is a procedure for adjusting quality of quantization at the encoder
by using a plurality of scale factors. Corresponding to adjusting quality of quantization
during encoding, the process of restoring the sample values in time domain at the
decoder based on the decoded quantized sample values in frequency domain is illustrated
in Figure 7. The process includes the following steps.
[0075] Step 701: Scale factors obtained from the encoding streams are used for gain balancing
for the quantized sample values in frequency domain. The implementation is similar
to the method described in step 504, which is omitted herein for brevity. It should
be noted that the gain balancing method may vary with the different method of transmitting
scale factors. In addition, the gain balancing method at the encoder and the gain
balancing method at the decoder should also be consistent with each other.
[0076] Step 702: Inverse time-frequency transform is performed on the sample values in frequency
domain which have been gain balanced and the sample values in time domain are thus
obtained.
[0077] Step 703: Restored sample values in time domain are obtained by multiplying the sample
values in time domain with the global gain obtained from the coded streams.
[0078] The scheme of multiple scale factors control adopted in Embodiment 1 may be applied
directly to sample values in time domain, which means that the scheme may be applied
to the case where no time-frequency transform is performed. Accordingly, no inverse
time-frequency transform is involved during the computation of the global gain. In
this case, when a plurality of scale factors are being configured, the sample values
in time domain can be divided by time intervals. When adjusting the plurality of scale
factors, scale factors associated with more critical time intervals are decreased
and the scale factors associated with less critical time intervals are increased.
[0080] Embodiment 2 provides a method for adjusting quality of quantization with a plurality
of scale factors and spectrum rectification.
[0081] Figure 8 illustrates a schematic diagram of an encoder for adjusting quality of quantization
according to Embodiment 2. In the encoding process, sample values in time domain are
first transformed to frequency domain by a time-frequency transform operation. Then,
after the spectrum pre-rectification and the control of a plurality of scale factors,
these samples are quantized and the quantized sample values are output. An optimal
global gain is calculated by performing gain balancing, inverse spectrum rectification
and inverse time-frequency transform on the output quantized sample values. Scale
factors, quantized sample values in frequency domain and a global gain need to be
transmitted in an encoding stream.
[0082] Figure 9 illustrates a schematic diagram of a decoder for adjusting quality of quantization
according to Embodiment 2. In a decoding process, after the quantized sample values
in frequency domain go through a gain-balancing, inverse spectrum rectification and
inverse time-frequency transform, sample values in time domain are obtained. Finally,
these sample values are multiplied with the global gain to form restored sample values
in time domain.
[0083] In addition to the process illustrated in Figure 5 according to Embodiment 1, the
method for adjusting quality of quantization with a plurality of scale factors and
peak rectification according to Embodiment 2 may further include a spectrum pre-rectification
step between the time-frequency transform at step 501 and the control of scale factors
at step 502 and may further include an inverse spectrum rectification step between
the gain balancing at step 504 and the inverse time-frequency transform at step 505.
The spectrum pre-rectification and inverse spectrum rectification are now detailed
below.
[0084] Figure 10 illustrates a schematic diagram of spectrum pre-rectification which is
implemented by the following steps.
[0085] Step 1001: A spectrum rectification area is determined and in this spectrum rectification
area, a peak set for the sample values in frequency domain obtained at step 501 are
marked as {
pm,
m=1,···,
M}.
[0086] The spectrum rectification area herein refers to a spectrum area at more critical
bands. For instance, for audio data, since human auditory system has a high resolution
at low frequencies, the low frequencies are considered as more critical bands. For
another instance, for data such as videos, images, since most of the data information
is distributed at low frequencies, the low frequencies are considered as more critical
bands. Therefore, the spectrum rectification area may take front part of the whole
band, such as, the first quarter of the band.
[0087] Therefore, peak
pk may be defined as a local maximum value among amplitudes in the spectrum rectification
area. If
X(
i)>
X(
j), ∀
j∈[
i-Δ,
i+Δ],
i≠
j,
X(
i) is a local maximum value among 2Δ+1 points within [
i-Δ,
i+Δ] where the local area can be selected at random.
[0088] Step 1002: Reference
pref for spectrum pre-rectification is computed.
[0089] The principle of selecting reference is to remain the value of the reference unchanged
before or after spectrum rectification. In step 1002, the maximum value in peak set
{
pm,
m=1,···,
M} is regarded as reference
pref. Alternatively, a local maximum energy value can be regarded as reference
pref. Considering the impact by the quantization error, a characteristic parameter of
a block of data can be regarded as reference
pref for lessoning the impact by the quantization error on the reference. Preferably,
the energy level at a data point close to the maximum peak in the peak set {
pm,
m=1,···,
M} may be selected as reference
pref. Alternatively, the average energy of the data points close to the maximum peak in
the peak set {
pm,
m=1,···,
M} may be selected as reference
pref.
[0090] Step 1003: Gain factor
Rm of each peak
pm in the peak set {
pm,
m=1,···,
M} is computed as
k∈(0,1) where parameters
Cm and
k may be selected according to actual situation.
[0091] Step 1004: The computed gain factors of the peaks are used to amplify the peaks.
[0092] To ensure that reference
pref is constant, apart from peaks which are used to calculate reference
pref , the remaining peaks
pm should be multiplied with their corresponding gain factors
Rm. The amplified peaks are expressed as
p'm =
pm ·
Rm.
[0093] Considering the fact that the human auditory system may have a high resolution at
low frequencies, the peak energies can be captured by a quantizer by simply amplifying
the peak energies at low frequencies. Therefore, in Embodiment2, only a few frequency
points, or peaks, need to be amplified. In the present embodiment, the spectrum pre-rectification
technique may also be referred to as peak pre-rectification. With such peak pre-rectification
technique, there is less impact on the global gain increase. The quantization error
caused by the global gain increase may be neglectable. For a better outcome of the
spectrum rectification, the frequency points neighboring to the peaks can also be
amplified. For instance, in addition to amplifying a local peak among 2Δ+1 points,
2Δ points or less than 2Δ points adjacent to the peak may be amplified by corresponding
gain factors.
[0094] With the above described spectrum pre-amplification, the peaks of frequency domain
sample values at more critical bands are enhanced, thereby reducing quantization error
at peaks of sample values in frequency domain at more critical bands and reducing
the possibility of the loss of the spectrum peaks at more critical bands during quantization.
[0095] In an encoder, to calculate the optimal global gain, the sample values in time domain
have to be restored from quantized sample values in frequency domain. If spectrum
pre-rectification is employed, inverse frequency rectification needs to be performed
on
Xbalance which have been gain balanced at step 504. The process is illustrated in Figure 11
which includes the following steps.
[0096] Step 1101: In
Xbalance obtained from step 504, peak set {
qm,m=1,···,
M} for sample values in frequency domain is marked in the spectrum rectification area.
The spectrum rectification area and the peak marking principle during inverse spectrum
rectification is the same as those in the process of spectrum pre-rectification.
[0097] Step 1102: Reference
qref for inverse spectrum rectification is computed. The principle of computing reference
value during inverse spectrum rectification is also the same as that in the process
of spectrum pre-rectification. For instance, if the energy, during spectrum pre-rectification
process, at a data point adjacent to the maximum peak in the peak set {
pm,
m=1,···,
M} is regarded as the reference value, the energy, during inverse spectrum rectification
process, at the data point adjacent to the maximum peak in the peak set {
qm,
m=1,···,
M} should also be regarded as the reference value.
[0098] Step 1103: Reduction factor
rm for each peak
qm in the peak set {
qm,
m=1,···,
M} is computed as
k∈(0,1) where
Cm and
k have to be consistent with the parameters in the spectrum pre-rectification.
[0099] The principle of computing reduction factor
rm in the inverse spectrum rectification is described as follow: in the spectrum pre-rectification
process, the gain factor is

,
k∈(0,1). If the value of a certain peak is
p, the amplified peak is

According to this equation,
p can be expressed as

[0100] As can be seen from the foregoing principle of computing reduction factor in the
inverse spectrum rectification, there is no need to transmit the reference value for
inverse spectrum rectification in the encoding streams. Such principle can also be
applied to the decoder. The reference for inverse spectrum rectification can be computed
by utilizing characteristics of sample values itself of the decoder. Reduction factor
of the corresponding peak can further be calculated. Therefore, no extra bits are
consumed.
[0101] Step 1104: The computed reduction factors of the peaks are used to decrease the peaks.
In inverse spectrum rectification, the peaks which have been amplified in spectrum
pre-rectification should be decreased. If other peaks in addition to the peaks used
to compute the reference value are amplified in the spectrum pre-rectification process,
the other peaks in addition to the peaks used to compute the reference value should
also be decreased in the inverse spectrum rectification. That is, in addition to the
peaks for computing the reference value
qref, the remaining peaks
qm are divided by corresponding reduction factors
rm. The decreased peaks are expressed as
q'm =
qm/
rm.
[0102] After inverse spectrum rectification is performed according to the above described
steps, the sample values in frequency domain obtained from inverse spectrum rectification
at step 505 are transformed from frequency domain to time domain.
[0103] In Embodiment 2, since spectrum pre-rectification is performed between the time-frequency
transform process and the process of the control of scale factors, the decoder may
need to perform, accordingly, an inverse spectrum rectification between the gain balance
process and the inverse time- frequency transform process. The detailed implementation
is similar to that of the method of inverse spectrum rectification in the above encoding
process, which is omitted herein for brevity.
[0104] In the above Embodiment 2, the spectrum pre-rectification is performed prior to the
scale factors being controlled. In addition, in the encoding process, the scale factors
may also be controlled prior to the spectrum pre-rectification. Accordingly, in the
process of restoring the original sample values during encoding and in the decoding
process, inverse spectrum rectification may be performed prior to gain balancing.
Description of such situation will not be detailed.
[0106] Embodiment 3 provides a method for adjusting quality of quantization by spectrum
rectification.
[0107] Figure 12 illustrates a schematic diagram of an encoder for adjusting quality of
quantization according to Embodiment 3. In the encoding process, sample values in
time domain are first transformed to frequency domain by a time-frequency transform
operation. Then, after the spectrum pre-rectification, these samples are quantized
and the quantized sample values are output. An optimal global gain is calculated by
performing inverse spectrum rectification and inverse time-frequency transform on
the output quantized sample values. Quantized sample values in frequency domain and
a global gain need to be transmitted in the encoding streams.
[0108] Figure 13 illustrates a schematic diagram of a decoder for adjusting quality of quantization
according to Embodiment 3. In a decoding process, after the quantized sample values
in frequency domain go through an inverse spectrum rectification and inverse time-frequency
transform, sample values in time domain are obtained. Finally, these sample values
are multiplied with the global gain to form restored sample values in time domain.
[0109] In Embodiment 3, the methods of spectrum pre-rectification and the inverse spectrum
rectification and the technical effects thereof are the same as those in Embodiment
2, which are omitted herein for brevity.
[0111] An apparatus for adjusting quality of quantization according to Embodiment 4 is provided.
[0112] Corresponding to methods in Embodiment 2, Figure 14 illustrates a block diagram of
an apparatus for adjusting quality of quantization at an encoder according to Embodiment
4. As illustrated in Figure 14, the apparatus for adjusting quality of quantization
at the encoder may include a time-frequency transform unit, a spectrum pre-rectification
unit, a multiple scale factors control unit, a quantization unit, a gain balancing
unit, an inverse spectrum rectification unit, an inverse time-frequency transform
unit, and a global gain computing unit. The time-frequency transform unit receives
a first group of sample values, performs a time-frequency transform on the first group
of sample values and outputs to the spectrum pre-rectification unit. The spectrum
pre-rectification unit receives the first group of sample values output from the time-frequency
transform unit, performs a spectrum pre-rectification on the first group of sample
values and outputs to the multiple scale factors control unit. The multiple scale
factors control unit receives the first group of sample values, configures two or
more scale factors for the first group of sample values, adjusts the first group of
sample values with the scale factors, and outputs the adjusted first sample value
to the quantization unit. The quantization unit quantizes the received first sample
value, obtains quantized sample values and outputs the quantized sample values to
the gain balancing unit. The gain balancing unit receives the quantized sample value,
eliminates the influence imposed by the scale factors on the quantized sample value,
obtains a second group of sample values, and outputs the second group of sample values
to the inverse spectrum rectification unit. The inverse spectrum rectification unit
receives the second group of sample values output from the gain balancing unit, performs
an inverse spectrum rectification on the second group of sample values and outputs
to the inverse time-frequency transform unit. The inverse time-frequency transform
unit receives the second group of sample values from the peak spectrum rectification
unit, performs an inverse time-frequency transform on the second group of sample values
and outputs to the global gain computing unit. The global gain computing unit receives
the first group of sample values and the second group of sample values, and obtains
the global gain by using the first group of sample values and the second group of
sample values.
[0113] The multiple scale factors control unit includes a scale factor configuration unit
and a sample value adjusting unit. The scale factor configuration unit is configured
to configure two or more scale factors for the first group of sample values and outputs
the configured scale factor to the sample value adjusting unit. The sample value adjusting
unit is configured to receive scale factors and adjust the first group of sample values
with the scale factors.
[0114] The scale factor configuration unit includes a criteria setting unit, a scale factor
adjusting unit, a unit for estimating the number of consumed bits, a perception distortion
computing unit. The criteria setting unit is configured to set a criterion for scale
factors and output the criteria to the scale factor adjusting unit. The scale factor
adjusting unit is configured to adjust the scale factors based on the criteria and
output the adjusted scale factors to the unit for estimating the number of consumed
bits and the perception distortion computing unit. The unit for estimating the number
of consumed bits is configured to estimate the number of consumed bits based on the
scale factors and determine if the number of consumed bits is less than the total
number of bits allowable by an encoding process and transmit a determination result
to the scale factor adjusting unit. The perception distortion computing unit is configured
to calculate perception distortion based on the scale factors, determine whether the
perception distortion is within an imperceptible range and transmit the determination
result to the scale factor adjusting unit.
[0115] The spectrum pre-rectification unit includes a peak marking unit, a reference computing
unit, a gain factor computing unit and a pre-rectification unit. The peak marking
unit is configured to receive the first group of sample values, mark a peak among
the first group of sample values within the spectrum rectification area, and output
the peak to the reference computing unit. The reference computing unit is configured
to compute based on the peak a reference for spectrum pre-rectification and output
the reference to the gain factor computing unit. The gain factor computing unit is
configured to compute based on the reference a gain factor for each marked peak and
output the gain factor to the pre-rectification unit. The pre-rectification unit is
configured to pre-rectify the spectrum with the gain factor.
[0116] The inverse spectrum rectification unit includes a peak marking unit, a reference
computing unit, a reduction factor computing unit and an inverse rectification unit.
The peak marking unit is configured to receive the sample values, mark peaks among
the sample values within the spectrum rectification area, and output the marked peaks
to the reference computing unit. The reference computing unit is configured to compute
based on the peaks the reference for inverse spectrum rectification and output the
reference to the reduction factor computing unit. The reduction factor computing unit
is configured to compute based on the reference a reduction factor for each marked
peak and output the reduction factor to the inverse rectification unit. The inverse
rectification unit is configured to inversely rectify the spectrum with the reduction
factor.
[0117] Corresponding to methods in Embodiment 2, Figure 15 illustrates a block diagram of
an apparatus for adjusting quality of quantization at a decoder according to Embodiment
4. As illustrated in Figure 15, the apparatus for adjusting quality of quantization
at the decoder includes a gain balancing unit, an inverse spectrum rectification unit,
an inverse time-frequency transform unit and a global gain balancing unit. The gain
balancing unit is configured to receive the quantized sample values and scale factors,
utilize the received scale factors to eliminate the influence of the scale factors
from the quantized sample values and obtain sample values, and output the sample values
to the inverse spectrum rectification unit. The inverse spectrum rectification unit
receives the sample values output from the gain balancing unit, performs an inverse
spectrum rectification on the sample values and outputs to the inverse time-frequency
transform unit. The inverse time-frequency transform unit receives the sample values
from the inverse spectrum rectification unit, performs an inverse time-frequency transform
on the sample values and outputs to the global gain balancing unit. The global gain
balancing unit receives a global gain and sample values, multiplies the sample values
with the global gain and outputs the multiplications. The global gain balancing unit
may be a multiplier. Like the encoder, the inverse spectrum rectification unit of
the decoder includes a peak marking unit, a reference computing unit, a reduction
factor computing unit and an inverse rectification unit. The peak marking unit is
configured to receive the sample values, mark peaks among the sample values within
the spectrum rectification area, and output the marked peaks to the reference computing
unit. The reference computing unit is configured to compute based on the peaks the
reference for inverse spectrum rectification and output the reference to the reduction
factor computing unit. The reduction factor computing unit is configured to compute
based on the reference a reduction factor for each marked peak and output the reduction
factor to the inverse rectification unit. The inverse rectification unit is configured
to inversely rectify the spectrum with the reduction factor.
[0118] Corresponding to the methods of Embodiment 1, 3, and implementations thereof, apparatuses
for adjusting quality of quantization with different structure can be contemplated.
The functionality of each unit in the apparatus has been described above in detail,
which is omitted herein for brevity.
[0119] Embodiments described above may be applicable to various encoding fields such as
audio encoding, video encoding, image encoding.
[0120] With the description of the foregoing embodiments, it is readily appreciated by those
skilled in the art that the present invention may be implemented with software on
a necessary hardware platform. The embodiment may also be implemented with hardware.
But, most of the time, the former approach is more preferable. Based on this understanding,
technical solutions of the present invention, or the part which the present invention
makes contribution over the prior art may be embodied in a software product. The computer
software product may be stored in a readable storage media. The software product may
include a set of instructions enabling a computer (may be a personal computer, a server,
or a network device, etc.) to perform methods according to various embodiments of
the present invention. The foregoing disclosure is only a few embodiments of the present
invention. However, the present invention is not intended to be limiting in these
respects. Any modification made by those skilled in the art shall be construed as
falling within the scope of the present invention.
[0121] The foregoing are merely procedures and method embodiments of the present invention,
which not be construed as limitation to the present invention. Any modifications,
equivalents, improvements, etc., made within the spirit and principle of the present
invention shall be construed as falling within the scope of the present invention.
1. A method for adjusting quality of quantization for encoding,
characterized in comprising:
adjusting a first group of sample values for encoding with at least two scale factors;
quantizing the adjusted first group of sample values to obtain the quantized sample
values;
eliminating the impact of the scale factors from the quantized sample values to obtain
a second group of sample values;
obtaining a global gain based on the first group of sample values and the second group
of sample values; and
outputting the quantized sample values, information of the at least two scale factors
and the global gain as an encoding stream.
2. The method of claim 1,
characterized in that,
the first group of sample values and the second group of sample values are sample
values in time domain; and
before adjusting the first group of sample values, the method further comprises: transforming
the first group of sample values in time domain into the first group of sample values
in frequency domain;
the adjusting a first group of sample values for encoding with at least two scale
factors comprises: utilizing the scale factors to adjust the first group of sample
values in frequency domain;
the quantizing the adjusted first group of sample values to obtain quantized sample
values comprises: quantizing the adjusted first group of sample values in frequency
domain to obtain the quantized sample values.
the eliminating the impact of the scale factors from the quantized sample values to
obtain a second group of sample values comprises: eliminating the impact of the scale
factors from the quantized sample values to obtain a second group of sample values
in frequency domain;
after obtaining the second group of sample values and before obtaining the global
gain, the method further comprises: transforming the second group of sample values
in frequency domain into the second group of sample values in time domain;
the obtaining a global gain based on the first group of sample values and the second
group of sample values comprises: obtaining the global gain by utilizing the first
group of sample values in time domain and the second group of sample values in time
domain.
3. The method of claim 2, characterized in that the transforming the first group of sample values in time domain into the first group
of sample values in frequency domain comprises: transforming the first group of sample
values in time domain into the first group of sample values in frequency domain based
on a Discrete Fourier Transform, or a Fast Fourier Transform, or a Discrete Cosine
Transform, or a Discrete Wavelet Transform.
4. The method of claim 2, characterized in that the at least two scale factors are scale factors configured for the first group of
sample values in frequency domain.
5. The method of claim 4, characterized in that the configuring at least two scale factors for the first group of sample values in
frequency domain comprises: dividing the first group of sample values in frequency
domain into two or more portions and configuring a scale factor for each portion.
6. The method of claim 5, characterized in that the process of utilizing the scale factors to adjust the first group of sample values
in frequency domain comprises: utilizing scale factors to adjust corresponding portions
of the first group of sample values in frequency domain, respectively.
7. The method of claim 6,
characterized in that the process of eliminating the impact of the scale factors from the quantized sample
values comprises:
dividing the quantized sample values into two or more portions in accordance with
the method of dividing the first group of sample values in frequency domain; and
utilizing the scale factor of each portion to eliminate the impact of the scale factor
from the corresponding portion of the quantized sample value.
8. The method of claim 7, characterized in that the process of outputting the information of the at least two scale factors as an
encoding stream comprises: outputting at least two scale factors as an encoding stream.
9. The method of claim 6,
characterized in that after configuring a scale factor for each portion, the method further comprises:
selecting one scale factor of one portion among the scale factors as a criteria scale
factor; and computing ratios of the scale factors of the remaining portions to the
criteria scale factor;
eliminating the impact of the scale factors from the quantized sample values comprises
dividing the quantized sample values into two or more portions in accordance with
the method of dividing the first group of sample values in frequency domain; and utilizing
the obtained ratios to eliminate the impact of the scale factors from the corresponding
portions of the quantized sample values.
10. The method of claim 9, characterized in that the process of outputting the information of the at least two scale factors as an
encoding stream comprises: outputting the ratios of the scale factors of the remaining
portions to the criteria scale factor as an encoding stream.
11. The method of claim 9,
characterized in that the process of eliminating the impact of the scale factors from the quantized sample
values comprises:
dividing the quantized sample values into two or more portions in accordance with
the method of dividing the first group of sample values in frequency domain;
utilizing the criteria scale factor and the obtained ratios to compute a scale factor
for each portion; and
utilizing the scale factor of each portion to eliminate the impact of the scale factor
from the corresponding portion of the quantized sample values.
12. The method of claim 11, characterized in that the process of outputting the information of the at least two scale factors as an
encoding stream comprises: outputting the criteria factor and the ratios of the scale
factors of the remaining portions to the criteria scale factor as an encoding stream.
13. The method of claim 6, characterized in that the process of configuring a scale factor for each portion comprises: adjusting the
scale factor for each portion based on the number of consumed bits and perception
distortion to obtain an optimal scale factor for each portion.
14. The method of claim 13,
characterized in that the process of adjusting the scale factor for each portion to obtain an optimal scale
factor comprises:
setting a criteria for the scale factors such that the number of consumed bits is
less than the total number of bits allowable by the encoding;
adjusting the scale factor for each portion based on the criteria;
determining whether the adjusted scale factors make the number of consumed bits less
than the total number of bits allowable by the encoding; if the condition is not met,
continuing performing the process of adjusting scale factors until the condition is
met; if the condition is met, computing the perception distortion;
determining whether the perception distortion is within an imperceptible range; if
it is determined that the perception distortion is within an imperceptible range,
regarding the scale factors obtained from the current adjustment as the optimal scale
factors; otherwise, returning to the process of adjusting scale factors and repeating
the step of adjusting scale factors and subsequent steps.
15. The method of claim 14, characterized in that the number of consumed bits is estimated based on the first group of sample values
in frequency domain, the number of first sample values in frequency domain and the
scale factors.
16. The method of claim 14, characterized in that, the perception distortion is obtained based on the first group of sample values
in frequency domain, and the scale factor of each portion.
17. The method of claim 14,
characterized in comprising:
if the perception distortion is within a perceptible range, repeating the process
of adjusting scale factors and subsequent steps for a predetermined times;
if the perception distortion is still within a perceptible range after a predetermined
times of repetition, selecting a scale factor as the optical scale factor which contributes
to a minimum perception distortion from the scale factors adjusted during the repeating
process.
18. The method of claim 14, characterized in that the process of adjusting the scale factor for each portion based on the criteria
comprises: decreasing the scale factors at critical bands based on the criteria and
increasing the scale factors at uncritical bands based on the criteria.
19. The method of claim 18, characterized in that the critical bands are low frequency bands and the uncritical bands are high frequency
bands.
20. The method of claim 2,
characterized in that,
before utilizing the scale factors to adjust the first group of sample values in frequency
domain, the method further comprises: performing a spectrum pre-rectification on the
first group of sample values in frequency domain;
after eliminating the impact of the scale factors from the quantized sample values
to obtain a second group of sample values in frequency domain and before transforming
the second group of sample values in frequency domain to the second group of sample
values in time domain, the method further comprises: performing an inverse spectrum
rectification on the second group of sample values in frequency domain.
21. The method of claim 2,
characterized in that,
after utilizing the scale factors to adjust the first group of sample values in frequency
domain and before quantizing the sample values, the method further comprises: performing
a spectrum pre-rectification on the adjusted first group of sample values in frequency
domain;
after quantization and before eliminating the impact of the scale factors from the
quantized sample values, the method further comprises: performing an inverse spectrum
rectification on the quantized sample values.
22. The method of claim 20 or 21,
characterized in comprising:
determining a spectrum rectification area;
performing a spectrum pre-rectification on the sample values comprises performing
a spectrum pre-rectification on the sample values in the determined spectrum rectification
area; and
performing an inverse spectrum rectification on the sample values comprises performing
an inverse spectrum rectification on the sample values in the determined spectrum
rectification area.
23. The method of claim 22,
characterized in that, the spectrum pre-rectification process comprises:
marking peaks of the sample values among the sample values in the determined spectrum
rectification area;
utilizing a peak of the marked peaks to compute a reference for spectrum pre-rectification;
utilizing the reference to compute a gain factor for each marked peak; and
utilizing the computed gain factor to pre-rectify the spectrum.
24. The method of claim 23, characterized in that the process of marking peaks of the sample values comprises: selecting one or more
local areas from the spectrum rectification area and selecting a sample value which
has the largest amplitude from each local area as a peak for a corresponding local
area.
25. The method of claim 24, characterized in that the process of pre-rectifying the spectrum comprises: utilizing gain factors for
corresponding peaks to perform pre-rectification on the local areas containing the
remaining peaks other than the peak for computing the reference.
26. The method of claim 25, characterized in that the process of performing pre-rectification comprising: amplifying the peaks with
the gain factors; or utilizing the gain factors to amplify the peaks and the sample
values in the areas containing the peaks.
27. The method of claim 23, characterized in that the process of computing the reference comprising: selecting a maximum peak from
the marked peaks and utilizing the maximum peak to obtain the reference.
28. The method of claim 27, characterized in that, the reference is an amplitude of the maximum peak, or energy of a sample point close
to the maximum peak; or average energy of sample points close to the maximum peak.
29. The method of claim 23, characterized in that,
the gain factor for the peak is Cm times of k th power of the ratio of the reference number to the peak where k is a number greater than zero and less than 1 and Cm is an arbitrary number.
30. The method of claim 22,
characterized in that, the inverse spectrum rectification process comprises:
marking peaks of the sample values among the sample values in the determined spectrum
rectification area;
utilizing a peak of the marked peaks to compute a reference for inverse spectrum rectification;
utilizing the reference to compute a reduction factor for each marked peak; and
utilizing the computed reduction factor to inversely rectify the spectrum.
31. The method of claim 2, characterized in that,
a global gain obtained by utilizing the first group of sample values in time domain
and the second group of sample values in time domain is determined in such a way that
the variance between the first group of sample values in time domain and the multiplication
of the second group of sample values in time domain and the global gain is at a minimum.
32. A method for adjusting quality of quantization for decoding, wherein an encoding stream
output by an encoder is decoded as a decoding stream, the method
characterized in comprising:
acquiring quantized sample values, information of at least two scale factors and a
global gain from the decoding stream; and
utilizing the information of the at least two scale factors to eliminate the impact
of the scale factors from the quantized sample values to obtain sample values; and
multiplying the sample values with the global gain.
33. The method of claim 32,
characterized in that,
the quantized sample values are quantized sample values in frequency domain;
the process of eliminating the impact of the scale factors from the quantized sample
values to obtain the sample values comprises: eliminating the impact of the scale
factors from the quantized sample values to obtain the sample values in frequency
domain; and
after eliminating the impact of the scale factors from the quantized sample values
to obtain sample values and before multiplying the sample values with the global gain,
the method further comprises: transforming the sample values in frequency domain to
sample values in time domain.
34. The method of claim 33,
characterized in that,
after eliminating the impact of the scale factors from the quantized sample values
in frequency domain to obtain sample values in frequency domain and before transforming
the sample values in frequency domain to the sample values in time domain, the method
further comprises: performing an inverse spectrum rectification on the sample values
in frequency domain; or
before eliminating the impact of the scale factors from the quantized sample values
in frequency domain to obtain sample values in frequency domain, the method further
comprises: performing an inverse spectrum rectification on the sample values in frequency
domain.
35. The method of any one of claims 32-34,
characterized in that
the information of the scale factors acquired from the decoding stream comprises all
scale factors;
the process of eliminating the impact of the scale factors from the quantized sample
values comprises:
dividing the quantized sample values into two or more portions in accordance with
the method of dividing the sample values in frequency domain during encoding; and
utilizing the scale factor for each portion to eliminate the impact of the scale factor
from the corresponding portion of the quantized sample values.
36. The method of any one of claims 32-34,
characterized in that
the information of the scale factors acquired from the decoding stream is the remaining
scale factors and ratios of the remaining scale factors to a criteria scale factor
wherein a scale factor is treated as the criteria scale factor;
the process of eliminating the impact of the scale factors from the quantized sample
values comprises:
dividing the quantized sample values into two or more portions in accordance with
the method of dividing the sample values in frequency domain during encoding; and
utilizing the obtained ratios to eliminate the impact of the scale factors from the
corresponding portions of the quantized sample value.
37. The method of any one of claims 32-34,
characterized in that
the information of the scale factors acquired from the decoding stream is a scale
factor treated as a criteria scale factor and the ratios of the remaining scale factors
to the criteria scale factor;
the process of eliminating the impact of the scale factors from the quantized sample
values comprises:
dividing the quantized sample values into two or more portions in accordance with
the method of dividing the sample values in frequency domain during encoding;
utilizing the criteria scale factor and the ratios to compute a scale factor for each
portion; and
utilizing the scale factor for each portion to eliminate the impact of the scale factor
from the corresponding portion of the quantized sample values.
38. The method of claim 34,
characterized in that, the inverse spectrum rectification process comprises:
marking peaks of the sample values among the sample values in a spectrum rectification
area determined during encoding;
utilizing a peak of the marked peaks to compute a reference for inverse spectrum rectification;
utilizing the reference to compute a reduction factor for each marked peak; and
utilizing the computed reduction factor to inversely rectify the spectrum.
39. An apparatus for adjusting quality of quantization for encoding, characterized in comprising a multiple scale factors control unit, a quantization unit, a gain balancing
unit, and a global gain computing unit; wherein
the multiple scale factors control unit is configured to receive a first group of
sample values, configure at least two scale factors for the first group of sample
values, adjust the first group of sample values with the scale factors, and output
the adjusted first group of sample values to the quantization unit;
the quantization unit is configured to quantize the received first group of sample
values, obtain quantized sample values and output the quantized sample values to the
gain balancing unit;
the gain balancing unit is configured to receive the quantized sample values, eliminate
the impact of the scale factors from the quantized sample values, obtain a second
group of sample values, and output the second group of sample values to the global
gain computing unit; and
the global gain computing unit is configured to receive the first group of sample
values and the second group of sample values, and obtain a global gain based on the
first group of sample values and the second group of sample values.
40. The apparatus of claim 39, characterized in further comprising a time-frequency transform unit and an inverse time-frequency
transform unit; wherein
the time-frequency transform unit is configured to receive the first group of sample
values, perform a time-frequency transform on the first group of sample values and
output to the multiple scale factors control unit; and
the inverse time-frequency transform unit is configured to receive the second group
of sample values from the gain balancing unit, perform an inverse time-frequency transform
on the second group of sample values and output to the global gain computing unit.
41. The apparatus of claim 40, characterized in further comprising a spectrum pre-rectification unit and an inverse spectrum rectification
unit; wherein
the spectrum pre-rectification unit is configured to receive the first group of sample
values output from the time-frequency transform unit, perform a spectrum pre-rectification
on the first group of sample values and output to the multiple scale factors control
unit; the inverse spectrum rectification unit is configured to receive the second
group of sample values output from the gain balancing unit, perform an inverse spectrum
rectification on the second group of sample values and output to the inverse time-frequency
transform unit;
or,
the spectrum pre-rectification unit is configured to receive the first group of sample
values output from the multiple scale factors control unit, perform a spectrum pre-rectification
on the first group of sample values and output to the quantization unit; the inverse
spectrum rectification unit is configured to receive the second group of sample values
output from the quantization unit, perform an inverse spectrum rectification on the
quantized sample values and output to the gain balancing unit.
42. The apparatus of any one of claims 39 to 41, characterized in that the multiple scale factors control unit comprises a scale factor configuration unit
and a sample value adjusting unit; wherein
the scale factor configuration unit is configured to configure at least two scale
factors for the first group of sample values and output the configured scale factors
to the sample value adjusting unit; and
the sample value adjusting unit is configured to receive scale factors and adjust
the first group of sample values with the scale factors.
43. The apparatus of claim 42, characterized in that, the scale factor configuration unit comprises a criteria setting unit, a scale factor
adjusting unit, a unit for estimating the number of consumed bits, a perception distortion
computing unit; wherein
the criteria setting unit is configured to set a criteria for the scale factors and
output the criteria to the scale factor adjusting unit;
the scale factor adjusting unit is configured to adjust the scale factors based on
the criteria and output the adjusted scale factors to the unit for estimating the
number of consumed bits and the perception distortion computing unit;
the unit for estimating the number of consumed bits is configured to estimate the
number of consumed bits based on the scale factors and determine if the number of
consumed bits is less than the total number of bits allowable by an encoding process
and transmit a determination result to the scale factor adjusting unit; and
the perception distortion computing unit is configured to calculate perception distortion
based on the scale factors, determine whether the perception distortion is within
an imperceptible range and transmit the determination result to the scale factor adjusting
unit.
44. The apparatus of claim 41, characterized in that, the spectrum pre-rectification unit comprises a peak marking unit, a reference computing
unit, a gain factor computing unit and a pre-rectification unit; wherein
the peak marking unit is configured to receive the first group of sample values, mark
a peak among the first group of sample values within the spectrum rectification area,
and output the peak to the reference computing unit;
the reference computing unit is configured to compute based on the peak a reference
for spectrum pre-rectification and output the reference to the gain factor computing
unit;
the gain factor computing unit is configured to compute based on the reference a gain
factor for each marked peak and output the gain factor to the pre-rectification unit;
and
the pre-rectification unit is configured to pre-rectify the spectrum with the gain
factor.
45. The apparatus of claim 41, characterized in that, the inverse spectrum rectification unit comprises a peak marking unit, a reference
computing unit, a reduction factor computing unit and an inverse rectification unit;
wherein
the peak marking unit is configured to receive the sample values, mark peaks among
the sample values within the spectrum rectification area, and output the marked peaks
to the reference computing unit;
the reference computing unit is configured to compute based on the peaks a reference
for inverse spectrum rectification and output the reference to the reduction factor
computing unit;
the reduction factor computing unit is configured to compute based on the reference
a reduction factor for each marked peak and output the reduction factor to the inverse
rectification unit; and
the inverse rectification unit is configured to inversely rectify the spectrum with
the reduction factor.
46. An apparatus for adjusting quality of quantization for decoding, characterized in comprising a gain balancing unit, and a global gain balancing unit; wherein
the gain balancing unit is configured to receive quantized sample values and scale
factors, utilize the received scale factors to eliminate the impact of the scale factors
from the quantized sample values and obtain sample values, and output the sample values
to the global gain balancing unit; and
the global gain balancing unit is configured to receive a global gain and the sample
values, multiply the sample values with the global gain and output the multiplications.
47. The apparatus of claim 46, characterized in further comprising an inverse time-frequency transform unit configured to receive
the sample values from the gain balancing unit, perform an inverse time-frequency
transform on the sample values and output to the global gain balancing unit.
48. The apparatus of claim 47, characterized in further comprising an inverse spectrum rectification unit; wherein
the inverse spectrum rectification unit is configured to receive the sample values
output from the gain balancing unit, perform an inverse spectrum rectification on
the sample values and output to the inverse time-frequency transform unit;
or,
the inverse spectrum rectification unit is configured to receive the quantized sample
values, perform an inverse spectrum rectification on the quantized sample values and
output to the gain balancing unit.
49. The apparatus of claim 48, characterized in that, the inverse spectrum rectification unit comprises a peak marking unit, a reference
computing unit, a reduction factor computing unit and an inverse rectification unit;
wherein
the peak marking unit is configured to receive the sample values, mark peaks among
the sample values within the spectrum rectification area, and output the marked peaks
to the reference computing unit;
the reference computing unit is configured to compute based on the peaks a reference
for inverse spectrum rectification and output the reference to the reduction factor
computing unit;
the reduction factor computing unit is configured to compute based on the reference
a reduction factor for each marked peak and output the reduction factor to the inverse
rectification unit; and
the inverse rectification unit is configured to inversely rectify the spectrum with
the reduction factor.