[0001] The invention relates to disturbance reduction in digital signal processing.
[0002] In telecommunication devices, sound pick-up has to deal with two problems: The presence
of noise in the environment where the devices are and the echo phenomenon due to the
coupling between the loudspeaker and the microphone. These phenomenon decrease the
quality of the communication: On one hand, the speech signal of the user is corrupted
by the environmental noise which leads to tiredness for the far-end speaker and in
case of too loud noise to misunderstanding between the correspondents. The echo phenomenon
is also really perturbing for the far-end speaker as he can hear his own voice due
to the coupling. If the transmission delay of the communication network is important
(more than 30 ms), he has the really bad feeling of hearing his own voice delayed.
Current mobile networks and their delay of more than 150 ms make this echo phenomenon
annoying (previously, fixed telephony network provided low delay for local calls and
the echo phenomenon was disturbing only for international communications).
[0003] These two drawbacks are inherent to phone communication. In case of handset or headset,
their effects on the far-end speaker are a little bit smoothed by the fact that the
speech of the user is generally more energetic than the two perturbations (noise and
echo). Nevertheless, in certain conditions (in stations, on the street), the surrounding
noise can really disturb the far-end speaker. The echo can also appear in small recent
devices, or in headsets. In both cases the coupling exists and the echo can be energetic
enough to be disturbing.
The hands-free systems present the worse case considering noise and echo problems.
Indeed, in this case the microphone is far from the talker so that the speech signal
is less energetic and the perturbation gains importance. Moreover, the loudspeaker
signal is also louder than in non-hands-free use cases so that the coupling between
the two transducers (microphone/loudspeaker) increases.
[0004] Digital telecommunication systems include speech coding. Speech codecs are definitely
perturbed by the presence of noise and echo. Indeed, they are optimized to handle
single speech signals.
[0005] There exists a lot of codecs, and many of them are using Linear Prediction Coefficients
(LPC) analysis. For example, CELP codecs as AMR, EFR, MELP, TCX or vocoders use such
an analysis. The generic principle of the LPC analysis is to provide a linear estimation
of the input speech signal
y(n) through an AR filter as follow:

[0006] Where
└ay(
k)┘ stand for the Linear Prediction Coefficients and
P stands for the prediction order. After filtering the input signal by the LPC filter,
a residual signal is obtained. This signal needs to be transmitted for reconstruction
of the original signal to the decoder.
[0008] Within the communication chain, it may happen that only the coded signal is available.
This is the case in any part of the network where only the codec bitstream is available.
It may also be the case in terminal where an integrated chipset encodes the microphone
signal and only delivers a bitstream to the processor where noise reduction or echo
cancellation can be implemented.
[0009] In such scenarios, classical solutions can only be applied by decoding the signal,
processing it and re-encoding. This leads to a suboptimal solution compared to any
solution that would be based on the processing of the PCM signal. Indeed, decoding
and re-encoding lead to high computation load, moreover, the so-called tandem effect
due to decoding-re-encoding decreases the quality of the speech signal. Accordingly,
the enhancement obtained by noise reduction and echo cancellation may be compensated
by the artefacts introduced by the tandem effect.
[0010] More recently, the idea appeared to modify the codec parameter "fixed gain" in A-CELP
coding to decrease of the signal energy when perturbation is detected. This is shown
in
- "Compressed Domain Noise Reduction and Echo Suppression for Network Speech" by Chandran,
Ravi and Marchok, Daniel J.; Enhancement, Proc.of the 43rd IEEE Midwest Symposium
on Circuits and Systems, pp 10-13, 2000,
- "Noise reduction on speech codec parameters" by Hervé Taddei, Christophe Beaugeant,
Michael de Meuleneire; ICASSP 2004
- and "Gain Loss Control based on Speech Codec Parameters" by C. Beaugeant, N. Duetsch, H.
Taddei; Eusipco 2004.
[0011] Such techniques allow an efficient decrease of the perturbation energy but can not
reduce all the artefacts present on the speech.
[0012] It is an object of the invention to provide a method for reducing noise on encoded
digital signals being encoded with Linear Prediction Coefficients (LPC).
[0013] This object is solved by the subject-matter of the independent claims. Further enhancements
are provided by the subject-matter of the dependent claims.
[0014] A method is provided for transmission of a digital signal y(n). The digital signal
y(n) comprises a useful signal s(n) and a perturbation signal p(n). The perturbation
signal p(n) derives e.g. from noise or echo and includes everything of y(n) that is
not part of the useful signal s(n).
[0015] The bitstream y
e(n) is derived from y(n) by LPC-encoding. As a first step, the Linear Prediction Coefficients
(LPC)
Ay of a bitstream y
e(n) are received. Other parameters of the bitstream y
e(n) may also be received, like the fixed gain or the adaptive gain of the bitstream
y
e(n). As an option, the complete bitstream y
e(n) is received. The autocorrelation matrix Γ
s of the useful signal s(n), of the autocorrelation matrix Γ
p, of the perturbation signal p(n) and the LPC
Ap of the perturbation signal p(n) are estimated.
[0016] A modified LPC
As is calculated. It is calculated from
Ay and the estimated Γ
s, Γ
p,
Ap. In an output step, a modified data stream y
e'(n) including the modified LPC
As is output. This data stream can be received by a decoder which decodes the original
signal y(n).
[0017] Codecs for transmission of speech are optimized for speech signals. The addition
of noise or of echo to the useful speech signal leads to sub-optimal behaviour of
the codecs, which means additive artefacts on the decoded signal and lower quality.
The use of LPC coefficients that are influenced by the noise signal makes the quality
of the received speech worse. Accordingly, noise and echo are not only adding undesired
information to the useful signal, they also lead to sub-optimal behaviour of speech
codecs, decreasing all the quality of telecommunication.
[0018] Our solution is based on this innovative principle to modify directly the parameter
computed by the speech encoders. Compared to prior art solutions, it proposes to modify
the LPC coefficients. The received speech sounds more precise if the LPC coefficients
are less influenced by noise.
[0019] As is preferably calculated by
As =
Γs-1•[(Γ
s + Γ
p) ·
Ay - Γ
p ·
Ap]. This equation only comprises multiplication and addition functions and is accordingly
adapted for digital signal processors.
As also can be calculated by the equivalent equation
As =
Ay + Γ
s-1.Γ
p[
Ay -
Ap].
[0020] If also the residual signal of the encoded signal y
e(n) is received, the steps of estimating Γ
s, Γ
p,
Ap can be based on the residual signal of y
e(n) and
Ay. The residual signal is the signal that is obtained after the LPC filtering. y
e(n) comprises of the residual signal and the LPC coefficients. The estimations of
Γ
s, Γ
p,
Ap can be done by classical methods, e.g. by frequency analysis of the encoded signal
y
e(n).
[0021] In an embodiment, the method also comprises a step of a noise reduction on the residual
signal of the encoded signal y
e(n). This gives the additional advantage that the noise is reduced in the output data
stream y
e'(n). A noise reduction technique on residual signals is described in the above-mentioned
"Compressed Domain Noise Reduction and Echo Suppression for Network Speech". The invention
described here provides a solution to achieve a reduction of perturbation, like noise
and echo, by modifying the LPC coefficients computed during LPC analysis.
[0022] In another embodiment, the Linear Prediction Cofficients (LPC)
Ay of the signal y
e(n) are not received, but calculated from the digital sample signal y(n). The encoding
and modifying the LPC coefficients is done only once. Therefore, the residual signal
does not need to be encoded and output twice. This improves the speed for encoding
and modifying the LPC.
[0023] Even if the embodiments of the invention will be described in respect to speech signals,
the invention may be used with any system based on model of Eq (1) where additive
perturbation disturbed the coefficients └
ay(
k)┘. The method is applicable in a broad range of applications in signal processing.
One possible application where the LPC modification would be useful is earthquake
detection.
[0024] But, the method is especially qualified for signal transmission in telecommunication.
Because the different signal characteristics of voice and noise signals, the autocorrelation
matrix of the perturbation signal can be estimated relatively precisely. This ensures
that the cleaning of the LPC parameters is made successful.
[0025] The invention also relates to a digital signal transmission apparatus that performs
the inventive method. Such an apparatus comprises means for receiving the LPC coefficients
Ay, for the estimation of Γ
s, Γ
p,
Ap, for the calculation and output of
Ap.
[0026] Especially, the calculation of
As is preferable done by a digital signal processor (DSP) because a DSP is effective
in performing multiplications and additions.
[0027] The invention deals with a method to reconstruct the LPC coefficient
As = [
as(
k)]
T of the useful signal
s(
n) knowing the LPC coefficients
Ay=[
ay(
k)]
T computed on a perturbed signal
y(
n). It is based on the following mathematical development of the LPC analysis.
[0028] We consider that a corrupted signal
y(n) is the sum of a useful signal
s(n) with a perturbation
p(
n). This perturbation can be additive noise, echo or more generally any signal that
is not desired:

[0029] We assume that the perturbation and the useful signal are not correlated.
[0030] We also assume that an LPC analysis of order
P is applied on analysis frames of
N samples (this is the case for speech codecs based on LPC), so that the signal
y(n) is estimated by
ŷ(
n) as:

[0031] In the same way, the useful signal
s(
n) is modeled as:

and, an estimation
p̂(
n)
of p(
n) can be written with:

[0032] Let's consider the following squared error

[0033] Using the fact that the signal
y(
n) can be written as the sum of the useful signal and of the perturbation, we can write:

The error
EST is minimum with respect to the LPC coefficients
Ay when its derivative with respect to each coefficient is zero:

[0034] It leads to

[0035] This can be written equivalently

[0036] With the hypothesis that the perturbation signal and the useful signal are not correlated,
we assume that we can write

[0037] As a result equation Eq (10) is reduced to:

[0038] Assuming that the estimation
ŝ(
n) from Eq (4) is close to the signal
s(
n) (
s(
n)
≈ŝ(
n)) we can write:

[0039] In the same way, we can write:

[0040] As a result Eq (11) leads to:

[0041] Let's introduce the covariance functions of the useful signal
s(n) and of the perturbation
p(
n)

as well as the autocorrelation matrix of
s(n) , p(n) and
y(n) 
[0042] For e.g. P=4, Γ
u can be written as

[0043] Let's also introduce the LPC coefficients vector of the different signals considered
( s(n) p(n) and
y(
n)) :

[0044] Where
T stands for the transposition operator.
With the new notation introduced Eq (14) becomes:

[0045] As a result the LPC coefficients of the useful signal can be obtained through:

[0046] Or equivalently:

[0047] The two last expressions show that the computation of the useful signal LPC is obtained
when the following entities are known: the LPC of the perturbed signal (
Ay), the LPC of the perturbation (
Ap), the covariance matrix of the perturbation (Γ
p), and the inverse of the covariance matrix of the useful signal (Γ
s). One can see the formula as a filter of the perturbed LPC
Ay to obtain the useful LPC
As, this filter depending on
Ap, Γp and Γ
s.
[0048] Accordingly, the present invention proposes a method based on formula Eq. (20)-(21)
or on any formula derived from this equation to obtain the LPC coefficients of the
useful signal (
As), when the LPC coefficients of the perturbed signal (
Ay) are available. Eq. (20)/(21) require to know the LPC coefficients of the perturbation
Ap, the correlation matrix of the perturbation Γ
p and the inverse of the correlation matrix of the useful signal Γ
s-1. Generally, these entities are not directly available as we place our problem in
a scheme where only the perturbed coefficients
Ay are available. Accordingly
Ap, Γ
p and Γ
s-1 need to be estimated. It results that the invention can be seen as the generic process
described below:
[0049] For each frame m, the LPC
Ay are available.
- Estimation of the LPC of the perturbation Ap based on the LPC Ay. As an alternative, if more entities than Ay are available the estimation can be based on the additive information. For instance,
when placing the method within a speech codec, other speech codec parameters can be
used to obtain the estimation.
- Estimation of the correlation matrix of the perturbation Γp based on the LPC Ay. As an alternative, if more entities than Ay are available the estimation can be based on the additive information. For instance,
when placing the method within a speech codec, other speech codec parameters can be
used to obtain the estimation.
- Estimation of the correlation matrix of the useful signal Γs based on the LPC Ay. As an alternative, if more entities than Ay are available the estimation can be based on the additive information. For instance,
when placing the method within a speech codec, other speech codec parameters can be
used to obtain the estimation.
Applying the filter defined in Eq (20)/(21) to get
As.
[0050] This process can be applied on speech codec bitstream by applying the following steps:
- For each frame m, extracting the LPC coefficients Ay from the speech codec bitstream.
- Applying the processed as described previously to get the useful signal LPC As.
- Exchanging the coefficients Ay with the useful one As
[0051] The solution has following advantages:
- The method makes it possible to obtain clean LPC coefficients when knowing perturbed
ones. Any system based on linear prediction, computing coefficients Ay can use this solution to obtain the clean coefficients As
- A typical application of the method is processing noise reduction, echo reduction
or reduction of any other perturbation on the LPC on speech.
- When PCM samples are not available, but only codec parameters, the methods permits
to re-construct the useful LPC parameters without the need to decode the bitstream
to get the PCM data and apply classical noise reduction or echo cancellation. This
is an alternative solution to existing prior art solutions.
- In the LPC analysis step (speech encoding), such method can be applied in parallel
to the LPC analysis. It can be seen as a kind of postfilter after the LPC analysis.
- The mathematical expression obtained in Equation (21) is relatively easy to implement.
It requires of course estimations of certain entities like cross-correlation functions
or LPC or the noisy signal, but such estimations are quite classical. The method is
accordingly quite classical from a signal processing point of view, and possible to
implement in real time applications.
[0052] The invention is demonstrated with help of the drawings.
Figure 1 shows a signal transmission from a sender to a receiver in a telecommunication
system.
Figure 2 is a flow chart for the modification of LPC coefficients according to a first
embodiment.
Figure 3 shows a comparison of transfer function with non-modified LPC versus modified
LPC coefficients.
Figure 4 shows a second embodiment for the modification of LPC coefficients.
[0053] Figure 1 shows an embodiment of a telecommunication system 1 in a signal transmission
with modified LPC coefficients. The sender 2 generates the useful signal s(t) by talking.
Perturbations generate a perturbation signal p(t) with is added to the useful signal
resulting in the signal y(t). The signal y(t) is digitalized in the Analog-Digital-Converter
(AD-Converter) 3 which generates a digital signal y(n). The digital signal y(n) is
encoded in the encoder to the signal y
e(n).
[0054] The encoding is done with the help of an LPC analysis. The encoded signal y
e(n) is transmitted via the transmission block 5 to the decoder 6. The decoder 6 receives
the signal y
e' (n) from the transmission block and decodes y
e' (n) to a digital signal y
d(n). Depending on how the transmission block 5 is implemented, y
e' (n) is either equal or unequal to y
e(n). The transmission block 5 is e.g. a telephone switch, a router or a simple wire.
[0055] y
d(n) is finally DA-converted by the DA-converter to y
a(t) which is received as an analog signal by the receiver 8.
[0056] In the embodiment of Figure 2, the modification of the LPC parameters is done in
the transmission block 5, whereas in the embodiment of Figure 4, the encoder 4 directly
modifies the LPC parameters.
[0057] Figure 2 is a flow chart for the modification of LPC coefficients within the transmission
block 5. y
e(n) is a bitstream including LPC coefficients. If the encoder uses the AMR codec,
the LPC coefficients are transmitted as Line Spectral Pair (LSP). The frames of y
e(n) also comprise the parameters pitch delay, fixed codebook index, fixed gain and
adaptive gain. The bitstream is computed by the analysis of successive frames, each,
each comprising a defined number of samples (generelly 160). If the signal y
e(t) is sampled at a frequency of 8 MHz, in the so-called narrow band, the number of
LPC coefficients is chosen to 8 or 10 in current standardized codecs (AMR, EFR, FR).
In other words, the codec uses 8th respectively a 10th order linear prediction filter.
In Eq. (1) k runs from 1 to 9 respectively from 1 to 11.
[0058] In the case of the so-called wide band, the sampling frequency is 16 kHz and the
number of coefficients is preferably chosen to 16 in current standardized codecs (AMR-WB).
[0059] The Figure 2 shows the flow chart where the LPC coefficients are extracted from the
bitstream y
e(n). The bitstream is divided in the LPC coefficients and the rest of the bitstream,
including the information needed to decode the residual waveform. Then the estimations
of
Ap, Γ
p and Γ
s-1 are applied, taking into account the LPC coefficient as well as eventually additive
information from the bitstream.
[0060] In an embodiment,
Ap is generated by the help of a Voice Activity Detection (VAD). Voice Activity Detection
is known in the art. Here, the output of the VAD generates zero if no voice signal
is detected in the bitsteam, else the VAD outputs a one.
Ap(
m) is calculated by the following algorithm, whereby m is the index of the frame or
subframe.
if VAD = 0,

if VAD = 1,

α is a fixed parameter being heuristically chosen, whereby 0<α<1.
[0061] To estimate Γ
p, the perturbation is assumed to be white noise. Accordingly, the autocorrelation
matrix Γ
p has the following form:

[0062] Depending on the output of VAD, E
p(m) is calculated by the equations
if VAD = 0,

if VAD = 1,

wherein by
Ey(
m) is the energy of the signal y(n), m indicates a frame or subframe and β is a fixed
parameter being heuristically chosen, 0<β<1.
[0063] In this embodiment, Γ
s is estimated by the equation:

Γ
y may be calculated with the help of Eq. 15 if the data stream is decoded.
[0064] In this embodiment, the bitstream of the encoded signal has to be decoded to make
the estimation of Γ
p, Γ
s.
Alternatively, the estimation of this matrices and vectors can also be done on the
basis of the codec parameters of the signal y
e(n) by interpreting the fixed gain and the adaptive gain.
[0065] After the estimation of
As,
Γp,
Γs, the clean LPC coefficients
As are generated by one of the equations 21 or 22. It should be noticed that the calculated
clean LCP
As are an estimation of a LPC of the useful signal s(n). Accordingly, the calculated
LPC
As are as good as the estimations for
As, Γ
p, Γ
s are.
[0066] The filter on the LPC
Ay coefficients is applied to get the clean LPC
As and finally the LPC are replaced in the bitstream by changing each frame by the use
of the clean LPC parameters
As. The frames are modified sequentially and sent to the decoder as signals y
e'(n).
[0067] This method of improving the speech signal quality can be done anywhere in the path
between the encoder and decoder. For example, in telecommunication systems, the method
can be applied in the terminal of the sender, in the terminal of the receiver or in
one of the routers telephone switches or gateways between different networks.
[0068] The use of the modified LPC coefficients improves the quality of the received signal
which is demonstrated with the help of Figure 3.
[0069] Figure 3 shows a comparison of transfer functions with non-modified LPC versus modified
LPC coefficients. The synthesis LPC filter function can be described by the filter
transfer function H(f) in the frequency domain. The graph of Fig. 3 shows a functions
H(f) dependent on the frequency f for a non-noisy LPC function and, with the dashed
line, for a noisy LPC filter. The transfer function of the noisy LPC filter, in our
case a non-modified LPC, has more energy but is smoother. Using a LPC that was generated
on the basis of a noisy signal worsens the quality of the speech. Hence, the modification
of the LPC to a clean LPC make it easier for the receiver to understand the received
speech and enhance the clarity of the speech.
[0070] If only the LPC's are modified, the received speech still includes noise. Therefore,
as an option, the flow chart of Figure 2 may be extended by an additional step which
reduces noise on the rest bitstream. This noise reduction is performed after the estimation
of
Ap, Γ
p and Γ
s-1 and before the generation the new frames of the bitstream of the signal.
[0071] One examplary noise reduction technique for the rest bitstream is the method for
reducing noise on the codec parameters pitch gain and codebook gain described in the
above-mentioned "Compressed Domain Noise Reduction and Echo Suppression for Network
Speech".
[0072] Figure 4 shows a second embodiment of the modification of LPC coeffients. In this
case, the function is included in the encoder 4.
[0073] In speech encoding, like typically AMR, AMR-WB, G723, G729, the LPC coefficient are
computed by an analysis of successive weighted frames. Typically, the Levinson-Durbin
algorithm permits to get the LPC coefficient from the sample
y(n) of the analysis frame. Our method maybe placed as a postfilter of the computation
blocks of the LPC analysis. In this scenario, it enhances the LPC coefficient by reducing
the influence of the noise. The needed estimations (
Ap, Γ
p and Γ
s-1) may be done by using the LPC coefficients but also some additive information of
sample y(n) as depicted in Fig 4. Finally, the filter of Eq (20)/(21) is applied on
the perturbed coefficients to get the enhanced ones.
[0074] In the embodiment of Figure 4, the method of improving the speech quality is performed
within the encoder. The encoder receives the samples from the A/D-converter. The samples
are organized as frames. After windowing a frame, the LPC analysis outputs the LCP
coefficients
Ay. The parameters
Ap, Γ
p and Γ
s-1 are estimated like one of the embodiments described above. LPC coefficients
As are calculated by one of the equations (21) and (22). The encoding of the frame is
done with
As.
Reference number list
[0075]
- 1
- telecommunication system
- 2
- sender
- 3
- A/D-converter
- 4
- encoder
- 5
- transmission block
- 6
- decoder
- 7
- D/A-converter
- 8
- receiver