BACKGROUND OF THE INVENTION
1. Field of the Invention
[0001] The present invention generally relates to electronic filtering for enhancing a desired
signal component of a mixed signal, and more specifically to a method and apparatus
for real-time unmixing (separation or deconvolving) of a desired signal from a mixture
of independent signals, particularly useful, for example, in a hearing aid.
2. Description of the Prior Art
[0002] When one is listening to someone or something, "noise" or undesired signals that
interfere with the voice or desired signal, are ubiquitous. People with hearing impairment
are especially vulnerable to noise. Background conversations, interference from digital
devices (mobile telephones), car, or other specific environment noises, can make it
very difficult for a hearing impaired person to understand a desired speech signal.
A reduction in the noise level of a signal, coupled with an automatic focus on a desired
signal component, can significantly improve the performance of an electronic voice
processor, such as one used in an advanced hearing aid.
[0003] In recent years, hearing aids using digital signal processing have been introduced.
They contain one or more microphones, analog to digital converters, digital signal
processors, and speakers. Usually the digital signal processors divide the incoming
signals into several frequency regions using filter banks. Within each of those regions,
signal gain and dynamic compression parameters can be individually adjusted in accordance
with the requirement for a particular user of the hearing aid, in an attempt to improve
intelligibility. Additionally, digital signal processing algorithms for feedback reduction
and noise reduction are available, however they have major limitations. For example,
some of the disadvantages of the currently available algorithms for noise reduction
are the limited improvement they obtain when speech and background noise are in the
same frequency region, due to their inability to distinguish between speech and background
noise.
[0004] One relatively new digital signal processing approach currently finding use for noise
reduction in areas such as speech recognition, data communication and sensor signal
processing, involves a technique known generally as Independent Component Analysis
(ICA), and in more specific applications as Blind Source Separation (BSS). This technique
searches an input signal having multiple components, for a signal transformation which
will minimize the statistical dependence between its components. Accordingly, BSS
is a signal separation technique capable of delivering dramatic improvements in signal
to noise ratio for mixtures of independent signals, such as multiple voices or mixtures
of voice and noise signals.
[0005] It is an object of the present invention to provide an electronic filtering technique
incorporating BSS processing which can operate in real time to enhance reception of
a desired signal, such as the voice of a nearby person, and furthermore, if desired,
can be incorporated in a hearing aid.
SUMMARY OF THE INVENTION
[0006] An electronic filtering device for performing real-time unmixing of a signal desired
to be recovered by a user of the device, where the desired signal emanates from one
of a plurality of independent signal sources. Two microphones positioned along a common
axis develop first and second electrical input signals in response to reception by
the microphones of acoustic signals from the plurality of independent signal sources.
The spatial position of the common axis of the microphones is controllable in real
time by the user to align the common axis so it points in the direction of the source
of the desired signal, thereby imparting an inherent directionality to the input signals.
An adaptive unmixing signal processor responsive to the input signals develops output
signals wherein the desired signal is separated from the mixture signal. In one preferred
embodiment of the invention a preprocessor is provided to enhance the inherent directionality
of the input signals by establishing a relative time delay therebetween. Furthermore,
the preprocessor may subject the enhanced input signals to a decorrelation processing
before their application to the unmixing signal processor. A selected output of the
unmixing signal processor can be applied as an input to a speaker for reproduction,
or can be further processed for signal enhancement by an additional processor before
reproduction.
BRIEF DESCRIPTION OF THE DRAWINGS
[0007]
Figure 1 illustrates in block diagram form an electronic filtering device constructed
in accordance with the principles of the present invention;
Figure 2 illustrates in block diagram form the preprocessing stage of the electronic
filtering device shown in Figure 1;
Figure 3 illustrates in block diagram form the technique of Blind Source Separation
as used in the electronic filtering device of the invention; and
Figure 4 illustrates in block diagram form an exemplary embodiment of a Blind Source
Separator useful in the electronic filtering device of the invention.
DETAILED DESCRIPTION OF THE INVENTION
[0008] Figure 1 illustrates in block diagram form an application of the invention for use
in hearing aids. A hearing aid 10 includes two microphones 12 and 14 for developing
two input signals 1 and 2, respectively. In accordance with one aspect of the invention,
the microphones are mounted in the hearing aid such that a common axis of their positioning
always extends substantially in the direction in which the wearer of the hearing aid
looks when being attentive to a signal source such as a voice. This microphone positioning
imparts an inherent directionality to input signals 1 and 2. Since each microphone
develops electrical signals representative of the acoustic waves received thereby
from sound sources within it's operating range, each input signal may comprise a mixture
of unknown signals from an unknown number of signal sources. Input signals 1 and 2
are processed in three main stages. At a first stage 16, the input signals are preprocessed
for enhancing the inherent directionality already imparted thereto by their positioning.
At a second stage 18, the resulting signals are subjected to an unmixing processing
(sometimes referred to as separation processing), which is designed to produce estimates
of the original unknown signals picked-up by microphones 12 and 14. At a third stage
20, the outputs of the unmixing processing are preferably postprocessed to produce
the desired signal 22, which can then be applied to a speaker 24 of the hearing aid
10 for reproduction and presentation to a user.
[0009] As illustrated in Fig. 2, preprocessing stage 16 begins with normalization of the
raw input signals. Automatic Gain Control is used to normalize input signals 1 and
2 to a [-1,+1] range. The inputs 1 and 2 are now given in by a vector
x =
(x1(t),x2(t)).
[0010] In accordance with one aspect of the invention, in order to adapt a blind source
separation (BSS) technique for use in a device as small as a hearing aid, and to have
it operate in real-time, preprocessing stage 16 also provides at least the first,
and preferably both of the following additional processing:
- Enhancement of signal source directionality inherent in the input signals, resulting
from a directional arrangement of microphones 12 and 14 with respect to a source of
interest. In the hearing aid exemplary embodiment, the directionality of the source
of interest is presumed to be in the direction that the user is looking. Accordingly,
the microphones are positioned on the hearing aid along an axis that is in the direction
that the user would be looking, and the direction of the source of interest is presumend
to be at zero degrees with respect to such axis. The direction of a second source
can be estimated in the preprocessing stage (delay box in 16) resulting in an adaptive
delay (δ). The delay is a positive or negative fractional delay, such that the most
powerful component of the inputs other than the one approximately aligned with the
microphone axis arrives synchronously at the two microphones. For example this would
be zero if the second source were perpendicular to the microphone axis. For this enhancement,
the normalized input signals x = (x1(t),x2(t)) are modified as follows:


- Decorrelation of the input signals. In the exemplary embodiment decorrelation is carried
out by a diagonalization of the correlation matrix. More specifically, let C=Covariance(
xT), where xT is a transpose of x. If significant correlation exists between the two input signals (x1, x2), a decorrelation over a time window D means transformation of the signals in two
steps: (1) centering around the mean over the data in the window D; and (2) Affine
transformation of the resulting data points in order to diagonalize the covariance
matrix of the resulting signals. Assuming that x is centered around its mean, we use
the following transformation:

[0011] In the illustrated embodiment, the window D comprised 16,000 samples.
[0012] The above described preprocessing facilitates the subsequent BSS processing to arrive
at a solution in a shorter time than if the preprocessing was not provided, and furthermore,
increases the probability that the BSS processing will arrive at a valid solution
instead of a local minimum.
[0013] Figure 3 illustrates the principles of the operation of a BSS algorithm upon which
the unmixing or separation of the desired component from the input signals is based.
The technique is called Blind Source Separation because it makes few assumptions about
the type of signals present in the mixture. As well known by those of ordinary skill
in this technology, BSS processing is intended to recover the set of n unknown source
signals from a set of their mixtures, assuming that the n source signals are independent.
More specifically, as shown in Figure 3, if
s is a vector of n sources, and
x is a vector of m observations of those sources (i.e., the raw input signals from
the m microphones), the goal of a BSS processor is to discover the m by n mixing matrix
A:
x =
As ,where x is the preprocessed signals shown in Figure 2 (i.e., x").
or equivalently, and as is done in the present invention, to find an unmixing or separating
matrix W such that
z =
Wx =
ŝ ≈
s where
z is the vector of the independent estimates of component signals s and z is an estimate
of the source signals.
[0014] As previously noted, the sources
s=(s1, s2) and the environment-dependent mixing matrix A are unknown. The BSS processor (which
as well known, may be implemented using a neural network) only sees the inputs
x=
(x1,x2) coming from two microphones in order to determine estimates
z=
(z1, z2) of the independent component signals
s. In this case, the inputs x are actually the preprocessed signals
x", previously described.
[0015] Figure 4 illustrates a block diagram of the main components of a BSS processor 400.
BSS processor 400 comprises: an unmixing component 402 for recording and updating
the state of the unmixing process defined by parameters W and v; a nonlinear component
404 for generating statistics used in the adaptation process; and an adaptation component
406 for computing changes in the values of the unmixing parameters, ΔW and Δv.
[0016] As will now be described in greater detail, the BSS processor 400 continuously adapts
two state variables: the 2 by 2 unmixing matrix W, and the 2 by 1 bias vector
b. The unmixing component 402 buffers the most recent N samples input to BSS processor
400. It computes the output
z corresponding to the most recent input sample
x by using the current values of the parameters W. These parameters are initialized
with small random values at the beginning of the process (while
v=0):

[0017] The nonlinear component 404 transforms the output of the system using an
invertible mapping. The objective of component 404 is to avoid processing very large numeric
values of the outputs, which may be infinities from a computational point of view.
This objective is carried out by processing statistically equivalent quantities, obtained
after running the outputs z through the invertible mapping. An example of a nonlinear
transformation used in component 404 is the sigmoidal nonlinearity
y, defined below, taking as arguments z translated with v over the input buffer.

[0018] The adaptation component 406 determines changes in the unmixing parameters W and
v: i.e., ΔW and Δv. The objective is to maximize the mutual information that the outputs
y contain about the inputs
x, as well known to those skilled in this technology, and as described, for example
by A.J. Bell and T.J. Sejnowski in their article entitled "An information-maximization
approach to blind separation and blind deconvolution" published in Neural Computation,
7:1129--1159, 1995, and as also described in Bell's US patent 5,706,402. This objective
reduces to a condition on the joint entropy
H=H(y1,y2) of the outputs
y:

[0019] The resulting adaptations rules are modified to perform a "natural gradient" step
known by those skilled in this technology, such as described by S. Amari in his publication
entitled "Minimum mutual information blind separation, published in Neural Computation,
1996.
[0020] We obtain the following update rules:


[0021] A typical value for the learning rate η is 0.005.
[0022] Referring again to Figure 1, following unmixer 18 is the postprocessing step 20,
wherein a determination is made of which output estimate of unmixer 18 is more likely
to represent voice rather than noise, as well as a normalization of the power of the
outputs by scaling them to the level of the input powers. The output signal section
can be based on multiple criteria using, for example, voice specific feature extraction
and analysis, and/or dominant speaker detection, which can also be accomplished using
feature extraction and analysis.
[0023] As previously noted, in the illustrated embodiment of the present invention, the
BSS processing is applied for use in hearing aids. The inputs to the system are given
by two microphones which, with the present invention, can be situated very close to
one another. In terms of the notation in the BSS processor shown in Figures 3 and
4, the system has two inputs and two ouputs (
n=
m=2).
[0024] Particularly for the case of hearing aids, the present invention addresses the following
problems:
- It works with real world mixtures of signals in anechoic environments. The challenge
is that a hearing aid using BSS would incorporate two microphones which, given the
physical limitation imposed by in the ear hearing aids, may be less than 11 mm apart.
- It can cope with more signals than the number of microphones. Until now, this was
thought to be impossible since the existing theory behind BSS guarantees that a solution
exists only when n>m.
- It works under non-stationary mixing conditions in order to follow moving sources
and adapt to changing listening environments.
- It works in real time so that the user is not subjected to disconcerting delays in
the signals and so that the hearing aid can adapt as necessary.
[0025] Thus, there has been shown and described a novel method and apparatus for real-time
unmixing of a desired signal from a mixture of independent signals. Many changes,
modifications, variations and other uses and applications of the subject invention
will, however, become apparent to those skilled in the art after considering this
specification and its accompanying drawings, which disclose a preferred embodiment
thereof. For example, although pre- and post- BSS processors 16 and 18 are described,
as noted herein, they are not strictly necessary in the broadest application of the
present invention. Additionally, the various components of BSS processor 400 can be
biased with
a priori knowledge about the input signals to facilitate its operation, for example, knowledge
about the distribution of the amplitude values of the source signals or even that
one input signal represents speech. Furthermore, signal processing for enhancing source
signal directionality can be incorporated into preprocessor 16. Even furthermore,
the teaching of the present invention can be extremely useful for interference cancellation,
separation of one voice from a mixture of many voices ("cocktail party" problem),
and for preprocessing sound mixtures for noise reduction in order to allow further
processing of a desired sound signal.
x. All such changes, modifications, variations and other uses and applications which
do not depart from the teachings herein are deemed to be covered by this patent, which
is limited only by the claims which follow as interpreted in light of the foregoing
description.
1. An electronic filtering device for performing real-time unmixing of a signal desired
to be recovered by a user of the device, where the desired signal emanates from one
of a plurality of independent signal sources, comprising:
two microphones positioned along a common axis for developing first and second electrical
input signals in response to reception by the microphones of acoustic signals from
the plurality of independent signal sources, wherein the spatial position of the common
axis of the microphones is controllable in real time by the user to align the common
axis so that it substantially continuously points in the direction of the source of
the desired signal; and
an adaptive unmixing signal processor responsive to said input signals for developing
output signals wherein the desired signal is separate from the mixture signal.
2. The apparatus of claim 1, wherein the common axis is positioned on the user in a manner
so as to point in the direction of the source.
3. The apparatus of claim 2, wherein said microphones are mounted in a common housing
that is intended to be co-located with the ear of the user.
4. The apparatus of claim 1, further including a preprocessor for modifying the input
signals before they are applied to the unmixing signal processor.
5. The apparatus of claim 4, wherein the preprocessor introduces a relative delay between
components of the input signals.
6. The apparatus of claim 4, wherein the preprocessor subjects the input signals to a
decorrelation processing.
7. The apparatus of claim 1, further including a postprocessor responsive to the output
signals of the unmixing signal processor for selecting the desired signal for application
to a signal reproduction device.
8. The apparatus of claim 1, wherein the unmixing signal processor comprises a blind
source signal separator.
9. The apparatus of claim 8, wherein the blind source signal separator comprises a neural
network for performing an unsupervised learning process that operates to maximize
the joint output entropy of the output signals.
10. A method for performing real-time unmixing of a signal desired to be recovered by
a user, where the desired signal emanates from one of a plurality of independent signal
sources, the method comprising the following steps:
positioning two microphones along a common axis, for developing first and second electrical
input signals in response to reception by the microphones of acoustic signals from
the plurality of independent signal sources, said positioning being such that the
common axis of the microphones is controllable in real time by the user to align the
common axis so that it substantially continuously points in the direction of the source
of the desired signal; and
subjecting said input signals to an adaptive unmixing signal processing for developing
output signals wherein the desired signal is separated from the mixture signal.
11. The method of claim 10, wherein said positioning locates the common axis proximate
the user in a manner so that it points in the direction that the user is looking.
12. The method of claim 11, wherein said positioning locates the common axis on a common
housing that is intended to be co-located with the ear of the user.
13. The method of claim 10, further including a preprocessing step for modifying the input
signals before they are subjected to the unmixing signal processing.
14. The method of claim 10, wherein the preprocessor step introduces a relative delay
between the input signals.
15. The method of claim 14, wherein the preprocessing step subjects the relatively delayed
input signals to decorrelation processing.
16. The method of claim 15, wherein the decorrelation processing step is carried out by
a diagonalization of a correlation matrix formed using the relatively delayed input
signals.
17. The method of claim 10, further including a postprocessing step responsive to the
output signals of the unmixing signal processing step for selecting the desired signal
for application to a signal reproduction device.
18. The method of claim 10, wherein the unmixing signal processing step comprises blind
source signal separation processing.
19. The method of claim 18, wherein the blind source signal separation processing comprises
an unsupervised learning process that operates to maximize the joint output entropy
of the output signals.
20. A method for performing real-time unmixing of a signal desired to be recovered by
a user, where the desired signal emanates from one of a plurality of independent signal
sources, the method comprising the following steps:
positioning two microphones along a common axis, for developing first and second electrical
input signals in response to reception by the microphones of acoustic signals from
the plurality of independent signal sources;
preprocessing the first and second electrical input signals so as to enhance signal
source directionality inherent therein due to the positioning of the microphones;
and
subjecting said directionality enhanced input signals to an adaptive unmixing signal
processing for developing output signals wherein the desired signal is separated from
the mixture signal.
21. The method of claim 20, wherein said positioning is such that the common axis of the
microphones is controllable in real time by the user to align the common axis so that
it substantially continuously points in the direction of the source of the desired
signal.
22. The method of claim 20, wherein said preprocessing comprises introducing a relative
delay between the input signals so as to further enhance their directionality.
23. The method of claim 22, wherein said preprocessing also includes a decorrelation processing
of the relatively delayed input signals.