BACKGROUND
1. Technical Field
[0001] The present disclosure relates to the field of processing sound fields. In particular,
to a system and method for maintaining the spatial stability of a sound field.
2. Related Art
[0002] Stereo and multichannel microphone configurations may be used for processing a sound
field that is a spatial representation of an audible environment associated with the
microphones. The audio received from the microphones may be used to reproduce the
sound field using audio transducers.
[0003] Many computing devices may have multiple integrated microphones used for recording
an audible environment associated with the computing device and communicating with
other users. Computing devices typically use multiple microphones to improve noise
performance with noise suppression processes. The noise suppression processes may
result in the reduction or loss of spatial information. In many cases the noise suppression
processing may result in a single, or mono, output signal that has no spatial information.
BRIEF DESCRIPTION OF DRAWINGS
[0004] The system may be better understood with reference to the following drawings and
description. The components in the figures are not necessarily to scale, emphasis
instead being placed upon illustrating the principles of the disclosure. Moreover,
in the figures, like referenced numerals designate corresponding parts throughout
the different views.
[0005] Other systems, methods, features and advantages will be, or will become, apparent
to one with skill in the art upon examination of the following figures and detailed
description. It is intended that all such additional systems, methods, features and
advantages be included with this description, be within the scope of the invention,
and be protected by the following claims.
[0006] Fig. 1 is a schematic representation of a system for maintaining the spatial stability
of a sound field.
[0007] Fig. 2 is a further schematic representation of a system for maintaining the spatial
stability of the sound field.
[0008] Fig. 3 is flow diagram representing a method for maintaining the spatial stability
of the sound field.
DETAILED DESCRIPTION
[0009] In a system and method for maintaining the spatial stability of a sound field balance
gains may be calculated for each of two or more microphone signals. The balance gains
may be associated with a spatial image in the sound field. Signal values may be calculated
for each of the received microphone signals. The signal values may be signal estimates
or signal gains calculated to improve a characteristic of the microphone signals.
The differences between the signal values associated with each microphone signal are
limited to mitigate audible distortions in the spatial image. Some difference between
signal values may be allowable in order to improve the audible characteristics of
the received microphone signals. One or more microphone signals are adjusted responsive
to the two or more balance gains and the signal gains to maintain the spatial stability
of the sound field. The adjustments of one or more microphone signals may include
mixing of two or more microphone signals. Further adjustments to the signal gains
may be made responsive to the mixing process. The signal gains are applied respectively
to each of the two or more microphone signals.
[0010] Figure 1 is a schematic representation of a system for maintaining the spatial stability
of a sound field 100. Two or more microphones 102 receive the sound field. Stereo
and multichannel microphone configurations may be utilized for processing the sound
field that is a spatial representation of an audible environment associated with the
microphones 102. Many audible environments associated with the microphones 102 may
include undesirable content that may be mitigated by processing the received sound
field. Microphones 102 that are arranged in a far field configuration typically receive
more undesirable content, noise, than microphones 102 in a near field configuration.
Far field configurations may include, for example, a hands free phone, a conference
phone and microphones embedded into an automobile. Far field configurations are capable
of receiving a sound field that represents the spatial environment associated with
the microphones 102. Near field configurations typically place the microphone 102
in close proximity to a user. Undesirable content may be mitigated in both near and
far field configurations by processing the received sound field.
[0011] Processing that may mitigate undesirable content received in the sound field may
include echo cancellation and noise reduction processes. Echo cancellation, noise
reduction and other audio processing processes may calculate one or more suppression,
or signal, gains utilizing a suppression gain calculator 106. An echo cancellation
process and a noise reduction process may each calculate one or more signal gains.
Each respective signal gain may be applied individually or a composite signal gain
may be applied to process the sound field using a gain filter 114. Echo cancellation
processing mitigates echoes caused by signal feedback between two or more communication
devices. Signal feedback occurs when an audio transducer on a first communication
device reproduces the signal received from a second communication device and subsequently
the microphones on the first communication device recapture the reproduced signal.
The recaptured signal may be transmitted to the second communication device where
the recaptured signal may be perceived as an echo of the previously transmitted signal.
Echo cancellation processes may detect when the signal has been recaptured and attempt
to suppress the recaptured signal. Many different echo cancellation processes may
mitigate echoes by calculating one or more signal gains that, when applied to the
signals received by the microphones 102, suppress the echoes. In one example implementation,
the echo suppression gain may be calculated using coherence calculation between the
predicted echo and the microphone disclosed in
U.S. Patent No. 8,036,879, which is incorporated herein by reference, except that in the event of any inconsistent
disclosure or definition from the present specification, the disclosure or definition
herein shall be deemed to prevail.
[0012] When the microphone 102 and an audio transducer are close in proximity, the echo
cancellation process may determine that a large amount of suppression, or calculate
large signal gains, as a result of the signal produced by the audio transducer dominating,
or coupling with, the microphone 102.
[0013] When one of the microphones 102 and an audio transducer are in close proximity, the
echo cancellation process may determine that a large amount of suppression may mitigate
the signal produced by the audio transducer from dominating, or coupling with, the
microphone 102. The echo cancellation process may calculate large signal gains to
mitigate the coupling. The large signal gains may result in a gating effect where
the communication device effectively supports only half duplex communication. Half
duplex communication may occur when the communication channel allows for reliable
communication from alternatively either the far side or near side but not both simultaneously.
The large signal gains may suppress the coupling but may also suppress all content,
including desired voice content resulting in half duplex communication.
[0014] Background noise is another type of undesirable signal content that may be mitigated
by processing the received sound field. Many different types of noise reduction processing
techniques may mitigate background noise. An exemplary noise reduction method is a
recursive Wiener filter. The Wiener suppression gain
Gi,k, or signal gain, is defined as

[0015] Where
SN̂Rpriori¡,k is the a priori SNR estimate and is calculated recursively by

[0016] SN̂Rpostii,k is the a posteriori SNR estimate given by

[0017] Here |
N̂i,k| is a background noise estimate. In one example implementation, the background noise
estimate, or signal values, may be calculated using the background noise estimation
techniques disclosed in
U.S. Patent No. 7,844,453, which is incorporated herein by reference, except that in the event of any inconsistent
disclosure or definition from the present specification, the disclosure or definition
herein shall be deemed to prevail. In other implementations, alternative background
noise estimation techniques may be used, such as, for example, a noise power estimation
technique based on minimum statistics.
[0018] Additional noise reduction processing may mitigate specific types of undesirable
noise characteristics including, for example, wind noise, transient noise, rain noise
and engine noise. Mitigation of some specific types of undesirable noise may be referred
to as signature noise reduction processes. Signature noise reduction processes detect
signature noise and generate signal gains that may be used to suppress a detected
signature noise. In one implementation, wind noise suppression gains (a.k.a. signal
gains) may be calculated using the system for suppressing wind noise disclosed in
U.S. Patent No. 7,885,420, which is incorporated herein by reference, except that in the event of any inconsistent
disclosure or definition from the present specification, the disclosure or definition
herein shall be deemed to prevail.
[0019] The sound field received by the two or more microphones 102 may contain a spatial
representation, or a spatial image, of an audible environment. Balance gains may be
calculated responsive to the spatial image in the sound field. The balance gains may
be calculated with a balance calculator 108. The balance calculator 108 may calculate
the balance gains by measuring an energy level in a signal from each microphone 102.
The energy level differences may represent the approximate balance of the spatial
image. One or more energy levels may be calculated for each microphone 102 generating
one or more balance gains. A single balance gain may be utilized in a two microphone
configuration where the single balance gain may be the ratio of energy levels between
the two microphone signals 118.
[0020] A subband filter may process the received microphone signal 118 to extract frequency
information. The subband filter may be accomplished by various methods, such as a
Fast Fourier Transform (FFT), critical filter bank, octave filter band, or one-third
octave filter bank. Alternatively, the subband analysis may include a time-based filter
bank. The time-based filter bank may be composed of a bank of overlapping bandpass
filters, where the center frequencies have non-linear spacing such as octave, 3
rd octave, bark, mel, or other spacing techniques. The one or more energy levels may
be calculated for each frequency bin or band of the subband filter. The resulting
balance gains may be filtered, or smoothed, over time and/or frequency. The balance
calculator 108 may update the balance gains responsive to desired signal content.
For example, the balance gains may be updated when, for example, the energy level
exceeds a threshold, the signal to noise ratio (SNR) exceeds a threshold, a voice
activity detector detects voice content or any combination thereof.
[0021] The background noise estimator 104 may calculate a background noise estimate, or
signal value, for each microphone signal 118. When the microphones 102 are spaced
apart, the background noise estimator 104 may calculate different signal values responsive
to the received sound value. Some difference in the calculated background noise estimate
may be acceptable but relatively large differences may indicate a potential corruption
or misrepresentation of one or more of the signals. For example, a user may be blocking
one microphone 102 with a finger resulting in a relatively large difference in the
background noise estimate. The background noise estimate may be utilized for many
subsequent calculations including signal-to-noise ratios, echo cancellers and noise
reduction calculators. When the subsequent calculations utilize background noise estimates
that contain relatively large differences the subsequent calculations may yield corrupted
or misrepresentative results. For example, large differences in suppression gains
between microphones 102 may result in audible distortions in the spatial image of
the sound field.
[0022] A difference limiter 110 may limit the difference in the background noise estimates,
or signal values, and/or the adaption rates utilized in the background noise estimator
104. The different limiter 110 may mitigate audio distortions in the spatial image
when reproduced in the output sound field. For example, a difference between corresponding
signal values in the calculated background noise estimates may be acceptable when
the difference is 2 dB (decibels) to 4 dB but noticeable when the difference exceeds
6 dB. The difference limiter 110 may, for example, limit the difference between signal
values to 6 dB or may allow a difference proportional to the signal value when the
difference is greater than 6 dB. The difference limiter 110 may utilize a coherence
and/or correlation calculation between microphones to limit a difference between the
signal values. Two signals that are correlated may indicate that the difference between
signal values should be limited. The difference limiter 110 may smooth, or filter,
the amount of limiting over time and frequency.
[0023] The difference limiter 110 may be applied to other signal values including suppression
gains, or signal gains, calculated using the suppression gain calculator 106. The
suppression gain calculator 106 may calculate signal gains for the echo cancellation
and noise reduction processes described above. Signature noise reduction processes
may calculate signal gains that have large differences between microphone signals
118. For example, in the case of wind noise reduction, a first microphone 102 may
receive significant wind noise and the second microphone 102 may receive negligible
wind noise. An example portable computing device may have two microphones 102 placed
several inches apart where the first microphone 102 may be located on the bottom surface
and the second microphone 102 may be located on the top surface. The first microphone
102 and the second microphone 102 may be relatively close in position although they
may not be close enough to process phase differences to utilize, for example, a beam
forming combining process. Even though the microphones 102 are relatively close in
position on the example portable computing device, one microphone 102 may receive
significant wind noise. The suppression gain calculator 106 may calculate signal gains
that may contain relatively large differences. The difference limiter 110 may allow
some of the wind noise to be suppressed while mitigating audio distortions in the
spatial image of the sound field. For example, a difference between corresponding
signal gains generated by the suppression gain calculators 106 may be acceptable when
the difference is 2 dB to 4 dB but noticeable when the difference exceeds 6 dB. The
difference limiter 110 may limit the difference between signal values to 6 dB or may
allow a difference proportional to the signal value when the difference is greater
than 6 dB. The difference limiter 110 may smooth, or filter, the amount of limiting
over time and frequency.
[0024] The difference limiter 110 may mitigate some distortion in the spatial image when
reproduced in the output sound field although it may be possible that the combination
of one or more of the signal values calculated utilizing the background noise estimator
104 and suppression gain calculator 106 may still distort the spatial image. Additionally,
in some cases the suppression gain calculator 106 may not utilize the difference limiter
110. For example, when the microphone 102 and audio transducer are coupled as described
above resulting in a gating effect, the difference limiter 110 may not be utilized
because the audible artifacts associated with the coupling are perceptibly more distracting
than distorting the spatial image. In this case, the echo cancellation process may
be allowed to gate the microphone signal 118 without applying the difference limiter
110.
[0025] A balance adjuster 112 may maintain the spatial stability when reproduced in the
output sound field. The balance adjuster 112 may mitigate distortions in the spatial
image that may not be mitigated with the difference limiter 110. Additionally, the
balance adjuster 112 may mitigate audio distortions in the spatial image where the
difference limiter 110 may not be applied. The balance adjuster 112 may adjust the
signal gains using the balance gains calculated with the balance calculator 108 and
the signal gains. The balance gains may represent the approximate balance of the spatial
image. The balance adjuster 112 may adjust the signal gains responsive to the balance
gains. Additionally, the balance adjuster 112 may mix, or borrow, between two or more
microphone signals 118 to maintain the spatial stability and to more closely track
the balance gains. In one example, the echo-gating triggered half-duplex use case
described above may have a first microphone signal 118 that may be gated. The balance
adjuster 112 may mitigate audio distortions in the spatial image by borrowing audio
from a second microphone signal 118 responsive to the balance gain. The second microphone
signal 118 may have associated signal gains that may be adjusted responsive to the
balance gain. The second microphone signal 118 that is borrowed may be mixed into
the first microphone signal 118. The balance adjuster 112 may adjust the signal gains
and the borrowing of microphone signals 118 may be filtered, or smoothed, over time
and frequency. The adjustments may be performed on a frequency bin and/or band using
the subband filter described above.
[0026] A gain filter 114 applies the signal gains to the two or more microphone signals
118. The signal gains may be a combination of signal gains associated with one or
more suppression gain calculators 106. The gain filter 114 may utilize the subband
filter described above.
[0027] Figure 2 is a further schematic representation of a system for maintaining the spatial
stability when reproduced in the output sound field. The system 200 comprises a processor
202, memory 204 (the contents of which are accessible by the processor 202), two or
more microphones 102 and an I/O interface 206. The two or more microphones 102 may
be either internal or external to the system 200 or a combination of internal and
external. The memory 204 may store instructions which when executed using the processor
202 may cause the system 200 to render the functionality associated with the background
noise estimator module 104, the suppression gain calculator module 106, the balance
calculator module 108, the difference limiter module 110, the balance adjuster module
112 and the gain filter module 114 described herein. In addition, data structures,
temporary variables and other information may store data in data storage 208.
[0028] The processor 202 may comprise a single processor or multiple processors that may
be disposed on a single chip, on multiple devices or distributed over more that one
system. The processor 202 may be hardware that executes computer executable instructions
or computer code embodied in the memory 204 or in other memory to perform one or more
features of the system. The processor 202 may include a general purpose processor,
a central processing unit (CPU), a graphics processing unit (GPU), an application
specific integrated circuit (ASIC), a digital signal processor (DSP), a field programmable
gate array (FPGA), a digital circuit, an analog circuit, a microcontroller, any other
type of processor, or any combination thereof.
[0029] The memory 204 may comprise a device for storing and retrieving data, processor executable
instructions, or any combination thereof. The memory 204 may include non-volatile
and/or volatile memory, such as a random access memory (RAM), a read-only memory (ROM),
an erasable programmable read-only memory (EPROM), or a flash memory. The memory 204
may comprise a single device or multiple devices that may be disposed on one or more
dedicated memory devices or on a processor or other similar device. Alternatively
or in addition, the memory 204 may include an optical, magnetic (hard-drive) or any
other form of data storage device.
[0030] The memory 204 may store computer code, such as the background noise estimator module
104, the suppression gain calculator module 106, the balance calculator module 108,
the difference limiter module 110, the balance adjuster module 112 and the gain filter
module 114 described herein. The computer code may include instructions executable
with the processor 202. The computer code may be written in any computer language,
such as C, C++, assembly language, channel program code, and/or any combination of
computer languages. The memory 204 may store information in data structures in the
data storage 208.
[0031] The I/O interface 206 may be used to connect devices such as, for example, microphones
102, and to other components internal or external to the system 200.
[0032] Figure 3 is flow diagram representing a method for maintaining a spatial stability
of a sound field. The method 300 may be, for example, implemented using either of
the systems 100 and 200 described herein with reference to Figures 1 and 2. The method
300 may include the following acts. Calculating a balance gain for each of two or
more microphone signals 302. The balance gain may be associated with a spatial image
in the sound field. Calculating one or more signal values for each of two or more
microphone signals 304. The signal values may be the background noise estimate or
signal gains associated with echo cancellation and noise reduction processes. Limiting
the difference between the two or more signal values 306. The difference between signal
values may be limited to mitigate distortions in the spatial image of the sound field.
Adjusting one or more microphone signals responsive to the two or more balance gains
and the signal gains 308. One or more microphone signals may be mixed, or borrowed,
with another microphone signal responsive to the balance gains and signal gains. Applying
the signal gains to the two or more microphone signals 310.
[0033] All of the disclosure, regardless of the particular implementation described, is
exemplary in nature, rather than limiting. The systems 100 and 200 may include more,
fewer, or different components than illustrated in Figures 1 and 2. Furthermore, each
one of the components of systems 100 and 200 may include more, fewer, or different
elements than is illustrated in Figures 1 and 2. Flags, data, databases, tables, entities,
and other data structures may be separately stored and managed, may be incorporated
into a single memory or database, may be distributed, or may be logically and physically
organized in many different ways. The components may operate independently or be part
of a same program or hardware. The components may be resident on separate hardware,
such as separate removable circuit boards, or share common hardware, such as a same
memory and processor for implementing instructions from the memory. Programs may be
parts of a single program, separate programs, or distributed across several memories
and processors.
[0034] The functions, acts or tasks illustrated in the figures or described may be executed
in response to one or more sets of logic or instructions stored in or on computer
readable media. The functions, acts or tasks are independent of the particular type
of instructions set, storage media, processor or processing strategy and may be performed
by software, hardware, integrated circuits, firmware, micro code and the like, operating
alone or in combination. Likewise, processing strategies may include multiprocessing,
multitasking, parallel processing, distributed processing, and/or any other type of
processing. In one embodiment, the instructions are stored on a removable media device
for reading by local or remote systems. In other embodiments, the logic or instructions
are stored in a remote location for transfer through a computer network or over telephone
lines. In yet other embodiments, the logic or instructions may be stored within a
given computer such as, for example, a CPU.
[0035] While various embodiments of the system and method for maintaining the spatial stability
of a sound field have been described, it will be apparent to those of ordinary skill
in the art that many more embodiments and implementations are possible within the
scope of the present invention. Accordingly, the invention is not to be restricted
except in light of the attached claims and their equivalents.
1. A computer implemented method for maintaining spatial stability of a sound field comprising:
calculating (302) balance gains for each of two or more microphone signals 118;
calculating (304) one or more signal values for each of the two or more microphone
signals (118);
limiting (306) differences between corresponding one or more signal values for each
of the two or more microphone signals (118); and
gain adjusting (308) the two or more microphone signals (118) responsive to the calculated
balance gains and the limited one or more signal values for each of the two or more
microphone signals (118).
2. The computer implemented method of claim 1, where the one or more signal values comprises
one or more of an estimated background noise and a calculated suppression gain.
3. The computer implemented method of claims 1 to 2 where the suppression gain comprises
one or more of a noise reduction calculation and an echo cancellation calculation.
4. The computer implemented method of claims 1 to 3, where calculating each of the one
or more signal values further comprises calculating one or more of a background noise
estimate and a background noise adaptation rate.
5. The computer implemented method of claims 1 to 4, where limiting the difference between
each of the one or more signal value that comprise suppression gains further comprises
limiting the suppression gains when a threshold is exceeded.
6. The computer implemented method of claims 1 to 5, where the noise reduction calculation
comprises any one or more of a wind noise reduction calculation, transients noise
reduction calculation, road noise reduction calculation, repetitive noise reduction
calculation and engine noise reduction calculation.
7. The computer implemented method of claims 1 to 6, where calculating the balance gains
for each of two or more microphone signals (118) further comprises an energy measurement.
8. The computer implemented method of claims 1 to 7, where calculating the balance gains
for each of two or more microphone signals (118) further comprises selectively updating
the balance gains responsive to one or more of the signal values exceeding a threshold
and a signal-to-noise ratio exceeding a threshold.
9. The computer implemented method of claims 1 to 8, where gain adjusting the two or
more microphone signals (118) further comprises mixing a first microphone signal (118)
with a second microphone signal (118).
10. The computer implemented method of claims 1 to 9, further comprising generating a
set of sub-bands for each of the two or more microphone signals (118) using a subband
filter or a Fast Fourier Transform.
11. The computer implemented method of claims 1 to 10, further comprising generating a
set of sub-bands for each of the two or more (118) signals according to a critical,
octave, mel, or bark band spacing technique.
12. A system for maintaining spatial stability of a sound field, the system comprising:
a processor (202); and
a memory (204) coupled to the processor (202) containing instructions, executable
by the processor (202), for performing the instructions executing the steps of any
of method claims 1 to 11.