Sound field spatial stabilizer

(19)

(11)

EP 2 760 021 A1

(12)	EUROPEAN PATENT APPLICATION

(43)	Date of publication:
	30.07.2014 Bulletin 2014/31

(21)	Application number: 13153065.1

(22)	Date of filing: 29.01.2013

(51)

International Patent Classification (IPC):

G10L 21/0208^(2013.01)

(84)	Designated Contracting States:
	AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR
	Designated Extension States:
	BA ME

(71)	Applicant: QNX Software Systems Limited
	Kanata, Ontario K2K 0B3 (CA)

(72)	Inventors:
	Paranjpe, Sheryas Vancouver, British Columbia V6B 2KA (CA) Hetherington, Phillip Alan Vancouver, British Columbia V6B 2KA (CA)

(74)	Representative: Roberts, Gwilym Vaughan et al
	Kilburn & Strode LLP 20 Red Lion Street London WC1R 4PJ London WC1R 4PJ (GB)

(54)	Sound field spatial stabilizer

(57) In a system and method for maintaining the spatial stability of a sound field a balance gain may be calculated for two or more microphone signals. The balance gain may be associated with a spatial image in the sound field. Signal values may be calculated for each of the microphone. The signal values may be signal estimates or signal gains calculated to improve a characteristic of the microphone signals. The differences between the signal values associated with each microphone signal may be limited although some difference between signal values may be allowable. One or more microphone signals are adjusted responsive to the two or more balance gains and the signal gains to maintain the spatial stability of the sound field. The adjustments of one or more microphone signals may include mixing of two or more microphone. The signal gains are applied to the two or more microphone signals.

Description

BACKGROUND

1. Technical Field

[0001] The present disclosure relates to the field of processing sound fields. In particular, to a system and method for maintaining the spatial stability of a sound field.

2. Related Art

[0002] Stereo and multichannel microphone configurations may be used for processing a sound field that is a spatial representation of an audible environment associated with the microphones. The audio received from the microphones may be used to reproduce the sound field using audio transducers.

[0003] Many computing devices may have multiple integrated microphones used for recording an audible environment associated with the computing device and communicating with other users. Computing devices typically use multiple microphones to improve noise performance with noise suppression processes. The noise suppression processes may result in the reduction or loss of spatial information. In many cases the noise suppression processing may result in a single, or mono, output signal that has no spatial information.

BRIEF DESCRIPTION OF DRAWINGS

[0004] The system may be better understood with reference to the following drawings and description. The components in the figures are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the disclosure. Moreover, in the figures, like referenced numerals designate corresponding parts throughout the different views.

[0005] Other systems, methods, features and advantages will be, or will become, apparent to one with skill in the art upon examination of the following figures and detailed description. It is intended that all such additional systems, methods, features and advantages be included with this description, be within the scope of the invention, and be protected by the following claims.

[0006] Fig. 1 is a schematic representation of a system for maintaining the spatial stability of a sound field.

[0007] Fig. 2 is a further schematic representation of a system for maintaining the spatial stability of the sound field.

[0008] Fig. 3 is flow diagram representing a method for maintaining the spatial stability of the sound field.

DETAILED DESCRIPTION

[0009] In a system and method for maintaining the spatial stability of a sound field balance gains may be calculated for each of two or more microphone signals. The balance gains may be associated with a spatial image in the sound field. Signal values may be calculated for each of the received microphone signals. The signal values may be signal estimates or signal gains calculated to improve a characteristic of the microphone signals. The differences between the signal values associated with each microphone signal are limited to mitigate audible distortions in the spatial image. Some difference between signal values may be allowable in order to improve the audible characteristics of the received microphone signals. One or more microphone signals are adjusted responsive to the two or more balance gains and the signal gains to maintain the spatial stability of the sound field. The adjustments of one or more microphone signals may include mixing of two or more microphone signals. Further adjustments to the signal gains may be made responsive to the mixing process. The signal gains are applied respectively to each of the two or more microphone signals.

[0010] Figure 1 is a schematic representation of a system for maintaining the spatial stability of a sound field 100. Two or more microphones 102 receive the sound field. Stereo and multichannel microphone configurations may be utilized for processing the sound field that is a spatial representation of an audible environment associated with the microphones 102. Many audible environments associated with the microphones 102 may include undesirable content that may be mitigated by processing the received sound field. Microphones 102 that are arranged in a far field configuration typically receive more undesirable content, noise, than microphones 102 in a near field configuration. Far field configurations may include, for example, a hands free phone, a conference phone and microphones embedded into an automobile. Far field configurations are capable of receiving a sound field that represents the spatial environment associated with the microphones 102. Near field configurations typically place the microphone 102 in close proximity to a user. Undesirable content may be mitigated in both near and far field configurations by processing the received sound field.

[0011] Processing that may mitigate undesirable content received in the sound field may include echo cancellation and noise reduction processes. Echo cancellation, noise reduction and other audio processing processes may calculate one or more suppression, or signal, gains utilizing a suppression gain calculator 106. An echo cancellation process and a noise reduction process may each calculate one or more signal gains. Each respective signal gain may be applied individually or a composite signal gain may be applied to process the sound field using a gain filter 114. Echo cancellation processing mitigates echoes caused by signal feedback between two or more communication devices. Signal feedback occurs when an audio transducer on a first communication device reproduces the signal received from a second communication device and subsequently the microphones on the first communication device recapture the reproduced signal. The recaptured signal may be transmitted to the second communication device where the recaptured signal may be perceived as an echo of the previously transmitted signal. Echo cancellation processes may detect when the signal has been recaptured and attempt to suppress the recaptured signal. Many different echo cancellation processes may mitigate echoes by calculating one or more signal gains that, when applied to the signals received by the microphones 102, suppress the echoes. In one example implementation, the echo suppression gain may be calculated using coherence calculation between the predicted echo and the microphone disclosed in U.S. Patent No. 8,036,879, which is incorporated herein by reference, except that in the event of any inconsistent disclosure or definition from the present specification, the disclosure or definition herein shall be deemed to prevail.

[0012] When the microphone 102 and an audio transducer are close in proximity, the echo cancellation process may determine that a large amount of suppression, or calculate large signal gains, as a result of the signal produced by the audio transducer dominating, or coupling with, the microphone 102.

[0013] When one of the microphones 102 and an audio transducer are in close proximity, the echo cancellation process may determine that a large amount of suppression may mitigate the signal produced by the audio transducer from dominating, or coupling with, the microphone 102. The echo cancellation process may calculate large signal gains to mitigate the coupling. The large signal gains may result in a gating effect where the communication device effectively supports only half duplex communication. Half duplex communication may occur when the communication channel allows for reliable communication from alternatively either the far side or near side but not both simultaneously. The large signal gains may suppress the coupling but may also suppress all content, including desired voice content resulting in half duplex communication.

[0014] Background noise is another type of undesirable signal content that may be mitigated by processing the received sound field. Many different types of noise reduction processing techniques may mitigate background noise. An exemplary noise reduction method is a recursive Wiener filter. The Wiener suppression gain G_i,_k, or signal gain, is defined as

[0015] Where SN̂R_priori¡,k is the a priori SNR estimate and is calculated recursively by

[0016] SN̂R_postii,k is the a posteriori SNR estimate given by

[0017] Here |N̂_i,k| is a background noise estimate. In one example implementation, the background noise estimate, or signal values, may be calculated using the background noise estimation techniques disclosed in U.S. Patent No. 7,844,453, which is incorporated herein by reference, except that in the event of any inconsistent disclosure or definition from the present specification, the disclosure or definition herein shall be deemed to prevail. In other implementations, alternative background noise estimation techniques may be used, such as, for example, a noise power estimation technique based on minimum statistics.

[0018] Additional noise reduction processing may mitigate specific types of undesirable noise characteristics including, for example, wind noise, transient noise, rain noise and engine noise. Mitigation of some specific types of undesirable noise may be referred to as signature noise reduction processes. Signature noise reduction processes detect signature noise and generate signal gains that may be used to suppress a detected signature noise. In one implementation, wind noise suppression gains (a.k.a. signal gains) may be calculated using the system for suppressing wind noise disclosed in U.S. Patent No. 7,885,420, which is incorporated herein by reference, except that in the event of any inconsistent disclosure or definition from the present specification, the disclosure or definition herein shall be deemed to prevail.

[0019] The sound field received by the two or more microphones 102 may contain a spatial representation, or a spatial image, of an audible environment. Balance gains may be calculated responsive to the spatial image in the sound field. The balance gains may be calculated with a balance calculator 108. The balance calculator 108 may calculate the balance gains by measuring an energy level in a signal from each microphone 102. The energy level differences may represent the approximate balance of the spatial image. One or more energy levels may be calculated for each microphone 102 generating one or more balance gains. A single balance gain may be utilized in a two microphone configuration where the single balance gain may be the ratio of energy levels between the two microphone signals 118.

[0020] A subband filter may process the received microphone signal 118 to extract frequency information. The subband filter may be accomplished by various methods, such as a Fast Fourier Transform (FFT), critical filter bank, octave filter band, or one-third octave filter bank. Alternatively, the subband analysis may include a time-based filter bank. The time-based filter bank may be composed of a bank of overlapping bandpass filters, where the center frequencies have non-linear spacing such as octave, 3^rd octave, bark, mel, or other spacing techniques. The one or more energy levels may be calculated for each frequency bin or band of the subband filter. The resulting balance gains may be filtered, or smoothed, over time and/or frequency. The balance calculator 108 may update the balance gains responsive to desired signal content. For example, the balance gains may be updated when, for example, the energy level exceeds a threshold, the signal to noise ratio (SNR) exceeds a threshold, a voice activity detector detects voice content or any combination thereof.

[0021] The background noise estimator 104 may calculate a background noise estimate, or signal value, for each microphone signal 118. When the microphones 102 are spaced apart, the background noise estimator 104 may calculate different signal values responsive to the received sound value. Some difference in the calculated background noise estimate may be acceptable but relatively large differences may indicate a potential corruption or misrepresentation of one or more of the signals. For example, a user may be blocking one microphone 102 with a finger resulting in a relatively large difference in the background noise estimate. The background noise estimate may be utilized for many subsequent calculations including signal-to-noise ratios, echo cancellers and noise reduction calculators. When the subsequent calculations utilize background noise estimates that contain relatively large differences the subsequent calculations may yield corrupted or misrepresentative results. For example, large differences in suppression gains between microphones 102 may result in audible distortions in the spatial image of the sound field.

[0022] A difference limiter 110 may limit the difference in the background noise estimates, or signal values, and/or the adaption rates utilized in the background noise estimator 104. The different limiter 110 may mitigate audio distortions in the spatial image when reproduced in the output sound field. For example, a difference between corresponding signal values in the calculated background noise estimates may be acceptable when the difference is 2 dB (decibels) to 4 dB but noticeable when the difference exceeds 6 dB. The difference limiter 110 may, for example, limit the difference between signal values to 6 dB or may allow a difference proportional to the signal value when the difference is greater than 6 dB. The difference limiter 110 may utilize a coherence and/or correlation calculation between microphones to limit a difference between the signal values. Two signals that are correlated may indicate that the difference between signal values should be limited. The difference limiter 110 may smooth, or filter, the amount of limiting over time and frequency.

[0023] The difference limiter 110 may be applied to other signal values including suppression gains, or signal gains, calculated using the suppression gain calculator 106. The suppression gain calculator 106 may calculate signal gains for the echo cancellation and noise reduction processes described above. Signature noise reduction processes may calculate signal gains that have large differences between microphone signals 118. For example, in the case of wind noise reduction, a first microphone 102 may receive significant wind noise and the second microphone 102 may receive negligible wind noise. An example portable computing device may have two microphones 102 placed several inches apart where the first microphone 102 may be located on the bottom surface and the second microphone 102 may be located on the top surface. The first microphone 102 and the second microphone 102 may be relatively close in position although they may not be close enough to process phase differences to utilize, for example, a beam forming combining process. Even though the microphones 102 are relatively close in position on the example portable computing device, one microphone 102 may receive significant wind noise. The suppression gain calculator 106 may calculate signal gains that may contain relatively large differences. The difference limiter 110 may allow some of the wind noise to be suppressed while mitigating audio distortions in the spatial image of the sound field. For example, a difference between corresponding signal gains generated by the suppression gain calculators 106 may be acceptable when the difference is 2 dB to 4 dB but noticeable when the difference exceeds 6 dB. The difference limiter 110 may limit the difference between signal values to 6 dB or may allow a difference proportional to the signal value when the difference is greater than 6 dB. The difference limiter 110 may smooth, or filter, the amount of limiting over time and frequency.

[0024] The difference limiter 110 may mitigate some distortion in the spatial image when reproduced in the output sound field although it may be possible that the combination of one or more of the signal values calculated utilizing the background noise estimator 104 and suppression gain calculator 106 may still distort the spatial image. Additionally, in some cases the suppression gain calculator 106 may not utilize the difference limiter 110. For example, when the microphone 102 and audio transducer are coupled as described above resulting in a gating effect, the difference limiter 110 may not be utilized because the audible artifacts associated with the coupling are perceptibly more distracting than distorting the spatial image. In this case, the echo cancellation process may be allowed to gate the microphone signal 118 without applying the difference limiter 110.

[0025] A balance adjuster 112 may maintain the spatial stability when reproduced in the output sound field. The balance adjuster 112 may mitigate distortions in the spatial image that may not be mitigated with the difference limiter 110. Additionally, the balance adjuster 112 may mitigate audio distortions in the spatial image where the difference limiter 110 may not be applied. The balance adjuster 112 may adjust the signal gains using the balance gains calculated with the balance calculator 108 and the signal gains. The balance gains may represent the approximate balance of the spatial image. The balance adjuster 112 may adjust the signal gains responsive to the balance gains. Additionally, the balance adjuster 112 may mix, or borrow, between two or more microphone signals 118 to maintain the spatial stability and to more closely track the balance gains. In one example, the echo-gating triggered half-duplex use case described above may have a first microphone signal 118 that may be gated. The balance adjuster 112 may mitigate audio distortions in the spatial image by borrowing audio from a second microphone signal 118 responsive to the balance gain. The second microphone signal 118 may have associated signal gains that may be adjusted responsive to the balance gain. The second microphone signal 118 that is borrowed may be mixed into the first microphone signal 118. The balance adjuster 112 may adjust the signal gains and the borrowing of microphone signals 118 may be filtered, or smoothed, over time and frequency. The adjustments may be performed on a frequency bin and/or band using the subband filter described above.

[0026] A gain filter 114 applies the signal gains to the two or more microphone signals 118. The signal gains may be a combination of signal gains associated with one or more suppression gain calculators 106. The gain filter 114 may utilize the subband filter described above.

[0027] Figure 2 is a further schematic representation of a system for maintaining the spatial stability when reproduced in the output sound field. The system 200 comprises a processor 202, memory 204 (the contents of which are accessible by the processor 202), two or more microphones 102 and an I/O interface 206. The two or more microphones 102 may be either internal or external to the system 200 or a combination of internal and external. The memory 204 may store instructions which when executed using the processor 202 may cause the system 200 to render the functionality associated with the background noise estimator module 104, the suppression gain calculator module 106, the balance calculator module 108, the difference limiter module 110, the balance adjuster module 112 and the gain filter module 114 described herein. In addition, data structures, temporary variables and other information may store data in data storage 208.

[0028] The processor 202 may comprise a single processor or multiple processors that may be disposed on a single chip, on multiple devices or distributed over more that one system. The processor 202 may be hardware that executes computer executable instructions or computer code embodied in the memory 204 or in other memory to perform one or more features of the system. The processor 202 may include a general purpose processor, a central processing unit (CPU), a graphics processing unit (GPU), an application specific integrated circuit (ASIC), a digital signal processor (DSP), a field programmable gate array (FPGA), a digital circuit, an analog circuit, a microcontroller, any other type of processor, or any combination thereof.

[0029] The memory 204 may comprise a device for storing and retrieving data, processor executable instructions, or any combination thereof. The memory 204 may include non-volatile and/or volatile memory, such as a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM), or a flash memory. The memory 204 may comprise a single device or multiple devices that may be disposed on one or more dedicated memory devices or on a processor or other similar device. Alternatively or in addition, the memory 204 may include an optical, magnetic (hard-drive) or any other form of data storage device.

[0030] The memory 204 may store computer code, such as the background noise estimator module 104, the suppression gain calculator module 106, the balance calculator module 108, the difference limiter module 110, the balance adjuster module 112 and the gain filter module 114 described herein. The computer code may include instructions executable with the processor 202. The computer code may be written in any computer language, such as C, C++, assembly language, channel program code, and/or any combination of computer languages. The memory 204 may store information in data structures in the data storage 208.

[0031] The I/O interface 206 may be used to connect devices such as, for example, microphones 102, and to other components internal or external to the system 200.

[0032] Figure 3 is flow diagram representing a method for maintaining a spatial stability of a sound field. The method 300 may be, for example, implemented using either of the systems 100 and 200 described herein with reference to Figures 1 and 2. The method 300 may include the following acts. Calculating a balance gain for each of two or more microphone signals 302. The balance gain may be associated with a spatial image in the sound field. Calculating one or more signal values for each of two or more microphone signals 304. The signal values may be the background noise estimate or signal gains associated with echo cancellation and noise reduction processes. Limiting the difference between the two or more signal values 306. The difference between signal values may be limited to mitigate distortions in the spatial image of the sound field. Adjusting one or more microphone signals responsive to the two or more balance gains and the signal gains 308. One or more microphone signals may be mixed, or borrowed, with another microphone signal responsive to the balance gains and signal gains. Applying the signal gains to the two or more microphone signals 310.

[0033] All of the disclosure, regardless of the particular implementation described, is exemplary in nature, rather than limiting. The systems 100 and 200 may include more, fewer, or different components than illustrated in Figures 1 and 2. Furthermore, each one of the components of systems 100 and 200 may include more, fewer, or different elements than is illustrated in Figures 1 and 2. Flags, data, databases, tables, entities, and other data structures may be separately stored and managed, may be incorporated into a single memory or database, may be distributed, or may be logically and physically organized in many different ways. The components may operate independently or be part of a same program or hardware. The components may be resident on separate hardware, such as separate removable circuit boards, or share common hardware, such as a same memory and processor for implementing instructions from the memory. Programs may be parts of a single program, separate programs, or distributed across several memories and processors.

[0034] The functions, acts or tasks illustrated in the figures or described may be executed in response to one or more sets of logic or instructions stored in or on computer readable media. The functions, acts or tasks are independent of the particular type of instructions set, storage media, processor or processing strategy and may be performed by software, hardware, integrated circuits, firmware, micro code and the like, operating alone or in combination. Likewise, processing strategies may include multiprocessing, multitasking, parallel processing, distributed processing, and/or any other type of processing. In one embodiment, the instructions are stored on a removable media device for reading by local or remote systems. In other embodiments, the logic or instructions are stored in a remote location for transfer through a computer network or over telephone lines. In yet other embodiments, the logic or instructions may be stored within a given computer such as, for example, a CPU.

[0035] While various embodiments of the system and method for maintaining the spatial stability of a sound field have been described, it will be apparent to those of ordinary skill in the art that many more embodiments and implementations are possible within the scope of the present invention. Accordingly, the invention is not to be restricted except in light of the attached claims and their equivalents.

Claims

1. A computer implemented method for maintaining spatial stability of a sound field comprising:

calculating (302) balance gains for each of two or more microphone signals 118;

calculating (304) one or more signal values for each of the two or more microphone signals (118);

limiting (306) differences between corresponding one or more signal values for each of the two or more microphone signals (118); and

gain adjusting (308) the two or more microphone signals (118) responsive to the calculated balance gains and the limited one or more signal values for each of the two or more microphone signals (118).

2. The computer implemented method of claim 1, where the one or more signal values comprises one or more of an estimated background noise and a calculated suppression gain.

3. The computer implemented method of claims 1 to 2 where the suppression gain comprises one or more of a noise reduction calculation and an echo cancellation calculation.

4. The computer implemented method of claims 1 to 3, where calculating each of the one or more signal values further comprises calculating one or more of a background noise estimate and a background noise adaptation rate.

5. The computer implemented method of claims 1 to 4, where limiting the difference between each of the one or more signal value that comprise suppression gains further comprises limiting the suppression gains when a threshold is exceeded.

6. The computer implemented method of claims 1 to 5, where the noise reduction calculation comprises any one or more of a wind noise reduction calculation, transients noise reduction calculation, road noise reduction calculation, repetitive noise reduction calculation and engine noise reduction calculation.

7. The computer implemented method of claims 1 to 6, where calculating the balance gains for each of two or more microphone signals (118) further comprises an energy measurement.

8. The computer implemented method of claims 1 to 7, where calculating the balance gains for each of two or more microphone signals (118) further comprises selectively updating the balance gains responsive to one or more of the signal values exceeding a threshold and a signal-to-noise ratio exceeding a threshold.

9. The computer implemented method of claims 1 to 8, where gain adjusting the two or more microphone signals (118) further comprises mixing a first microphone signal (118) with a second microphone signal (118).

10. The computer implemented method of claims 1 to 9, further comprising generating a set of sub-bands for each of the two or more microphone signals (118) using a subband filter or a Fast Fourier Transform.

11. The computer implemented method of claims 1 to 10, further comprising generating a set of sub-bands for each of the two or more (118) signals according to a critical, octave, mel, or bark band spacing technique.

12. A system for maintaining spatial stability of a sound field, the system comprising:

a processor (202); and

a memory (204) coupled to the processor (202) containing instructions, executable by the processor (202), for performing the instructions executing the steps of any of method claims 1 to 11.

Drawing

Search report

Search report

Cited references

REFERENCES CITED IN THE DESCRIPTION

This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

Patent documents cited in the description