(19)
(11)EP 2 765 787 B1

(12)EUROPEAN PATENT SPECIFICATION

(45)Mention of the grant of the patent:
11.12.2019 Bulletin 2019/50

(21)Application number: 13154317.5

(22)Date of filing:  07.02.2013
(51)International Patent Classification (IPC): 
H04R 3/00(2006.01)

(54)

A method of reducing un-correlated noise in an audio processing device

Verfahren zur Reduzierung von nicht korreliertem Rauschen in einer Audioverarbeitungsvorrichtung

Procédé de réduction de bruit non corrélé dans un dispositif de traitement audio


(84)Designated Contracting States:
AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

(43)Date of publication of application:
13.08.2014 Bulletin 2014/33

(73)Proprietors:
  • Sennheiser Communications A/S
    2680 Solrød Strand (DK)
  • Oticon A/S
    2765 Smørum (DK)

(72)Inventors:
  • Feldt, Svend
    2680 Solroed Strand (DK)
  • Christiansen, Torben
    2680 Solroed Strand (DK)
  • Kaulberg, Thomas
    DK-2765 Smørum (DK)
  • The other inventors have waived their right to be thus mentioned.

(74)Representative: Hauge, Christian 
Oticon A/S Kongebakken 9
2765 Smørum
2765 Smørum (DK)


(56)References cited: : 
EP-A2- 1 339 256
WO-A1-2012/074503
US-A1- 2012 253 798
WO-A1-2008/041730
GB-A- 2 453 118
  
      
    Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give notice to the European Patent Office of opposition to the European patent granted. Notice of opposition shall be filed in a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art. 99(1) European Patent Convention).


    Description

    TECHNICAL FIELD



    [0001] The present application relates to an audio processing device, e.g. a hearing aid, such as a headset or a hearing instrument. The application relates in particular to the minimization in an audio signal of un-correlated noise signal components, e.g. originating from wind noise or microphone noise. The disclosure relates specifically to an audio processing device comprising a multitude of electric input signals.

    [0002] The application furthermore relates to a method of operating an audio processing device and to the use of an audio processing device.

    [0003] The application further relates to a data processing system comprising a processor and program code means for causing the processor to perform at least some of the steps of the method.

    [0004] Embodiments of the disclosure may e.g. be useful in applications such as hearing aids, headsets, ear phones, active ear protection systems, handsfree telephone systems, mobile telephones, public address systems, recording systems, etc.

    BACKGROUND



    [0005] The following account of the prior art relates to one of the areas of application of the present application, hearing aids.

    [0006] Detection of wind noise and subsequent minimization of its influence on a target signal, e.g. in a hearing aid, such as a head set or a hearing instrument, has been dealt with in a number publications. EP1448016A1 deals for example with a device for detecting the presence of wind noise in an array of microphones. The microphones each generate their time dependant signals, which are fed to a signal processing device providing one or more output signals. The signal processing device has means for generating a time dependant cross correlation function between a first and a second microphone signal and, means for generating a signal corresponding to a time dependant auto correlation function of either the first or the second of the microphone signals. Further the signal processing device is configured to detect a condition that the auto correlation function is substantially higher than the cross correlation function, which is indicative of the presence of wind noise.

    [0007] US 2012/253798 A1 discloses A system for combining signals includes a first microphone generating a first input signal having a first voice component and a first noise component, a second microphone generating a second input signal having a second voice component and a second noise component, a mixing circuit, and an adaptive filter. The mixing circuit is configured to produce a summed signal of the first input signal and the second input signal, and the adaptive filter provides an updated value to minimize the energy of the summed signal.

    [0008] WO 2012/074503 A1 discloses a system and method of signal combining that supports different speakers in a noisy environment.

    SUMMARY



    [0009] The present disclosure relates in particular to the processing of signals that are stochastic and uncorrelated across the multitude of input signals, e.g. from an array of microphones (and possible beam formers). Such signals may e.g. include wind noise. The present disclosure relates in a particular embodiment to the selection of a best source of the input signal in a time/frequency multiplexed domain, e.g. to minimize wind noise or other uncorrelated signals (e.g. stochastic noise) in the resulting signal.

    [0010] An object of the present application is to provide an improved scheme for processing audio signals.

    [0011] Objects of the application are achieved by the invention described in the accompanying claims and as described in the following.

    An audio processing device:



    [0012] In an aspect, the present disclosure relates to an audio processing device comprising a multitude N (N > 1) of inputs, e.g. from an array of microphones or a combination of other and/or different input transducers each providing an electric input signal Xi (i=1, 2, ..., N) representing an acoustic signal of the environment of the audio processing device, and providing a resulting, processed signal Y (cf. e.g. FIG. 1a).

    [0013] In an aspect, an object of the application is achieved by an audio processing device comprising
    • a multitude of electric input signals, each electric input signal being provided in a digitized form, and
    • a control unit receiving said digitized electric input signals and providing a resulting enhanced signal, wherein the control unit is configured to determine the resulting enhanced signal from said digitized electric input signals, or signals derived therefrom, according to a predefined scheme.


    [0014] This has the advantage of allowing a minimization in a resulting audio signal of undesired signal components of a sound field.

    [0015] In an embodiment, the predefined scheme comprises a criterion based on the magnitude (or magnitude squared) of the respective electric input signals or a signal derived therefrom. In an embodiment, the predefined scheme comprises a criterion to provide a resulting enhanced signal (at a given time and/or frequency) having the smallest magnitude as selected among the multitude of electric input signals.

    [0016] In an embodiment, the predefined scheme comprises a criterion for minimizing the energy of a resulting enhanced signal.

    [0017] In an embodiment, the criterion for minimizing the energy is achieved by, at each point in time, selecting the electric input signal that - at that point in time - contains the least energy.

    [0018] The electric input signals are provided in digitized form (the digitized signal comprising sample values of an analogue input signal at regularly spaced, distinct time instances). In an embodiment, the resulting enhanced signal is determined as an assembly of samples selected among the available electric input signals (or signals derived therefrom), each sample of the enhanced signal at a specific point in time representing (being taken from) the electric input signal that at that time has the lowest energy (magnitude).

    [0019] In an embodiment, a time frame is defined, each time frame containing a predefined number of samples. In an embodiment, the minimization of energy in the resulting enhanced signal is performed on a time frame basis in that the resulting enhanced signal is determined as an assembly of time frames (instead of samples) selected among the available electric input signals (or signals derived therefrom).

    [0020] In an embodiment, the predefined scheme comprises a criterion resulting in the selection of the contents of a given time-frequency bin (k,m) of one of the respective electric input signals (or a signal derived therefrom) for being used in the corresponding time-frequency bins (k,m) of the enhanced signal.

    [0021] In a preferred embodiment, the predefined scheme comprises that the content of a given time-frequency bin (k,m) of the enhanced signal is determined from the content of the time-frequency bin (k,m) of the electric input signal comprising the least energy (e.g. has the smallest magnitude among the available input signals), cf. e.g. FIG. 6. In a signal comprising wind noise, such scheme utilizes that the source signal containing the least energy also will contain least wind noise.

    [0022] Alternatively, the predefined scheme comprises selecting the time frequency units having the largest signal to noise ratio (SNR). This has the advantage of basically removing the need to prefer one source over the other.

    [0023] The input signals to the audio processing device may be presented in the (time-) frequency domain or converted from the time domain to the (time-) frequency domain by appropriate functional units, e.g. included in the audio processing device. An audio processing device according to the present disclosure may e.g. comprise a multitude of time to time time-frequency conversion units (e.g. one for each input signal that is not otherwise provided in a time-frequency representation, cf. e.g. FIG. 1b) to provide each input signal Xi(k,m) (i=1, 2, ..., N) in a number of frequency bands k and a number of time instances m (the entity (k,m) being defined by corresponding values of indices k and m being termed a TF-bin or DFT-bin or TF-unit, cf. e.g. FIG. 3b). The audio processing device comprises a control or signal processing unit that receives time-frequency representations of the input signals and provides a time-frequency representation of a resulting, enhanced signal. The control unit is configured to determine the content of a given time-frequency bin (k,m) of the enhanced signal from the contents of corresponding TF-bins of the time-frequency representations of one or more of the electric input signals according to a predefined scheme.

    [0024] In an embodiment, the phase of the resulting enhanced signal is determined by the origin of the signal at a particular time. In other words, the resulting enhanced signal comprises magnitude and phase from the input signal (or a time frequency unit of the input signal) that is selected at a given point in time (that fulfils the predetermined scheme). Alternatively, the phase of the resulting enhanced signal may be synthesized (e.g. randomized) or smoothed to avoid large changes in phase from time instance to time instance (and/or from frequency band to frequency band).

    [0025] Preferably, the electric input signals Xi provide different representations (e.g. picked up at different locations and/or by different transducer means) of the same sound field constituting the current acoustic environment of the audio processing device.

    [0026] In an embodiment, the sound field comprises one or more speech signals. In an embodiment, a specific speech signal constitutes a target signal of the user wearing the audio processing device.

    [0027] An electric input signal Xi(k,m) (i=1, 2, ..., N) may represent a signal from an input transducer, e.g. a microphone (e.g. a normal microphone or a vibration sensing bone conduction microphone), or an accelerometer, or a wireless receiver.

    [0028] In an embodiment, an electric input signal Xi is an omni-directional signal.

    [0029] In an embodiment, an electric input signal Xi is the result of a combination of two or more signals, e.g. a linear combination. In an embodiment, an electric input signal Xi is an output from a processing algorithm (in the sense that the electric input signal Xi is the resulting signal after having been subject to a processing algorithm, e.g. a noise reduction or compression or directional algorithm).

    [0030] The audio processing device may comprise a directional algorithm for combining two or more of the input signals (e.g. signals from two or more microphones), such directionality algorithm being alternatively referred to as beam forming in the present application. In an embodiment, an electric input signal Xi is a directional (beamformed) signal (e.g. resulting from a (e.g. weighted) combination of at least two omni-directional signals).

    [0031] In an embodiment, the directional algorithm is configured to identify the direction (from the present location of the audio processing device) to a sound source comprising a target speech signal in a given sound field.

    [0032] In an embodiment, the audio processing device comprises a microphone system comprising at least two microphones for converting a sound field to respective time variant electric input signals and at least two time to time time-frequency conversion units, one for each microphone, to provide time-frequency representations of each electric input signal in a number of frequency bands k and a number of time instances m.

    [0033] In a situation where individual input signals (e.g. each originating from an omni-directional microphone) are correlated, a directional signal based on such correlated inputs may have a lower energy than each of the individual signals. It will hence - in general - be advantageous to use the directional signal for further processing or presentation to a user. On the other hand, in a situation where individual input signals (or elements or components of the inputs signals) are UN-correlated, e.g. due to contributions from wind or microphone noise, it will in general be advantageous to use one of the nondirectional (omni-directional) signals for further processing or presentation to a user.

    [0034] In an embodiment, the audio processing device comprises at least three electric input signals, first and second electric input signals from two (e.g. omni-directional) microphones and a third electric input signals from a directional algorithm providing a weighted combination of said first and second electric input signals.

    [0035] In a preferred embodiment, the electric input signals are normalized. This has the advantage that the signal contents of the individual signals can be readily compared. In an embodiment, the audio processing device comprises a normalization filter operationally connected to an electrical input, the normalization filter being configured to have a transfer function HN(f), which makes the source providing the electric input signal in question comparable and interchange able with the other sources. The normalization filter is preferably configured to allow a direct comparison of the input signals and a smooth exchange of signal components (e.g. TF-units) between the input signals (without creating significant discontinuities). A normalization can e.g. compensate for a constant level difference between two electric input signals (e.g. due to the location of the two source input transducers providing the input signals relative to the current sound source(s)). In an embodiment, the normalization filter comprises an adaptive filter.

    [0036] In an embodiment, a method of normalizing N electric input signals comprises a) Select a reference source input signal (e.g. the signal assumed to be most reliable), e.g. signal X1, b) for each of the other source input signals Xi, i=2, ..., N, calculate the difference in magnitude over frequency to the reference (e.g. for a common time period of the signals and/or for respective signals averaged over a certain time), and c) scale each source by multiplication with a (possibly complex) correction value.

    [0037] In an embodiment, the control unit comprises a combination unit allowing a weight P to be applied to at least one of the electric input signals (or to a signal derived therefrom). In an embodiment the weight P is dependent on frequency. The (possibly frequency dependent Pi(f), i=1, 2, ..., N, f being frequency) weighting factors provide a possibility to prefer one input signal (or a specific frequency range of an input signal) over another. If, e.g., a given source providing one of the electric inputs signals is known to have a specific dependence on a specific parameter (e.g. a parameter of the current acoustic environment, e.g. the general level of wind noise or), the control unit can be configured to take this into account by modifying P(f) accordingly.

    [0038] In an embodiment, the relative size of weights Pq(f) and Pj(f) for input signal q and j, respectively, is influenced by the location of the input transducers that provide the electric input signals q and j, respectively, relative to sound sources of the current sound field.

    [0039] In an embodiment, the audio processing unit comprises a wind noise detector for providing an (preferably frequency dependent) estimate of the amount of wind noise present at a specific point in time (e.g. at a specific location, or in a specific input signal). In an embodiment, the wind noise detector is configured to detect any un-correlated noise (e.g. wind noise, microphone, noise etc.).

    [0040] In an embodiment, a weighting factor Pi(f) of the ith electric input signal (or a signal derived therefrom, i=1, 2, ..., N) is modified in dependence of the estimated amount of wind noise. In an embodiment, a weighting factor P(f) of an input signal that represents a directional microphone signal is different at different frequencies (e.g. to compensate for a non linear wind noise dependence of the directional signal).

    [0041] In an embodiment, a weighting factor Pi(f) of an input signal is adaptively determined dependent on an output from a detector, e.g. an environment detector, e.g. a wind noise detector.

    [0042] In an embodiment, the weighting factors are used Pi(f) to select the signal from only one source, e.g. source j (by setting Pi(f)=0 for all other signals (for i≠j)).

    [0043] In an embodiment, the audio processing device comprises an artifact detector providing an artifact measure indicative of the presence of 'fluctuating gains', which may be a source of artifacts introduced by a processing algorithm (e.g. the present algorithm for reducing un-correlated noise in an audio signal). An example of such artifact detector is disclosed in EP2463856A1.

    [0044] The term 'artifact' is in the present context of audio processing taken to mean elements of an audio signal that are introduced by signal processing (digitalization, noise reduction, compression, etc.) that are in general not perceived as natural sound elements, when presented to a listener. The artifacts are often referred to as musical noise, which are due to random spectral peaks in the resulting signal. Such artifacts sound like short pure tones.

    [0045] In an embodiment, the activation or deactivation of the current scheme for minimizing un-correlated signal components (e.g. wind noise) in an audio signal is influenced by the artifact measure. In other words, the control unit is configured to disregard the proposal of the predefined scheme, if the artifact measure indicates that the artifact produced by the noise minimization algorithm is audible (i.e. worse than the benefit achieved by lowering the noise floor).

    [0046] In an embodiment comprising two microphone input signals and a directional input, the energy minimization algorithm according to the present disclosure is used to remove un-correlated signal components in the directional signal. In an embodiment, the control unit is configured to maintain the directional signal in cases where the artifacts created by the minimization algorithm are audible. This can e.g. be implemented by lowering the weighting factor P for the directional signal (third electric input signal) compared to the weighting factors of the two microphone input signals (first and second electric input signals). In an embodiment, the weighting factors P1(f)=P2(f)=1 and P3(f)<1.

    [0047] In an embodiment, the audio processing device is adapted to provide a frequency dependent gain to compensate for a hearing loss of a user. In an embodiment, the audio processing device comprises a signal processing unit for enhancing the input signals and providing a processed output signal.

    [0048] In an embodiment, the audio processing device comprises an output transducer for converting an electric signal to a stimulus perceived by the user as an acoustic signal. In an embodiment, the output transducer comprises a number of electrodes of a cochlear implant or a vibrator of a bone conducting hearing device. In an embodiment, the output transducer comprises a receiver (speaker) for providing the stimulus as an acoustic signal to the user.

    [0049] In an embodiment, the audio processing device comprises an antenna and transceiver circuitry for wirelessly receiving a direct electric input signal from another device, e.g. a communication device or another audio processing device. In an embodiment, the audio processing device comprises a (possibly standardized) electric interface (e.g. in the form of a connector) for receiving a wired direct electric input signal from another device, e.g. a communication device or another audio processing device. In an embodiment, the direct electric input signal represents or comprises an audio signal and/or a control signal and/or an information signal.

    [0050] In an embodiment, the communication between the audio processing device and the other device is in the base band (audio frequency range, e.g. between 0 and 20 kHz). Preferably, communication between the audio processing device and the other device is based on some sort of modulation at frequencies above 100 kHz. Preferably, frequencies used to establish communication between the audio processing device and the other device is below 50 GHz, e.g. located in a range from 50 MHz to 50 GHz, e.g. above 300 MHz, e.g. in an ISM range above 300 MHz, e.g. in the 900 MHz range or in the 2.4 GHz range.

    [0051] In an embodiment, the audio processing device is portable device, e.g. a device comprising a local energy source, e.g. a battery, e.g. a rechargeable battery.

    [0052] In an embodiment, the audio processing device comprises a forward or signal path between an input transducer (microphone system and/or direct electric input (e.g. a wireless receiver)) and an output transducer. In an embodiment, the signal processing unit is located in the forward path. In an embodiment, the signal processing unit is adapted to provide a frequency dependent gain according to a user's particular needs. In an embodiment, the audio processing device comprises an analysis path comprising functional components for analyzing the input signal (e.g. determining a level, a modulation, a type of signal, an acoustic feedback estimate, etc.). In an embodiment, some or all signal processing of the analysis path and/or the signal path is conducted in the frequency domain. In an embodiment, some or all signal processing of the analysis path and/or the signal path is conducted in the time domain.

    [0053] In an embodiment, an analogue electric signal representing an acoustic signal is converted to a digital audio signal in an analogue-to-digital (AD) conversion process, where the analogue signal is sampled with a predefined sampling frequency or rate fs, fs being e.g. in the range from 8 kHz to 80 kHz (adapted to the particular needs of the application, e.g. 48 kHz) to provide digital samples xn (or x[n]) at discrete points in time tn (or n), each audio sample representing the value of the acoustic signal at tn by a predefined number Ns of bits, Ns being e.g. in the range from 1 to 64 bits, e.g. 16 or 24 bits. A digital sample x has a length in time of ts=1/fs, e.g. 50 µs, for fs = 20 kHz. In an embodiment, a number of audio samples are arranged in a time frame. In an embodiment, a time frame comprises 64 audio data samples. Other frame lengths may be used depending on the practical application. A frame length may e.g. be of the order of 2 to 20 ms, e.g. 3.2 ms or 3.75 ms (e.g. Bluetooth) or 5 ms or 7.5 ms (e.g. Bluetooth) 10 ms (e.g. DECT).

    [0054] In an embodiment, the audio processing devices comprise an analogue-to-digital (AD) converter to digitize an analogue input with a predefined sampling rate, e.g. 20 kHz. In an embodiment, the audio processing devices comprise a digital-to-analogue (DA) converter to convert a digital signal to an analogue output signal, e.g. for being presented to a user via an output transducer.

    [0055] In an embodiment, the audio processing device, e.g. the microphone unit, and or the transceiver unit comprise(s) a TF-conversion unit for providing a time-frequency representation of an input signal. In an embodiment, the time-frequency representation comprises an array or map of corresponding complex or real values of the signal in question in a particular time and frequency range. In an embodiment, the TF conversion unit comprises a filter bank for filtering a (time varying) input signal and providing a number of (time varying) output signals each comprising a distinct frequency range of the input signal. In an embodiment, the TF conversion unit comprises a Fourier transformation unit for converting a time variant input signal to a (time variant) signal in the frequency domain. In an embodiment, the frequency range considered by the audio processing device from a minimum frequency fmin to a maximum frequency fmax comprises a part of the typical human audible frequency range from 20 Hz to 20 kHz, e.g. a part of the range from 20 Hz to 12 kHz. In an embodiment, a signal of the forward and/or analysis path of the audio processing device is split into a number NI of frequency bands, where NI is e.g. larger than 5, such as larger than 10, such as larger than 50, such as larger than 100, such as larger than 500, at least some of which are processed individually. In an embodiment, the audio processing device is/are adapted to process a signal of the forward and/or analysis path in a number NP of different frequency channels (NPNI). The frequency channels may be uniform or non-uniform in width (e.g. increasing in width with frequency), overlapping or non-overlapping.

    [0056] In an embodiment, the audio processing device comprises a level detector (LD) for determining the level of an input signal (e.g. on a band level and/or of the full (wide band) signal). The input level of the electric microphone signal picked up from the user's acoustic environment is e.g. a classifier of the environment.

    [0057] In a particular embodiment, the audio processing device comprises a voice activity detector (VAD) for determining whether or not an input signal comprises a voice signal (at a given point in time). This has the advantage that time segments of the electric microphone signal comprising human utterances (e.g. speech) in the user's environment can be identified, and thus separated from time segments only comprising other sound sources (e.g. naturally or artificially generated noise, e.g. wind noise or microphone noise).

    [0058] In an embodiment, the audio processing device comprises a signal to noise ratio detector (estimator). SNR estimation may e.g. be performed in combination with a voice activity detector (VAD), as indicated above.

    [0059] In an embodiment, the audio processing device comprises a detector of the presence of a predefined amount of un-correlated signal components in the current sound field surrounding the audio processing device. In an embodiment, the audio processing device comprises a wind noise detector, a correlation detector, an auto-correlation detector or a combination thereof.

    [0060] In an embodiment, the activation or deactivation of the current scheme for minimizing un-correlated signal components (e.g. wind noise) in an audio signal is dependent on a control signal from the detector(s).

    [0061] In an embodiment, the audio processing device comprises an acoustic (and/or mechanical) feedback suppression system. In an embodiment, the audio processing device further comprises other relevant functionality for the application in question, e.g. compression, noise reduction, etc.

    [0062] In an embodiment, the audio processing device comprises a listening device, e.g. a hearing aid, or a communication device, e.g. a cell phone. In an embodiment, the hearing aid comprises a headset. In an embodiment, the hearing aid comprises a hearing instrument, such as a hearing instrument adapted for being located at the ear or fully or partially in the ear canal of a user. In an embodiment, the hearing aid comprises an earphone, an ear protection device or a combination thereof.

    Use:



    [0063] In an aspect, use of an audio processing device as described above, in the 'detailed description of embodiments' and in the claims, is moreover provided. In an embodiment, use is provided in an audio system comprising a mixture of one or more target signals and one or more noise signals, the one pr more noise signals comprising un-correlated signal components, e.g. wind noise. In an embodiment, use is provided in a system comprising one or more hearing instruments, headsets, ear phones, active ear protection systems, etc., e.g. in handsfree telephone systems, teleconferencing systems, public address systems, karaoke systems, classroom amplification systems, etc.

    A method:



    [0064] In an aspect, a method of operating an audio processing device is furthermore provided by the present application, the method comprises
    • providing a multitude of electric input signals, each being provided in a digitized form;
    • providing a resulting enhanced signal based on said digitized electric input signals; and
    • determining the resulting enhanced signal from said digitized electric input signals, or signals derived therefrom, according to a predefined scheme.


    [0065] It is intended that some or all of the structural features of the device described above, in the 'detailed description of embodiments' or in the claims can be combined with embodiments of the method, when appropriately substituted by a corresponding process and vice versa. Embodiments of the method have the same advantages as the corresponding devices.

    A computer readable medium:



    [0066] In an aspect, a tangible computer-readable medium storing a computer program comprising program code means for causing a data processing system to perform at least some (such as a majority or all) of the steps of the method described above, in the 'detailed description of embodiments' and in the claims, when said computer program is executed on the data processing system is furthermore provided by the present application. In addition to being stored on a tangible medium such as diskettes, CD-ROM-, DVD-, or hard disk media, or any other machine readable medium, and used when read directly from such tangible media, the computer program can also be transmitted via a transmission medium such as a wired or wireless link or a network, e.g. the Internet, and loaded into a data processing system for being executed at a location different from that of the tangible medium.

    A data processing system:



    [0067] In an aspect, a data processing system comprising a processor and program code means for causing the processor to perform at least some (such as a majority or all) of the steps of the method described above, in the 'detailed description of embodiments' and in the claims is furthermore provided by the present application.

    [0068] Further objects of the application are achieved by the embodiments defined in the dependent claims and in the detailed description of the invention.

    [0069] As used herein, the singular forms "a," "an," and "the" are intended to include the plural forms as well (i.e. to have the meaning "at least one"), unless expressly stated otherwise. It will be further understood that the terms "includes," "comprises," "including," and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will also be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present, unless expressly stated otherwise. Furthermore, "connected" or "coupled" as used herein may include wirelessly connected or coupled. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless expressly stated otherwise.

    BRIEF DESCRIPTION OF DRAWINGS



    [0070] The disclosure will be explained more fully below in connection with a preferred embodiment and with reference to the drawings in which:

    FIG. 1 schematically shows three embodiments of an audio processing device according to the present disclosure,

    FIG. 2 three embodiments of an audio processing device according to the present disclosure,

    FIG. 3 schematically shows a conversion of a signal in the time domain to the time-frequency domain, FIG. 3a illustrating a time dependent sound signal (amplitude versus time) and its sampling in an analogue to digital converter, FIG. 3b illustrating a resulting 'map' of time-frequency units after a Fourier transformation of the sampled signal,

    FIG. 4 shows three embodiments of an audio processing device according to the present disclosure where processing is performed in the frequency domain,

    FIG. 5 shows an embodiment of an audio processing device according to the present disclosure where the electric input signals comprise two microphone signals and a directional signal generated from the microphone signals,

    FIG. 6 shows an illustrative example of time-frequency representations of first and second electric input signals and a resulting enhanced output signal according to an embodiment of the present disclosure, and

    FIG. 7 shows embodiments of a hearing instrument (FIG. 7a) and a headset (FIG. 7b), respectively, according to the present disclosure.



    [0071] The figures are schematic and simplified for clarity, and they just show details which are essential to the understanding of the disclosure, while other details are left out.

    [0072] Further scope of applicability of the present disclosure will become apparent from the detailed description given hereinafter. However, it should be understood that the detailed description and specific examples, while indicating preferred embodiments of the disclosure, are given by way of illustration only. Other embodiments may become apparent to those skilled in the art from the following detailed description.

    DETAILED DESCRIPTION OF EMBODIMENTS



    [0073] FIG. 1 shows three embodiments of an audio processing device according to the present disclosure.

    [0074] In a first basic embodiment, the audio processing device (APD) of FIG. 1a comprises a multitude of electric input signals X1, ..., XN, (where N ≥ 2), each electric input signal being provided in a digitized form, and a control unit (CTR) receiving the digitized electric input signals and providing a resulting enhanced signal Y. The control unit (CTR) is configured to determine the resulting enhanced signal Y from the digitized electric input signals X1, ..., XN, or signals derived therefrom, according to a predefined scheme, e.g. by minimizing the energy of the enhanced signal Y at predefined points in time (e.g. on a sample or a time frame or a time-frequency unit basis).

    [0075] In a second basic embodiment, the audio processing device (APD) of FIG. 1b comprises a multitude of input signals I1, I2, ..., IN, (where N ≥ 2), each input signal Ii being operationally connected to an analysis filter bank (A-FB) providing a time-frequency representation Xi of the input signal (e.g. comprising (complex) values of the input signal in a number of frequency bands k and at a number of time instances m). The control unit (CTR) is configured to determine the resulting enhanced signal Y from the time-frequency representation of the electric input signals X1, X2, ..., XN, or signals derived therefrom, according to a predefined scheme. The resulting enhanced signal Y is here assumed to be provided in a time-frequency representation and is here operationally connected to a synthesis filter bank (S-FB) providing the resulting enhanced signal as a time domain signal OUT, which may be processed further in subsequent signal processing units.

    [0076] The third basic embodiment of an audio processing device (APD) according to the present disclosure as shown in FIG. 1c comprises the same functional elements as described in connection with FIG. 1b. Additionally, the embodiment of FIG. 1c comprises input units IU1, IU2, ..., IUN, providing the input signals I1, I2, ..., IN. Input units IU1, IU2, ..., IUN in general provide (electric) input signals (I1, I2, ..., IN) representing an audio signal (i.e. comprise frequencies in the audible frequency range (e.g. ≤ 20 kHz)). Input units IU1, IU2, ..., IUN can e.g. comprise one or more input transducers, e.g. microphones for picking up an acoustic sound signal and/or transceivers for (wirelessly or wired) receiving electric or electromagnetic signals comprising audio, and/or be a signal input from a processing algorithm having been applied to an audio signal. In the embodiment of FIG. 1c, the time domain signal OUT is operationally connected to an output unit (OU). The output unit (OU) can e.g. be an output transducer, e.g. for converting the electric output signal OUT to a stimulus perceived by a user as sound. Such output transducer can e.g. include a) a receiver/speaker for converting the electric signal to an acoustic sound signal, b) a vibrator of a bone conducting hearing instrument for converting the electric signal to mechanical vibrations to the user's inner ears through the bone structure of the user's head, or c) one or more electrodes of a cochlear implant for transferring electric signals directly or indirectly to the cochlear nerve of the user. Alternatively of additionally, the output unit can include a transceiver for transmitting the electric output signal to another device.

    [0077] FIG. 2 shows three embodiments of an audio processing device according to the present disclosure. The embodiments of an audio processing device illustrated in FIG. 2 can e.g. represent a listening device, e.g. a hearing aid, e.g. a (part of a) headset or a hearing instrument.

    [0078] The embodiment of an audio processing device of FIG. 2a comprises microphone units M1, M2, ..., MN each picking up a version of the sound field around the audio processing device and converting the sound signal to a digitized (electric) input signal Ii, i=1, 2, ..., N. The microphone units M1, M2, ..., MN are assumed to each comprise analogue to digital (AD) conversion units to convert an analogue electric signal to a digitized electric signal. The control unit (CTR) receives the digitized electric input signals I1, I2, ..., IN and provides a resulting enhanced signal Y, wherein un-correlated signal components are minimized. This signal is fed to processing unit (FSP) for additional signal processing, e.g. further noise reduction, application of frequency dependent gain, feedback cancellation or echo cancelling, etc. The output OUT of the processing unit is in the embodiment of FIG. 2a fed to a speaker unit (SP) for converting the electric output signal OUT to an output sound for being provided to a user. The speaker unit (SP) is e.g. embodied in an ear piece located at a user's ear or in a user's ear canal.

    [0079] The embodiment of an audio processing device of FIG. 2b comprises the same elements as the embodiment of FIG.2a. Additionally, the audio processing device of FIG. 2b comprises a directional algorithm embodied in directional unit (DIR), which is operationally connected to digitized (electric) input signal Ii, i=1, 2, ..., N. Directional unit (DIR) provides a directional signal ID emulating a directional microphone having a directional characteristic providing increased sensitivity in particular directions and decreased sensitivity in other directions, thereby attenuating noise (and possible target signals) from those (other) directions. The directional signal ID, which is formed as a weighted combination of input signals Ii, i=1, 2, ..., N, is operationally connected to control unit CTR. The control unit (CTR) thus receives the digitized electric input signals I1, I2, ..., IN and additionally directional signal ID as inputs and provides a resulting enhanced signal Y based thereon. In the embodiment of FIG. 2b, the directional signal ID is assumed to have a lower amount of un-correlated noise, e.g. wind noise (at least in directions of decreased sensitivity of the directional characteristic), and the task of control unit CTR can be seen as optimizing the directional signal by inserting signal components from the individual microphones (among signals I1, I2, ..., IN) having lower energy (e.g. magnitude), and thus lower noise, than the corresponding signal components of the directional input signal ID. The resulting enhanced signal Y, is operationally connected to processing unit (FSP) for additional signal processing as described in connection with FIG. 2a.

    [0080] The embodiment of an audio processing device of FIG. 2c comprises a multitude of microphone units M1, M2, ..., MN each picking up a version of the sound field around the audio processing device and converting the sound signal to an (electric) input signal. Each microphone Mi is operationally connected to a normalization filter Hi (i=1, 2, ..., N), each being configured to have a transfer function Hi(f), which makes the source (picked up by microphone Mi) providing the (electric) input signal in question comparable and interchange able with the other sources. The normalization filters Hi providing normalized electric input signals Ii, are operationally connected to time to time-frequency conversion units (here analysis filter banks, A-FB), whose outputs Xi (comprising a time-frequency representation of corresponding signals Ii) are fed to signal processing unit SPU. Signal processing unit (SPU) comprises an algorithm according to the present invention configured to minimize un-correlated noise in a resulting audio signal (Y, not shown in FIG. 2c). The signal processing unit (SPU) may be configured to apply further signal processing to input signals Xi (i=1, 2, ..., N) before or after the application of the noise minimization algorithm according to the present disclosure. Likewise, the resulting enhanced signal Y may be further processed signal processing unit (SPU) before a processed output signal OF (assumed to be in a time-frequency representation) is operationally connected to time-frequency to time conversion unit (here synthesis filter bank S-FB) to provide an output signal OUT in the time domain. Output signal OF is operationally connected to output transducer, OT. Output transducer OT may be embodied in various ways as e.g. indicated in connection with FIG. 1c.

    [0081] FIG. 3 schematically shows a conversion of a signal in the time domain to the time-frequency domain, FIG. 3a illustrating a time dependent sound signal (amplitude versus time) and its sampling in an analogue to digital converter, FIG. 3b illustrating a resulting 'map' of time-frequency units after a Fourier transformation of the sampled signal. FIG. 3a illustrates a time dependent sound signal (amplitude versus time), its sampling in an analogue to digital converter and a grouping of time samples in frames, each comprising Ns samples. The graph showing a sound pressure level in dB versus time (solid line in FIG. 3a) may e.g. represent the time variant analogue electric signal provided by an input transducer, e.g. a microphone, before being digitized by an analogue to digital conversion unit. FIG. 3b illustrates a 'map' of time-frequency units resulting from a Fourier transformation (e.g. a discrete Fourier transform, DFT) of the input signal of FIG. 3a, where a given time-frequency unit (m,k) corresponds to one DFT-bin and comprises a complex value of the signal X(m,k) in question (X(m,k)= |X|·e, |X| = magnitude and ϕ = phase) in a given time frame m and frequency band k. In the following, a given frequency band is assumed to contain one (generally complex) value of the signal in each time frame. It may alternatively comprise more than one value. The terms 'frequency range' and 'frequency band' are used in the present disclosure. A frequency range may comprise one or more frequency bands. The Time-frquency map of FIG. 3b illustrates time frequency units (m,k) for k=1, 2, ..., K frequency bands and m=1, 2, ..., M time units. Each frequency band Δfk is indicated in FIG. 3b to be of uniform width. This need not be the case though. The frequency bands may be of different width (or alternatively, frequency channels may be defined which contain a different number of uniform frequency bands, e.g. the number of frequency bands of a given frequency channel increasing with increasing frequency, the lowest frequency channel(s) comprising e.g. a single frequency band). The time intervals Δtm (time unit) of the individual time-frequency bins are indicated in FIG. 3b to be of equal size. This need not be the case though, although it is assumed in the present embodiments. A time unit Δtm is typically equal to the number Ns of samples in a time frame (cf. FIG. 3a) times the length in time ts of a sample (ts = (1/fs), where fs is a sampling frequency). A time unit is e.g. of the order of ms in an audio processing system.

    [0082] FIG. 4 shows three embodiments of an audio processing device according to the present disclosure where processing is performed in the frequency domain.

    [0083] The embodiment of an audio processing unit shown in FIG. 4a comprises the same functional units as shown in the embodiment of FIG. 1c. A difference is that control unit CTR in FIG. 1c is denoted WNR in FIG. 4a and that the processing in different frequency bands (or channels) is indicated by the multitude 1, 2, ..., K of control units (WNR), one for each frequency band k. Each sub-band control unit provides a resulting enhanced signal Y(n,k), n being a time index (termed m above), for a specific frequency band k. FIG. 4b shows an embodiment of control unit WNR of FIG. 4a in more detail (FIG 4b specifically illustrates sub-band k=1). Each sub-band control unit WNRk (k=1, 2, ..., N) comprises normalization filters H1, H2, ..., HN. Each normalization filter Hi filter of a specific sub-band control unit WNRk filters an input signal Xi(n,k) and provides normalized signals XNi(n,k). The normalization filters Hi (i=1, 2, ..., N) are configured to allow a direct comparison of the resulting input signals XNi(n,k) and a smooth exchange of signal components (e.g. TF-units (n,k)) between the different input signals XNi(n,k) (without creating significant discontinuities). The normalized signals XNi(n,k) of a given sub-band control unit WNRk are operationally connected to control unit (CNT) and to selection unit (SEL). The control unit (CNT) of the sub-band control unit in question is configured to determine which of the current time-frequency units (n,k) fulfils a predefined criterion (e.g. has the lowest energy (magnitude)) and provides as an output a select signal S(n,k), which is fed to selection unit (SEL) and controls the selection of the appropriate time-frequency units (n,k) among the N time-frequency units (n,k) of normalized input signals XN1, XN2, ..., XNN. The selection unit (SEL) thereby provides the selected TF-unit as an output in the resulting enhanced signal Y(n,k). The same procedure is followed in all K sub-band control units WNRk (k=1, 2, ..., K), thereby providing the resulting enhanced signal Y(n,k).

    [0084] FIG. 4c shows an alternative example of sub-band control units WNRk (k=1, 2, ..., K). In the embodiment of FIG. 4c frequency index k of FIG. 4b is replaced by symbol ω. Normalization filters Hi(k) of FIG. 4b are implemented as multipliers with frequency dependent multiplication factors Hi(ω). The control unit (CNT) of FIG. 4b is implemented in the embodiment of FIG. 4c by units |X| for determining a magnitude of a (real or) complex input signal, a multiplication unit 'x' for applying a weighting factor Pi(ω) to the input signal in question and a minimization unit (Min{}) for - at a given point in time n - determining the minimum value among the normalized weighted magnitude signals (|Xi(n,ω))·Hi(ω)| ·Pi(ω) from the inputs Xi(n,ω). The output of minimization unit (Min{}) controls the selection unit (equivalent to unit SEL in FIG. 4b), which provides resulting enhanced output signal Y(n,ω).

    [0085] FIG. 5 shows an embodiment of an audio processing device according to the present disclosure where the electric input signals comprise two microphone signals and a directional signal generated from the microphone signals. The embodiment of an audio processing device shown in FIG. 5 comprises four input units IU1, IU2, IU3, and IU4, each providing respective digitized electric input signals I1, I2, I3, and I4, each representing an input audio signal. Input units IU1, IU2 are embodied by respective microphone units. Input unit IU3 is embodied by antenna and transceiver circuitry for receiving and (possibly) demodulating an electromagnetic signal representing an audio signal, such audio signal being e.g. picked up by a microphone separately located from the audio processing unit (e.g. in another device, e.g. in a communication device, e.g. a cellular phone, or in a contra-lateral listening device of a binaural listening system, e.g. of a binaural hearing aid system). The fourth Input unit IU4 may be another microphone or any other input transducer (e.g. a vibration sensing bone conduction microphone, or an accelerometer) or input signal as discussed above (e.g. in connection with FIG. 1c). The audio processing device comprises analysis filter banks (A-FB) for providing input signals I1-I4 in a time-frequency representation as signals IF1-IF4. The analogue to digital conversion assumed to have taken place in input units IU1-IU4 may alternatively be included in analysis filter bank units (A-FB). The audio processing device of FIG. 5 further comprises normalization filters H1-H4 as discussed in connection with FIG. 2c and FIG. 4b, 4c. Outputs of normalization filters H1 and H2 are operationally connected to a directional unit (DIR) providing as an output a directional signal X1(n,k). The audio processing device further comprises control unit (WNR) for providing an enhanced signal Y(n,k) from input signals from the directional unit DIR (signal X1(n,k)) and the four normalized audio signals X2(n,k)- X4(n,k). originating from input units IU1-IU4.

    [0086] In an embodiment, the audio processing device comprises (e.g. embodied in control unit WNR) a detector of the presence of a predefined amount of un-correlated signal components in a given input signal (e.g. each of X1(n,k) - X5(n,k)). In an embodiment, the audio processing device comprises a wind noise detector. In an embodiment, the audio processing device comprises a correlation detector and is configured to determine cross-correlation between two input signals selected among the five signals X1(n,k)- X5(n,k). In an embodiment, the audio processing device comprises an auto-correlation detector and is configured to determine auto-correlation of a signal selected among signals X1(n,k) - X5(n,k). In an embodiment, the activation or deactivation (or aggressiveness) of the current scheme for minimizing un-correlated signal components (e.g. wind noise) is dependent on a control signal from one or more of the detector(s).

    [0087] The resulting enhanced signal Y assumed to be in a time-frequency representation) may be further processed in a subsequent signal processing unit (not shown here) before being operationally connected to synthesis filter bank S-FB to provide output signal OUT in the time domain. Output signal OF is operationally connected to output unit, OU, which may be embodied in various ways as e.g. indicated in connection with FIG. 1c.

    [0088] The algorithm of the present disclosure is configured to optimize the output signal by selecting, at a given point in time and at a given frequency (i.e. in a given TF-unit), the signal that has the lowest magnitude among the available input signals. In case that a user (or the device itself) in a given acoustic environment has selected a directional signal as the preferred signal to listen to, e.g. to minimize signals from behind the user, the present algorithm can improve the selected signal (e.g. including improving a user's perception of a speech signal) by choosing time-frequency units from an omni-directional signal input. This could e.g. be the case if a source of un-correlated noise (e.g. wind noise) is predominately arriving from a direction defined by the directional microphone characteristic (e.g. in front of the user).

    [0089] FIG. 6 shows an illustrative example of time-frequency representations of first and second electric input signals and a resulting enhanced output signal according to an embodiment of the present disclosure. FIG. 6a shows exemplary magnitudes (Mag) of input signals X1 (top table) and X2 (middle table) to a noise minimization algorithm that for each time frequency unit (k,m) selects the one having the lowest magnitude (Mag) and includes that in the resulting enhanced signal Y(k,m) (bottom table). Each table shows values of magnitude (Mag) in arbitrary units of a complex signal in eight frequency bands (k1-k8) for twelve time frames (m1-m12). The upper table shows magnitude values of first input signal X1 on a white background, whereas the middle table shows magnitude values of second input signal X2 on a grey background. The lower table showing the magnitude values of the resulting enhanced signal Y (Mag(Y(k,m))=Min(Mag(X1(k,m)); Mag(X2(k,m))) indicates by its background shading for each magnitude value its origin by having a white (origin X1) or grey background (origin X2), respectively.

    [0090] In the scheme illustrated in FIG. 6, energy (magnitude) minimization of the resulting signal Y is achieved by minimizing the magnitude of each TF-unit. Alternatively, the minimization of the resulting signal Y may be performed for each time frame, so that the resulting signal Y(m) is determined from the inputs signals (here X1, X2) in such a way that the energy (proportional to the SUM(|Xi(k,m) |2) at a given time instant m is minimized. That would result in - at a given time instant m - selecting all time units (the whole time frame) from the input signal having the lowest total energy at that time instant. This would result in an assembly of time frames in the resulting signal Y(m) selected among the time frames of the available input signals (here X1(m), X2(m)).

    [0091] FIG. 6b illustrates frequency spectra (magnitude, arbitrary units) of a specific time frame (here m1) for the first X1(k,m1) (top graph) and second X2(k,m1) (middle graph) input signals and the resulting enhanced signal Y(k,m1) (bottom graph). The graphs of FIG. 6b reflect the values of the magnitudes of the respective signals indicated in the first column (corresponding to time instant m1) of the tables of FIG. 6a. It is thereby illustrated that the time frequency units (k,m1) (k=1, 2, ..., K=8) of the resulting signal Y comprises the minimum values of the magnitudes of the corresponding time frequency units (k,m1) of the input signals X1, X2.

    [0092] FIG. 7 shows embodiments of a hearing instrument (FIG. 7a) and a headset (FIG. 7b), respectively, according to the present disclosure.

    [0093] FIG. 7a illustrates an audio processing device in the form of a listening device, e.g. a hearing instrument (HI). The embodiment of FIG. 7a is equivalent to the upper part of the embodiment of FIG. 5 comprising two microphones (IU1, IU2) and a directional unit DIR. The hearing instrument HI of FIG. 7a comprises a forward path from a pair of (omni-directional) microphone units (for picking up each their version of an Acoustic input) via a directional unit (DIR) and a signal processing unit (SPU) to a speaker unit. The signal processing unit SPU includes the noise minimization algorithm according to the present disclosure, e.g. to minimize wind noise in a resulting enhanced signal of the forward path (as e.g. discussed in connection with FIG. 4b and 4c). The signal processing unit SPU further includes other processing algorithms for enhancing an input audio signal before being presented to a user. The signal processing unit SPU e.g. comprises algorithms for adapting the Acoustic output to a user's hearing impairment, including applying a frequency (and input level) dependent gain.

    [0094] FIG. 7b illustrates an audio processing device in the form of a listening device, e.g. a headset (HS) in wireless communication with a communication device (ComD). The headset (HS) comprises (substantially) independent microphone- and speaker-paths. The speaker path (upper part of FIG. 7b) comprises a wireless receiver (comprising Antenna and Rx-circuitry) for receiving an audio signal from a communication device (ComD), e.g. a cellular telephone, via a wireless link WL. The speaker path further comprises a signal processing unit (SPU-SP) for processing the audio signal extracted by the wireless receiver and a speaker for converting the processed audio signal to an output sound (Acoustic output) for being perceived by a user of the headset. The microphone path (lower part of FIG. 7b) is identical to the forward path of the hearing instrument of FIG. 7a, except that the output transducer of the headset (HS) is a wireless transmitter for establishing a wireless link to the communication device (ComD). The headset is thereby (using the speaker- and microphone-paths) configured to wirelessly establish a bi-directional audio link (WL) (e.g. a telephone conversation) to the communication device (ComD).

    [0095] The invention is defined by the features of the independent claim(s). Preferred embodiments are defined in the dependent claims. Any reference numerals in the claims are intended to be non-limiting for their scope.

    [0096] Some preferred embodiments have been shown in the foregoing, but it should be stressed that the invention is not limited to these, but may be embodied in other ways within the subject-matter defined in the following claims and equivalents thereof.

    REFERENCES



    [0097] 

    EP1448016A1 (OTICON) 18.08.2004

    EP2463856A1 (OTICON) 13.06.2012




    Claims

    1. An audio processing device (ADP) comprising

    • a multitude of electric input signals (X1, ..., XN), where N>=2, each electric input signal being provided in a digitized (I1, ..., IN) form in a time-frequency representation comprising a number of frequency bands k and a number of time instances m, and

    • a control unit (CTR; WNR) receiving said digitized electric input signals (I1, ..., IN) and providing a resulting signal (Y(k, m)), wherein the control unit (CTR; WNR) is configured to determine the resulting signal from said digitized electric input signals (I1, ..., IN), or signals derived therefrom, according to a predefined scheme

    CHARACTERIZED IN THAT the predefined scheme comprises:

    • a criterion resulting in the selection of the contents of a given time-frequency unit (k, m) of one of the respective electric input signals (X1, ..., XN), or a signal derived therefrom, for being used in the corresponding time-frequency unit (k, m) of the resulting signal (Y(k, m)); and

    • selecting the time frequency units (k, m) having the largest signal to noise ratio (SNR).


     
    2. An audio processing device according to claim 1 wherein the predefined scheme comprises minimizing the energy of the resulting signal (Y(k, m)).
     
    3. An audio processing device according to claim 2 wherein the criterion for minimizing the energy is achieved by, at each point in time, selecting the electric input signal that - at that point in time - contains the least energy.
     
    4. An audio processing device according to any one of claims 1-3 wherein the predefined scheme comprises that the content of a given time-frequency bin (k, m) of the resulting signal (Y (k, m)) is determined from the content of the time-frequency bin (k, m) of the electric input signal comprising the least energy.
     
    5. An audio processing device according to any one of claims 1-4 wherein one or more of said electric input signals Xi(k, m) (i=1, 2, ..., N) represents a signal from an input transducer, such as a microphone (IU) or a wireless receiver.
     
    6. An audio processing device according to any one of claims 1-5 wherein one or more of said electric input signals Xi(k, m) (i=1, 2, ..., N) represents an omni-directional signal or a directional signal (ID).
     
    7. An audio processing device according to any one of claims 1-6 comprising a microphone system comprising at least two microphones (IU1, IU2) for converting a sound field to respective time variant electric input signals and at least two time to time time-frequency conversion units, one for each microphone (IU1, IU2), to provide time-frequency representations of each electric input signal in a number of frequency bands k and a number of time instances m.
     
    8. An audio processing device according to any one of claims 1-7 comprising a normalization filter (Hi) operationally connected to one of said electrical input signals, the normalization filter (Hi) being configured to have a transfer function HN(f), f being frequency, which makes the electric input signal in question comparable and interchangeable with other normalized electric input signals Ii.
     
    9. An audio processing device according to any one of claims 1-8 wherein the control unit (CTR; WNR) comprises a combination unit allowing a weight P to be applied to at least one of the electric input signals or to a signal derived therefrom.
     
    10. An audio processing device according to claim 9 wherein the weight P is dependent on frequency.
     
    11. An audio processing device according to any one of claims 1-10 comprising a wind noise detector for providing an estimate of the amount of wind noise or any other un-correlated noise present at a specific point in time.
     
    12. An audio processing device according to claim 11 wherein a weighting factor Pi(f) of the ith electric input signal or a signal derived therefrom, i=1, 2, ..., N, is modified in dependence of the estimated amount of wind noise.
     
    13. An audio processing device according to any one of claims 10-12 wherein the control unit (CTR; WNR) comprises a minimization unit for - at a given point in time m - determining the minimum value among the normalized weighted magnitude signals (|Xi(k, m)·Hi(k)Pi(k)) from the inputs Xi(k, m), and a selection unit (SEL) for providing the resulting signal Y(k, m) based on the output of the minimization unit (Min{}) and the normalized input signals Xi(k, m)·Hi(k).
     
    14. Use of an audio processing device as claimed in any one of claims 1-13.
     


    Ansprüche

    1. Audioverarbeitungsvorrichtung (ADP), umfassend

    • eine Vielzahl von elektrischen Eingangssignalen (X1, ... XN), wobei N>=2, wobei jedes elektrische Eingangssignal in einer digitalisierten (I1, ..., IN) Form in einer Zeit-Frequenz-Darstellung, umfassend eine Anzahl von Frequenzbändern k und eine Anzahl von Zeitinstanzen m, bereitgestellt wird, und

    • eine Steuereinheit (CTR; WNR), die die digitalisierten elektrischen Eingangssignale (I1, ..., IN) empfängt und ein resultierendes Signal (Y(k, m)) bereitstellt, wobei die Steuereinheit (CTR; WNR) dazu konfiguriert ist, das resultierende Signal aus den digitalisierten elektrischen Eingangssignalen (I1, ..., IN) oder daraus abgeleiteten Signalen gemäß einem vordefinierten Schema zu bestimmen, DADURCH GEKENNZEICHNET, DASS das vordefinierte Schema Folgendes umfasst:

    • ein Kriterium, das in der Auswahl der Inhalte einer gegebenen Zeit-Frequenz-Einheit (k, m) von einem der jeweiligen elektrischen Eingangssignale (X1, ..., XN) oder einem daraus abgeleiteten Signal resultiert, um in der entsprechenden Zeit-Frequenz-Einheit (k, m) des resultierenden Signals (Y(k, m)) verwendet zu werden; und

    • Auswählen der Zeit-Frequenz-Einheiten (k, m) mit dem größten Signal-RauschVerhältnis (SNR).


     
    2. Audioverarbeitungsvorrichtung nach Anspruch 1, wobei das vordefinierte Schema Minimieren der Energie des resultierenden Signals (Y(k, m)) umfasst.
     
    3. Audioverarbeitungsvorrichtung nach Anspruch 2, wobei das Kriterium zum Minimieren der Energie zu jedem Zeitpunkt durch Auswählen des elektrischen Eingangssignals, das zu diesem Zeitpunkt die wenigste Energie enthält, erreicht wird.
     
    4. Audioverarbeitungsvorrichtung nach einem der Ansprüche 1-3, wobei das vordefinierte Schema umfasst, dass der Inhalt eines gegebenen Zeit-Frequenz-Bins (k, m) des resultierenden Signals (Y (k, m)) aus dem Inhalt des Zeit-Frequenz-Bins (k, m) des elektrischen Signals, das die wenigste Energie enthält, bestimmt wird.
     
    5. Audioverarbeitungsvorrichtung nach einem der Ansprüche 1-4, wobei ein oder mehrere der elektrischen Eingangssignale Xi(k, m) (i=1, 2, ..., N) ein Signal von einem Eingangswandler, wie etwa einem Mikrofon (IU) oder einem Drahtlosempfänger, darstellen.
     
    6. Audioverarbeitungsvorrichtung nach einem der Ansprüche 1-5, wobei ein oder mehrere der elektrischen Eingangssignale Xi(k, m) (i=1, 2, ..., N) ein omnidirektionales Signal oder ein direktionales Signal (ID) darstellen.
     
    7. Audioverarbeitungsvorrichtung nach einem der Ansprüche 1-6, umfassend ein Mikrofonsystem, umfassend zumindest zwei Mikrofone (IU1, IU2) zum Umwandeln eines Klangfelds in entsprechende zeitvariante elektrische Eingangssignale, und zumindest zwei Zeit-zu-Zeit-Zeit-Frequenz-Umwandlungseinheiten, eine für jedes Mikrofon (IU1, IU2), um Zeit-Frequenz-Darstellungen von jedem elektrischen Eingangssignal bei einer Anzahl von Frequenzbändern k und einer Anzahl von Zeitinstanzen m bereitzustellen.
     
    8. Audioverarbeitungsvorrichtung nach einem der Ansprüche 1-7, umfassend einen Normalisierungsfilter (Hi), der mit einem der elektrischen Eingangssignale wirkverbunden ist, wobei der Normalisierungsfilter (Hi) dazu konfiguriert ist, eine Übertragungsfunktion HN(f) aufzuweisen, wobei f die Frequenz ist, sodass das betreffende elektrische Eingangssignal mit anderen normalisierten elektrischen Eingangssignalen Ii vergleichbar und austauschbar wird.
     
    9. Audioverarbeitungsvorrichtung nach einem der Ansprüche 1-8, wobei die Steuereinheit (CTR; WNR) eine Kombinationseinheit umfasst, die es ermöglicht, eine Gewichtung P an zumindest einem der elektrischen Eingangssignale oder an einem daraus abgeleiteten Signal anzubringen.
     
    10. Audioverarbeitungsvorrichtung nach Anspruch 9, wobei die Gewichtung P von der Frequenz abhängig ist.
     
    11. Audioverarbeitungsvorrichtung nach einem der Ansprüche 1-10, umfassend einen Windrauschdetektor zum Bereitstellen einer Schätzung der Menge von Windrauschen oder beliebigem anderen nicht korreliertem Rauschen, das zu einem bestimmten Zeitpunkt vorhanden ist.
     
    12. Audioverarbeitungsvorrichtung nach Anspruch 11, wobei ein Gewichtungsfaktor Pi(f) des i. elektrischen Eingangssignals oder eines daraus abgeleiteten Signals, i=1, 2, ..., N, in Abhängigkeit von der geschätzten Menge von Windrauschen modifiziert ist.
     
    13. Audioverarbeitungsvorrichtung nach einem der Ansprüche 10-12, wobei die Steuereinheit (CTR; WNR) eine Minimierungseinheit zum Bestimmen, zu einem gegebenen Zeitpunkt m, des Mindestwertes unter den normalisierten gewichteten Größensignalen (|Xi(k, m) · Hi(k)| · Pi(k)) aus den Eingängen Xi(k, m) und eine Auswahleinheit (SEL) zum Bereitstellen des resultierenden Signals Y(k, m) auf Grundlage des Ausgangs der Minimierungseinheit (Min{}) und der normalisierten Eingangssignale Xi(k, m) · Hi(k) umfasst.
     
    14. Verwendung einer Audioverarbeitungsvorrichtung nach einem der Ansprüche 1-13.
     


    Revendications

    1. Dispositif de traitement audio (ADP) comprenant :

    • une multitude de signaux d'entrée électriques (X1,...XN), où N>=2, chaque signal d'entrée électrique étant délivré sous une forme numérisée (I1,...,IN) dans une représentation temps-fréquence comprenant un nombre de bandes de fréquence k et un nombre d'instances de temps m, et

    • une unité de commande (CTR ; WNR) recevant lesdits signaux d'entrée électriques numérisés (I1,..., IN) et délivrant un signal résultant (Y(k, m)), dans lequel l'unité de commande (CTR ; WNR) est configurée pour déterminer le signal résultant à partir desdits signaux d'entrée électriques numérisés (I1,..., IN), ou des signaux dérivés de ceux-ci, en fonction d'un schéma prédéfini

    CARACTÉRISÉ EN CE QUE le schéma prédéfini comprend :

    • un critère résultant dans la sélection du contenu d'une unité temps-fréquence donnée (k, m) de l'un des signaux d'entrée électriques respectifs (X1, ..., XN), ou un signal dérivé de ceux-ci, pour être utilisé dans l'unité temps-fréquence correspondante (k, m) du signal résultant (Y(k, m)) ; et

    • la sélection des unités temps-fréquence (k, m) ayant le ratio signal-bruit (SNR) le plus élevé.


     
    2. Dispositif de traitement audio selon la revendication 1, dans lequel le schéma prédéfini comprend la minimisation de l'énergie du signal résultant (Y(k, m)).
     
    3. Dispositif de traitement audio selon la revendication 2, dans lequel le critère pour minimiser l'énergie est obtenu par, à chaque point dans le temps, la sélection du signal d'entrée électrique qui - à ce point dans le temps - contient le moins d'énergie.
     
    4. Dispositif de traitement audio selon l'une quelconque des revendications 1 à 3, dans lequel le schéma prédéfini comprend que le contenu d'une gamme temps-fréquence donnée (k, m) du signal résultant (Y (k, m)) est déterminé à partir du contenu de la gramme temps-fréquence (k, m) du signal d'entrée électrique comprenant le moins d'énergie.
     
    5. Dispositif de traitement audio selon l'une quelconque des revendications 1 à 4, dans lequel un ou plusieurs desdits signaux d'entrée électriques Xi(k, m) (i=1,2, ..., N) représente un signal en provenance d'un transducteur d'entrée, tel qu'un microphone (IU) ou un récepteur sans fil.
     
    6. Dispositif de traitement audio selon l'une quelconque des revendications 1 à 5, dans lequel un ou plusieurs desdits signaux d'entrée électriques Xi(k, m) (i=1, 2, ..., N) représente un signal omnidirectionnel ou un signal directionnel (ID).
     
    7. Dispositif de traitement audio selon l'une quelconque des revendications 1 à 6, comprenant un système de microphone comprenant au moins deux microphones (IU1, IU2) destinés à convertir un champ audio en signaux d'entrée électriques variables dans le temps respectifs et au moins deux unités de conversion temps-fréquence temps à temps, une pour chaque microphone (IU1, IU2), pour délivrer des représentations temps-fréquence à chaque signal d'entrée électrique dans un nombre de bandes de fréquence k et un nombre d'instances de temps m.
     
    8. Dispositif de traitement audio selon l'une quelconque des revendications 1 à 7, comprenant un filtre de normalisation (Hi) connecté de manière opérationnelle à l'un desdits signaux d'entrée électrique, le filtre de normalisation (Hi) étant configuré pour avoir une fonction de transfert HN(f), f étant la fréquence, qui rend le signal d'entrée électrique en question comparable et interchangeable avec d'autres signaux d'entrée électriques normalisés Ii.
     
    9. Dispositif de traitement audio selon l'une quelconque des revendications 1 à 8, dans lequel l'unité de commande (CTR ; WNR) comprend une unité de combinaison permettant à un poids P d'être appliqué à au moins l'un des signaux d'entrée électrique ou à un signal dérivé de ceux-ci.
     
    10. Dispositif de traitement audio selon la revendication 9, dans lequel le poids P est dépendant de la fréquence.
     
    11. Dispositif de traitement audio selon l'une quelconque des revendications 1 à 10, comprenant un détecteur de bruit du vent destiné à délivrer une estimation de la quantité de bruit du vent ou tout autre bruit non-corrélé présent à un point spécifique dans le temps.
     
    12. Dispositif de traitement audio selon la revendication 11, dans lequel un facteur de pondération Pi(f) du ième signal d'entrée électrique ou un signal dérivé de celui-ci, i=1, 2, ..., N, est modifié en dépendance de la quantité estimée de bruit du vent.
     
    13. Dispositif de traitement audio selon l'une quelconque des revendications 10 à 12, dans lequel l'unité de commande (CTR ; WNR) comprend une unité de minimisation destinée à - à un point donné dans le temps m - déterminer la valeur minimum parmi les signaux de magnitude pondérés normalisés (|Xi(k, m)·Hi(k)| · Pi(k)) à partir des entrées Xi(k, m), et une unité de sélection (SEL) destinée à délivrer le signal résultant Y(k, m) sur la base de la sortie de l'unité de minimisation (Min{}) et des signaux d'entrée normalisés Xi(k, m)·Hi(k).
     
    14. Utilisation d'un dispositif de traitement audio selon l'une quelconque des revendications 1 à 13.
     




    Drawing


























    Cited references

    REFERENCES CITED IN THE DESCRIPTION



    This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

    Patent documents cited in the description