(19)
(11)EP 3 229 496 B1

(12)EUROPEAN PATENT SPECIFICATION

(45)Mention of the grant of the patent:
24.06.2020 Bulletin 2020/26

(21)Application number: 17165255.5

(22)Date of filing:  06.04.2017
(51)International Patent Classification (IPC): 
H04R 25/00(2006.01)
H04R 5/033(2006.01)
A61N 1/36(2006.01)
H04R 1/10(2006.01)

(54)

HEARING DEVICE WITH NEURAL NETWORK-BASED MICROPHONE SIGNAL PROCESSING

HÖRGERÄT MIT MIKROFONSIGNALVERARBEITUNG AUF BASIS EINES NEURONALEN NETZWERKS

DISPOSITIF D'AIDE AUDITIVE AVEC TRAITEMENT DE SIGNAL DE MICROPHONE BASÉ SUR UN RÉSEAU DE NEURONES


(84)Designated Contracting States:
AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

(30)Priority: 06.04.2016 US 201615092489

(43)Date of publication of application:
11.10.2017 Bulletin 2017/41

(73)Proprietor: Starkey Laboratories, Inc.
Eden Prairie, MN 55344 (US)

(72)Inventors:
  • XU, Buye
    Minnetonka, MN Minnesota 55343 (US)
  • MERKS, Ivo
    Eden Prairie, MN Minnesota 55347 (US)
  • MUSTIERE, Frederic Philippe Denis
    Eden Prairie, Minnesota 55344 (US)

(74)Representative: Dentons UK and Middle East LLP 
One Fleet Place
London EC4M 7WS
London EC4M 7WS (GB)


(56)References cited: : 
WO-A2-01/97558
US-A1- 2008 212 810
US-A1- 2015 112 672
US-A1- 2003 063 759
US-A1- 2010 008 526
  
      
    Note: Within nine months from the publication of the mention of the grant of the European patent, any person may give notice to the European Patent Office of opposition to the European patent granted. Notice of opposition shall be filed in a written reasoned statement. It shall not be deemed to have been filed until the opposition fee has been paid. (Art. 99(1) European Patent Convention).


    Description

    TECHNICAL FIELD



    [0001] This document relates generally to hearing systems and more particularly to a system for processing microphone signals using a neural network.

    BACKGROUND



    [0002] Hearing devices provide sound for the wearer. Some examples of hearing devices are headsets, hearing aids, speakers, cochlear implants, bone conduction devices, and personal listening devices. Hearing aids provide amplification to compensate for hearing loss by transmitting amplified sounds to their ear canals. Damage to outer hair cells in a patient's cochlea results in loss of frequency resolution in the patient's auditory perception. As this condition develops, it becomes difficult for the patient to distinguish target sound, such as speech, from environmental noise. Simple amplification does not address such difficulty. Thus, there is a need to help such a patient in listening to target sounds, such as speech, in a noisy environment.

    [0003] WO01/97558 discloses an adaptive directionality microphone system using a non-linear filter implemented by a neural network.

    [0004] US2010/0008526 discloses a method for effective estimation of signal processing parameters in a hearing aid using two microphone inputs, where there is automatic adjustment of signal processing parameters using a neural network.

    [0005] US2003/0063759 discloses a directional processing system for beamforming information signals from an array of microphones, wherein a neural network can be used for signal processing direction of arrival parameters.

    [0006] US2015/0112672 discloses an echo canceller that can have two microphone inputs and a speech recognizer using a neural network.

    [0007] US2008/0212810 discloses a binaural hearing aid system having two microphone inputs and uses a neural network as a classifier to determine characteristic parameters of received sound input to a parameter map for selection of signal processing algorithms.

    SUMMARY



    [0008] The present invention provides a hearing system configured to be worn by a wearer and a method for operating a hearing system according to the appended claims.

    [0009] According to the present disclosure, a hearing system performs nonlinear processing of signals received from a plurality of microphones using a neural network to enhance a target signal in a noisy environment. In various embodiments, the neural network can be trained to improve a signal-to-noise ratio without causing substantial distortion of the target signal. An example of the target sound includes speech, and the neural network is used to improve speech intelligibility.

    [0010] In an exemplary embodiment, a hearing system includes a plurality of microphones, a control circuit, and a receiver (speaker). The microphones receive input sounds including a target sound and produce a plurality microphone signals including the target sound. The control circuit produces an output signal using the plurality of microphone signals. The control circuit includes a neural network and controls a directionality of the plurality of microphones by processing the plurality of microphone signals using a nonlinear signal processing algorithm that is based on the neural network. The receiver produces an output sound using the output signal.

    [0011] In an exemplary embodiment, a hearing system includes a pair of left and right hearing aids configured to be worn by a wear and communicatively coupled to each other. The left and right hearing aids each include a microphone, a control circuit, and a receiver. The microphone receives input sounds including a target sound and produces a microphone signal including the target sound. The control circuit produces an output signal using the microphone signals produced by microphones of the left and right hearing aids. The control circuit includes a neural network and controls a directionality of the microphones of the left and right hearing aids using a nonlinear signal processing algorithm that is based on the neural network. The receiver produces an output sound using the output signal.

    [0012] In an exemplary embodiment, a method for operating a hearing system to enhance a target sound is provided. Microphone signals including a target sound are received from a plurality of microphones of the hearing system. The microphone signals are processed, using a neural network-based non-linear signal processing algorithm, to control a directionality of the plurality of microphones and produce an output signal. An output sound is produced based on the output signal using a receiver of the hearing system.

    [0013] This summary is an overview of some of the teachings of the present application and not intended to be an exclusive or exhaustive treatment of the present subject matter. Further details about the present subject matter are found in the detailed description and appended claims. The scope of the present invention is defined by the appended claims.

    BRIEF DESCRIPTION OF THE DRAWINGS



    [0014] 

    FIG. 1 is a block diagram illustrating an exemplary embodiment of a hearing system using a neural network for processing microphone signals.

    FIG. 2 is a block diagram illustrating an exemplary embodiment of a pair of hearing aids including the neural network.

    FIG. 3 is an illustration of an exemplary embodiment of a neural network.

    FIG. 4 is a block diagram illustrating an exemplary embodiment of the neural network of FIG. 1.

    FIG. 5 is a graph illustrating performance of the exemplary neural network of FIG. 4 compared to performance of an exemplary ideal binaural beamformer.

    FIG. 6 is a graph illustrating performance of the exemplary neural network of FIG. 4 compared to performance of an exemplary ideal binaural beamformer.

    FIG. 7 is a block diagram illustrating another exemplary embodiment of the neural network of FIG. 1.

    FIG. 8 is a graph illustrating performance of the exemplary neural network of FIG. 7 compared to performance of an exemplary ideal binaural beamformer.

    FIG. 9 is a graph illustrating performance of the exemplary neural network of FIG. 7 compared to performance of an exemplary ideal binaural beamformer.

    FIG. 10 is a flow chart illustrating an exemplary embodiment of a method for processing microphone signals using a neural network in a hearing system.


    DETAILED DESCRIPTION



    [0015] The following detailed description of the present subject matter refers to subject matter in the accompanying drawings which show, by way of illustration, specific aspects and embodiments in which the present subject matter may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the present subject matter. References to "an", "one", or "various" embodiments in this disclosure are not necessarily to the same embodiment, and such references contemplate more than one embodiment. The following detailed description is demonstrative and not to be taken in a limiting sense. The scope of the present subject matter is defined by the appended claims.

    [0016] This document discusses, among other things, a hearing system that performs neural network based processing of microphone signals to enhance target sounds for better listening, such as improving speech intelligibility in a noisy environment. Though speech intelligibility is discussed as a specific example in this document, the present subject matter can be applied in various hearing devices for enhancing target sounds of any type (e.g., speech or music) in a noisy signal (e.g., babble noise or machine noise). Such devices include, among other things, hearing assistance devices, such as headsets, hearing aids, speakers, cochlear implants, bone conduction devices, and personal listening devices.

    [0017] Bilateral directional microphones and binaural beamforming have been used in hearing assistance devices for processing signals including speeches with noisy background, with limited improvement in signal-to-noise ratio (SNR). The present subject matter can use a neural network based binaural algorithm that can achieve performance exceeding the theoretical upper limit provided by a directional microphone or a binaural beamformer in processing microphone signals for a hearing assistance system. The neural network based binaural algorithm is a nonlinear signal processing algorithm that can exceed the theoretical limit achievable by the existing linear algorithms in processing binaural microphone signals. Training of this neural network is highly flexible and may take into account various measures as cost functions. Specific neural network structure and training strategy have been designed and tested to achieve a desirable balance between sound quality and SNR improvement. In various embodiments, the neural network based nonlinear signal processing algorithm can introduce controlled nonlinearity to the signals such that the SNR can be greatly improved while the sound quality is not substantially compromised.

    [0018] FIG. 1 is a block diagram illustrating an exemplary embodiment of a hearing system 100 that uses a neural network for processing microphone signals. System 100 includes a plurality of microphones 1-N (102-1 to 102-N), a control circuit 104, and a receiver (speaker) 106.

    [0019] Microphones 102 produce a plurality of microphone signals including speech. In one embodiment, microphones 102 are two microphones (N=2). In various embodiments, microphones 102 can include two or more microphones. Microphones 102 are each communicatively coupled to control circuit 104 via a wired or wireless link. Control circuit 104 processes the plurality of microphone signals to produce an output signal. Receiver 106 produces an output sound using the output signal and transmits the output sound to a listener.

    [0020] Control circuit 104 can include a neural network 108 and control directionality of microphones 102 using the plurality of microphone signals by executing a neural network based signal processing algorithm. In various embodiments, the neural network based signal processing algorithm can include a nonlinear signal processing algorithm. In various embodiments, neural network 108 can be trained to control the directionality of microphones 102 by processing the plurality of microphone signals to achieve a desirable balance between the SNR (with the clean speech being the target signal) and the distortion of the speech, as further discussed below, with reference to FIGS. 3-9. In various embodiments, control circuit 104 can precondition the plurality of microphone signals before processing it using neural network 108, such as by amplifying and/or filtering each of the microphone signals as needed. In various embodiments, control circuit 104 can process the output of neural network 108 to produce right output signal as needed.

    [0021] In various embodiments, system 100 can be implemented entirely or partially in hearing aids. For example, microphones 102 can include one or more microphones in the hearing aids, one or more ad-hoc microphone arrays, and one or more remote microphones that are external to but communicatively coupled to the hearing aids. Control circuit 104 can be implemented in one or more processors of the hearing aids and/or one or more processors in an external device communicatively coupled to the hearing aids. One example of such an external device includes a cellphone installed with an application implementing portions of control circuit 104. In addition to or in place of receiver 106 for transmitting the output to the listener being a hearing aid wearer, the output can be delivered to another person or device as needed, such as a user other than the hearing aid wearer or a speech recognizer.

    [0022] FIG. 2 is a block diagram illustrating an exemplary embodiment of a hearing system 200, which represents an exemplary embodiment of system 100 with a pair of hearing aids each including a neural network. System 200 includes a left hearing aid 210L and a right hearing aid 210R communicatively coupled to each other via a wireless binaural link 216.

    [0023] Left hearing aid 210L can be configured to be worn in or about the left ear of a hearing aid wearer and includes a hearing aid circuit 212L and a shell 214L that houses hearing aid circuit 212L. Examples of shell 214L include, but are not limited to, a housing for a BTE, ITE, ITC, RIC, CIC, RITE or deep insertion types of hearing aids for use with the left ear. Hearing aid circuit 212L includes a microphone 202L, an antenna 220L, a communication circuit 218L, a control circuit 204L, and a receiver (speaker) 206L. Microphone 202L receives sounds from the environment of the hearing aid wearer and produces a left microphone signal representing the received sounds. Communication circuit 218L performs wireless communication including ear-to-ear communication with right hearing aid 210R using antenna 220L via binaural link 216. Control circuit 204L processes the left microphone signal and a right microphone signal received by communication circuit 218L to produce a left output signal. Receiver 206L produces a left sound using the left output signal and transmits the left sound to the left ear of the hearing aid wearer.

    [0024] Right hearing aid 210R can be configured to be worn in or about the right ear of the hearing aid wearer and includes a hearing aid circuit 212R and a shell 214R that houses hearing aid circuit 212R. Examples of shell 214R include, but are not limited to, housing for a BTE, ITE, ITC, RIC, CIC, RITE or deep insertion types of hearing aids for use with the right ear. Hearing aid circuit 212R includes a microphone 202R, an antenna 220R, a communication circuit 218R, a control circuit 204R, and a receiver (speaker) 206R. Microphone 202R receives sounds from the environment of the wearer and produces a right microphone signal representing the received sounds. Communication circuit 218R performs wireless communication including ear-to-ear communication with left hearing aid 210L using antenna 220R via binaural link 216. Control circuit 204R processes the right microphone signal and the left microphone signal received by communication circuit 218R to produce a right output signal. Receiver 206L produces a right sound using the right output signal and transmits the right sound to the left ear of the hearing aid user.

    [0025] Control circuit 204L represents an exemplary embodiment of control circuit 104 and includes a neural network 208L. Control circuit 204R also represents an exemplary embodiment of control circuit 104 and includes a neural network 208R. Examples of neural networks 208L and 208R include neural network 108 including its various embodiments as discussed in this document. In various embodiments, control circuit 204L can precondition the left microphone signal before processing it using neural network 208L and/or processes the output of neural network 208L to produce the left output signal as needed. Control circuit 204R preconditions the right microphone signal before processing it using neural network 208R and/or processes the output of neural network 208R to produce the right output signal, as needed.

    [0026] FIG. 3 is an illustration of an exemplary embodiment of a neural network 308. The illustrated embodiment is an example of a time domain neural network structure that takes delayed time domain signals (x(t), x(t-1), x(t-2), x(t-3), ...) as inputs and generate y(t) as output. This type of structure can be easily modified to process multiple input signals and generate multiple output signals. Neural network 308 can be trained to process microphone signals such as the signals output from microphones 102 or microphones 202L and 202R, and provide an output signal with improved SNR for improved speech intelligibility. In an exemplary embodiment, neural network 308 is a fixed neural network that remains unchanged after training. In another exemplary embodiment, neural network 308 is an adaptive neural network that is adaptive to a changing environment in which the hearing system is used. In various embodiments, neural network 308 can be a time domain neural network (such as illustrated in FIG. 3) or a frequency domain neural network. The structure of neural network 308 can be highly flexible.

    [0027] FIG. 4 is a block diagram illustrating an exemplary embodiment of a neural network 408, which represent an exemplary embodiment of neural network 108. Neural network 408 includes an input 424, a nonlinear hidden layer 426, a linear output layer 428, and an output 430. In the illustrated embodiment, input 424 receives time sequence samples from microphones (such as from binaural microphones 202L and 202R, with 32 samples from each microphone). In various embodiments, neural network 408 can be trained on synthesized target sound (e.g., speech or music) in various types of noise conditions (e.g., with babble noise or machine noise) using various cost functions. Examples of the cost functions include mean squared error (MSE), weighted MSE, mean absolute error (MAE), statistical forecast error (SFE), and perceptual inspired metrics such as SII (speech intelligibility index) and STI (speech transmission index). In one experiment, for example, neural network 408 was trained on synthesized speech in babble noise conditions with the desired speech coming from front. During the training, the target signal was the clean speech and the MSE, a cost function, was minimized by properly adjusting synaptic weights in neural network 408, which included a plurality of synapses. After the training, the performance of neural network 408 in SNR improvement was tested on a separate training dataset and was compared to an ideal binaural beamformer (a linear binaural beamformer optimized for the testing condition). FIGS. 5 and 6 are each a graph illustrating performance of neural network 408 (NN OUTPUT) compared to performance of the ideal binaural beamformer (BBF). The graph plots an SNR 434 of neural network 408 and an SNR 436 of the ideal BBF over a range of frequencies, and shows the SNR improvement achieved by neural network 408. FIGS. 5 shows the SNR improvement on an input signal having an average SNR of 5 dB. FIGS. 6 shows the SNR improvement on an input signal having an average SNR of 0 dB.

    [0028] The fact that neural network 408 can improve the SNR to an extent that exceeds the theoretical limit of linear binaural beamformer indicates that neural network 408 introduces nonlinearity to the signal. However, though a good SNR improvement was achieved, the distortion to the desired speech as well as the noise could be annoying. To reduce the audible distortion, sound quality measures can be incorporated into the cost function, the structure of the neural network can be adjusted, and/or the training data can be adjusted. The following is an example demonstrating a specific network structure (illustrated in FIG. 7) combined with carefully designed training data to achieve a balance between SNR improvement and distortion.

    [0029] FIG. 7 is a block diagram illustrating an exemplary embodiment of a neural network 708, which represents another exemplary embodiment of neural network 108. Neural network 708 includes an input 724, a linear first hidden layer 726, a nonlinear second hidden layer 727, a linear output layer 728, and an output 730. In the illustrated embodiment, input 724 receives time sequence samples from microphones (such as from binaural microphones 202L and 202R, with 16 samples from each microphone). A shortcut connection 732 between the output of the linear first hidden layer 726 and the input of output layer 728 provides a direct path for a portion of the input signal to pass through without nonlinear distortion. Thus, neural network 708 includes a linear signal processing path between input 724 and output 730 and a nonlinear signal processing path between input 724 and output 730. The linear path includes first hidden layer 726 and output layer 728. The nonlinear path includes first hidden layer 726, second hidden layer 727, and output layer 728. In other words, first hidden layer 726 has an input directly connected to input 724 and an output. Second hidden layer 727 has an input directly connected to the output of hidden layer 726 and an output. Output layer 728 has an input directly connected to the output of hidden layer 726, another input directly connected to the output of hidden layer 727, and an output directly connected to output 730.

    [0030] Neural network 708 was trained at SNRs of 0 dB, 10 dB, and 20 dB with the target signal always being the clean speech. The training is also a crucial step for reducing distortion of the speech. FIGS. 8 and 9 are each a graph illustrating performance of neural network 708 (NN OUTPUT) compared to performance of the ideal binaural beamformer (BBF). The graph plots an SNR 734 of neural network 708 and SNR 436 of the ideal BBF over a range of frequencies, and shows the SNR improvement achieved by neural network 708. FIGS. 8 shows the SNR improvement on an input signal having an average SNR of 5 dB. FIGS. 9 shows the SNR improvement on an input signal having an average SNR of 0 dB. Compared to the example of neural network 408 as discussed above with reference to FIGS. 5-7, neural network 708 provided less SNR improvement (though still higher than the ideal BBF), but the distortion associated with neural network was virtually unperceivable.

    [0031] Neural network 708 is illustrated in FIG. 7 and discussed above by way of example, but not by way of restriction. In various embodiments, neural network 108 can each include a linear signal processing path and a nonlinear signal processing path such that the output includes components being the input subjected to only linear signal processing and therefore not distorted as a result of nonlinear processing. For example, the nonlinear signal processing path can include one or more linear layers and one or more nonlinear layers, and the liner signal processing path can include only the one or more linear layers while bypassing each of the one or more nonlinear layers.

    [0032] In various embodiments, the cost function in the training of neural network 108, including its various embodiments, can incorporate various speech intelligibility and sound quality measures to optimize the neural network for various working conditions and/or user preferences. In various embodiments, neural network 108, including its various embodiments, can be trained in both time domain and frequency domain. In various embodiments, neural network 108, including its various embodiments, can be fixed (i.e., kept unchanged after the training) or adaptive (i.e., dynamically adjustable based on the real environment). In various embodiments, neural network 108, including its various embodiments, can be implemented digitally, in the form of analog circuits, or as a combination of digital and analog circuits.

    [0033] FIG. 10 is a flow chart illustrating an exemplary embodiment of a method 1040 for processing microphone signals using a neural network in a hearing system. Examples of the hearing system include system 100 and its various embodiments as described by this document. Examples for the neural network used in performing method 1040 include neural network 108 and its various embodiments as discussed in this document. In various embodiments, method 1040 can be performed to enhance a target sound in a noisy background for better listening to the target sound. In an exemplary embodiment, as discussed below as an example, the target sound is a speech, and method 1040 can be performed to improve intelligibility of speech in a noisy background.

    [0034] At 1042, microphone signals are received from a plurality of microphones of the hearing system. The microphone signals include a speech received by the microphones. In an exemplary embodiment, the hearing system includes a pair of left and right hearing aids each being worn in or about an ear of a hearing aid wearer, such as the pair of left and right hearing aids 210L and 210R. The received microphone signals include a left microphone signal received from the left hearing aid and a right microphone signal received from the right hearing aid.

    [0035] At 1044, the microphone signals are processed, using a neural network-based signal processing algorithm, to control a directionality of the plurality of microphones and produce an output signal. In various embodiments, the neural network-based signal processing algorithm can include a nonlinear signal processing algorithm. This includes, for example, processing the microphone signals using a linear signal processing path and a nonlinear signal processing path. In various embodiments, the microphone signals can be processed using a neural network trained for a desirable balance between an SNR and distortion of the speech. In an exemplary embodiment, the neural network is trained with a clean speech as the target signal and a mean squared error as a cost function. In an exemplary embodiment, the mean squared error is approximately minimized by adjusting synaptic weights in the neural network. In various embodiments, the microphone signals can be processed within a hearing device, such as a hearing aid, and/or one or more devices external to but communicatively coupled to the hearing aid. An example of such an external device include a cellphone. This allows for a distributed processing that off-loads the processing work from the hearing aid.

    [0036] At 1046, an output sound is produced based on the output signal using a receiver (speaker) of the hearing assistance system. The output sound is delivered to the user of the hearing assistance system, such as a hearing aid wearer when the hearing assistance system includes the pair of left and right hearing aids.

    [0037] In various embodiments, the present subject matter provides a neural network based binaural algorithm that can achieve performance exceeding the theoretical upper limit provided by a binaural beamformer in processing microphone signals. Neural network 408, as discussed above with reference to FIGS. 4-6, is an example that demonstrates that a substantially better SNR improvement can be achieved when compared to the upper limit of SNR improvement provided by a binaural beamformer. However, this is achieved at the cost of obvious nonlinear distortions to the target signal as well as the noise. If the distortion to the target signal is of concern, one could incorporate sound quality measures into the cost function during training of the neural network, adjust the structure of the neural network, and/or adjust training data. Neural network 708, as discussed above with reference to FIGS. 7-9, is an example that demonstrates a specific neural network structure that can achieve a balance between SNR improvement and distortion of the target signal when combined with carefully designed training data.

    [0038] Hearing devices typically include at least one enclosure or housing, a microphone, hearing device electronics including processing electronics, and a speaker or "receiver." Hearing devices may include a power source, such as a battery. In various embodiments, the battery may be rechargeable. In various embodiments multiple energy sources may be employed. It is understood that in various embodiments the receiver is optional. It is understood that variations in communications protocols, antenna configurations, and combinations of components may be employed without departing from the scope of the present subject matter. Antenna configurations may vary and may be included within an enclosure for the electronics or be external to an enclosure for the electronics. Thus, the examples set forth herein are intended to be demonstrative and not a limiting or exhaustive depiction of variations.

    [0039] It is understood that digital hearing aids include a processor. For example, control circuit 104 and its various embodiments may be implemented in a processor. In digital hearing aids with a processor, programmable gains may be employed to adjust the hearing aid output to a wearer's particular hearing impairment. The processor may be a digital signal processor (DSP), microprocessor, microcontroller, other digital logic, or combinations thereof. The processing may be done by a single processor, or may be distributed over different devices. The processing of signals referenced in this application can be performed using the processor or over different devices. Processing may be done in the digital domain, the analog domain, or combinations thereof. Processing may be done using subband processing techniques. Processing may be done using frequency domain or time domain approaches. Some processing may involve both frequency and time domain aspects. For brevity, in some examples drawings may omit certain blocks that perform frequency synthesis, frequency analysis, analog-to-digital conversion, digital-to-analog conversion, amplification, buffering, and certain types of filtering and processing. In various embodiments the processor can be adapted to perform instructions stored in one or more memories, which may or may not be explicitly shown. Various types of memory may be used, including volatile and nonvolatile forms of memory. In various embodiments, the processor or other processing devices can execute instructions to perform a number of signal processing tasks. Such embodiments may include analog components in communication with the processor to perform signal processing tasks, such as sound reception by a microphone, or playing of sound using a receiver (i.e., in applications where such transducers are used). In various embodiments, different realizations of the block diagrams, circuits, and processes set forth herein can be created by one of skill in the art without departing from the scope of the present subject matter.

    [0040] Various embodiments of the present subject matter support wireless communications with a hearing device. In various embodiments the wireless communications can include standard or nonstandard communications. Some examples of standard wireless communications include, but not limited to, Bluetooth™, low energy Bluetooth, IEEE 802.11(wireless LANs), 802.15 (WPANs), and 802.16 (WiMAX). Cellular communications may include, but not limited to, CDMA, GSM, ZigBee, and ultra-wideband (UWB) technologies. In various embodiments, the communications are radio frequency communications. In various embodiments the communications are optical communications, such as infrared communications. In various embodiments, the communications are inductive communications. In various embodiments, the communications are ultrasound communications. Although embodiments of the present system may be demonstrated as radio communication systems, it is possible that other forms of wireless communications can be used. It is understood that past and present standards can be used. It is also contemplated that future versions of these standards and new future standards may be employed without departing from the scope of the present subject matter.

    [0041] The wireless communications support a connection from other devices. Such connections include, but are not limited to, one or more mono or stereo connections or digital connections having link protocols including, but not limited to 802.3 (Ethernet), 802.4, 802.5, USB, ATM, Fibre-channel, Firewire or 1394, InfiniBand, or a native streaming interface. In various embodiments, such connections include all past and present link protocols. It is also contemplated that future versions of these protocols and new protocols may be employed without departing from the scope of the present subject matter.

    [0042] In various embodiments, the present subject matter is used in hearing devices that are configured to communicate with mobile phones. In such embodiments, the hearing device may be operable to perform one or more of the following: answer incoming calls, hang up on calls, and/or provide two way telephone communications. In various embodiments, the present subject matter is used in hearing devices configured to communicate with packet-based devices. In various embodiments, the present subject matter includes hearing devices configured to communicate with streaming audio devices. In various embodiments, the present subject matter includes hearing devices configured to communicate with Wi-Fi devices. In various embodiments, the present subject matter includes hearing devices capable of being controlled by remote control devices.

    [0043] It is further understood that different hearing devices may embody the present subject matter without departing from the scope of the present disclosure.

    [0044] The devices depicted in the figures are intended to demonstrate the subject matter, but not necessarily in a limited, exhaustive, or exclusive sense. It is also understood that the present subject matter can be used with a device designed for use in the right ear or the left ear, or both ears, of the wearer.

    [0045] The present subject matter may be employed in hearing devices, such as headsets, hearing aids, speakers, cochlear implants, bone conduction devices, and personal listening devices.

    [0046] The present subject matter is demonstrated for use in hearing devices, such as hearing aids, including but not limited to, behind-the-ear (BTE), in-the-ear (ITE), in-the-canal (ITC), receiver-in-canal (RIC), or completely-in-the-canal (CIC) type hearing aids. It is understood that behind-the-ear type hearing aids may include devices that reside substantially behind the ear or over the ear. Such devices may include hearing aids with receivers associated with the electronics portion of the behind-the-ear device, or hearing aids of the type having receivers in the ear canal of the user, including but not limited to receiver-in-canal (RIC) or receiver-in-the-ear (RITE) designs. The present subject matter can also be used in hearing assistance devices generally, such as cochlear implant type hearing devices. The present subject matter can also be used in deep insertion devices having a transducer, such as a receiver or microphone. The present subject matter can be used in devices whether such devices are standard or custom fit and whether they provide an open or an occlusive design. It is understood that other hearing devices not expressly stated herein may be used in conjunction with the present subject matter.

    [0047] This application is intended to cover adaptations or variations of the present subject matter. It is to be understood that the above description is intended to be illustrative, and not restrictive. The scope of the present subject matter is defined by the appended claims.


    Claims

    1. A hearing system configured to be worn by a wearer, comprising:

    a plurality of microphones (102-1, 102-N) configured to receive input sounds including a target sound and to produce a plurality of microphone signals including the target sound;

    a control circuit (104) configured to produce an output signal using the plurality of microphone signals, the control circuit (104) including a neural network (108, 408, 708) and configured to control directionality of the plurality of microphones (102-1, 102-N) by processing the plurality of microphone signals using a nonlinear signal processing algorithm that is based on the neural network (108, 408, 708); and

    a receiver (106) configured to produce an output sound using the output signal;

    wherein the neural network (108, 408, 708) comprises:

    an input (424, 724) to receive the plurality of microphone signals;

    an output (430, 730);

    a linear signal processing path between the input and the output, the linear signal processing path including one or more linear hidden layers (428, 726, 728); and

    a nonlinear signal processing path between the input and the output, the nonlinear signal processing path including one or more nonlinear hidden layers (426, 727).


     
    2. The system according to claim 1, wherein the neural network (108, 408, 708) is trained to process the plurality of microphone signals for a desirable balance between a signal-to-noise ratio, SNR, and distortion of the target sound.
     
    3. The system according to any of the preceding claims, wherein the neural network (108, 408, 708) is an adaptive neural network that is adaptive to a changing environment in which the hearing system is used.
     
    4. The system according to any of the preceding claims, wherein the neural network (108, 408, 708) is trained with a synthesized target sound in a noisy condition and a specified cost function.
     
    5. The system according to claim 4, wherein the target sound is speech, and the neural network (108, 408, 708) is trained with a clean speech in a babble noise condition as the synthesized target sound in the noisy condition, and a mean squared error as the cost function.
     
    6. The system according to claim 5, wherein the neural network (108, 408, 708) comprises a plurality of synapses and adjustable synaptic weights, and the mean squared error is approximately minimized by adjusting the synaptic weights.
     
    7. The system according to any of the preceding claims, comprising a pair of left and right hearing aids configured to be worn by the wearer and communicatively coupled to each other, the left and right hearing aids including the plurality of microphones (102-1, 102-N), the control circuit (104), and the receiver (106).
     
    8. The system according to claim 7, further comprising a remote microphone communicatively coupled to the pair of left and right hearing aids and configured to receive a remote sound and produce a remote microphone signal, wherein the control circuit (104) is configured to produce the output signal using the microphone signals produced by microphones of the left and right hearing aids and the remote microphone signal.
     
    9. A method for operating a hearing system, comprising:

    receiving (1042) microphone signals including a target sound from a plurality of microphones (102-1, 102-N) of the hearing system;

    processing (1044) the microphone signals, using a neural network-based non-linear signal processing algorithm, to control a directionality of the plurality of microphones (102-1, 102-N) and produce an output signal; and

    producing (1046) an output sound based on the output signal using a receiver (106) of the hearing system;

    wherein the neural network-based processing comprises an input to receive the plurality of microphone signals and an output, the processing using:

    a linear signal processing path between the input and the output, the linear signal processing path including one or more linear hidden layers (428, 726, 728); and

    a nonlinear signal processing path between the input and the output, the nonlinear signal processing path including one or more nonlinear hidden layers (426, 727).


     
    10. The method according to claim 9, wherein processing the microphone signals comprises processing the microphone signals using a neural network (108, 408, 708) trained for a cost function and a desirable balance between a signal-to-noise ratio, SNR, and distortion of the target sound.
     
    11. The method according to claim 10, wherein the target sound is a speech, and using the neural network (108, 408, 708) trained for the desirable balance between the SNR and the distortion of the target sound comprises:

    using a neural network (108, 408, 708) trained with a mean squared error as the cost function and a clean speech as the target sound; and

    approximately minimizing the mean squared error by adjusting synaptic weights in the neural network (108, 408, 708).


     
    12. The method according to any of claims 9 to 11, wherein receiving the microphone signals comprises receiving a left microphone signal from a left hearing aid and a right microphone signal from a right hearing aid, the left and right hearing aid each being worn in or about an ear of a hearing aid wearer.
     
    13. The method according to claim 12, wherein receiving the microphone signals further comprises receiving one or more microphone signal from a microphone external to the left and right hearing aids and communicatively coupled to the left and right hearing aids, and processing the microphone signals comprises processing the microphone signals using at least a processor of the left and right hearing aids.
     


    Ansprüche

    1. Hörsystem, das konfiguriert ist, um von einem Träger getragen zu werden, umfassend:

    mehrere Mikrofone (102-1, 102-N), die konfiguriert sind, um Eingangstöne einschließlich eines Ziellauts zu empfangen und mehrere Mikrofonsignale einschließlich des Ziellauts zu erzeugen;

    eine Steuerschaltung (104), die konfiguriert ist, um ein Ausgangssignal unter Verwendung der mehreren Mikrofonsignale zu erzeugen, wobei die Steuerschaltung (104) ein neuronales Netz (108, 408, 708) umfasst und konfiguriert ist, um die Richtungsabhängigkeit der mehreren Mikrofone (102-1, 102-N) durch Verarbeiten der mehreren Mikrofonsignale unter Verwendung eines nichtlinearen Signalverarbeitungsalgorithmus zu steuern, der auf dem neuronalen Netzwerk (108, 408, 708) basiert; und

    einen Empfänger (106), der konfiguriert ist, um unter Verwendung des Ausgangssignals einen Ausgangston zu erzeugen;

    wobei das neuronale Netz (108, 408, 708) Folgendes umfasst:

    einen Eingang (424, 724) zum Empfangen der mehreren Mikrofonsignale;

    einen Ausgang (430, 730);

    einen linearen Signalverarbeitungspfad zwischen dem Eingang und dem Ausgang, wobei der lineare Signalverarbeitungspfad eine oder mehrere lineare verborgene Schichten (428, 726, 728) umfasst; und

    einen nichtlinearen Signalverarbeitungspfad zwischen dem Eingang und dem Ausgang, wobei der nichtlineare Signalverarbeitungspfad eine oder mehrere nichtlineare verborgene Schichten (426, 727) umfasst.


     
    2. System nach Anspruch 1, wobei das neuronale Netzwerk (108, 408, 708) darauf trainiert ist, die mehreren Mikrofonsignale für ein gewünschtes Gleichgewicht zwischen einem Signal-Rausch-Verhältnis, SNR (signal-to-noise), und einer Verzerrung des Ziellauts zu verarbeiten.
     
    3. System nach einem der vorhergehenden Ansprüche, wobei das neuronale Netz (108, 408, 708) ein adaptives neuronales Netz ist, das sich an eine sich ändernde Umgebung anpasst, in der das Hörsystem verwendet wird.
     
    4. System nach einem der vorhergehenden Ansprüche, wobei das neuronale Netz (108, 408, 708) mit einem synthetisierten Ziellaut in einem verrauschten Zustand und einer spezifizierten Kostenfunktion trainiert wird.
     
    5. System nach Anspruch 4, wobei der Ziellaut Sprache ist und das neuronale Netz (108, 408, 708) in einem Störgeräuschzustand mit einer klaren Sprache als dem synthetisierten Ziellaut im verrauschten Zustand und einem gemittelten quadrierten Fehler als Kostenfunktion trainiert wird.
     
    6. System nach Anspruch 5, wobei das neuronale Netz (108, 408, 708) mehrere Synapsen und anpassbare synaptische Gewichte umfasst und wobei der gemittelte quadratische Fehler durch Anpassen der synaptischen Gewichte annähernd minimiert wird.
     
    7. System nach einem der vorhergehenden Ansprüche, umfassend ein Paar aus linkem und rechtem Hörgerät, die konfiguriert sind, um vom Träger getragen zu werden und die kommunikativ miteinander gekoppelt sind, wobei das linke und rechte Hörgerät die mehreren Mikrofone (102-1, 102-N), die Steuerschaltung (104) und den Empfänger (106) umfassen.
     
    8. System nach Anspruch 7, ferner umfassend ein Fernmikrofon, das kommunikativ mit dem Paar aus linkem und rechtem Hörgerät gekoppelt und so konfiguriert ist, dass es einen Fernton empfängt und ein Fernmikrofonsignal erzeugt, wobei die Steuerschaltung (104) konfiguriert ist, um das Ausgangssignal unter Verwendung der von den Mikrofonen des linken und rechten Hörgeräts erzeugten Mikrofonsignale und des Fernmikrofonsignals zu erzeugen.
     
    9. Verfahren zum Betreiben eines Hörsystems, Folgendes umfassend:

    Empfangen (1042) von Mikrofonsignalen einschließlich eines Ziellauts von mehreren Mikrofonen (102-1, 102-N) des Hörsystems;

    Verarbeiten (1044) der Mikrofonsignale unter Verwendung eines auf einem neuronalen Netz basierenden nichtlinearen Signalverarbeitungsalgorithmus, um eine Richtwirkung der mehreren Mikrofone (102-1, 102-N) zu steuern und ein Ausgangssignal zu erzeugen; und

    Erzeugen (1046) eines Ausgangstons auf der Grundlage des Ausgangssignals unter Verwendung eines Empfängers (106) des Hörsystems;

    wobei das auf einem neuronalen Netzwerk basierende Verarbeiten einen Eingang zum Empfangen der mehreren Mikrofonsignale und einen Ausgang umfasst, wobei das Verarbeiten Folgendes verwendet:

    einen linearen Signalverarbeitungspfad zwischen dem Eingang und dem Ausgang, wobei der lineare Signalverarbeitungspfad eine oder mehrere lineare verborgene Schichten (428, 726, 728) umfasst; und

    einen nichtlinearen Signalverarbeitungspfad zwischen dem Eingang und dem Ausgang, wobei der nichtlineare Signalverarbeitungspfad eine oder mehrere nichtlineare verborgene Schichten (426, 727) umfasst.


     
    10. Verfahren nach Anspruch 9, wobei das Verarbeiten der Mikrofonsignale das Verarbeiten der Mikrofonsignale unter Verwendung eines neuronalen Netzes (108, 408, 708) umfasst, das auf eine Kostenfunktion und ein gewünschtes Gleichgewicht zwischen einem Signal-RauschVerhältnis, SNR, und einer Verzerrung des Ziellauts trainiert ist.
     
    11. Verfahren nach Anspruch 10, wobei der Ziellaut eine Sprache ist, welches das neuronale Netzwerk (108, 408, 708) verwendet, das für das gewünschte Gleichgewicht zwischen dem SNR und der Verzerrung des Ziellauts trainiert wurde, Folgendes umfassend:

    Verwenden eines neuronalen Netzes (108, 408, 708), das mit einem gemittelten quadratischen Fehler als Kostenfunktion und einer klaren Sprache als Ziellaut trainiert wurde; und

    annäherndes Minimieren des gemittelten quadratischen Fehlers durch Anpassen der synaptischen Gewichte im neuronalen Netz (108, 408, 708).


     
    12. Verfahren nach einem der Ansprüche 9 bis 11, wobei das Empfangen der Mikrofonsignale das Empfangen eines linken Mikrofonsignals von einem linken Hörgerät und eines rechten Mikrofonsignals von einem rechten Hörgerät umfasst, wobei das linke und das rechte Hörgerät jeweils in oder um ein Ohr eines Hörgeräteträgers getragen werden.
     
    13. Verfahren nach Anspruch 12, wobei das Empfangen der Mikrofonsignale ferner das Empfangen eines oder mehrerer Mikrofonsignale von einem Mikrofon umfasst, das sich außerhalb der linken und rechten Hörgeräte befindet und kommunikativ mit den linken und rechten Hörgeräten gekoppelt ist, und wobei das Verarbeiten der Mikrofonsignale das Verarbeiten der Mikrofonsignale unter Verwendung mindestens eines Prozessors der linken und rechten Hörgeräte umfasst.
     


    Revendications

    1. Système auditif conçu pour être porté par un utilisateur, comprenant :

    une pluralité de microphones (102-1, 102-N) conçus pour recevoir des sons d'entrée comportant un son cible, et pour produire une pluralité de signaux de microphone comportant le son cible ;

    un circuit de commande (104) conçu pour produire un signal de sortie en utilisant la pluralité de signaux de microphone, le circuit de commande (104) comportant un réseau neuronal (108, 408, 708) et étant conçu pour commander la directivité de la pluralité de microphones (102-1, 102-N) en traitant la pluralité de signaux de microphone en utilisant un algorithme de traitement de signal non linéaire basé sur le réseau neuronal (108, 408, 708) ; et

    un récepteur (106) conçu pour produire un son de sortie en utilisant le signal de sortie ;

    le réseau neuronal (108, 408, 708) comprenant :

    une entrée (424, 724) pour recevoir la pluralité de signaux de microphone ;

    une sortie (430, 730) ;

    un chemin linéaire de traitement de signal entre l'entrée et la sortie, le chemin linéaire de traitement de signal comportant une ou plusieurs couches linéaires cachées (428, 726, 728) ; et

    un chemin non linéaire de traitement de signal entre l'entrée et la sortie, le chemin non linéaire de traitement de signal comportant une ou plusieurs couches non linéaires cachées (426, 727).


     
    2. Système selon la revendication 1, dans lequel le réseau neuronal (108, 408, 708) est entraîné à traiter la pluralité de signaux de microphone pour un équilibre souhaitable entre un rapport signal sur bruit (SNR) et la distorsion du son cible.
     
    3. Système selon l'une quelconque des revendications précédentes, dans lequel le réseau neuronal (108, 408, 708) est un réseau neuronal adaptatif qui s'adapte à un environnement changeant dans lequel le système auditif est utilisé.
     
    4. Système selon l'une quelconque des revendications précédentes, dans lequel le réseau neuronal (108, 408, 708) est entraîné avec un son cible synthétisé dans des conditions de bruit et une fonction de coût spécifiée.
     
    5. Système selon la revendication 4, dans lequel le son cible est la parole, et le réseau neuronal (108, 408, 708) est entraîné avec une parole nette dans des conditions de bruit de babillage en tant que son cible synthétisé dans les conditions de bruit, et avec une erreur quadratique moyenne en tant que fonction de coût.
     
    6. Système selon la revendication 5, dans lequel le réseau neuronal (108, 408, 708) comprend une pluralité de synapses et de poids synaptiques ajustables, et dans lequel l'erreur quadratique moyenne est approximativement minimisée par ajustement des poids synaptiques.
     
    7. Système selon l'une quelconque des revendications précédentes, comprenant une paire d'aides auditives gauche et droite conçues pour être portées par l'utilisateur, et couplées de manière communicative l'une à l'autre, les aides auditives gauche et droite comportant la pluralité de microphones (102-1, 102-N), le circuit de commande (104) et le récepteur (106).
     
    8. Système selon la revendication 7, comprenant en outre un microphone distant couplé de manière communicative à la paire d'aides auditives gauche et droite, et conçu pour recevoir un son distant et produire un signal de microphone distant, le circuit de commande (104) étant conçu pour produire le signal de sortie en utilisant les signaux de microphone produits par les microphones des aides auditives gauche et droite et le signal de microphone distant.
     
    9. Procédé de fonctionnement d'un système auditif, comprenant :

    la réception (1042) de signaux de microphone comportant un son cible provenant d'une pluralité de microphones (102-1, 102-N) du système auditif ;

    le traitement (1044) des signaux de microphone, en utilisant un algorithme non linéaire de traitement de signal basé sur un réseau neuronal, pour commander une directivité de la pluralité de microphones (102-1, 102-N) et produire un signal de sortie ; et

    la production (1046) d'un son de sortie sur la base du signal de sortie en utilisant un récepteur (106) du système auditif ;

    le traitement basé sur le réseau neuronal comprenant une entrée destinée à recevoir la pluralité de signaux de microphone et une sortie, le traitement utilisant :

    un chemin linéaire de traitement de signal entre l'entrée et la sortie, le chemin linéaire de traitement de signal comportant une ou plusieurs couches linéaires cachées (428, 726, 728) ; et

    un chemin non linéaire de traitement de signal entre l'entrée et la sortie, le chemin non linéaire de traitement de signal comportant une ou plusieurs couches non linéaires cachées (426, 727).


     
    10. Procédé selon la revendication 9, dans lequel le traitement des signaux de microphone comprend le traitement des signaux de microphone en utilisant un réseau neuronal (108, 408, 708) entraîné pour une fonction de coût et un équilibre souhaitable entre un rapport signal sur bruit (SNR) et la distorsion du son cible.
     
    11. Procédé selon la revendication 10, dans lequel le son cible est une parole, et l'utilisation du réseau neuronal (108, 408, 708) entraîné pour l'équilibre souhaitable entre le SNR et la distorsion du son cible comprend :

    l'utilisation d'un réseau neuronal (108, 408, 708) entraîné avec une erreur quadratique moyenne en tant que fonction de coût et une parole nette en tant que son cible ; et

    la minimisation approximative de l'erreur quadratique moyenne par ajustement des poids synaptiques dans le réseau neuronal (108, 408, 708).


     
    12. Procédé selon l'une quelconque des revendications 9 à 11, dans lequel la réception des signaux de microphone comprend la réception d'un signal de microphone gauche provenant d'une aide auditive gauche et d'un signal de microphone droit provenant d'une aide auditive droite, les aides auditives gauche et droite étant portées chacune dans une oreille d'un utilisateur d'aides auditives ou autour de celle-ci.
     
    13. Procédé selon la revendication 12, dans lequel la réception des signaux de microphone comprend en outre la réception d'un ou de plusieurs signaux de microphone provenant d'un microphone externe aux aides auditives gauche et droite et couplé de manière communicative aux aides auditives gauche et droite, et le traitement des signaux de microphone comprend le traitement des signaux de microphone en utilisant au moins un processeur des aides auditives gauche et droite.
     




    Drawing


























    Cited references

    REFERENCES CITED IN THE DESCRIPTION



    This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

    Patent documents cited in the description