MULTIMICROPHONE AUDIO SYSTEM

(19)

(11)

EP 4 482 173 A1

(12)	EUROPEAN PATENT APPLICATION

(43)	Date of publication:
	25.12.2024 Bulletin 2024/52

(21)	Application number: 23180927.8

(22)	Date of filing: 22.06.2023

(51)

International Patent Classification (IPC):

H04R 3/00^(2006.01)
H04R 25/00^(2006.01)
G10L 21/0264^(2013.01)

H04R 1/40^(2006.01)
G10L 21/02^(2013.01)

(52)	Cooperative Patent Classification (CPC):
	H04R 1/406; H04R 3/005; H04R 25/407; H04R 2225/43; H04R 2420/07; G10L 21/02; G10L 21/0264; G10L 2021/02166; G10L 2021/02082

(84)	Designated Contracting States:
	AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC ME MK MT NL NO PL PT RO RS SE SI SK SM TR
	Designated Extension States:
	BA
	Designated Validation States:
	KH MA MD TN

(71)	Applicant: GN Audio A/S
	2750 Ballerup (DK)

(72)	Inventors:
	MOWLAEE, Pejman 2750 Ballerup (DK) ZERMINI, Alfredo 2750 Ballerup (DK)

(74)	Representative: Zacco Denmark A/S
	Arne Jacobsens Allé 15 2300 Copenhagen S 2300 Copenhagen S (DK)

(54)	MULTIMICROPHONE AUDIO SYSTEM

(57) An audio system comprising a plurality of distributed microphones and a processor. The processor is configured to obtain a plurality of microphone input signals, determine a plurality of different sets of microphone input signals, determine for each set of microphone input signals a first audio parameter, determine a score for each of the sets of microphone input signals, and output the determined scores.

Description

[0001] The present disclosure relates to an audio system comprising a plurality of distributed microphones, a related audio device and related methods.

BACKGROUND

[0002] Audio systems are used for a wide variety of purposes, one of these being for facilitating online meetings. During an online meeting the audio system may be configured for picking-up speech at a near-end and transmitting it to a far-end. To enhance the quality of the picked-up speech the audio system may perform processing such as noise reduction, echo control, dereverberation, beamforming, etc..

[0003] For audio systems comprising a plurality of distributed microphones, e.g., distributed through-out a meeting room, the audio system may be configured for picking out the best microphone signals or weighting the picked-up microphone signals, and using the selected/weighted microphone signals for further processing. By selecting the best microphone signal or weighting the picked-up microphone signals it may facilitate a better enhancement of the quality of the picked-up speech, e.g., by reducing the impact from noise pollutants and other sources of audio distortion.

[0004] Examples of such methods are provided in, WO 2016/033364 A1, US 10,728,662 B2, and US 2015/0380010 A1. However, in the presented methods there is still room for improvement.

SUMMARY

[0005] Accordingly, there is a need for an improved audio system, improved audio device and related methods.

[0006] According to a first aspect there is provided an audio system comprising a plurality of distributed microphones and a processor. The plurality of distributed microphones comprises at least three distributed microphones. The processor is configured to obtain via the plurality of distributed microphones a plurality of microphone input signals, determine a plurality of different sets of microphone input signals, where each set of microphone input signals comprises at least two microphone input signals, determine for each set of microphone input signals a first audio parameter, determine a score for each of the sets of microphone input signals based on the first audio parameters, and output the determined scores.

[0007] Consequently, the assessment of the score/quality of the microphone signals is not based on the microphone input signals in isolation, instead the microphone input signals are assessed based on different combinations with other microphone input signals. Hence, the determined score for the set of microphones may better reflect the actual result achievable with the set of microphone input signals. Especially, for multichannel processing purposes looking at the combination of microphone input signals instead of the available signals in isolation may be advantageous. Furthermore, the outputted scores may be provided to an audio engineer, IT personnel, or other relevant persons, thus, allowing them to diagnose the audio system and facilitate improvement on the distribution of microphones.

BRIEF DESCRIPTION OF THE DRAWINGS

[0008] The above and other features and advantages of the present invention will become readily apparent to those skilled in the art by the following detailed description of example embodiments thereof with reference to the attached drawings, in which:

Fig. 1 schematically illustrates an example system according to the present disclosure, and

Fig. 2A-B is a flow chart of an example method according to the present disclosure.

DETAILED DESCRIPTION

[0009] Various example embodiments and details are described hereinafter, with reference to the figures when relevant. It should be noted that the figures may or may not be drawn to scale and that elements of similar structures or functions are represented by like reference numerals throughout the figures. It should also be noted that the figures are only intended to facilitate the description of the embodiments. They are not intended as an exhaustive description of the invention or as a limitation on the scope of the invention. In addition, an illustrated embodiment needs not have all the aspects or advantages shown. An aspect or an advantage described in conjunction with a particular embodiment is not necessarily limited to that embodiment and can be practiced in any other embodiments even if not so illustrated, or if not so explicitly described.

[0010] The audio system may be embodied by one or more audio devices. The audio system may be embodied by two, three, or four audio devices distributed through-out a meeting room. The audio system may comprise a central processing unit communicatively connected to the plurality of distributed microphones. The central processing unit may be a processor of one or more audio devices. The audio system may be embodied as a conference system. The audio system may comprise video capabilities for transmitting a video stream to a far-end. The audio system may be formed by a plurality of distributed microphones being communicatively connected to a processor.

[0011] The plurality of distributed microphones may be a plurality of microphones spatially separated from each other, e.g., by the plurality of microphones being located on two or more audio devices or arranged at different locations through-out a meeting room. The plurality of distributed microphones may be a plurality of microphones distributed at different positions within a room, such as a meeting room. The plurality of distributed microphones may be a plurality of microphones where the spatial arrangement between the microphones is not known in advance. The plurality of distributed microphones may be a plurality of microphones where the plurality of microphones has been distributed to improve the coverage of the plurality of microphones. The plurality of distributed microphones comprises at least three microphones, these may be distributed on two or more audio devices. The plurality of distributed microphones may comprise more than three microphones, such as five, ten, fifteen or twenty microphones. The plurality of distributed microphones may be a plurality of microphones where two or more of the microphones are located at a first audio device, e.g., as microphones comprised by a first speakerphone, and one or more microphones located at a second audio device, e.g., as one or more microphones comprised by a second speakerphone.

[0012] The plurality of distributed microphones and the processor may be communicatively connected to each other via a wired or a wireless connection. Each of the plurality of distributed microphones may be associated with a transmitter unit or a transceiver unit allowing them to communicate with a receiver unit or a transceiver unit associated with the processor, thus allowing the processor to receive microphone input signals from the microphones. The plurality of microphone input signals may be obtained by the processor by receiving the plurality of microphone input signals from the plurality of distributed microphones.

[0013] The plurality of microphone input signals may be digital signals or analog signals. The plurality of microphone input signals may be obtained by the plurality of distributed microphones as analog signals, and then be converted to digital signals and subsequently, be transmitted to the processor as digital signals. Alternatively, the plurality of distributed microphone may transmit the plurality of microphone input signals as analog signals and then the processor may convert the analog signals to digital microphone input signals. The processor may comprise or be communicatively connected to an analog to digital converter.

[0014] The processor is configured to obtain via the plurality of distributed microphones a plurality of microphone input signals. The processor may receive the plurality of microphone input signals over a wireless or a wired connection with the plurality of distributed microphones. The processor may obtain the plurality of microphone input signals as analog or digital signals.

[0015] The processor is configured to determine a plurality of different sets of microphone input signals. The processor may be configured to determine a set of microphone input signal for each combination of microphone input signals. In an example, where the plurality of distributed microphones comprises three microphones, each microphone obtaining a microphone input signal, the different sets of microphone input signals may be expressed in a vector comprising every combination of microphone input signal:

c_1,2 being the set of microphone input signals comprising microphone input signals 1 and 2 from microphone 1 and 2, and so forth for the rest of the elements of the vector, and C may be called the set vector which comprises the plurality of different sets of microphone input signals. In some embodiments, the processor may be configured to determine a limited number of sets of microphone input signals, i.e., not determining a set for each combination of microphone input signals. Instead, the processor may be limited to determine only sets of microphone input signals comprising 2, 3, 4 or 5 microphone input signals. By only determining a limited number of sets of microphone input signals it may reduce the processing required while still allowing to determine the best microphone set for further multichannel processing.

[0016] By different sets of microphones, it should be understood as each set of microphones comprises a unique combination of microphone input signals. Thus, different sets of microphone input signals are not just different permutations of the same microphone input signals.

[0017] The processor is configured to determine for each set of microphone input signals a first audio parameter. To determine the first audio parameter, the processor may combine the microphone input signals within a set. In an example, the processor combines the microphone input signals within a set into a single signal by beamforming the microphone input signals within a set, to thereby determine a beamformed signal. The processor may be configured to process the beamformed signal to determine the first audio parameter. The processor may process the beamformed signal to determine one or more first audio parameters, such as a signal to noise ratio, a direct to reverberation ratio, an echo to signal ratio, or a non-intrusive speech quality score. The processor may be configured for each set of microphone input signals to determine a corresponding beamformed signal, and based on the determined beamformed signals determine the first audio parameters.

[0018] In the above presented example, the processor combines the microphone input signals within each set into a single signal and then determines the first audio parameters, however, it is not necessary to combine the microphone input signals before determining the first audio parameter. The processor may be configured to determine a noise covariance matrix, and/or an echo covariance matrix and/or a clean speech covariance matrix for each set. Thus, not requiring for the microphone input signals within a set to be combined before determining the first audio parameters.

[0019] The first audio parameter may be viewed as a parameter which describes the whole set of microphone input signals, and not the individual elements within the set of microphone input signals.

[0020] In one example, the microphone input signals may be intended to be used in multi-channel Wiener filtering solution for denoising. Where x = s + v, where x is the noisy observation, s is the clean speech signal and v is the background noise. From adaptive filter theorem, assuming d as the desired signal, the optimal filter weights H_MWF,opt may be given by:

where Φ_xd is the cross-power spectral density between the noisy and desired signal, and Φ_xx is the auto-power spectral density of the microphone input signals. The processor may be configured to determine/estimate the two power spectral density elements for each set of microphone input signals. The two power spectral density elements may then be viewed as the first audio parameter.

[0021] The processor is configured to determine a score for each of the sets of microphone input signals based on the first audio parameters. The score may be a relative score between the sets of microphone input signals, e.g., the score may define a ranking between sets of the multimicrophone signals, for example the processor may score each set based on a signal to noise ratio and rank the sets relative to each other from lowest signal to noise ratio to highest signal to noise ratio. The processor may score each set based on the associated first audio parameters and rank the sets relative to each other based on their associated first audio parameters. The score may be determined from a continuous range of values, e.g., 1-100. The score may be determined from a discrete range of values, e.g., 1, 2, 3, 4, or 5. The score may be determined based on one or more threshold values, e.g., if the first audio parameter exceeds the threshold value the processor may assign a first score and if the first audio parameter does exceed the threshold value the processor may assign a second score, the first score could be a score indicating the set to be good, and the second score could be score indicating the set to be bad.

[0022] To continue the above example with the multi-channel Wiener filtering solution for denoising, the processor may determine for each set of microphone input signals a SNR value based on the power spectral density elements and score the set of microphone input signals based on the determined SNR values. An alternative is to choose the set which leads to lowest mean square error cost among the Wiener filter solutions. The lowest mean square error cost may be found by the following:

[0023] The processor may be configured to output the determined scores. The processor may use the outputted score to determine which microphone input signals to use for further processing. The processor may output scores to another processor. The processor may output the scores to be used for diagnostics of a meeting room, e.g., determining areas with combination of microphones which score well and areas with combination of microphones which score badly.

[0024] In embodiment the processor is configured to determine for each set of microphone input signals a second audio parameter and determine the score for each of the sets of microphone input signals based on the first audio parameters and the second audio parameters.

[0025] Consequently, the scoring of the sets is based on more parameters, thus, making the scoring of the sets more robust. For example, during a double talk situation, i.e., where the far-end and the near-end is transmitting simultaneously, it may not necessarily be advantageous to select the multichannel signal with the highest signal-to-noise ratio, as the same multichannel signal may exhibit a large echo-to-signal ratio, thus, deciding based on both parameters may be preferable.

[0026] The second audio parameter may be an audio parameter differing from the first audio parameter.

[0027] Although only a first and second audio parameter is mentioned, a third, a fourth and so forth audio parameter may be determined for each set and used in determining a score for the set.

[0028] In an embodiment the processor is configured to determine for each set of microphone input signals a weighted sum of the respective first audio parameter and respective second audio parameter and determine the score for each of the sets of microphone input signals based on the weighted sum associated with each multichannel signal.

[0029] A weighted sum provides an easy and simple manner to consider several audio parameters.

[0030] The weights associated with different audio parameters may be user defined, i.e., determined by a user of the audio system. The weights associated with different audio parameters may be determined during a tuning/set-up process of the audio system. The weights may be pre-set by a provider of the audio system, e.g., the audio system may comprise pre-set factory setting comprising weights associated with different audio parameters.

[0031] Although a weighted sum provides a good approach to consider several audio parameters, other approaches may also be usable. For example, a ranking approach may be selected where each set is ranked in relation to each other based on the first audio parameter and ranked in relation to each other based on the second audio parameter, and the average rank obtained by a set is the score for the set.

[0032] In an embodiment the processor is configured to determine a plurality of third audio parameters, each third audio parameter being associated with a respective microphone input signal, compare each of the third audio parameters to a first sorting criterion, and discard microphone input signals associated with a third audio parameter not passing the first sorting criterion, where the created sets of microphone input signals do not comprise discarded microphone input signals.

[0033] By discarding microphone input signals, it may reduce the complexity and processing power required in determining the sets of microphones, and subsequently in determining audio parameters associated with the sets, as there will be less sets and audio parameters to determine. Furthermore, faulty microphones giving misleading microphone input signals may be sorted away, thus, avoiding faulty microphones skewing the results.

[0034] The third audio parameter may be an audio parameter differing from the first audio parameter and/or the second audio parameter.

[0035] The third audio parameter may be a sound pressure level, a signal to noise-ratio or a non-intrusive speech quality predictor, such as NS-MOS, AEC-MOS, NORRESQA, NISQA, SRMR.

[0036] Although only a first sorting criterion is mentioned, a second sorting criterion, a third sorting criterion and so forth may also be present. Each sorting criterion may each be associated with their own audio parameter. Consequently, the microphone input signals may go through several rounds of sortation before sets of microphone input signals are determined.

[0037] The first sorting criterion may be a threshold value, or a range of acceptable values. The first sorting criterion may be set by a user of the audio system. The first sorting criterion may be set during a tuning process of the audio system. The weights may be pre-set by a provider of the audio system, e.g., as part of pre-set factory settings.

[0038] The first sorting criterion may be based on a sound pressure level, e.g., if a microphone input signal exhibits a sound pressure level near 0 dB or an abnormally high sound pressure level, it may indicate the microphone associated with the microphone input signal is faulty. The first sorting criterion may be based on a signal to noise ratio, e.g., if the signal to noise ratio of a microphone input signal is determined to be low it may be sorted away as even in combination with other signals it may not positively contribute. The first sorting criterion may be based on a non-intrusive speech quality predictor. Although, some specific examples for the first sorting criterion have been mentioned, the present disclosure is not limited to these and other audio parameters may be equally applicable, e.g., MOS score, speech presence probability, signal to echo ratio, RT60, direct-to-reverberation ratio, etc.

[0039] The third audio parameter may be selected from the following: a signal to noise ratio, a direct to reverberation ratio, an echo to signal ratio, a non-intrusive speech quality score, and a speech presence probability estimate.

[0040] In an embodiment the first audio parameter, and/or the second audio parameter are selected from the following: a signal to noise ratio, a direct to reverberation ratio, an echo to signal ratio, a non-intrusive speech quality score, a speech presence probability estimate, a noise covariance matrix, a clean speech covariance matrix, an echo covariance matrix, cross power spectral density, an auto power spectral density and a noise power spectral density estimation.

[0041] In an embodiment the processor is configured to synchronize the obtained microphone input signals.

[0042] The synchronization may be carried out by cross-correlation of the microphone input signals. The synchronization may be carried out based on a known distribution of the microphones. The synchronization may be carried out based on time stamps associated with the microphone input signals. The synchronization may be carried out temporally with respect to a clock. Synchronization may be carried out by cross-attention layers in a neural network.

[0043] Synchronization of the microphone input signals may be carried out prior to determining the different sets of microphone input signals.

[0044] In an embodiment the processor is configured to perform multichannel processing on the plurality of microphone input signals from the set of microphones associated with the top scoring set of microphone input signals to provide an output audio signal.

[0045] Consequently, after determining the top scoring set of microphone input signals, multichannel processing may be performed on the plurality of microphone input signals from the set of microphones associated with the top scoring set of microphone input signals.

[0046] The output audio signal may be an audio signal which is to be transmitted to a far-end, e.g., another audio system or another audio device.

[0047] In an embodiment to perform multichannel processing comprises one or more of the following: bandwidth extension, denoising, dereverberation, echo control, direction of arrival estimation, and beamforming.

[0048] In an embodiment, the processor is configured to determine a change in room acoustics based on the plurality of microphone input signals, comparing the change in room acoustics to a first update criterion, if the change in room acoustics exceeds the first update criterion, re-determine the score for each of the sets of microphone input signals, and output the re-determined scores.

[0049] Consequently, the audio system may in an adaptive manner determine the score for each set of microphones.

[0050] The change in room acoustics may be caused by movement of an object or a person in the vicinity of the audio system. The change in room acoustics may be caused by the addition or removal of an object or a person in the vicinity of the audio system.

[0051] The change in room acoustics may be determined by determining a change over time in one or more audio parameters of the plurality of microphone input signals. The change in room acoustics may be determined by determining a change over time in one or more audio parameters in each of the plurality of microphone input signals. The change in room acoustics may be determined by analysing microphone input signals associated with the top scoring set of microphone input signals.

[0052] The first update criterion may comprise one or more threshold values.

[0053] Re-determining the score for each set of microphone input signals may comprise repeating some or all of the processing previously carried out to determine the score for each set of microphone input signals, such as, obtain via the plurality of distributed microphones a plurality of microphone input signals, determine a plurality of different sets of microphone input signals, determine for each set of microphone input signals a first audio parameter, determine a score for each of the sets of microphone input signals based on the first audio parameters, and output the determined scores.

[0054] In an embodiment to determine the change in room acoustics comprises to determine one or more first impulse responses at a first time based on the plurality of microphone input signals, determine one or more second impulse responses at a second time based on the plurality of microphone input signals, determine the change in room acoustics, where the determination of the change in room acoustics is based on a difference between the one or more first impulse responses and the one or more second impulse responses.

[0055] Consequently, a simple method is provided for determining a change in room acoustics. Furthermore, the determination of an impulse response is part of the normal processing pipeline for audio systems dealing with echo control, hence, minimal additional processing logic is needed for determining the change in room acoustics.

[0056] The first impulse response may be a room impulse response. The second impulse response may be a room impulse response.

[0057] In an embodiment microphone input signals associated with the top scoring set of microphones are used in determining the change in room acoustics.

[0058] Thus, the determination of room acoustics may be done post determining the top scoring set of microphone input signals.

[0059] In an embodiment the audio system comprises a first audio devices comprising one or more first microphones belonging to the plurality of distributed microphones, a first wireless communication interface, and a first audio processor, and a second audio devices comprising one or more second microphones belonging to the plurality of distributed microphones, a second wireless communication interface, and a second audio processor, and where the first audio device is communicatively connected to the second audio device via the first wireless communication interface and the second wireless communication interface, and where the first audio processor and/or the second audio processor are configured to determine and output the score for the different sets microphones input signals.

[0060] The first audio device and the second audio device may be two speakerphones arranged in a meeting room. Even for modern speakerphones it may be required to have two or more speakerphones to provide proper audio pick-up coverage of a large meeting room. The first audio device and the second audio device may be different types of audio devices, e.g., the first audio device may be a headset and the second audio device may be a speakerphone.

[0061] All microphones of the plurality of distributed microphones may be comprised by the first audio device and the second audio device.

[0062] The first audio device may comprise one, two, three, four, five or more first microphones belonging to the plurality of distributed microphones. The second audio device may comprise one, two, three, four, five or more first microphones belonging to the plurality of distributed microphones.

[0063] According to a second aspect of the invention there is provided an audio device comprising one or more first microphones, a first wireless communication interface, and a first processor, where the first processor is configured to obtain via the one or more first microphone one or more first microphone input signals, receive one or more second microphone input signals from one or more additional audio device communicatively connected to the audio device, determine a plurality of different sets of microphone input signals using the one or more first microphone input signals and the one or more second microphone input signal, where each set of microphone input signals comprises at least two microphone input signals, determine for each set of microphone input signals a first audio parameter, determine a score for each of the sets of microphone input signals based on the first audio parameters, output the determined scores.

[0064] In the following the general term audio device is used, the description mentioned in relation to the audio device is equally applicable to the first audio device, the second audio device, or any other audio device mentioned herein.

[0065] The audio device may be configured to be worn by a user. The audio device may be arranged at the user's ear, on the user's ear, over the user's ear, in the user's ear, in the user's ear canal, behind the user's ear, and/or in the user's concha, i.e., the audio device is configured to be worn at the user's ear.

[0066] The audio device may be configured to be worn by a user at each ear, e.g., a pair of ear buds or a head set with two earcups. In the embodiment where the audio device is to be worn at both ears, the components meant to be worn at each ear may be connected, such as wirelessly connected and/or connected by wires, and/or by a strap and/or by a headband. The components meant to be worn at each ear may be substantially identical or differ from each other.

[0067] The audio device may be a hearable such as a headset, headphones, earphones, ear bud, hearing aids, an over the counter (OTC) hearing device, a hearing protection device, a one-size-fits-all audio device, a custom audio device or another head-wearable audio device.

[0068] The audio device may be a speaker phone, or another audio device forming part of a conference system. The audio device may be an audio device not configured to be worn by a user.

[0069] The interface may comprise a wireless transceiver, also denoted as a radio transceiver, and an antenna for wireless transmission and reception of an audio signal, such as for wireless transmission of an output signal and/or wireless reception of a wireless input signal. The audio device may be configured for wireless communication with one or more electronic devices, such as another audio device, a smartphone, a tablet, a computer and/or a smart watch. The audio device optionally comprises an antenna for converting one or more wireless input audio signals to antenna output signal(s). The audio device may be configured for wireless communications via a wireless communication system, such as short-range wireless communications systems, such as Wi-Fi, Bluetooth, Zigbee, IEEE 802.11, IEEE 802.15, infrared and/or the like. The audio device may be configured for wireless communications via a wireless communication system, such as a 3GPP system, such as a 3GPP system supporting one or more of: New Radio, NR, Narrow-band loT, NB-loT, and Long Term Evolution - enhanced Machine Type Communication, LTE-M, millimeter-wave communications, such as millimeter-wave communications in licensed bands, such as device-to-device millimetre-wave communications in licensed bands. In one or more example audio devices the interface of the audio device comprises one or more of: a Bluetooth interface, Bluetooth low energy interface, and a magnetic induction interface. For example, the interface of the audio device may comprise a Bluetooth antenna and/or a magnetic interference antenna. In one or more example audio devices, the interface may comprise a connector for wired communication, via a connector, such as by using an electrical cable. The connector may connect one or more microphones to the audio device. The connector may connect the audio device to an electronic device, e.g., for wired connection. The one or more interfaces can be or comprise wireless interfaces, such as transmitters and/or receivers, and/or wired interfaces, such as connectors for physical coupling.

[0070] The audio device comprises a plurality of input transducers. The plurality of input transducers may comprise a plurality of microphones. The plurality of input transducers may be configured for converting an acoustic signal into an electric input signal. The electric input signal may be an analog signal. The electric input signal may be a digital signal. The plurality of input transducers may be coupled to one or more analog-to-digital converters configured for converting the analog input signal into a digital input signal.

[0071] The audio device may comprise one or more antennas configured for wireless communication. The one or more antennas may comprise an electric antenna. The electric antenna is configured for wireless communication at a first frequency. The first frequency may be above 800 MHz, preferably a wavelength between 900 MHz and 6 GHz. The first frequency may be 902 MHz to 928 MHz. The first frequency may be 2.4 to 2.5 GHz. The first frequency may be 5.725 GHz to 5.875 GHz. The one or more antennas may comprise a magnetic antenna. The magnetic antenna may comprise a magnetic core. The magnetic antenna comprises a coil. The coil may be coiled around the magnetic core. The magnetic antenna is configured for wireless communication at a second frequency. The second frequency may be below 100 MHZ. The second frequency may be between 9 MHZ and 15 MHZ.

[0072] The audio device may comprise one or more wireless communication units. The one or more wireless communication units may comprise one or more wireless receivers, one or more wireless transmitters, one or more transmitter-receiver pairs, and/or one or more transceivers. At least one of the one or more wireless communication units may be coupled to the one or more antennas. The wireless communication unit may be configured for converting a wireless signal received by at least one of the one or more antennas into an electric input signal. The audio device may be configured for wired/wireless audio communication, e.g., enabling the user to listen to media, such as music or radio, and/or enabling the user to perform phone calls.

[0073] The audio device may be configured for wireless communication with one or more external devices, such as one or more accessory devices, such as a smartphone and/or a smart watch.

[0074] The audio device may comprise one or more processing units. The processing unit may be configured for processing one or more input signals. The processing may comprise compensating for a hearing loss of the user, i.e., apply frequency dependent gain to input signals in accordance with the user's frequency dependent hearing impairment. The processing may comprise performing feedback cancellation, beamforming, tinnitus reduction/masking, noise reduction, noise cancellation, speech recognition, bass adjustment, treble adjustment, face balancing, echo control, and/or processing of user input. The processing unit may be a processor, an integrated circuit, an application, functional module, etc. The processing unit may be implemented in a signal-processing chip or a printed circuit board (PCB). The processing unit is configured to provide an electric output signal based on the processing of one or more input signals. The processing unit may be configured to provide one or more further electric output signals. The one or more further electric output signals may be based on the processing of one or more input signals. The processing unit may comprise a receiver, a transmitter and/or a transceiver for receiving and transmitting wireless signals. The processing unit may control one or more playback features of the audio device.

[0075] The audio device may comprise an output transducer. The output transducer may be coupled to the processing unit. The output transducer may be a loudspeaker, or any other device configured for converting an electrical signal into an acoustical signal. The receiver may be configured for converting an electric output signal into an acoustic output signal.

[0076] The wireless communication unit may be configured for converting an electric output signal into a wireless output signal. The wireless output signal may comprise synchronization data. The wireless communication unit may be configured for transmitting the wireless output signal via at least one of the one or more antennas.

[0077] The audio device may comprise a digital-to-analog converter configured to convert an electric output signal or a wireless output signal into an analog signal.

[0078] The audio device may comprise a power source. The power source may comprise a battery providing a first voltage. The battery may be a rechargeable battery. The battery may be a replaceable battery. The power source may comprise a power management unit. The power management unit may be configured to convert the first voltage into a second voltage. The power source may comprise a charging coil. The charging coil may be provided by the magnetic antenna.

[0079] The audio device may comprise a memory, including volatile and non-volatile forms of memory.

[0080] According to a third aspect there is provided a computer implemented method for selecting a set of microphones among a plurality of distributed microphones in an audio system, the method comprising obtaining via the plurality of distributed microphones a plurality of microphone input signals, determining a plurality of different sets of microphone input signals, where each set of microphone input signals comprises at least two microphone input signals, determining for each set of microphone input signals a first audio parameter, determining a score for each of the sets of microphone input signals based on the first audio parameters, outputting the determined scores.

[0081] Fig. 1 schematically illustrates an example system, such as an audio system 2 according to the present disclosure. The audio system 2 comprises an audio device 10 comprising a memory 10A, an interface 10B, a processor 10C, one or more speakers 10D, and one or more microphones, including a first microphone 10E. The audio device 10 may be configured to obtain audio signals, output audio signals, and process audio signals. The audio device 10 may be a speakerphone, e.g., configured to be used by a party (such as one or more users 1A at a near-end) to communicate with one or more other parties (such as one or more users 1B at a far-end). The audio device 10 may be used for a conference and/or a meeting between two or more parties being remote from each other. The audio device 10 may be used by one or more users in a vicinity of where the speakerphone 10 is located, also referred to as a near-end.

[0082] In one or more example systems, the audio system 2 comprises one or more additional audio devices, including a second audio device 10'. The second audio device 10' may be substantially identical to the audio device 10, i.e., the second audio device 10' may comprises a memory, an interface, a processor, one or more speakers, and one or more microphones.

[0083] In one or more example systems, the audio system 2 comprises one or more additional microphones 10E' spatially separated and from the audio device 10 and communicatively connected with the audio device 10 via a wireless connection 15 or alternatively a wired connection.

[0084] Optionally, the audio system 2 comprises an electronic device 60. The electronic device 60 may for example be or comprise a smartphone, a smart-watch, a conference hub, a smart-tv, smart-speakers, a tablet, a computer, such as a laptop computer or PC, or a tablet computer. In other words, the electronic device 60 may for example be a user device of a user 1A, such as a mobile phone or a computer, configured to communicate with the speakerphone 10. In one or more example systems and/or speakerphones, the accessory device may be seen as a user accessory device, such as a mobile phone, a smart watch, a tablet, and/or a wearable gadget.

[0085] Optionally, the audio system 2 is communicatively connected to a far-end communication device 30. The communication device 30 may be seen as a communication device used by one or more far-end users 1, 1B to communicate with the one or more users 1, 1A at the near-end, e.g., via a network 40 such as global network, e.g., the internet, and/or a local network. The communication device 30 may be configured to obtain 38 a microphone input signal indicative of speech from one or more users 1B at the far-end. The communication device 30 may be configured to process the microphone input signal for provision of an external output signal. The communication device 30 may be configured to transmit 22 the external output signal to the audio device 10, e.g., via the network 40. The communication device 30 may be configured to receive 24 the external output signal from the audio device 10. The communication device 30 may be configured to output 36, to the user 1B at the far-end, an internal output signal based on the external output signal from the speakerphone 10.

[0086] The audio system comprises a plurality of distributed microphones 10E, 10E'. The plurality of distributed microphones comprises at least three distributed microphones 10E, 10E'. The plurality of distributed microphones may comprise one or more microphones 10E comprised by the audio device 10, and one or more microphones 10E' spatially separated from the audio device 10E'. The plurality of distributed microphones 10E, 10E' may be communicatively connected to a processor 10C of the audio device 10. The one or more microphones 10E' spatially separated from the audio device 10E' may be comprised by a second audio device 10' communicatively connected to the audio device 10.

[0087] The audio device 10 is configured to obtain via the plurality of distributed microphones 10E, 10E' a plurality of microphone input signals. In other words, the processor 10C of the audio device 10 is configured to obtain via the plurality of distributed microphones 10E, 10E' a plurality of microphone input signals. The plurality of microphone input signals may be a combination of microphone input signals obtained by one or more microphones 10E of the audio device 10, and one or more microphones 10E' external to the audio device 10, e.g., one or more microphones 10E' associated with a second audio device 10'.

[0088] The audio device 10 is configured to determine a plurality of different sets of microphone input signals, where each set of microphone input signals comprises at least two microphone input signals. In other words, the processor 10c of the audio device 10 is configured to determine a plurality of different sets of microphone input signals, where each set of microphone input signals comprises at least two microphone input signals. The plurality of sets of microphone input signals may comprises microphone input signals from one or more microphones 10E of the audio device 10. The plurality of sets of microphone input signals may comprises microphone input signals from one or more microphones 10E' external to the audio device 10. The plurality of sets of microphone input signals may comprises microphone input signals from one or more microphones 10E of the audio device 10 and from one or more microphones 10E' spatially separated from the audio device 10.

[0089] The audio device 10 is configured to determine for each set of microphone input signals a first audio parameter. In other words, the processor 10c of the audio device 10 is configured to determine for each set of microphone input signals a first audio parameter.

[0090] The audio device 10 is configured to determine a score for each of the sets of microphone input signals based on the first audio parameters. In other words, the processor 10C of the audio device 10 is configured to determine a score for each of the sets of microphone input signals based on the first audio parameters. The determination of the score may be carried out by scoring the sets of microphone input signals relative to each based on the first audio parameter, e.g., in case of the first audio parameter being a signal-to-noise ratio determining the score may be scoring the set of microphone input signals from lowest to highest signal-to-noise ratio.

[0091] The audio device 10 is configured to output the determined scores. In other words, the processor 10C of the audio device 10 is configured to output the determined scores.

[0092] The processor 10C may use the highest scoring set to determine which microphone input signals to use for further processing, such as audio processing. The processor 10C may output the scores to be used for diagnostics of a meeting room, e.g., determining areas where microphones are usually selected, areas where microphones which are not usually selected are located, and distributions of microphones which work well.

[0093] Referring to Fig. 2A-B which depicts a flow chart of an example method according to the present disclosure. The method may be fully or at least partly performed by a computer or other processing units. The method may be performed by a processor of an audio device. The method may be performed by an external processor communicatively connected with one or more audio devices of an audio system. The method comprises obtaining via the plurality of distributed microphones a plurality of microphone input signals S101. The method may comprise synchronizing the obtained microphone input signals S101D. The method may comprise determining a plurality of third audio parameters, each third audio parameter being associated with a respective microphone input signal S101A. The method may comprise comparing each of the third audio parameters to a first sorting criterion S101B. The method may comprise discarding microphone input signals associated with a third audio parameter not passing the first sorting criterion, where the created sets of microphone input signals do not comprise discarded microphone input signals S101C. The method comprises determining a plurality of different sets of microphone input signals, where each set of microphone input signals comprises at least two microphone input signals and differ from each other S102. The method comprises determining for each set of microphone input signals a first audio parameter S103. The method may comprise of determining for each set of microphone input signals a second audio parameter S103A. The method may comprise determining for each set a weighted sum of the respective first audio parameter and respective second audio parameter S103B. The method comprises determining a score for each of the sets of microphone input signals based on the first audio parameters S104. The determination of score may also be based on the second audio parameter. The method comprises outputting the determined scores S105. The method may comprise performing multichannel processing on the plurality of microphone input signals associated with the top scoring set of microphone input signals to provide an output audio signal S106. The multichannel processing may comprise one or more of the following: bandwidth extension, denoising, dereverberation, echo control, direction of arrival, and beamforming. The method may comprise determining a change in room acoustics based on the plurality of microphone input signals S107. Determining the change in room acoustics may comprise determining one or more first impulse responses at a first time based on the plurality of microphone input signals S107A. Determining the change in room acoustics may comprise determining one or more second impulse responses at a second time based on the plurality of microphone input signals S107B. Determining the change in room acoustics may comprise determining the change in room acoustics, where the change in room acoustics is based on the difference between the first impulse responses and the second impulse responses S107C. The method may comprise comparing the change in room acoustics to a first update criterion S108. The method may comprise, if the change in room acoustics exceeds the first update criterion, re-determining the score for each of the sets of microphone input signals S109. The method may comprise, if the change in room acoustics does not exceed the first update criterion, keeping the determined scores microphones S110.

[0094] The use of the terms "first", "second", "third" and "fourth", "primary", "secondary", "tertiary" etc. does not imply any particular order but are included to identify individual elements. Moreover, the use of the terms "first", "second", "third" and "fourth", "primary", "secondary", "tertiary" etc. does not denote any order or importance, but rather the terms "first", "second", "third" and "fourth", "primary", "secondary", "tertiary" etc. are used to distinguish one element from another. Note that the words "first", "second", "third" and "fourth", "primary", "secondary", "tertiary" etc. are used here and elsewhere for labelling purposes only and are not intended to denote any specific spatial or temporal ordering.

[0095] Furthermore, the labelling of a first element does not imply the presence of a second element and vice versa.

[0096] It may be appreciated that Figs. 1-2b comprise some modules or operations which are illustrated with a solid line and some modules or operations which are illustrated with a dashed line. The modules or operations which are comprised in a solid line are modules or operations which are comprised in the broadest example embodiment. The modules or operations which are comprised in a dashed line are example embodiments which may be comprised in, or a part of, or are further modules or operations which may be taken in addition to the modules or operations of the solid line example embodiments. It should be appreciated that these operations need not be performed in order presented. Furthermore, it should be appreciated that not all of the operations need to be performed. The example operations may be performed in any order and in any combination.

[0097] It is to be noted that the word "comprising" does not necessarily exclude the presence of other elements or steps than those listed.

[0098] It is to be noted that the words "a" or "an" preceding an element do not exclude the presence of a plurality of such elements.

[0099] It should further be noted that any reference signs do not limit the scope of the claims, that the example embodiments may be implemented at least in part by means of both hardware and software, and that several "means", "units" or "devices" may be represented by the same item of hardware.

[0100] The various example methods, devices, and systems described herein are described in the general context of method steps processes, which may be implemented in one aspect by a computer program product, embodied in a computer-readable medium, including computer-executable instructions, such as program code, executed by computers in networked environments. A computer-readable medium may include removable and non-removable storage devices including, but not limited to, Read Only Memory (ROM), Random Access Memory (RAM), compact discs (CDs), digital versatile discs (DVD), etc. Generally, program modules may include routines, programs, objects, components, data structures, etc. that perform specified tasks or implement specific abstract data types. Computer-executable instructions, associated data structures, and program modules represent examples of program code for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps or processes.

[0101] Although features have been shown and described, it will be understood that they are not intended to limit the claimed invention, and it will be made obvious to those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the claimed invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than restrictive sense. The claimed invention is intended to cover all alternatives, modifications, and equivalents.

Claims

1. An audio system comprising a plurality of distributed microphones and a processor, wherein the plurality of distributed microphones comprises at least three distributed microphones, wherein the processor is configured to:

• obtain via the plurality of distributed microphones a plurality of microphone input signals,

• determine a plurality of different sets of microphone input signals, wherein each set of microphone input signals comprises at least two microphone input signals,

• determine for each set of microphone input signals a first audio parameter,

• determine a score for each of the sets of microphone input signals based on the first audio parameters, and

• output the determined scores.

2. An audio system according to claim 1, wherein the processor is configured to:

• determine for each set of microphone input signals a second audio parameter, and

• determine the score for each of the sets of microphone input signals based on the first audio parameters and the second audio parameters.

3. An audio system according to claim 2, wherein the processor is configured to:

• determine for each set of microphone input signals a weighted sum of the respective first audio parameter and respective second audio parameter, and

• determine the score for each of the sets of microphone input signals based on the weighted sum associated with each multichannel signal.

4. An audio system according to any of the preceding claims, wherein the processor is configured to:

• determine a plurality of third audio parameters for each microphone input signal,

• compare each of the third audio parameters to a first sorting criterion, and

• discard microphone input signals associated with a third audio parameter not passing the first sorting criterion, wherein the created sets of microphone input signals do not comprise discarded microphone input signals.

5. An audio system according to any of the preceding claims, wherein the first audio parameter, and/or the second audio parameter are selected from the following: a signal to noise ratio, a direct to reverberation ratio, an echo to signal ratio, a non-intrusive speech quality score, a speech presence probability estimate, a noise covariance matrix, a clean speech covariance matrix, and a noise power spectral density estimation.

6. An audio system according to any of the preceding claims, wherein the processor is configured to:

• synchronize the obtained microphone input signals.

7. An audio system according to any of the preceding claims, wherein the processor is configured to:

• perform multichannel processing on the plurality of microphone input signals from the set of microphones associated with the top scoring set of microphone input signals to provide an output audio signal.

8. An audio system according to claim 7, wherein to perform multichannel processing comprises one or more of the following: bandwidth extension, denoising, dereverberation, echo control, direction of arrival estimation, and beamforming.

9. An audio system according to any of the preceding claims, wherein the processor is configured to:

• determine a change in room acoustics based on the plurality of microphone input signals,

• comparing the change in room acoustics to a first update criterion,

• if the change in room acoustics exceeds the first update criterion, re-determine the score for each of the sets of microphone input signals, and

• output the re-determined scores.

10. An audio system according to claim 9, wherein to determine the change in room acoustics comprises to:

• determine one or more first impulse responses at a first time based on the plurality of microphone input signals,

• determine one or more second impulse responses at a second time based on the plurality of microphone input signals,

• determine the change in room acoustics, wherein the change in room acoustics is based on the difference between the first impulse responses and the second impulse responses.

11. An audio system according to any of the preceding claims, wherein the audio system comprises a first audio devices comprising:

• one or more first microphones belonging to the plurality of distributed microphones,

• a first wireless communication interface, and

• a first audio processor, and

a second audio devices comprising:

• one or more second microphones belonging to the plurality of distributed microphones,

• a second wireless communication interface, and

• a second audio processor, and

wherein the first audio device is communicatively connected to the second audio device via the first wireless communication interface and the second wireless communication interface, and wherein the first audio processor and/or the second audio processor are configured to determine and output the score for the different sets microphones input signals.

12. An audio device comprising one or more first microphones, a first wireless communication interface, and a first processor, wherein the first processor is configured to:

• obtain via the one or more first microphone one or more first microphone input signals,

• receive one or more second microphone input signals from one or more additional audio device communicatively connected to the audio device,

• determine a plurality of different sets of microphone input signals using the one or more first microphone input signals and the one or more second microphone input signal, wherein each set of microphone input signals comprises at least two microphone input signals,

• determine for each set of microphone input signals a first audio parameter,

• determine a score for each of the sets of microphone input signals based on the first audio parameters,

• output the determined scores.

13. A computer implemented method for selecting a set of microphones among a plurality of distributed microphones in an audio system, the method comprising the steps of:

• obtaining via the plurality of distributed microphones a plurality of microphone input signals,

• determining a plurality of different sets of microphone input signals, wherein each set of microphone input signals comprises at least two microphone input signals,

• determining for each set of microphone input signals a first audio parameter,

• determining a score for each of the sets of microphone input signals based on the first audio parameters,

• outputting the determined scores.

14. A computer program comprising instructions which, when the program is executed by a computer, cause the computer to carry out the method of claim 13.

15. A computer-readable medium having stored thereon the computer program of claim 14.

Drawing

Search report

Search report

Cited references

REFERENCES CITED IN THE DESCRIPTION

This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

Patent documents cited in the description