[0001] The present disclosure relates to an audio system comprising a plurality of distributed
microphones, a related audio device and related methods.
BACKGROUND
[0002] Audio systems are used for a wide variety of purposes, one of these being for facilitating
online meetings. During an online meeting the audio system may be configured for picking-up
speech at a near-end and transmitting it to a far-end. To enhance the quality of the
picked-up speech the audio system may perform processing such as noise reduction,
echo control, dereverberation, beamforming, etc..
[0003] For audio systems comprising a plurality of distributed microphones, e.g., distributed
through-out a meeting room, the audio system may be configured for picking out the
best microphone signals or weighting the picked-up microphone signals, and using the
selected/weighted microphone signals for further processing. By selecting the best
microphone signal or weighting the picked-up microphone signals it may facilitate
a better enhancement of the quality of the picked-up speech, e.g., by reducing the
impact from noise pollutants and other sources of audio distortion.
SUMMARY
[0005] Accordingly, there is a need for an improved audio system, improved audio device
and related methods.
[0006] According to a first aspect there is provided an audio system comprising a plurality
of distributed microphones and a processor. The plurality of distributed microphones
comprises at least three distributed microphones. The processor is configured to obtain
via the plurality of distributed microphones a plurality of microphone input signals,
determine a plurality of different sets of microphone input signals, where each set
of microphone input signals comprises at least two microphone input signals, determine
for each set of microphone input signals a first audio parameter, determine a score
for each of the sets of microphone input signals based on the first audio parameters,
and output the determined scores.
[0007] Consequently, the assessment of the score/quality of the microphone signals is not
based on the microphone input signals in isolation, instead the microphone input signals
are assessed based on different combinations with other microphone input signals.
Hence, the determined score for the set of microphones may better reflect the actual
result achievable with the set of microphone input signals. Especially, for multichannel
processing purposes looking at the combination of microphone input signals instead
of the available signals in isolation may be advantageous. Furthermore, the outputted
scores may be provided to an audio engineer, IT personnel, or other relevant persons,
thus, allowing them to diagnose the audio system and facilitate improvement on the
distribution of microphones.
BRIEF DESCRIPTION OF THE DRAWINGS
[0008] The above and other features and advantages of the present invention will become
readily apparent to those skilled in the art by the following detailed description
of example embodiments thereof with reference to the attached drawings, in which:
Fig. 1 schematically illustrates an example system according to the present disclosure,
and
Fig. 2A-B is a flow chart of an example method according to the present disclosure.
DETAILED DESCRIPTION
[0009] Various example embodiments and details are described hereinafter, with reference
to the figures when relevant. It should be noted that the figures may or may not be
drawn to scale and that elements of similar structures or functions are represented
by like reference numerals throughout the figures. It should also be noted that the
figures are only intended to facilitate the description of the embodiments. They are
not intended as an exhaustive description of the invention or as a limitation on the
scope of the invention. In addition, an illustrated embodiment needs not have all
the aspects or advantages shown. An aspect or an advantage described in conjunction
with a particular embodiment is not necessarily limited to that embodiment and can
be practiced in any other embodiments even if not so illustrated, or if not so explicitly
described.
[0010] The audio system may be embodied by one or more audio devices. The audio system may
be embodied by two, three, or four audio devices distributed through-out a meeting
room. The audio system may comprise a central processing unit communicatively connected
to the plurality of distributed microphones. The central processing unit may be a
processor of one or more audio devices. The audio system may be embodied as a conference
system. The audio system may comprise video capabilities for transmitting a video
stream to a far-end. The audio system may be formed by a plurality of distributed
microphones being communicatively connected to a processor.
[0011] The plurality of distributed microphones may be a plurality of microphones spatially
separated from each other, e.g., by the plurality of microphones being located on
two or more audio devices or arranged at different locations through-out a meeting
room. The plurality of distributed microphones may be a plurality of microphones distributed
at different positions within a room, such as a meeting room. The plurality of distributed
microphones may be a plurality of microphones where the spatial arrangement between
the microphones is not known in advance. The plurality of distributed microphones
may be a plurality of microphones where the plurality of microphones has been distributed
to improve the coverage of the plurality of microphones. The plurality of distributed
microphones comprises at least three microphones, these may be distributed on two
or more audio devices. The plurality of distributed microphones may comprise more
than three microphones, such as five, ten, fifteen or twenty microphones. The plurality
of distributed microphones may be a plurality of microphones where two or more of
the microphones are located at a first audio device, e.g., as microphones comprised
by a first speakerphone, and one or more microphones located at a second audio device,
e.g., as one or more microphones comprised by a second speakerphone.
[0012] The plurality of distributed microphones and the processor may be communicatively
connected to each other via a wired or a wireless connection. Each of the plurality
of distributed microphones may be associated with a transmitter unit or a transceiver
unit allowing them to communicate with a receiver unit or a transceiver unit associated
with the processor, thus allowing the processor to receive microphone input signals
from the microphones. The plurality of microphone input signals may be obtained by
the processor by receiving the plurality of microphone input signals from the plurality
of distributed microphones.
[0013] The plurality of microphone input signals may be digital signals or analog signals.
The plurality of microphone input signals may be obtained by the plurality of distributed
microphones as analog signals, and then be converted to digital signals and subsequently,
be transmitted to the processor as digital signals. Alternatively, the plurality of
distributed microphone may transmit the plurality of microphone input signals as analog
signals and then the processor may convert the analog signals to digital microphone
input signals. The processor may comprise or be communicatively connected to an analog
to digital converter.
[0014] The processor is configured to obtain via the plurality of distributed microphones
a plurality of microphone input signals. The processor may receive the plurality of
microphone input signals over a wireless or a wired connection with the plurality
of distributed microphones. The processor may obtain the plurality of microphone input
signals as analog or digital signals.
[0015] The processor is configured to determine a plurality of different sets of microphone
input signals. The processor may be configured to determine a set of microphone input
signal for each combination of microphone input signals. In an example, where the
plurality of distributed microphones comprises three microphones, each microphone
obtaining a microphone input signal, the different sets of microphone input signals
may be expressed in a vector comprising every combination of microphone input signal:
c1,2 being the set of microphone input signals comprising microphone input signals 1 and
2 from microphone 1 and 2, and so forth for the rest of the elements of the vector,
and C may be called the set vector which comprises the plurality of different sets
of microphone input signals. In some embodiments, the processor may be configured
to determine a limited number of sets of microphone input signals, i.e., not determining
a set for each combination of microphone input signals. Instead, the processor may
be limited to determine only sets of microphone input signals comprising 2, 3, 4 or
5 microphone input signals. By only determining a limited number of sets of microphone
input signals it may reduce the processing required while still allowing to determine
the best microphone set for further multichannel processing.
[0016] By different sets of microphones, it should be understood as each set of microphones
comprises a unique combination of microphone input signals. Thus, different sets of
microphone input signals are not just different permutations of the same microphone
input signals.
[0017] The processor is configured to determine for each set of microphone input signals
a first audio parameter. To determine the first audio parameter, the processor may
combine the microphone input signals within a set. In an example, the processor combines
the microphone input signals within a set into a single signal by beamforming the
microphone input signals within a set, to thereby determine a beamformed signal. The
processor may be configured to process the beamformed signal to determine the first
audio parameter. The processor may process the beamformed signal to determine one
or more first audio parameters, such as a signal to noise ratio, a direct to reverberation
ratio, an echo to signal ratio, or a non-intrusive speech quality score. The processor
may be configured for each set of microphone input signals to determine a corresponding
beamformed signal, and based on the determined beamformed signals determine the first
audio parameters.
[0018] In the above presented example, the processor combines the microphone input signals
within each set into a single signal and then determines the first audio parameters,
however, it is not necessary to combine the microphone input signals before determining
the first audio parameter. The processor may be configured to determine a noise covariance
matrix, and/or an echo covariance matrix and/or a clean speech covariance matrix for
each set. Thus, not requiring for the microphone input signals within a set to be
combined before determining the first audio parameters.
[0019] The first audio parameter may be viewed as a parameter which describes the whole
set of microphone input signals, and not the individual elements within the set of
microphone input signals.
[0020] In one example, the microphone input signals may be intended to be used in multi-channel
Wiener filtering solution for denoising. Where
x =
s +
v, where x is the noisy observation, s is the clean speech signal and v is the background
noise. From adaptive filter theorem, assuming d as the desired signal, the optimal
filter weights
HMWF,opt may be given by:

where Φ
xd is the cross-power spectral density between the noisy and desired signal, and Φ
xx is the auto-power spectral density of the microphone input signals. The processor
may be configured to determine/estimate the two power spectral density elements for
each set of microphone input signals. The two power spectral density elements may
then be viewed as the first audio parameter.
[0021] The processor is configured to determine a score for each of the sets of microphone
input signals based on the first audio parameters. The score may be a relative score
between the sets of microphone input signals, e.g., the score may define a ranking
between sets of the multimicrophone signals, for example the processor may score each
set based on a signal to noise ratio and rank the sets relative to each other from
lowest signal to noise ratio to highest signal to noise ratio. The processor may score
each set based on the associated first audio parameters and rank the sets relative
to each other based on their associated first audio parameters. The score may be determined
from a continuous range of values, e.g., 1-100. The score may be determined from a
discrete range of values, e.g., 1, 2, 3, 4, or 5. The score may be determined based
on one or more threshold values, e.g., if the first audio parameter exceeds the threshold
value the processor may assign a first score and if the first audio parameter does
exceed the threshold value the processor may assign a second score, the first score
could be a score indicating the set to be good, and the second score could be score
indicating the set to be bad.
[0022] To continue the above example with the multi-channel Wiener filtering solution for
denoising, the processor may determine for each set of microphone input signals a
SNR value based on the power spectral density elements and score the set of microphone
input signals based on the determined SNR values. An alternative is to choose the
set which leads to lowest mean square error cost among the Wiener filter solutions.
The lowest mean square error cost may be found by the following:

[0023] The processor may be configured to output the determined scores. The processor may
use the outputted score to determine which microphone input signals to use for further
processing. The processor may output scores to another processor. The processor may
output the scores to be used for diagnostics of a meeting room, e.g., determining
areas with combination of microphones which score well and areas with combination
of microphones which score badly.
[0024] In embodiment the processor is configured to determine for each set of microphone
input signals a second audio parameter and determine the score for each of the sets
of microphone input signals based on the first audio parameters and the second audio
parameters.
[0025] Consequently, the scoring of the sets is based on more parameters, thus, making the
scoring of the sets more robust. For example, during a double talk situation, i.e.,
where the far-end and the near-end is transmitting simultaneously, it may not necessarily
be advantageous to select the multichannel signal with the highest signal-to-noise
ratio, as the same multichannel signal may exhibit a large echo-to-signal ratio, thus,
deciding based on both parameters may be preferable.
[0026] The second audio parameter may be an audio parameter differing from the first audio
parameter.
[0027] Although only a first and second audio parameter is mentioned, a third, a fourth
and so forth audio parameter may be determined for each set and used in determining
a score for the set.
[0028] In an embodiment the processor is configured to determine for each set of microphone
input signals a weighted sum of the respective first audio parameter and respective
second audio parameter and determine the score for each of the sets of microphone
input signals based on the weighted sum associated with each multichannel signal.
[0029] A weighted sum provides an easy and simple manner to consider several audio parameters.
[0030] The weights associated with different audio parameters may be user defined, i.e.,
determined by a user of the audio system. The weights associated with different audio
parameters may be determined during a tuning/set-up process of the audio system. The
weights may be pre-set by a provider of the audio system, e.g., the audio system may
comprise pre-set factory setting comprising weights associated with different audio
parameters.
[0031] Although a weighted sum provides a good approach to consider several audio parameters,
other approaches may also be usable. For example, a ranking approach may be selected
where each set is ranked in relation to each other based on the first audio parameter
and ranked in relation to each other based on the second audio parameter, and the
average rank obtained by a set is the score for the set.
[0032] In an embodiment the processor is configured to determine a plurality of third audio
parameters, each third audio parameter being associated with a respective microphone
input signal, compare each of the third audio parameters to a first sorting criterion,
and discard microphone input signals associated with a third audio parameter not passing
the first sorting criterion, where the created sets of microphone input signals do
not comprise discarded microphone input signals.
[0033] By discarding microphone input signals, it may reduce the complexity and processing
power required in determining the sets of microphones, and subsequently in determining
audio parameters associated with the sets, as there will be less sets and audio parameters
to determine. Furthermore, faulty microphones giving misleading microphone input signals
may be sorted away, thus, avoiding faulty microphones skewing the results.
[0034] The third audio parameter may be an audio parameter differing from the first audio
parameter and/or the second audio parameter.
[0035] The third audio parameter may be a sound pressure level, a signal to noise-ratio
or a non-intrusive speech quality predictor, such as NS-MOS, AEC-MOS, NORRESQA, NISQA,
SRMR.
[0036] Although only a first sorting criterion is mentioned, a second sorting criterion,
a third sorting criterion and so forth may also be present. Each sorting criterion
may each be associated with their own audio parameter. Consequently, the microphone
input signals may go through several rounds of sortation before sets of microphone
input signals are determined.
[0037] The first sorting criterion may be a threshold value, or a range of acceptable values.
The first sorting criterion may be set by a user of the audio system. The first sorting
criterion may be set during a tuning process of the audio system. The weights may
be pre-set by a provider of the audio system, e.g., as part of pre-set factory settings.
[0038] The first sorting criterion may be based on a sound pressure level, e.g., if a microphone
input signal exhibits a sound pressure level near 0 dB or an abnormally high sound
pressure level, it may indicate the microphone associated with the microphone input
signal is faulty. The first sorting criterion may be based on a signal to noise ratio,
e.g., if the signal to noise ratio of a microphone input signal is determined to be
low it may be sorted away as even in combination with other signals it may not positively
contribute. The first sorting criterion may be based on a non-intrusive speech quality
predictor. Although, some specific examples for the first sorting criterion have been
mentioned, the present disclosure is not limited to these and other audio parameters
may be equally applicable, e.g., MOS score, speech presence probability, signal to
echo ratio, RT60, direct-to-reverberation ratio, etc.
[0039] The third audio parameter may be selected from the following: a signal to noise ratio,
a direct to reverberation ratio, an echo to signal ratio, a non-intrusive speech quality
score, and a speech presence probability estimate.
[0040] In an embodiment the first audio parameter, and/or the second audio parameter are
selected from the following: a signal to noise ratio, a direct to reverberation ratio,
an echo to signal ratio, a non-intrusive speech quality score, a speech presence probability
estimate, a noise covariance matrix, a clean speech covariance matrix, an echo covariance
matrix, cross power spectral density, an auto power spectral density and a noise power
spectral density estimation.
[0041] In an embodiment the processor is configured to synchronize the obtained microphone
input signals.
[0042] The synchronization may be carried out by cross-correlation of the microphone input
signals. The synchronization may be carried out based on a known distribution of the
microphones. The synchronization may be carried out based on time stamps associated
with the microphone input signals. The synchronization may be carried out temporally
with respect to a clock. Synchronization may be carried out by cross-attention layers
in a neural network.
[0043] Synchronization of the microphone input signals may be carried out prior to determining
the different sets of microphone input signals.
[0044] In an embodiment the processor is configured to perform multichannel processing on
the plurality of microphone input signals from the set of microphones associated with
the top scoring set of microphone input signals to provide an output audio signal.
[0045] Consequently, after determining the top scoring set of microphone input signals,
multichannel processing may be performed on the plurality of microphone input signals
from the set of microphones associated with the top scoring set of microphone input
signals.
[0046] The output audio signal may be an audio signal which is to be transmitted to a far-end,
e.g., another audio system or another audio device.
[0047] In an embodiment to perform multichannel processing comprises one or more of the
following: bandwidth extension, denoising, dereverberation, echo control, direction
of arrival estimation, and beamforming.
[0048] In an embodiment, the processor is configured to determine a change in room acoustics
based on the plurality of microphone input signals, comparing the change in room acoustics
to a first update criterion, if the change in room acoustics exceeds the first update
criterion, re-determine the score for each of the sets of microphone input signals,
and output the re-determined scores.
[0049] Consequently, the audio system may in an adaptive manner determine the score for
each set of microphones.
[0050] The change in room acoustics may be caused by movement of an object or a person in
the vicinity of the audio system. The change in room acoustics may be caused by the
addition or removal of an object or a person in the vicinity of the audio system.
[0051] The change in room acoustics may be determined by determining a change over time
in one or more audio parameters of the plurality of microphone input signals. The
change in room acoustics may be determined by determining a change over time in one
or more audio parameters in each of the plurality of microphone input signals. The
change in room acoustics may be determined by analysing microphone input signals associated
with the top scoring set of microphone input signals.
[0052] The first update criterion may comprise one or more threshold values.
[0053] Re-determining the score for each set of microphone input signals may comprise repeating
some or all of the processing previously carried out to determine the score for each
set of microphone input signals, such as, obtain via the plurality of distributed
microphones a plurality of microphone input signals, determine a plurality of different
sets of microphone input signals, determine for each set of microphone input signals
a first audio parameter, determine a score for each of the sets of microphone input
signals based on the first audio parameters, and output the determined scores.
[0054] In an embodiment to determine the change in room acoustics comprises to determine
one or more first impulse responses at a first time based on the plurality of microphone
input signals, determine one or more second impulse responses at a second time based
on the plurality of microphone input signals, determine the change in room acoustics,
where the determination of the change in room acoustics is based on a difference between
the one or more first impulse responses and the one or more second impulse responses.
[0055] Consequently, a simple method is provided for determining a change in room acoustics.
Furthermore, the determination of an impulse response is part of the normal processing
pipeline for audio systems dealing with echo control, hence, minimal additional processing
logic is needed for determining the change in room acoustics.
[0056] The first impulse response may be a room impulse response. The second impulse response
may be a room impulse response.
[0057] In an embodiment microphone input signals associated with the top scoring set of
microphones are used in determining the change in room acoustics.
[0058] Thus, the determination of room acoustics may be done post determining the top scoring
set of microphone input signals.
[0059] In an embodiment the audio system comprises a first audio devices comprising one
or more first microphones belonging to the plurality of distributed microphones, a
first wireless communication interface, and a first audio processor, and a second
audio devices comprising one or more second microphones belonging to the plurality
of distributed microphones, a second wireless communication interface, and a second
audio processor, and where the first audio device is communicatively connected to
the second audio device via the first wireless communication interface and the second
wireless communication interface, and where the first audio processor and/or the second
audio processor are configured to determine and output the score for the different
sets microphones input signals.
[0060] The first audio device and the second audio device may be two speakerphones arranged
in a meeting room. Even for modern speakerphones it may be required to have two or
more speakerphones to provide proper audio pick-up coverage of a large meeting room.
The first audio device and the second audio device may be different types of audio
devices, e.g., the first audio device may be a headset and the second audio device
may be a speakerphone.
[0061] All microphones of the plurality of distributed microphones may be comprised by the
first audio device and the second audio device.
[0062] The first audio device may comprise one, two, three, four, five or more first microphones
belonging to the plurality of distributed microphones. The second audio device may
comprise one, two, three, four, five or more first microphones belonging to the plurality
of distributed microphones.
[0063] According to a second aspect of the invention there is provided an audio device comprising
one or more first microphones, a first wireless communication interface, and a first
processor, where the first processor is configured to obtain via the one or more first
microphone one or more first microphone input signals, receive one or more second
microphone input signals from one or more additional audio device communicatively
connected to the audio device, determine a plurality of different sets of microphone
input signals using the one or more first microphone input signals and the one or
more second microphone input signal, where each set of microphone input signals comprises
at least two microphone input signals, determine for each set of microphone input
signals a first audio parameter, determine a score for each of the sets of microphone
input signals based on the first audio parameters, output the determined scores.
[0064] In the following the general term audio device is used, the description mentioned
in relation to the audio device is equally applicable to the first audio device, the
second audio device, or any other audio device mentioned herein.
[0065] The audio device may be configured to be worn by a user. The audio device may be
arranged at the user's ear, on the user's ear, over the user's ear, in the user's
ear, in the user's ear canal, behind the user's ear, and/or in the user's concha,
i.e., the audio device is configured to be worn at the user's ear.
[0066] The audio device may be configured to be worn by a user at each ear, e.g., a pair
of ear buds or a head set with two earcups. In the embodiment where the audio device
is to be worn at both ears, the components meant to be worn at each ear may be connected,
such as wirelessly connected and/or connected by wires, and/or by a strap and/or by
a headband. The components meant to be worn at each ear may be substantially identical
or differ from each other.
[0067] The audio device may be a hearable such as a headset, headphones, earphones, ear
bud, hearing aids, an over the counter (OTC) hearing device, a hearing protection
device, a one-size-fits-all audio device, a custom audio device or another head-wearable
audio device.
[0068] The audio device may be a speaker phone, or another audio device forming part of
a conference system. The audio device may be an audio device not configured to be
worn by a user.
[0069] The interface may comprise a wireless transceiver, also denoted as a radio transceiver,
and an antenna for wireless transmission and reception of an audio signal, such as
for wireless transmission of an output signal and/or wireless reception of a wireless
input signal. The audio device may be configured for wireless communication with one
or more electronic devices, such as another audio device, a smartphone, a tablet,
a computer and/or a smart watch. The audio device optionally comprises an antenna
for converting one or more wireless input audio signals to antenna output signal(s).
The audio device may be configured for wireless communications via a wireless communication
system, such as short-range wireless communications systems, such as Wi-Fi, Bluetooth,
Zigbee, IEEE 802.11, IEEE 802.15, infrared and/or the like. The audio device may be
configured for wireless communications via a wireless communication system, such as
a 3GPP system, such as a 3GPP system supporting one or more of: New Radio, NR, Narrow-band
loT, NB-loT, and Long Term Evolution - enhanced Machine Type Communication, LTE-M,
millimeter-wave communications, such as millimeter-wave communications in licensed
bands, such as device-to-device millimetre-wave communications in licensed bands.
In one or more example audio devices the interface of the audio device comprises one
or more of: a Bluetooth interface, Bluetooth low energy interface, and a magnetic
induction interface. For example, the interface of the audio device may comprise a
Bluetooth antenna and/or a magnetic interference antenna. In one or more example audio
devices, the interface may comprise a connector for wired communication, via a connector,
such as by using an electrical cable. The connector may connect one or more microphones
to the audio device. The connector may connect the audio device to an electronic device,
e.g., for wired connection. The one or more interfaces can be or comprise wireless
interfaces, such as transmitters and/or receivers, and/or wired interfaces, such as
connectors for physical coupling.
[0070] The audio device comprises a plurality of input transducers. The plurality of input
transducers may comprise a plurality of microphones. The plurality of input transducers
may be configured for converting an acoustic signal into an electric input signal.
The electric input signal may be an analog signal. The electric input signal may be
a digital signal. The plurality of input transducers may be coupled to one or more
analog-to-digital converters configured for converting the analog input signal into
a digital input signal.
[0071] The audio device may comprise one or more antennas configured for wireless communication.
The one or more antennas may comprise an electric antenna. The electric antenna is
configured for wireless communication at a first frequency. The first frequency may
be above 800 MHz, preferably a wavelength between 900 MHz and 6 GHz. The first frequency
may be 902 MHz to 928 MHz. The first frequency may be 2.4 to 2.5 GHz. The first frequency
may be 5.725 GHz to 5.875 GHz. The one or more antennas may comprise a magnetic antenna.
The magnetic antenna may comprise a magnetic core. The magnetic antenna comprises
a coil. The coil may be coiled around the magnetic core. The magnetic antenna is configured
for wireless communication at a second frequency. The second frequency may be below
100 MHZ. The second frequency may be between 9 MHZ and 15 MHZ.
[0072] The audio device may comprise one or more wireless communication units. The one or
more wireless communication units may comprise one or more wireless receivers, one
or more wireless transmitters, one or more transmitter-receiver pairs, and/or one
or more transceivers. At least one of the one or more wireless communication units
may be coupled to the one or more antennas. The wireless communication unit may be
configured for converting a wireless signal received by at least one of the one or
more antennas into an electric input signal. The audio device may be configured for
wired/wireless audio communication, e.g., enabling the user to listen to media, such
as music or radio, and/or enabling the user to perform phone calls.
[0073] The audio device may be configured for wireless communication with one or more external
devices, such as one or more accessory devices, such as a smartphone and/or a smart
watch.
[0074] The audio device may comprise one or more processing units. The processing unit may
be configured for processing one or more input signals. The processing may comprise
compensating for a hearing loss of the user, i.e., apply frequency dependent gain
to input signals in accordance with the user's frequency dependent hearing impairment.
The processing may comprise performing feedback cancellation, beamforming, tinnitus
reduction/masking, noise reduction, noise cancellation, speech recognition, bass adjustment,
treble adjustment, face balancing, echo control, and/or processing of user input.
The processing unit may be a processor, an integrated circuit, an application, functional
module, etc. The processing unit may be implemented in a signal-processing chip or
a printed circuit board (PCB). The processing unit is configured to provide an electric
output signal based on the processing of one or more input signals. The processing
unit may be configured to provide one or more further electric output signals. The
one or more further electric output signals may be based on the processing of one
or more input signals. The processing unit may comprise a receiver, a transmitter
and/or a transceiver for receiving and transmitting wireless signals. The processing
unit may control one or more playback features of the audio device.
[0075] The audio device may comprise an output transducer. The output transducer may be
coupled to the processing unit. The output transducer may be a loudspeaker, or any
other device configured for converting an electrical signal into an acoustical signal.
The receiver may be configured for converting an electric output signal into an acoustic
output signal.
[0076] The wireless communication unit may be configured for converting an electric output
signal into a wireless output signal. The wireless output signal may comprise synchronization
data. The wireless communication unit may be configured for transmitting the wireless
output signal via at least one of the one or more antennas.
[0077] The audio device may comprise a digital-to-analog converter configured to convert
an electric output signal or a wireless output signal into an analog signal.
[0078] The audio device may comprise a power source. The power source may comprise a battery
providing a first voltage. The battery may be a rechargeable battery. The battery
may be a replaceable battery. The power source may comprise a power management unit.
The power management unit may be configured to convert the first voltage into a second
voltage. The power source may comprise a charging coil. The charging coil may be provided
by the magnetic antenna.
[0079] The audio device may comprise a memory, including volatile and non-volatile forms
of memory.
[0080] According to a third aspect there is provided a computer implemented method for selecting
a set of microphones among a plurality of distributed microphones in an audio system,
the method comprising obtaining via the plurality of distributed microphones a plurality
of microphone input signals, determining a plurality of different sets of microphone
input signals, where each set of microphone input signals comprises at least two microphone
input signals, determining for each set of microphone input signals a first audio
parameter, determining a score for each of the sets of microphone input signals based
on the first audio parameters, outputting the determined scores.
[0081] Fig. 1 schematically illustrates an example system, such as an audio system 2 according
to the present disclosure. The audio system 2 comprises an audio device 10 comprising
a memory 10A, an interface 10B, a processor 10C, one or more speakers 10D, and one
or more microphones, including a first microphone 10E. The audio device 10 may be
configured to obtain audio signals, output audio signals, and process audio signals.
The audio device 10 may be a speakerphone, e.g., configured to be used by a party
(such as one or more users 1A at a near-end) to communicate with one or more other
parties (such as one or more users 1B at a far-end). The audio device 10 may be used
for a conference and/or a meeting between two or more parties being remote from each
other. The audio device 10 may be used by one or more users in a vicinity of where
the speakerphone 10 is located, also referred to as a near-end.
[0082] In one or more example systems, the audio system 2 comprises one or more additional
audio devices, including a second audio device 10'. The second audio device 10' may
be substantially identical to the audio device 10, i.e., the second audio device 10'
may comprises a memory, an interface, a processor, one or more speakers, and one or
more microphones.
[0083] In one or more example systems, the audio system 2 comprises one or more additional
microphones 10E' spatially separated and from the audio device 10 and communicatively
connected with the audio device 10 via a wireless connection 15 or alternatively a
wired connection.
[0084] Optionally, the audio system 2 comprises an electronic device 60. The electronic
device 60 may for example be or comprise a smartphone, a smart-watch, a conference
hub, a smart-tv, smart-speakers, a tablet, a computer, such as a laptop computer or
PC, or a tablet computer. In other words, the electronic device 60 may for example
be a user device of a user 1A, such as a mobile phone or a computer, configured to
communicate with the speakerphone 10. In one or more example systems and/or speakerphones,
the accessory device may be seen as a user accessory device, such as a mobile phone,
a smart watch, a tablet, and/or a wearable gadget.
[0085] Optionally, the audio system 2 is communicatively connected to a far-end communication
device 30. The communication device 30 may be seen as a communication device used
by one or more far-end users 1, 1B to communicate with the one or more users 1, 1A
at the near-end, e.g., via a network 40 such as global network, e.g., the internet,
and/or a local network. The communication device 30 may be configured to obtain 38
a microphone input signal indicative of speech from one or more users 1B at the far-end.
The communication device 30 may be configured to process the microphone input signal
for provision of an external output signal. The communication device 30 may be configured
to transmit 22 the external output signal to the audio device 10, e.g., via the network
40. The communication device 30 may be configured to receive 24 the external output
signal from the audio device 10. The communication device 30 may be configured to
output 36, to the user 1B at the far-end, an internal output signal based on the external
output signal from the speakerphone 10.
[0086] The audio system comprises a plurality of distributed microphones 10E, 10E'. The
plurality of distributed microphones comprises at least three distributed microphones
10E, 10E'. The plurality of distributed microphones may comprise one or more microphones
10E comprised by the audio device 10, and one or more microphones 10E' spatially separated
from the audio device 10E'. The plurality of distributed microphones 10E, 10E' may
be communicatively connected to a processor 10C of the audio device 10. The one or
more microphones 10E' spatially separated from the audio device 10E' may be comprised
by a second audio device 10' communicatively connected to the audio device 10.
[0087] The audio device 10 is configured to obtain via the plurality of distributed microphones
10E, 10E' a plurality of microphone input signals. In other words, the processor 10C
of the audio device 10 is configured to obtain via the plurality of distributed microphones
10E, 10E' a plurality of microphone input signals. The plurality of microphone input
signals may be a combination of microphone input signals obtained by one or more microphones
10E of the audio device 10, and one or more microphones 10E' external to the audio
device 10, e.g., one or more microphones 10E' associated with a second audio device
10'.
[0088] The audio device 10 is configured to determine a plurality of different sets of microphone
input signals, where each set of microphone input signals comprises at least two microphone
input signals. In other words, the processor 10c of the audio device 10 is configured
to determine a plurality of different sets of microphone input signals, where each
set of microphone input signals comprises at least two microphone input signals. The
plurality of sets of microphone input signals may comprises microphone input signals
from one or more microphones 10E of the audio device 10. The plurality of sets of
microphone input signals may comprises microphone input signals from one or more microphones
10E' external to the audio device 10. The plurality of sets of microphone input signals
may comprises microphone input signals from one or more microphones 10E of the audio
device 10 and from one or more microphones 10E' spatially separated from the audio
device 10.
[0089] The audio device 10 is configured to determine for each set of microphone input signals
a first audio parameter. In other words, the processor 10c of the audio device 10
is configured to determine for each set of microphone input signals a first audio
parameter.
[0090] The audio device 10 is configured to determine a score for each of the sets of microphone
input signals based on the first audio parameters. In other words, the processor 10C
of the audio device 10 is configured to determine a score for each of the sets of
microphone input signals based on the first audio parameters. The determination of
the score may be carried out by scoring the sets of microphone input signals relative
to each based on the first audio parameter, e.g., in case of the first audio parameter
being a signal-to-noise ratio determining the score may be scoring the set of microphone
input signals from lowest to highest signal-to-noise ratio.
[0091] The audio device 10 is configured to output the determined scores. In other words,
the processor 10C of the audio device 10 is configured to output the determined scores.
[0092] The processor 10C may use the highest scoring set to determine which microphone input
signals to use for further processing, such as audio processing. The processor 10C
may output the scores to be used for diagnostics of a meeting room, e.g., determining
areas where microphones are usually selected, areas where microphones which are not
usually selected are located, and distributions of microphones which work well.
[0093] Referring to Fig. 2A-B which depicts a flow chart of an example method according
to the present disclosure. The method may be fully or at least partly performed by
a computer or other processing units. The method may be performed by a processor of
an audio device. The method may be performed by an external processor communicatively
connected with one or more audio devices of an audio system. The method comprises
obtaining via the plurality of distributed microphones a plurality of microphone input
signals S101. The method may comprise synchronizing the obtained microphone input
signals S101D. The method may comprise determining a plurality of third audio parameters,
each third audio parameter being associated with a respective microphone input signal
S101A. The method may comprise comparing each of the third audio parameters to a first
sorting criterion S101B. The method may comprise discarding microphone input signals
associated with a third audio parameter not passing the first sorting criterion, where
the created sets of microphone input signals do not comprise discarded microphone
input signals S101C. The method comprises determining a plurality of different sets
of microphone input signals, where each set of microphone input signals comprises
at least two microphone input signals and differ from each other S102. The method
comprises determining for each set of microphone input signals a first audio parameter
S103. The method may comprise of determining for each set of microphone input signals
a second audio parameter S103A. The method may comprise determining for each set a
weighted sum of the respective first audio parameter and respective second audio parameter
S103B. The method comprises determining a score for each of the sets of microphone
input signals based on the first audio parameters S104. The determination of score
may also be based on the second audio parameter. The method comprises outputting the
determined scores S105. The method may comprise performing multichannel processing
on the plurality of microphone input signals associated with the top scoring set of
microphone input signals to provide an output audio signal S106. The multichannel
processing may comprise one or more of the following: bandwidth extension, denoising,
dereverberation, echo control, direction of arrival, and beamforming. The method may
comprise determining a change in room acoustics based on the plurality of microphone
input signals S107. Determining the change in room acoustics may comprise determining
one or more first impulse responses at a first time based on the plurality of microphone
input signals S107A. Determining the change in room acoustics may comprise determining
one or more second impulse responses at a second time based on the plurality of microphone
input signals S107B. Determining the change in room acoustics may comprise determining
the change in room acoustics, where the change in room acoustics is based on the difference
between the first impulse responses and the second impulse responses S107C. The method
may comprise comparing the change in room acoustics to a first update criterion S108.
The method may comprise, if the change in room acoustics exceeds the first update
criterion, re-determining the score for each of the sets of microphone input signals
S109. The method may comprise, if the change in room acoustics does not exceed the
first update criterion, keeping the determined scores microphones S110.
[0094] The use of the terms "first", "second", "third" and "fourth", "primary", "secondary",
"tertiary" etc. does not imply any particular order but are included to identify individual
elements. Moreover, the use of the terms "first", "second", "third" and "fourth",
"primary", "secondary", "tertiary" etc. does not denote any order or importance, but
rather the terms "first", "second", "third" and "fourth", "primary", "secondary",
"tertiary" etc. are used to distinguish one element from another. Note that the words
"first", "second", "third" and "fourth", "primary", "secondary", "tertiary" etc. are
used here and elsewhere for labelling purposes only and are not intended to denote
any specific spatial or temporal ordering.
[0095] Furthermore, the labelling of a first element does not imply the presence of a second
element and vice versa.
[0096] It may be appreciated that Figs. 1-2b comprise some modules or operations which are
illustrated with a solid line and some modules or operations which are illustrated
with a dashed line. The modules or operations which are comprised in a solid line
are modules or operations which are comprised in the broadest example embodiment.
The modules or operations which are comprised in a dashed line are example embodiments
which may be comprised in, or a part of, or are further modules or operations which
may be taken in addition to the modules or operations of the solid line example embodiments.
It should be appreciated that these operations need not be performed in order presented.
Furthermore, it should be appreciated that not all of the operations need to be performed.
The example operations may be performed in any order and in any combination.
[0097] It is to be noted that the word "comprising" does not necessarily exclude the presence
of other elements or steps than those listed.
[0098] It is to be noted that the words "a" or "an" preceding an element do not exclude
the presence of a plurality of such elements.
[0099] It should further be noted that any reference signs do not limit the scope of the
claims, that the example embodiments may be implemented at least in part by means
of both hardware and software, and that several "means", "units" or "devices" may
be represented by the same item of hardware.
[0100] The various example methods, devices, and systems described herein are described
in the general context of method steps processes, which may be implemented in one
aspect by a computer program product, embodied in a computer-readable medium, including
computer-executable instructions, such as program code, executed by computers in networked
environments. A computer-readable medium may include removable and non-removable storage
devices including, but not limited to, Read Only Memory (ROM), Random Access Memory
(RAM), compact discs (CDs), digital versatile discs (DVD), etc. Generally, program
modules may include routines, programs, objects, components, data structures, etc.
that perform specified tasks or implement specific abstract data types. Computer-executable
instructions, associated data structures, and program modules represent examples of
program code for executing steps of the methods disclosed herein. The particular sequence
of such executable instructions or associated data structures represents examples
of corresponding acts for implementing the functions described in such steps or processes.
[0101] Although features have been shown and described, it will be understood that they
are not intended to limit the claimed invention, and it will be made obvious to those
skilled in the art that various changes and modifications may be made without departing
from the spirit and scope of the claimed invention. The specification and drawings
are, accordingly, to be regarded in an illustrative rather than restrictive sense.
The claimed invention is intended to cover all alternatives, modifications, and equivalents.
1. An audio system comprising a plurality of distributed microphones and a processor,
wherein the plurality of distributed microphones comprises at least three distributed
microphones, wherein the processor is configured to:
• obtain via the plurality of distributed microphones a plurality of microphone input
signals,
• determine a plurality of different sets of microphone input signals, wherein each
set of microphone input signals comprises at least two microphone input signals,
• determine for each set of microphone input signals a first audio parameter,
• determine a score for each of the sets of microphone input signals based on the
first audio parameters, and
• output the determined scores.
2. An audio system according to claim 1, wherein the processor is configured to:
• determine for each set of microphone input signals a second audio parameter, and
• determine the score for each of the sets of microphone input signals based on the
first audio parameters and the second audio parameters.
3. An audio system according to claim 2, wherein the processor is configured to:
• determine for each set of microphone input signals a weighted sum of the respective
first audio parameter and respective second audio parameter, and
• determine the score for each of the sets of microphone input signals based on the
weighted sum associated with each multichannel signal.
4. An audio system according to any of the preceding claims, wherein the processor is
configured to:
• determine a plurality of third audio parameters for each microphone input signal,
• compare each of the third audio parameters to a first sorting criterion, and
• discard microphone input signals associated with a third audio parameter not passing
the first sorting criterion, wherein the created sets of microphone input signals
do not comprise discarded microphone input signals.
5. An audio system according to any of the preceding claims, wherein the first audio
parameter, and/or the second audio parameter are selected from the following: a signal
to noise ratio, a direct to reverberation ratio, an echo to signal ratio, a non-intrusive
speech quality score, a speech presence probability estimate, a noise covariance matrix,
a clean speech covariance matrix, and a noise power spectral density estimation.
6. An audio system according to any of the preceding claims, wherein the processor is
configured to:
• synchronize the obtained microphone input signals.
7. An audio system according to any of the preceding claims, wherein the processor is
configured to:
• perform multichannel processing on the plurality of microphone input signals from
the set of microphones associated with the top scoring set of microphone input signals
to provide an output audio signal.
8. An audio system according to claim 7, wherein to perform multichannel processing comprises
one or more of the following: bandwidth extension, denoising, dereverberation, echo
control, direction of arrival estimation, and beamforming.
9. An audio system according to any of the preceding claims, wherein the processor is
configured to:
• determine a change in room acoustics based on the plurality of microphone input
signals,
• comparing the change in room acoustics to a first update criterion,
• if the change in room acoustics exceeds the first update criterion, re-determine
the score for each of the sets of microphone input signals, and
• output the re-determined scores.
10. An audio system according to claim 9, wherein to determine the change in room acoustics
comprises to:
• determine one or more first impulse responses at a first time based on the plurality
of microphone input signals,
• determine one or more second impulse responses at a second time based on the plurality
of microphone input signals,
• determine the change in room acoustics, wherein the change in room acoustics is
based on the difference between the first impulse responses and the second impulse
responses.
11. An audio system according to any of the preceding claims, wherein the audio system
comprises a first audio devices comprising:
• one or more first microphones belonging to the plurality of distributed microphones,
• a first wireless communication interface, and
• a first audio processor, and
a second audio devices comprising:
• one or more second microphones belonging to the plurality of distributed microphones,
• a second wireless communication interface, and
• a second audio processor, and
wherein the first audio device is communicatively connected to the second audio device
via the first wireless communication interface and the second wireless communication
interface, and wherein the first audio processor and/or the second audio processor
are configured to determine and output the score for the different sets microphones
input signals.
12. An audio device comprising one or more first microphones, a first wireless communication
interface, and a first processor, wherein the first processor is configured to:
• obtain via the one or more first microphone one or more first microphone input signals,
• receive one or more second microphone input signals from one or more additional
audio device communicatively connected to the audio device,
• determine a plurality of different sets of microphone input signals using the one
or more first microphone input signals and the one or more second microphone input
signal, wherein each set of microphone input signals comprises at least two microphone
input signals,
• determine for each set of microphone input signals a first audio parameter,
• determine a score for each of the sets of microphone input signals based on the
first audio parameters,
• output the determined scores.
13. A computer implemented method for selecting a set of microphones among a plurality
of distributed microphones in an audio system, the method comprising the steps of:
• obtaining via the plurality of distributed microphones a plurality of microphone
input signals,
• determining a plurality of different sets of microphone input signals, wherein each
set of microphone input signals comprises at least two microphone input signals,
• determining for each set of microphone input signals a first audio parameter,
• determining a score for each of the sets of microphone input signals based on the
first audio parameters,
• outputting the determined scores.
14. A computer program comprising instructions which, when the program is executed by
a computer, cause the computer to carry out the method of claim 13.
15. A computer-readable medium having stored thereon the computer program of claim 14.