Field
[0001] Example embodiments relate multi-microphone audio capture, such as identifying when
one or more microphones of an audio capture device are blocked.
Background
[0002] The number of microphones in many audio capture devices, such as smartphones, is
tending to increase. Increasing the number of microphones can have advantages, such
as aiding noise suppression techniques and beamforming, and enables new features such
as spatial audio capture. Spatial audio capture may be useful for communications and
user-generated content (UGC). For example, a user may generate videos that incorporate
spatial audio. Spatial audio capture is extending to also other domains outside mobile
phones. Examples include standalone cameras, police wearable cameras, action cameras
and portable recorders. There remains a need for further developments in this field.
Sum m ary
[0003] In a first aspect, this specification describes an apparatus comprising means for
performing: identifying at least one microphone of a multi-microphone audio capture
device that is blocked (e.g. blocked by a user of the device); and determining a position
of at least one first visual indication of the blocked microphone(s) for output to
a user on a display (e.g. an integrated display) of the audio capture device based,
at least in part, on a location of the respective microphone on the audio capture
device, wherein the positions of the at least one first visual indication correspond
more closely to the position of the respective blocked microphone than to the position
of any other microphone of the multi-microphone audio capture device. The audio capture
device may be used to capturing spatial audio, for example for use to generate user-generated
content (UGC) such as video with spatial audio content. The display may be a screen,
such as an integrated screen of the user device. In some example embodiments, the
display may be separate to the user device (e.g. a monitor or a headset display).
[0004] Some example embodiments further comprise means for performing: determining whether
any part of the display is blocked (e.g. such that the relevant part of the display
is not visible to the user), wherein the position of the at least one first visual
indication of the blocked microphone(s) is based, at least in part, on a location
of any part of the display that is blocked. The means for performing determining the
position of the at least one first visual indication of the blocked microphone(s)
may further comprise means for performing determining the position of the at least
one first visual indication such that at least part of the or each first visual indication
is on a part of the display not blocked. The means for performing determining the
position of the at least one first visual indication of the blocked microphone(s)
may further comprise determining the position of the first visual indication such
that the or each first visual indication corresponds as closely as possible to the
position of the blocked microphone.
[0005] Some example embodiments further comprise means for performing: determining an extent
of the at least one first visual indication of the blocked microphone(s). The extent
of the at least one first visual indication of the blocked microphone(s) may be based,
at least in part, on a location of any part of the display that is blocked. The extent
of the at least one first visual indication of the blocked microphone(s) may be based,
at least in part, on one or more of: the location of the respective blocked microphone;
a functionality of the respective blocked microphone; or an impact of the respective
blocked microphone to an audio output (e.g. audio degradation). A priority order for
unblocking multiple blocked microphones may be provided.
[0006] In some example embodiments, a nature of the at least one first visual indication
of the blocked microphone(s) is based, at least in part, on one or more of: whether
the respective blocked microphone is on a front face of the audio capture device;
whether the respective blocked microphone is on a rear face of the audio capture device;
whether the respective blocked microphone is neither on the front face nor on the
rear face of the audio capture device; whether the respective blocked microphone is
on a face of the audio capture device visible to and/ or being viewed by the user;
a distance between the first visual indication and the position of the respective
blocked microphone; or a microphone type of the respective blocked microphone.
[0007] Some example embodiments further comprise means for performing: determining a position
of a user gaze on the display of the audio capture device. For example, means for
performing: providing a further visual indication on the display directing the user
gaze from the position of the user gaze to the position of the at least one first
visual indication may be provided. Some example embodiments further comprise means
for performing: determining whether to provide the further visual indication on the
display based, at least in part, on one or more of: a distance between the user gaze
and the at least one first visual indication (e.g. whether the distance is above a
threshold distance); an impact of the respective blocked microphone to an audio output;
or whether the at least one first visual indication is provided on a part of the display
that is blocked.
[0008] The at least one first visual indication may be based on an impact of the respective
blocked microphone on possible audio formats.
[0009] Some example embodiments further comprise means for performing: setting an audio
capture mode based, at least in part, on the at least one blocked microphone (e.g.
based on reduced performance capabilities). Auser may be required to confirm a change
in audio capture mode (or to cease blocking the relevant microphone(s)).
[0010] The at least one first visual indication may be provided on the display only in the
event that a selected audio mode is not possible due to the respective blocked microphone.
[0011] Some example embodiments further comprise means for performing: providing a virtual
indication to the user on an augmented reality display, wherein the virtual indication
directs a user gaze to the at least one first visual indication.
[0012] Some example embodiments further comprise mans for performing: providing the at least
one first visual indication on the display in the determined position(s).
[0013] The means may comprise: at least one processor; and at least one memory including
computer program code, the at least one memory and the computer program code configured,
with the at least one processor, to cause the performance of the apparatus.
[0014] In a second aspect, this specification describes an audio capture device comprising:
a plurality of microphones for capturing audio; a display for outputting at least
one first visual indication of at least one blocked microphone in response a determination
that one of the plurality of microphone is blocked, wherein the display provides the
at least one first visual indication based, at least in part, on a location of the
respective blocked microphone on the audio capture device, wherein the position of
the at least one first visual indication corresponds more closely to the position
of the respective blocked microphone than to the position of any other microphone
of the multi-microphone audio capture device.
[0015] One or more microphones of the plurality may be on a front side of the audio capture
device and one or more other microphones of the plurality may be on a rear side of
the audio capture device.
[0016] The audio capture device may comprise means (such as one or more sensors) for performing:
determining whether any part of the display is blocked (e.g. such that the relevant
part of the display is not visible to the user), wherein the position of the at least
one first visual indication of the blocked microphone(s) is based, at least in part,
on a location of any part of the display that is blocked.
[0017] The audio capture device may comprise means for performing: determining a position
of a user gaze on the display of the audio capture device.
[0018] The at least one first visual indication may be provided on the display only in the
event that a selected audio mode is not possible due to the respective blocked microphone.
[0019] The audio capture device may further comprise any aspect of the apparatus as described
with reference to the first aspect.
[0020] In a third aspect, this specification describes a method comprising: identifying
at least one microphone of a multi-microphone audio capture device that is blocked;
and determining a position of at least one first visual indication of the at least
one blocked microphones for output to a user on a display of the audio capture device
based, at least in part, on a location of the respective microphone on the audio capture
device, wherein the position of the at least one first visual indication corresponds
more closely to the position of the respective blocked microphone than to the position
of any other microphone of the multi-microphone audio capture device.
[0021] Some example embodiments further comprise: determining whether any part of the display
is blocked (e.g. such that the relevant part of the display is not visible to the
user), wherein the position of the at least one first visual indication of the blocked
microphone(s) is based, at least in part, on a location of any part of the display
that is blocked. The position of the at least one first visual indication of the blocked
microphone(s) may be such that at least part of the or each first visual indication
is on a part of the display not blocked. The position of the at least one first visual
indication of the blocked microphone(s) may be such that the or each first visual
indication corresponds as closely as possible to the position of the blocked microphone.
[0022] The method may further comprise determining an extent of the at least one first visual
indication of the blocked microphone(s). The extent of the at least one first visual
indication of the blocked microphone(s) may be based, at least in part, on a location
of any part of the display that is blocked. The extent of the at least one first visual
indication of the blocked microphone(s) may be based, at least in part, on one or
more of: the location of the respective blocked microphone; a functionality of the
respective blocked microphone; or an impact of the respective blocked microphone to
an audio output (e.g. audio degradation). Apriority order for unblocking multiple
blocked microphones may be provided.
[0023] In some example embodiments, a nature of the at least one first visual indication
of the blocked microphone(s) is based, at least in part, on one or more of: whether
the respective blocked microphone is on a front face of the audio capture device;
whether the respective blocked microphone is on a rear face of the audio capture device;
whether the respective blocked microphone is neither on the front face nor on the
rear face of the audio capture device; whether the respective blocked microphone is
on a face of the audio capture device visible to and/ or being viewed by the user;
a distance between the first visual indication and the position of the respective
blocked microphone; or a microphone type of the respective blocked microphone.
[0024] The method may further comprise: determining a position of a user gaze on the display
of the audio capture device. For example: providing a further visual indication on
the display directing the user gaze from the position of the user gaze to the position
of the at least one first visual indication may be provided. Some example embodiments
further: determining whether to provide the further visual indication on the display
based, at least in part, on one or more of: a distance between the user gaze and the
at least one first visual indication (e.g. whether the distance is above a threshold
distance); an impact of the respective blocked microphone to an audio output; or whether
the at least one first visual indication is provided on a part of the display that
is blocked.
[0025] The at least one first visual indication may be based on an impact of the respective
blocked microphone on possible audio formats.
[0026] Some example embodiments further comprise: setting an audio capture mode based, at
least in part, on the at least one blocked microphone (e.g. based on reduced performance
capabilities). A user may be required to confirm a change in audio capture mode (or
to cease blocking the relevant microphone(s)).
[0027] The at least one first visual indication may be provided on the display only in the
event that a selected audio mode is not possible due to the respective blocked microphone.
Some example embodiments further comprise: providing a virtual indication to the user
on an augmented reality display, wherein the virtual indication directs a user gaze
to the at least one first visual indication.
[0028] Some example embodiments further comprise: providing the at least one first visual
indication on the display in the determined position(s).
[0029] In a fourth aspect, this specification describes an apparatus configured to perform
any (at least) any method as described with reference to the third aspect.
[0030] In a fifth aspect, this specification describes computer-readable instructions which,
when executed by a computing apparatus, cause the computing apparatus to perform (at
least) any method as described with reference to the third aspect.
[0031] In a sixth aspect, this specification describes a computer-readable medium (such
as a non-transitory computer-readable medium) comprising program instructions stored
thereon for performing (at least) any method as described with reference to the third
aspect.
[0032] In a seventh aspect, this specification describes an apparatus comprising: at least
one processor; and at least one memory including computer program code which, when
executed by the at least one processor, causes the apparatus to perform (at least)
any method as described with reference to the third aspect.
[0033] In an eighth aspect, this specification describes a computer program comprising instructions
for causing an apparatus to perform at least the following: identifying at least one
microphone of a multi-microphone audio capture device that is blocked; and determining
a position of at least one first visual indication of the at least one blocked microphones
for output to a user on a display of the audio capture device based, at least in part,
on a location of the respective microphone on the audio capture device, wherein the
position of the at least one first visual indication corresponds more closely to the
position of the respective blocked microphone than to the position of any other microphone
of the multi-microphone audio capture device.
[0034] In a ninth aspect, this specification describes an apparatus comprising: one or more
sensors (or some other means) for identifying at least one microphone of a multi-microphone
audio capture device that is blocked; and a display controller (or some other means)
for determining a position of at least one first visual indication of the at least
one blocked microphones for output to a user on a display of the audio capture device
based, at least in part, on a location of the respective microphone on the audio capture
device, wherein the position of the at least one first visual indication corresponds
more closely to the position of the respective blocked microphone than to the position
of any other microphone of the multi-microphone audio capture device.
Brief description of the drawings
[0035] Example embodiments will now be described, by way of example only, with reference
to the following schematic drawings, in which:
FIG. 1 is a front view of a device in accordance with an example embodiment;
FIG. 2 is a rear view of the device of FIG. 1;
FIG. 3 is a block diagram showing an example use of the device of FIGS. 1 and 2 in
accordance with an example embodiment;
FIG. 4 is a block diagram of a device in accordance with an example embodiment;
FIG. 5 is a flow chart showing an algorithm in accordance with an example embodiment;
FIG. 6 shows a view from above of the device of FIGS. 1 and 2 in accordance with an
example embodiment;
FIG. 7 is a block diagram of a device in accordance with an example embodiment;
FIG. 8 is a block diagram of a display in accordance with an example embodiment;
FIG. 9 is a flow chart showing an algorithm in accordance with an example embodiment;
FIG. 10 shows a view from above of the device of FIGS. 1 and 2 in accordance with
an example embodiment;
FIG. 11 is a block diagram of a device in accordance with an example embodiment;
FIG. 12 is a block diagram of a display in accordance with an example embodiment;
FIG. 13 is a flow chart showing an algorithm in accordance with an example embodiment;
FIG. 14 shows a view from above of the device of FIGS. 1 and 2 in accordance with
an example embodiment;
FIG. 15 is a block diagram of a display in accordance with an example embodiment;
FIG. 16 is a flow chart showing an algorithm in accordance with an example embodiment;
FIG. 17 is a block diagram of a device in accordance with an example embodiment;
FIG. 18 is a block diagram of a display in accordance with an example embodiment;
FIG. 19 is a block diagram showing an example use of the device of FIGS. 1 and 2 in
accordance with an example embodiment;
FIG. 20 is a block diagram of a display in accordance with an example embodiment;
FIG. 21 is a block diagram of a display in accordance with an example embodiment;
FIG. 22 is a flow chart showing an algorithm in accordance with an example embodiment;
FIG. 23 is a block diagram of a display in accordance with an example embodiment;
FIG. 24 is a block diagram of a display in accordance with an example embodiment;
FIG. 25 is a flow chart showing an algorithm in accordance with an example embodiment;
FIG. 26 is a block diagram showing an example use of the device of FIGS. 1 and 2 in
accordance with an example embodiment;
FIG. 27 is a block diagram of components of a system in accordance with an example
embodiment; and
FIGS. 28A and 28B show tangible media, respectively a removable non-volatile memory
unit and a compact disc (CD) storing computer-readable code which when run by a computer
perform operations according to example embodiment.
Detailed description
[0036] The scope of protection sought for various embodiments of the invention is set out
by the independent claims. The embodiments and features, if any, described in the
specification that do not fall under the scope of the independent claims are to be
interpreted as examples useful for understanding various embodiments of the invention.
[0037] FIG. 1 is a front view of a device, indicated generally by the reference numeral
10, in accordance with an example embodiment. FIG. 2 is a rear view of the device
10.
[0038] As shown in FIG. 1, a display 11 is provided on the front side of the device 10.
As shown in FIG. 2, a camera hump 12 is provided on the rear side of the device 10.
The display 11 may be provided by a screen of the device 10.
[0039] The device 10 includes six microphones (although alternative embodiments could include
more or fewer microphones). As shown in FIG. 1, a first microphone 21 is provided
near an earpiece of the device 10, and second and third microphones 22 and 23 are
provided near the bottom of the device 10. As shown in FIG. 2, fourth and fifth microphones
24 and 25 are provided at opposite ends of the rear side and a sixth microphone 26
forms part of the camera hump 12.
[0040] FIG. 3 is a block diagram, indicated generally by the reference numeral 30, showing
an example use of the device 10 in accordance with an example embodiment.
[0041] The system 30 shows a user 32 holding the device 10 whilst capturing audio-visual
content. For example, the user 32 may be recording a video with spatial audio capture.
[0042] As shown in FIG. 3, the right hand 34 of the user 32 is blocking some of the microphones
of the device 10 (e.g. the second and third microphones 22 and 23 at the bottom of
the device). While it is often apparent to the user if they are blocking the camera,
it can be quite difficult for the user to observe that they are blocking any of the
microphones. For example, spatial audio monitoring is seldom carried out when a user
records user generated content (UGC). Furthermore, it can be difficult for a user
to hear a specific spatial audio problem in some circumstances (e.g., there is no
apparent sound source in a direction that is mainly affected).
[0043] As discussed in detail below, some type of indication can be provided to alert the
user to the blockage of one or more microphones. For example, an indication of which
of a plurality of microphones is blocked and/ or the impact of the blockage may be
provided. For example, the indication may seek to prevent microphones that are particularly
important for high-quality capture of a currently selected spatial audio format from
being blocked.
[0044] FIG. 4 is a block diagram of a device, indicated generally by the reference numeral
40, in accordance with an example embodiment. The device 40 is a simplified illustration
of the device 10 described above. In the device 40, the positions of the microphones
21 to 26 are indicated by the labels 1 to 6 respectively. The first to third microphones
(that are on the front side of the device) are indicated with filled circles and the
fourth to sixth microphones (that are on the rear side of the device) are indicated
with empty circles (or donut shapes).
[0045] FIG. 5 is a flow chart showing an algorithm, indicated generally by the reference
numeral 50, in accordance with an example embodiment.
[0046] The algorithm 50 starts at operation 52 where a microphone of a multi-microphone
audio capture device (such as the device 10) that is blocked is identified. A microphone
may be blocked by a user of the device (as in the system 30). The user might not know
where the microphones are and hence may accidentally block a microphone. Of course,
two or more blocked microphones may be identified in the operation 52. One or more
sensors (or some other means) may be provided for identifying blocked microphone(s).
[0047] At operation 54, a position of a first visual indication of the blocked microphone(s)
for output to a user on a display or of the audio capture device (such as the display
11) is determined. The position of a visual indication may be determined (for example
by a display controller or some other means) based, at least in part, on a location
of the blocked microphone on the audio capture device. For example, the position of
the first visual indication may correspond more closely to the position of the blocked
microphone than to the position of any other microphone of the multi-microphone audio
capture device. The display may be an integrated display (as in the device 10), but
this is not essential to all example embodiments. For example, the display may be
a separate display (e.g. a monitor or a headset display).
[0048] At operation 56, the first visual indication is provided on the display in the position
determined in the operation 54.
[0049] FIG. 6 shows a view from above of the device 10 in accordance with an example embodiment.
The camera hump 12 is visible of the rear side of the device 10. A finger 62 of a
user is shown in the front side of the device 10. As discussed further below, the
finger blocks the first microphone 21.
[0050] FIG. 7 is a block diagram of a device, indicated generally by the reference numeral
70, in accordance with an example embodiment. The device 70 is a simplified illustration
of the device 10 and is similar to the device 40 described above.
[0051] The device 70 includes a representation 72 of the user's finger (i.e. the finger
62 described above) that is blocking the first microphone 21 at position 1 of the
device 10. Note that the fifth microphone 25 at position 5 is not blocked since that
microphone is on the rear of the device 10 and the user's finger is covering a portion
of the front of the device only.
[0052] In an implementation of the algorithm 50, the operation 52 determines that the first
microphone 21 at position 1 is blocked. In operation 54, a position of a visual indication
of the blocked microphone for output to a user on a display of the audio capture device
is determined.
[0053] FIG. 8 is a block diagram of a display, indicated generally by the reference numeral
80, in accordance with an example embodiment. The display 80 may be the display 11
of the device 10 described above.
[0054] The display 80 includes a first visual indication 82 provided in an example implementation
of the operation 56 of the algorithm 50. Also shown in FIG. 8 is the representation
of the user's finger 72 such that the position and extent of the visual indication
82 relative to the user's finger can be seen.
[0055] In this way, a visual indication is provided to the user. This indication may be
provided as close as possible to the position of the blocked microphone (the first
microphone 21 in this example).
[0056] Thus, the device 10 is an audio capture device (e.g. an audio-visual capture device)
having a plurality of microphones for capturing audio and a display for outputting
a first visual indication of a blocked microphone in response to a determination that
one of the plurality of microphone is blocked. The display provides the first visual
indication based, at least in part, on a location of the microphone on the audio capture
device, wherein the position of the first visual indication corresponds more closely
to the position of the blocked microphone than to the position of any other microphone
of the multi-microphone audio capture device.
[0057] FIG. 9 is a flow chart showing an algorithm, indicated generally by the reference
numeral 90, in accordance with an example embodiment.
[0058] The algorithm 90 starts at operation 92, where a determination is made regarding
whether any part of the display is blocked (e.g. by the user), such that the relevant
part of the display is not visible to the user. A region of the display being blocked
may be determined, for example based on camera-based detection, hover-based detection,
touchscreen-based detection, or any other suitable method.
[0059] At operation 94, the position of the first visual indication of the blocked microphone
is based, at least in part, on a location of any part of the display that is blocked
(i.e. not visible to the user).
[0060] Thus, consideration may be given to parts of the display that are blocked when providing
the indication, for example by providing the indication as close as possible to the
microphone position in a visible part of the display. In the example display 80, the
user's finger does not block a very large portion of the display, and therefore the
indication location need not be changed. However, the system may, in some example
embodiments, modify the extent (e.g. size) of the visual modification in order to
make it clearly visible and intuitive enough for the user to understand which microphone
is intended. The system may also check that the indication is not closer to an unblocked
microphone than the blocked microphone. If this is not possible, then some other indication
may be provided. For example, the indication may be provided as close as possible
to the blocked microphone within the visible part of the display. In this configuration,
the indication may take a different form (e.g. a different colour or the form of an
arrow pointing to the blocked microphone) to distinguish this scenario from a default
condition of providing the indication not closer to an unblocked microphone than a
blocked microphone.
[0061] In some example embodiments, if an indication of a blocked microphone is only partially
blocked, then it is not modified. This may be advantages since the user may move their
hand to prevent the indication from being partially blocked and may unblock the respective
microphone in the process.
[0062] FIG. 10 shows a view from above of the device 10 in accordance with an example embodiment.
The camera hump 12 is visible of the rear side of the device 10. A hand 102 of a user
is shown. As discussed further below, the user blocks the second microphone 22 and
third microphone 23 and also obscures some of the display 11.
[0063] FIG. 11 is a block diagram of a device, indicated generally by the reference numeral
110, in accordance with an example embodiment. The device 110 is a simplified illustration
of the device 10 and is similar to the devices 40 and 70 described above. FIG. 11
shows a scenario in which a user's hand 112 is grabbing the device at the edge and
on top of the display.
[0064] The device 110 includes a representation of the user's hand 112 (i.e. the hand 102
described above) that is blocking the second and third microphones 22 and 23 at positions
2 and 3 of the device 10. Note that the fourth microphone 24 at position 4 (on the
rear of the device 10) is not blocked.
[0065] FIG. 12 is a block diagram of a display, indicated generally by the reference numeral
120, in accordance with an example embodiment. The display 120 may be the display
11 of the device 10 described above.
[0066] The display 120 includes a first visual indication 122 and a second visual indication
123 provided in an example implementation of the operation 56 of the algorithm 50.
Also shown in FIG. 12, the representation of the user's hand 102 is such that the
position and extent of the first and second visual indications 122, 123 relative to
the user's hand 112 can be seen.
[0067] In this way, a visual indication is provided to the user. This indication may be
provided as close as possible to the position of the blocked microphones (the second
and third microphones 22 and 23 at positions 2 and 3 in this example). However, the
positions of the indications may take into account the partial blocking the display
by the user's hand 112. In this example, even though the microphones are at the very
edge of the device, the indications are moved slightly towards the centre such that
the user can see them. If user would now drag their hand towards the edge of the device,
the indications could correspondingly move towards the edge to show which microphones
are blocked. Note that if the visual indications are only partially blocked, then
they may not be modified in this way in some example embodiments.
[0068] FIG. 13 is a flow chart showing an algorithm, indicated generally by the reference
numeral 130, in accordance with an example embodiment. The algorithm 130 may be used
to generate the display 120 described above.
[0069] The algorithm 130 starts at operation 52 where (as described above with respect to
the algorithm 50) one or more microphones of a multi-microphone audio capture device
(such as the device 10) that are blocked are identified. In an implementation of the
algorithm 130, the operation 52 determines that the second and third microphones 22
and 23 are blocked.
[0070] The algorithm moves to operation 54 where, as described above, a position of a first
visual indication of the blocked microphone for output to a user on a display of the
audio capture device is determined.
[0071] In operation 132 of the algorithm 130, the nature and/or the extent of the visual
indication of the blocked microphone(s) is determined.
[0072] Finally, at operation 134, the visual indication(s) (such as the first and second
visual indications 122 and 123 described above) are provided on the display.
[0073] The extent of the visual indication of a blocked microphone may be based, at least
in part, on a location of any part of the display that is blocked by the user. Thus,
as described above, the first and second visual indications 122 and 123 may be moved
to avoid the portion of the display blocked by the user's finger/hand.
[0074] Alternatively, or in addition, the extent of the visual indication of the blocked
microphone may based, at least in part, on one or more of:
- the location of the blocked microphone;
- a functionality of the blocked microphone; and
- an impact of the blocked microphone to an audio output.
[0075] Alternatively, or in addition, the nature of the visual indication of the blocked
microphone may be based, at least in part, on one or more of:
- whether the blocked microphone is on a front face of the audio capture device;
- whether the blocked microphone is on a rear face of the audio capture device;
- whether the blocked microphone is neither on the front face nor on the rear face of
the audio capture device;
- whether the blocked microphone is on a face of the audio capture device visible to
and/or being viewed by the user;
- a distance between the first visual indication and the position of the blocked microphone;
and
- a microphone type of the blocked microphone.
[0076] FIG. 14 shows a view from above of the device 10 in accordance with an example embodiment.
The camera hump 12 is visible of the rear side of the device 10. A finger or hand
142 of a user is shown. As discussed further below, the user blocks the second microphone
22, the third microphone 23 and the fourth microphone 24 and also obscures some of
the display 11. Thus, microphones on both the front and the rear of the device 10
are blocked.
[0077] FIG. 15 is a block diagram of a display, indicated generally by the reference numeral
150, in accordance with an example embodiment. The display 150 may be the display
11 of the device 10 described above.
[0078] The display 150 includes a first visual indication 152, a second visual indication
153 and a third visual indication 154 provided in an example implementation of the
operation 56 of the algorithm 50. Also shown in FIG. 15, the representation of the
user's finger or hand 142 is such that the position and extent of the visual indications
152 to 154 relative to the user's finger or hand 142 can be seen.
[0079] The third visual indication 154 corresponding to the fourth microphone 24 is different
to the first and second visual indications 152 and 153 corresponding to the second
and third microphones 22 and 23, since the fourth microphone is on the rear of the
device and the second and third microphone are on the front.
[0080] By way of example, the display 150 shown in FIG. 15 may be provided on the front
of the device, with the visual indications 152 and 153 being provided in solid form
and the visual indication 154 in dotted form. Of course, many alternative display
types are possible, such as the use of different colours to distinguish between which
side of the device a particular blocked microphone is on.
[0081] Some example embodiment makes use of gaze tracking. This can be particularly useful
in case of larger displays. For example, a user may be looking at a first part of
display while blocking one or more microphones relating to a second part of the display.
Gaze tracking can be used to obtain the position of the user's gaze on the display
and, in some circumstances, direct the user's attention to an indication of a blocked
microphone.
[0082] FIG. 16 is a flow chart showing an algorithm, indicated generally by the reference
numeral 160, in accordance with an example embodiment.
[0083] FIG. 17 is a block diagram of a device, indicated generally by the reference numeral
170, in accordance with an example embodiment. The device 170 is a simplified illustration
of the device 10 and is similar to the devices 40, 70 and 110 described above. FIG.
17 shows a scenario in which a user's hand 172 is grabbing the device at one edge
of the display but is looking at the other side of the display. As discussed further
below, the device 170 may be used in an implementation of the algorithm 160.
[0084] FIG. 18 is a block diagram of a display, indicated generally by the reference numeral
180, in accordance with an example embodiment. The display 180 may be the display
11 of the device 10 described above. The display 180 includes first visual indications
182 and 183 relating to position of blocked microphones, as discussed above. The display
also includes a further visual indication 184 discussed further below.
[0085] The algorithm 160 starts at operation 52 where, as described above, a microphone
of a multi-microphone audio capture device (such as the device 10) that is blocked
is identified. Then, at operation 54, a position of a first visual indication of the
blocked microphone for output to a user on a display of the audio capture device (such
as the display 11) is determined. The position may be determined based, at least in
part, on a location of the blocked microphone on the audio capture device.
[0086] By way of example, the second and third microphones 22 and 23 of the device 10 may
be blocked by the user's hand 172 (as shown in FIG. 17).
[0087] At operation 162 of the algorithm 160, a position of a user gaze on the display of
the audio capture device is determined. An example user gaze position 174 is shown
in FIG. 17. The user gaze position may be determined, for example, by imaging the
pupils of the user; alternative methods will be apparent to those of ordinary skill
in the art.
[0088] At operation 164 of the algorithm 160, a determination is made regarding whether
a further visual indication directing the user gaze from the position of the user
gaze to the position of the first visual indication should be provided on the display.
The further visual indication 184 shown in FIG. 18 is an example of such a further
visual indication.
[0089] The decision made in the operation 164 may be based on a variety of factors. These
may include one or more of:
- a distance between the user gaze and the first visual indication(s), such as whether
the distance is above a threshold distance.;
- an impact of the blocked microphone to an audio output; and
- whether the first visual indication is provided on a part of the display that is blocked
by the user.
[0090] If, in the operation 164, a decision is taken to provide a further visual indication,
then the algorithm 160 moves to operation 166, where the first and further visual
indication are provided (such as the first visual indications 182 and 183, and the
further visual indication 184 shown in FIG. 18).
[0091] Alternatively, if, in the operation 164, a decision is taken not to provide a further
visual indication, then the algorithm 160 moves to operation 168, where the first
visual indication are provided (such that the further visual indication 184 shown
in FIG. 18 is not provided).
[0092] FIG. 19 is a block diagram, indicated generally by the reference numeral 190, showing
an example use of the device 10 in accordance with an example embodiment.
[0093] The block diagram 190 includes the user 32 holding the device 10 whilst capturing
audio-visual content, as described above. As in the example described above, the right
hand 34 of the user 32 may be blocking some of the microphones of the device 10.
[0094] The spatial audio capture of the microphones of the device 10 may determine 3D audio
consisting of audio in a horizontal direction 192 and audio in a vertical direction
194. In the event that one or more of the microphones is blocked, the audio in one
direction (e.g. the vertical direction 194) may be degraded. Visual information may
be provided that is based on the impact of the blocked microphone(s) on the 3D audio
format that can be provided.
[0095] FIG. 20 is a block diagram of a display, indicated generally by the reference numeral
200, in accordance with an example embodiment. The display 200 provides a user interface
consisting of a horizontal audio indication 202 and a vertical audio indication 204.
The horizontal audio indication is shown in solid form, but the vertical audio indication
is shown in dashed form. This output may be used to indicate that one or more blocked
microphones is inhibiting the vertical direction of a 3D audio format. Thus, the user
interface may provide a visual indication based on an impact of the blocked microphone(s)
on possible audio formats. Of course, the solid and dashed forms are provided by way
of example only; many alternative display types are possible.
[0096] FIG. 21 is a block diagram of a display, indicated generally by the reference numeral
210, in accordance with an example embodiment. The display 210 provides the horizontal
audio indication 202 and the vertical audio indication 204 described above. In addition,
the display 210 provides a visual indication 212 of the location of a blocked microphone.
The visual indication 212 may be provided in accordance with the principles described
in detail above.
[0097] In some example embodiments, the user interface indicates to user when audio degradation
happens due to blocked microphone(s). For example, a system may allow blocking of
at least one microphone if this blocking does not affect (or does not significantly
affect) the (spatial) audio capture quality according to at least one of: current
spatial audio content format, current device orientation (e.g., portrait or landscape,
or free orientation).
[0098] By way of example, the system can indicate to user at least one of: degradation of
capture according to current spatial audio format (as shown in FIG. 20), recommended
switch to lesser spatial audio format, no ability to switch to higher spatial audio
format (if blocking not removed).
[0099] In order to provide the displays 200 or 210 described above, a control module may
be provide that:
- Determines at least one microphone that is currently blocked (e.g., based on any suitable
microphone blocking detection technique).
- Obtains at least currently selected (spatial) audio format information. This step
may also obtain, e.g., at least one lesser or one higher (spatial) audio format.
- Determines whether the at least one blocked microphone affects at least the currently
selected (spatial) audio format capture performance. This step may also determine,
e.g., whether the at least one blocked microphone affects the at least one lesser
or one higher (spatial) audio format capture performance.
- If the at least one blocked microphone affects the capture performance, an indication
of the fact that audio capture according to current format is degraded and optionally
how the audio capture is degraded may be provided (e.g. using the horizontal and vertical
audio indications 202 and 204 described above).
- If the at least one blocked microphone affects the capture performance, the blocked
microphone maybe indicated (e.g. as shown in FIG. 21).
[0100] Many variants are possible. For example, the control module may indicate and availability
to switch to at least one lower or one higher (spatial) audio format without or with
degradation due to at least one blocking microphone. Alternatively, or in addition,
the control module may indicate changing device orientation to overcome effect of
at least one microphone being blocked (as discussed further below).
[0101] In some examples, the severity of a degrading effect may be indicated. Example indications
of how a specific format may be degraded include: indication that height information
will be lost, indication that differentiating between front and back will not be possible,
indication that a specific direction (e.g., sector) will have unreliable direction
or distance estimates for audio sources, indication that audio zoom is not possible
or will be limited (e.g., to 50% effect).
[0102] FIG. 22 is a flow chart showing an algorithm, indicated generally by the reference
numeral 220, in accordance with an example embodiment.
[0103] The algorithm 220 starts at operation 52 where, as discussed in detail above, one
or more blocked microphones is identified.
[0104] At operation 222 of the algorithm 220, an audio capture mode is set based, at least
in part, on the identified blocked microphone(s). For example, the audio capture mode
may be set based on reduced performance capabilities. In one example embodiment a
user may be required to confirm a change in audio capture mode, but in other example
embodiments the change in audio capture mode may be automatic.
[0105] In the operation 224, a visual indication of the blocked microphone is provided on
a display. As discussed in detail above, the position of the visual indication may
be based, at least in part, on a location of the microphone on the audio capture device,
wherein the position of the first visual indication corresponds more closely to the
position of the blocked microphone than to the position of any other microphone of
the multi-microphone audio capture device. In some example embodiments, the visual
indication of the blocked microphone is provided on the display only in the event
that a selected audio mode is not possible due to the blocked microphone.
[0106] Thus, in some examples, the algorithm 220 may automatically switch to, e.g., a lesser
spatial audio format or spatial audio capture mode and indicate this switching in
addition to indicating the at least one blocked microphone. A user maybe able to confirm
this selection by simply continuing the capture. If user decides that the mode switch
due to reduced capability should not be made, the user can act to remove the blocking
of the at least one microphone indicated on the display.
[0107] FIG. 23 is a block diagram of a display, indicated generally by the reference numeral
230, in accordance with an example embodiment.
[0108] FIG. 24 is a block diagram of a display, indicated generally by the reference numeral
240, in accordance with an example embodiment.
[0109] The display 230 provides an indication to a user of a way to hold based on capture
format to not block important microphone(s), such as microphones most critical to
a selected (spatial) audio format. For example, in the example of FIG. 23, a user
has selected a planar-FOArepresentation (a first order Ambisonics (FOA) format that
does not have height information). Such reduced FOA representation can be useful for
certain applications, e.g., audio calls where height information may not be particularly
important. As the user has the device in portrait orientation, the system may first
suggest rotating the device to landscape orientation. This may, for example, maximize
separation between the microphones that will be used to derive the spatial audio according
to the selected format.
[0110] When user has the device in the correct orientation, the system may then indicate,
using the display 240, how user should hold it in order to avoid blocking the most
important microphones. For example, the system may indicate the positions of the microphones
on the device it intends to use for the (spatial) audio capture.
[0111] There may be situations where the current capture is adjusted to the available non-blocked
microphones, but a user wants to use a feature that would require some of the currently
blocked microphones to operate optimally.
[0112] In a first example, a user has a device with microphones on the front and back side.
Assume that the user's grip is blocking some of the front side microphones but it
does not matter in his current operation as he is shooting video with spatial audio
using the back camera, thus being able to capture spatial audio with three back microphones.
Now, the user wants to switch to the front camera with spatial audio. In this case
the use of front microphones would result in much better quality but user is blocking
(some of) them.
[0113] In a second example, the user has a device with five microphones on the back side
on a 'super-zoom device'. Assume now that the user's grip is blocking two out of the
five microphones. Spatial audio capture is available when the user is shooting video
with the back camera. Now, the user wishes to do serious audio zooming with maximum
values that would require the use of the blocked microphones as well.
[0114] In both of these examples, the user may receive an indication that the requested
functionality is not available in the current state followed by an indication as described
earlier regarding which microphone(s) need to be unblocked. The preferred functionality
or state may be automatically applied and displayed on the user interface when the
user changes his grip, and the required microphones are free.
[0115] The device may also indicate a fall-back operation for such function if the microphones
are still being blocked. Examples include suboptimal spatial audio, revert to stereo
or in the zooming example only modest zoom values being available in the user interface
complemented with the visualization of which microphones would need to be free for
other higher values.
[0116] FIG. 25 is a flow chart showing an algorithm, indicated generally by the reference
numeral 250, in accordance with an example embodiment.
[0117] The algorithm 250 starts at operation 52 where, as discussed in detail above, one
or more blocked microphones is identified.
[0118] At operation 252, a position of a visual indication(s) of the one or more blocked
microphones for output to a user on a display of the audio capture device (such as
the display 11) is determined. As discussed in detail above, the position may be determined
based, at least in part, on a location of the blocked microphone on the audio capture
device. The visual indication is then provided to the user in operation 254.
[0119] At operation 256, a further visual indication is provided to the user on an augmented
reality display, wherein the further visual indication directs a user gaze to the
first visual indication.
[0120] FIG. 26 is a block diagram, indicated generally by the reference numeral 260 showing
an example use of the device of FIGS. 1 and 2 in accordance with an example embodiment.
That use may, for example, implement the algorithm 250.
[0121] The block diagram 260 includes the user 32 holding the device 10 whilst capturing
audio-visual content, as described above. As in the example described above, the right
hand 34 of the user 32 may be blocking some of the microphones of the device 10. The
user 32 is wearing augmented reality (AR) glasses 262.
[0122] A visual indication 264 is shown on the device 10. That visual indication may be
provided in the operation 254 of the algorithm 250.
[0123] A virtual indication 265 is an augmented reality indication that is visible to the
user through the AR glasses 262. The virtual indication 265 may be aligned, for example
to point to the side of the device where the problem is. This virtual indication 265
may be particularly useful if the AR system is not able to track the device precisely
enough. Also, it can help a second user, who is not wearing AR glasses, to realize
there is a problem. In one example embodiment, the virtual indication 265 may be used
to direct the user's gaze to the device 10, since the position of the visual indication
264 on the device 10 may be more accurate that the position of the virtual indication
265 provided in augmented reality.
[0124] The example embodiments described above may be implemented using a smartphones or
some other audio capture device that implement spatial audio capture or any type of
enhanced mono or stereo capture. In other words, at least two microphones are used
to capture audio, where the audio signal resulting from the capture processing may
be mono, stereo, or spatial audio signal. As discussed above, user attention may be
drawn to at least one specific microphone being blocked by placement of a visual indication
on the device display in a way that allows user to intuitively localize the blocked
microphone in an intuitive manner.
[0125] For completeness, FIG. 27 is a schematic diagram of components of one or more of
the example embodiments described previously, which hereafter are referred to generically
as a processing system 300. The processing system 300 may, for example, be the apparatus
referred to in the claims below.
[0126] The processing system 300 may have a processor 302, a memory 304 closely coupled
to the processor and comprised of a RAM 314 and a ROM 312, and, optionally, a user
input 310 and a display 318. The processing system 300 may comprise one or more network/
apparatus interfaces 308 for connection to a network/ apparatus, e.g. a modem which
maybe wired or wireless. The network/apparatus interface 308 may also operate as a
connection to other apparatus such as device/ apparatus which is not network side
apparatus. Thus, direct connection between devices/ apparatus without network participation
is possible.
[0127] The processor 302 is connected to each of the other components in order to control
operation thereof.
[0128] The memory 304 may comprise a non-volatile memory, such as a hard disk drive (HDD)
or a solid state drive (SSD). The ROM 312 of the memory 304 stores, amongst other
things, an operating system 315 and may store software applications 316. The RAM 314
of the memory 304 is used by the processor 302 for the temporary storage of data.
The operating system 315 may contain code which, when executed by the processor implements
aspects of the algorithms 50, 90, 130, 160, 220 and 250 described above. Note that
in the case of small device/ apparatus the memory can be most suitable for small size
usage i.e. not always a hard disk drive (HDD) or a solid state drive (SSD) is used.
[0129] The processor 302 may take any suitable form. For instance, it may be a microcontroller,
a plurality of microcontrollers, a processor, or a plurality of processors.
[0130] The processing system 300 may be a standalone computer, a server, a console, or a
network thereof. The processing system 300 and needed structural parts may be all
inside device/ apparatus such as IoT device/ apparatus i.e. embedded to very small
size.
[0131] In some example embodiments, the processing system 300 may also be associated with
external software applications. These may be applications stored on a remote server
device/ apparatus and may run partly or exclusively on the remote server device/ apparatus.
These applications maybe termed cloud-hosted applications. The processing system 300
may be in communication with the remote server device/ apparatus in order to utilize
the software application stored there.
[0132] FIGS. 28A and 28B show tangible media, respectively a removable memory unit 365 and
a compact disc (CD) 368, storing computer-readable code which when run by a computer
may perform methods according to example embodiments described above. The removable
memory unit 365 may be a memory stick, e.g. a USB memory stick, having internal memory
366 storing the computer-readable code. The internal memory 366 may be accessed by
a computer system via a connector 367. The CD 368 may be a CD-ROM or a DVD or similar.
Other forms of tangible storage media may be used. Tangible media can be any device/
apparatus capable of storing data/ information which data/ information can be exchanged
between devices/ apparatus/ network.
[0133] Embodiments of the present invention may be implemented in software, hardware, application
logic or a combination of software, hardware and application logic. The software,
application logic and/or hardware may reside on memory, or any computer media. In
an example embodiment, the application logic, software or an instruction set is maintained
on any one of various conventional computer-readable media. In the context of this
document, a "memory" or "computer-readable medium" may be any non-transitory media
or means that can contain, store, communicate, propagate or transport the instructions
for use by or in connection with an instruction execution system, apparatus, or device,
such as a computer.
[0134] Reference to, where relevant, "computer-readable medium", "computer program product",
"tangibly embodied computer program" etc., or a "processor" or "processing circuitry"
etc. should be understood to encompass not only computers having differing architectures
such as single/ multi-processor architectures and sequencers/ parallel architectures,
but also specialised circuits such as field programmable gate arrays FPGA, application
specify circuits ASIC, signal processing devices/ apparatus and other devices/ apparatus.
References to computer program, instructions, code etc. should be understood to express
software for a programmable processor firmware such as the programmable content of
a hardware device/ apparatus as instructions for a processor or configured or configuration
settings for a fixed function device/ apparatus, gate array, programmable logic device/
apparatus, etc.
[0135] If desired, the different functions discussed herein may be performed in a different
order and/ or concurrently with each other. Furthermore, if desired, one or more of
the above-described functions may be optional or may be combined. Similarly, it will
also be appreciated that the flow diagrams of FIGS. 5, 9, 13, 16, 22 and 25 are examples
only and that various operations depicted therein may be omitted, reordered and/or
combined.
[0136] It will be appreciated that the above described example embodiments are purely illustrative
and are not limiting on the scope of the invention. Other variations and modifications
will be apparent to persons skilled in the art upon reading the present specification.
[0137] Moreover, the disclosure of the present application should be understood to include
any novel features or any novel combination of features either explicitly or implicitly
disclosed herein or any generalization thereof and during the prosecution of the present
application or of any application derived therefrom, new claims may be formulated
to cover any such features and/ or combination of such features.
[0138] Although various aspects of the invention are set out in the independent claims,
other aspects of the invention comprise other combinations of features from the described
example embodiments and/or the dependent claims with the features of the independent
claims, and not solely the combinations explicitly set out in the claims.
[0139] It is also noted herein that while the above describes various examples, these descriptions
should not be viewed in a limiting sense. Rather, there are several variations and
modifications which may be made without departing from the scope of the present invention
as defined in the appended claims.