TECHNICAL FIELD
[0001] The present disclosure relates to an apparatus and a method for determining an intended
target of an object.
BACKGROUND
[0002] It is common for a user to interact with a machine, so called human machine interaction
(HMI), via a pointing selection action, hereinafter referred to as a pointing gesture.
For example the user may point to a button or other control or an interactive display
such as graphical user interface (GUI) which may be displayed on a touch-sensitive
display device. However, especially when such gestures are used in moving vehicles
which can lead to erratic and unpredictable perturbations in the user input resulting
in erroneous selection(s), this may compromise system usability and tie up an undesirable
amount of the user's attention, particularly if the user is the driver of the vehicle.
[0003] It is an object of embodiments of the invention to at least mitigate one or more
of the problems of the prior art. It is an object of embodiments of the invention
to reduce a duration of a pointing gesture. It is an object of embodiments of the
invention to improve an accuracy of a pointing gesture.
[0004] US 2014/125590 discloses a computer-implemented method, system and software which includes providing
output from a touch-based device to an external display. Gestures from a user are
detected.
[0006] US 8,649,999 discloses estimation of a bias value association with a sensor using a ZRO-tracking
filter.
[0007] US 2013/194193 discloses methods and apparatus for correcting gesture-based input commands.
[0008] US 2012/005058 A1 discloses a method for predicting a target element based on the movement of a cursor
relative to a user interface. A prediction model is applied taking into account the
noise corresponding to imprecisions and local errors of a user operating an input
device.
SUMMARY OF THE INVENTION
[0009] According to an aspect of the present invention there is provided a method and system
as set forth in the appended claims.
[0010] According to an aspect of the present invention there is provided a human-machine
interaction method of determining an intended target of an object in relation to a
user interface, comprising: determining a three-dimensional location of the object
at a plurality of time intervals; determining a metric associated with each of a plurality
of items of the user interface, the metric indicative of the respective item being
the intended target of the object, wherein the metric is determined based upon a model
and the location of the object in three dimensions at the plurality of time intervals;
and determining, using a Bayesian reasoning process, the intended target from the
plurality of items of the user interface based on the metric associated with each
of the plurality of items. The method characterised by receiving one or more items
of environmental information, wherein the environmental information comprises one
or more of: information indicative of acceleration of a vehicle, information indicative
of a state of a vehicle and image data indicative of surroundings of the vehicle;
and wherein the model models movement of the object with respect to the plurality
of items and unintentional perturbations of the object movement and the determination
of the metric is based on the one or more items of environmental information, and/or
wherein the model is selected based on the one or more items of environmental information.
[0011] A human-machine interface (HMI) system for determining an intended target of an object
in relation to a user interface, comprising location determining means for determining
a three-dimensional location of the object, a memory means for storing data indicative
of the location of the object in three dimensions at a plurality of instants in time,
a processing means arranged to determine a metric associated with each of a plurality
of items of a user interface of the respective item being the intended target of the
object, wherein the metric is determined based upon a model and the location of the
object at the plurality of time intervals, and determine, using a Bayesian reasoning
process, the intended target from the plurality of items of the user interface based
on the metric associated with each of the plurality of items. Characterised by the
memory means being further for storing data indicative of one or more items of environmental
information, wherein the environmental information comprises one or more of: information
indicative of acceleration of a vehicle, information indicative of a state of a vehicle
and image data indicative of surroundings of the vehicle; and wherein the model models
movement of the object with respect to the plurality of items and unintentional perturbations
of the object movement and the determination of the metric is based on the one or
more items of environmental information, and/or wherein the model is selected based
on the one or more items of environmental information.
[0012] Optionally, the intended target is determined based on the location of the object.
The intended target may be determined before the object reaches the target.
[0013] The method may comprise determining a trajectory of the object. The trajectory of
the object may comprise data indicative of the location of the object at a plurality
of time intervals. Using the trajectory of the object may improve determination of
the intended target.
[0014] The method may comprise filtering the trajectory of the object. The filtering may
smooth the trajectory of the object and/or the filtering may reduce unintended movements
of the object and/or noise from the trajectory. Advantageously filtering the trajectory
may reduce an influence of unintended movements such as jumps or jolts.
[0015] The model may be a Bayesian intentionality prediction model. The model may be a linear
model. The model may be based on one or more filters; optionally the one or more filters
are Kalman filters.
[0016] The model may be a non-linear model. The model may incorporate irregular movements
of the object. The non-linear model may be based on one or more statistical filters;
optionally particle filters.
[0017] The model may be a model based on a learnt distribution based upon historical data,
which may be Gaussian or otherwise. The model may be a nearest neighbour (NN) model.
The NN model may determine the metric based upon a distance between the location of
the object and each of the targets. The metric may be indicative of a distance between
the object and each of the targets.
[0018] The model may be a bearing angle (BA) model. The metric may be indicative of an angle
between the trajectory of the object and each of the targets.
[0019] The model may be a heading solid angle (HSA) model. The metric may be indicative
of a solid angle between the object and each of the targets.
[0020] The model may be a Linear Destination Reversion (LDR) or a Nonlinear Destination
Reversion (NLDR) model. The method may comprise determining a model for each of the
targets. The metric may be indicative of the model best matching the trajectory of
the object. The NLDR model may comprise non-linear perturbations of the trajectory.
The model may be a Mean Reverting Diffusion (MRD) model. The MRD may model a location
of the object as a process reverting to the intended target.
[0021] The model may be an Equilibrium Reverting Velocity (ERV) model. The metric may be
based upon a speed of travel of the object to the target.
[0022] The model may be a bridging model. The bridging model may be based on one or more
bridges. For example the bridging model may be based on a bank of Markov bridges.
Each bridge may be determined to terminate at a nominal intended destination of the
tracked object and may be based upon the spatial area of a plurality of targets and
/ or a duration of the plurality of time intervals.
[0023] The method may comprise determining a state of the object.
[0024] The determining the intended target may be based on a cost function. The cost function
may impose a cost for incorrectly determining the intended target. The intended target
may be determined so as to reduce the cost function.
[0025] The determining the intended target may be based on one or more items of prior information.
The prior information may be associated with at least some of the targets. The prior
information may be indicative of previously selected targets. Advantageously the prior
information may improve determination of the intended target.
[0026] The method may comprise selecting a plurality of most recent time intervals, wherein
the determining the metric associated with each of the plurality of targets may be
based upon the location of the object at the plurality of most recent time intervals.
[0027] The object may be a pointing object. The location of the object may be determined
in three dimensions. Determining the location of the object may comprise tracking
the location of the object. Determining the location of the object may comprise receiving
radiation from the object.
[0028] The method may comprise outputting an indication of the intended target. The indication
of the intended target may comprise identifying the intended target; optionally the
intended target may be visually identified. Advantageously the user may become aware
of the determined intended target. The user may then cause selection of the intended
target.
[0029] The method may comprise outputting the indication of the intended target and one
or more possible targets. The method may comprise activating the intended target.
[0030] The plurality of targets may comprise one or more of graphically displayed items
or physical controls. The location of the object may be determined in three-dimensions.
[0031] According to an aspect of the present invention there is provided a system for determining
an intended target of an object, comprising location determining means for determining
a location of the object; a memory means for storing data indicative of the location
of the object at one or more instants in time; a processing means arranged to determine
a metric associated with each of a plurality of targets of the respective target being
the intended target of the object, wherein the metric is determined based upon a model
and the location of the object at the plurality of time intervals; determine, using
a Bayesian reasoning process, the intended target from the plurality of targets based
on the metric associated with each of the plurality of targets.
[0032] The processing means may be arranged to perform a method according to the first aspect
of the invention.
[0033] The location determining means may comprise means for receiving radiation from the
object. The location determining means may comprise one or more imaging devices.
[0034] Location data indicative of the location of the object at each instant in time may
be stored in the memory means.
[0035] The system may comprise one or more accelerometers for outputting acceleration data.
Advantageously the acceleration data may be used in the determination process, for
example to improve the determination e.g. by selecting a model.
[0036] The system may comprise a display means for displaying a graphical user interface
(GUI) thereon, wherein the plurality of targets are GUI items.
[0037] The model of the system may be a bridging model. The bridging model may be based
on one or more bridges. For example the bridging model may be based on a bank of Markov
bridges. Each bridge may be determined to terminate at a nominal intended destination
of the tracked object and may be based upon the spatial area of a plurality of targets
and / or a duration of the plurality of time intervals.
[0038] The processing means may be arranged to receive environmental data from one or more
sensing means; optionally the sensing means may comprise means for determining a state
of the vehicle and/or imaging devices.
[0039] According to an aspect of the invention there is provided a vehicle comprising a
processing device arranged, in use, to perform a method according to a first aspect
of the invention or comprising a system according to the second aspect of the invention.
[0040] According to an aspect of the present invention there is provided a method of determining
an intended target of an object, comprising determining a location of the object at
a plurality of time intervals; determining a probability associated with a target
of said target being an intended target.
[0041] The probability may be determined based upon a model and the location of the object
at the plurality of time intervals.
[0042] According to an aspect of the present invention there is provided an apparatus comprising
a processing device arranged, in use, to determine an intended target of an object,
wherein the processing device is arranged to determine a location of the object at
a plurality of time intervals; and to determine a probability associated with a target
of said target being an intended target.
[0043] As used herein, the term "processing means" will be understood to include both a
single processor, control unit or controller and a plurality of processors, control
units or controllers collectively operating to provide the required control functionality.
A set of instructions could be provided which, when executed, cause said controller(s)
or control unit(s) to implement the control techniques described herein (including
the method(s) described below). The set of instructions may be embedded in one or
more electronic processors, or alternatively, the set of instructions could be provided
as software to be executed by one or more electronic processor(s). For example, a
first controller may be implemented in software run on one or more electronic processors,
and one or more other controllers may also be implemented in software run on or more
electronic processors, optionally the same one or more processors as the first controller.
It will be appreciated, however, that other arrangements are also useful, and therefore,
the present invention is not intended to be limited to any particular arrangement.
In any event, the set of instructions described above may be embedded in a computer-readable
storage medium (e.g., a non-transitory storage medium) that may comprise any mechanism
for storing information in a form readable by a machine or electronic processors/computational
device, including, without limitation: a magnetic storage medium (e.g., floppy diskette);
optical storage medium (e.g., CD-ROM); magneto optical storage medium; read only memory
(ROM); random access memory (RAM); erasable programmable memory (e.g., EPROM ad EEPROM);
flash memory; or electrical or other types of medium for storing such information/instructions.
It will also be understood that the term "location determining means" may be understood
to mean one or more location determining devices for determining a location of the
object and that the term "memory means" may be understood to means one or more memory
devices for storing data indicative of the location of the object at one or more instants
in time.
[0044] Within the scope of this application it is expressly intended that the various aspects,
embodiments, examples and alternatives set out in the preceding paragraphs, in the
claims and/or in the following description and drawings, and in particular the individual
features thereof, may be taken independently or in any combination. That is, all embodiments
and/or features of any embodiment can be combined in any way and/or combination, unless
such features are incompatible.
BRIEF DESCRIPTION OF THE DRAWINGS
[0045] Embodiments of the invention will now be described by way of example only, with reference
to the accompanying figures, in which:
Figure 1 shows an illustration of fingertip trajectories during pointing gestures;
Figure 2 shows an illustration of a system according to an embodiment of the invention;
Figure 3 illustrates a solid angle for a target;
Figure 4 shows an illustration of a method according to an embodiment of the invention;
Figure 5 shows an illustration of performance of various embodiments of the invention;
and
Figure 6 shows a further illustration of performance of various embodiments of the
invention;
Figure 7 shows a still further illustration of performance of various embodiments
of the invention;
Figure 8 shows a vehicle according to an embodiment of the invention;
Figure 9 is an illustration of a perturbed trajectory and a filtered trajectory of
an object according to an embodiment of the invention;
Figure 10 is an illustration of mean percentage of destination successful prediction
for various models according to embodiments of the invention;
Figure 11 is an illustration of gesture portion (in time) with successful prediction
for various models according to embodiments of the invention; and
Figure 12 is an illustration of average log prediction uncertainty according to embodiments
of the invention.
DETAILED DESCRIPTION
[0046] Embodiments of the present invention relate to methods and apparatus for determining
an intended target of an object. The object may be a pointing object, such as a stylus
or finger, although it will be realised that this is not limiting. Embodiments of
the invention will be explained, by way of example, with reference to fingertip pointing
gestures which are performed in vehicles. It will be realised, however, that the pointing
object may be an object other than a finger, such an elongate object e.g. a stylus.
Furthermore embodiments of the invention are not limited to use within a vehicle and
may be used, for example, to determine the intended destination of a pointing object
upon a computing device such as a tablet computer or smartphone, for example. Furthermore,
embodiments of the invention will be explained with reference to determining the intended
destination of the pointing object upon a display device. In particular determining
one or a plurality of graphical objects displayed on the display device which is the
intended target, or which have a likelihood of being the intended target. It will
be realised that embodiments of the invention are not limited the intended target
being displayed on a surface of the display device. The display device may be a device
for projecting an image onto a surface, such as an interior surface of a vehicle,
and detecting the intended target, which may be a graphical object displayed on the
surface. For example the surface may be a dashboard or interior portion of the vehicle,
although it will be realised that other surfaces may be envisaged. The intended target
may also be one of a plurality of physical buttons or other controls, for example.
The image may comprise a 3D heliograph and/or a stereoscopic image in some embodiments.
[0047] Referring to Figure 1 there is illustrated a fingertip trajectory in three-dimensions
(3D) for three separate pointing tasks to select one of a plurality of graphical items
displayed on a display device within a vehicle. A location of the fingertip is determined
at each of a plurality of time intervals
tn from
t1 to
tk. At each time interval a location of the fingertip is determined in 3D as a location
vector
mn = [
x̂tn,
ŷtn,
ẑtn]
T. The vector m
n is used to represent a recorded pointing object location, e.g. of the finger, which
may include noise and/or perturbations.
[0048] In some embodiments
mn may be determined with reference to an origin of a sensor arranged to detect the
fingertip location, although in other embodiments
mn may be determined with reference to another location, such as a location within the
vehicle, for example a location about the display device. Furthermore, in some embodiments,
the vector
mn may comprise other sensor data such as that output by one or more accelerometers,
gyroscopes etc. In other words, the vector
mn may represent information additional to the position of the object.
[0049] Figure 1(a) illustrates the fingertip trajectory 150 (only one of which is numbered
for clarity) for three separate pointing tasks to select different graphical items
or buttons which are represented as circles 110 (only one of which is numbered for
clarity) displayed on a display device 100 in a stationary vehicle. As can be appreciated,
even within a stationary vehicle, the trajectories are irregular. Figure 1(b) illustrates
trajectories 160 (again only one of which is numbered) for three separate pointing
tasks to select different displayed graphical items whilst the vehicle is moving at
varying speeds over an uneven road. As can be appreciated the trajectories experience
significant perturbations. Other perturbations may arise from, for example, a user
walking whilst holding a computing device and attempting a pointing gesture.
[0050] Figure 2 illustrates a system 200 according to an embodiment of the present invention.
The system 200 is a system for determining an intended target of a pointing object.
The system 200 comprises a means 210 for determining a location of the pointing object,
a processing means 220 for determining the intended target of the pointing object
and a display means 230 for displaying at least one possible target of the pointing
object although, as noted above, in other embodiments the possible targets of the
pointing object may be a physical object such as a button or other control and thus
the display is optional. The processing means 220 may determine whether a target of
the pointing object is intended, or has been accidentally targeted. For example whether
a graphical item or button was intended to be touched by the user or is touched accidentally
such as due to movement of the vehicle. Accordingly the input may be discarded if
the processing means 220 determines the target to be unintended. Responsive to the
processing means determining the intended target, in some embodiments the display
means 230 may be caused to act responsive to the determination to aid a selection
process, such as by highlighting the intended target, or one or more possible targets,
or to enlarge a portion of information displayed on the display means 230.
[0051] The system 200 includes or receives data from one or more additional sensors, such
as one or more accelerometers, sensors monitoring a suspension of the vehicle, one
or more cameras, such as forward facing to face the road to enable road condition
classification, etc. The one or more sensors may help establish an operating environment
of the system 200. For example, an accelerometer/camera may be used establish that
a lot of vibrations are being or are about-to-be experienced. The one or more accelerometers
may enable the system to adapt to prevailing conditions, such as by selecting an appropriate
model, as will be explained.
[0052] The means 210 for determining the location of the object is a location sensing device
210. The location sensing device may determine the location of the object based on
data from one or more devices responsive to received radiation. The radiation may
be emitted from one or more devices forming part of the system 200, such as sound
waves or electromagnetic radiation. The location sensing device may, in one embodiment,
be an accelerometer associated with the object being tracked. The location sensing
device may comprise one or more imaging devices for outputting image data relating
to the object. The one or more imaging devices may be one or more cameras arranged
to output image data including image data corresponding to the object such that the
location of the object may be determined therefrom. The location sensing device may
be a commercially available device such as a Microsoft Kinect (RTM) or a Leap Motion
(RTM) Controller available from Leap Motion, Inc. It will be realised that other devices
may be used.
[0053] The location sensing device 210 may be arranged to output data from which the location
of the object may be determined by the processing means 220 or the location sensing
device 210 may output location data indicative of the location of the object. In one
embodiment the location sensing device 210 is arranged to output location data at
a time instant
tk of the form
indicative of the location of the object. A value of
mk, which may be in
mm, may specify the location of the object with reference to a predetermined datum. The
datum may be relative to the location sensing device 210 or may be relative to another
datum such as a point about the display device 230.
[0054] The location sensing device or the processing means 220 may be arranged to extract
or identify the object by performing data association, such as when the location sensing
device 210 temporarily loses track of the object. For example, several objects may
be detected within a field of vision of the location sensing device 210, such as a
pointing hand with several possible fingers, steering wheel, rear viewing mirror,
etc. Extracting and/or identifying the desired object such as a pointing finger or
other object may be performed as a preliminary step.
[0055] The display means 230 is a display device for displaying one or more selectable items
which may form part of a graphical user interface (GUI). The display device may be
a touch-sensitive screen for outputting visual images comprising the one or more selectable
items which may form part of the GUI. The display device 230, in response to a user
touching a surface of the screen, may output data indicative of a touched location
or may output data indicative of the selected item. In another embodiment the display
device 230 may comprise a projection device arranged to project an image onto a surface,
such as an interior surface of the vehicle, where the image comprises a selectable
object displayed on the surface. For example the surface may be a dashboard or interior
portion of the vehicle, although it will be realised that other surfaces may be envisaged.
[0056] The processing means 220 may be a processing device comprising one or more processors
and memory accessible to the processing device. The memory may store computer software
arranged, when executed by the processing device, to perform a method according to
an embodiment of the invention. The memory may also, in use, store data indicative
of the location of the object at one or more instants in time.
[0057] The processing means 220 may comprise a trajectory module 221 for determining the
trajectory of the object. It will be realised that the term trajectory may be understood
to mean the location of the object at a plurality of instants in time. The trajectory
module 221 is arranged to determine a likelihood of one or more possible targets being
the intended target of the object.
[0058] In particular, the trajectory module 221 may determine, at an instant in time
tk, the probability of a selectable item
Bi being the intended target as
P(
Bi|
m1:k) where
bi = [
bx,i by,i bz,i]
T denotes coordinates of a centre of an i
th selectable icon
Bi and
comprises all available coordinates of the object at consecutive discrete times {
t1,
t2,...,
tk}
. The trajectory module 221 may determine, in some embodiments,
as a processed location of the object such as after a pre-processing operation has
been performed to, for example, smooth the trajectory of the object. The pre-processing
may remove one or more of noise, unintentional movements, vibrations, jumps etc. from
the location data
m1:k to produce
c1:k. Unintentional movements are, for example, those illustrated in Figure 1(b). It will
be appreciated that in the following
m1:k may be replaced with
c1:k.
[0059] In some embodiments the trajectory module 221 may determine a probability for each
of a plurality
N of items
where
is a set of items such as selectable GUI items as
P(B
i|
m1:k).
[0060] A filtering operation may be performed to reduce erratic or unintentional movements
of the object. Such movements may be due to road or driving conditions e.g. the road
being uneven or the vehicle being driven enthusiastically, such as in a sporting manner.
Such movements may also be due to a user walking or moving.
[0061] The filtering operation may be a Monte Carlo filtering operation such as Sequential
Monte Carlo (SMC). The filtering is performed before an intent inference process,
as will be described. The output of the filtering operation at the time instant
tn is indicative of a true location of the pointing object denoted by
cn = [
xtn,
ytn,
ztn]
T, thus after removing unintentional movements or undesired noise.
[0062] For mild perturbations, the filtering operation may be based on linear state space
model of the object's movements. The model may lead to a linear statistical filtering
operation, e.g. Linear Kalman filter. More erratic unintentional pointing object movements,
e.g. significant jumps or jolts, may be modelled as jumps that may lead to non-linear
implementations, e.g. Monte Carlo filtering such as Sequential Monte Carlo (SMC) or
Markov Chain Monte Carlo (MCMC) or any other numerical approach.
[0063] The probability
P (
Bi|
m1:k) or
P (
Bi|
c1:k) of an item being the intended target is determined according to a model and the
trajectory of the object. The model may be a linear or a non-linear model. The model
models unintended movements such as jumps or jolts due to perturbations i.e. movement
such as arising from vehicle movement.
[0064] The model may be one of a Nearest Neighbour (NN), Bearing Angle (BA), Heading and
Solid Angle (HSA), Linear Destination Reversion (LDR) such as the Mean Reverting Diffusion
(MRD) as well as Equilibrium Reverting Velocity (ERV), Nonlinear Destination Reversion
(NLDR) and a Bridging Distribution (BD). In addition to the information below, further
information associated with these models according to embodiments of the invention
is provided in the accompanying draft papers.
[0065] The intent inference module 222 is arranged to determine an intended target of the
object. The intended target is determined using a Bayesian approach. The intent inference
module 222 may be arranged to determine the intended target from the plurality
N of targets based on the likelihood associated with each of the plurality of targets
P(
Bi|
m1:k). This may be equivalent to calculating the Maximum a Posteriori (MAP) via:
for the set of
N nominal targets where
B̂(
tk) is the predicted destination and P (
Bi|
m1:k) ∝
P(
m1:k|
Bi)
P(
Bi) according to Bayes' rule P (
Bil
m1:k) ∝
P(
m1:kl
Bi)
P(
Bi)
.
[0066] The following sections provide a discussion of a plurality of models which may be
used by the trajectory module 221.
Nearest Neighbour (NN) Model
[0067] In the NN model the likelihood P is assigned to each item based on a distance to
the current position of the object at an instant in time
tk. Unlike traditional approaches to NN, here a probabilistic interpretation of the nearest
neighbour model is formulated such that the probability of each nominal destination
is calculated.
[0068] This approach chooses the item such as the interface selectable icon that is closest
to the current position of the object such as the pointing finger, i.e.
Bi ∈
B with the smallest Euclidean distance
dk,i = ||
ck -
bi||
2,
i = 1, 2,...,
N. In a probabilistic framework, this can be expressed as
where
p (.) is either a known distribution, for example Gaussian, or a distribution learnt
from previously recorded data. Whereas, the distribution mean
f(
bi) is a function of the location of the
ith destination, for example
f(
bi) =
bi. The most simple NN model is given by
where the object location
ck has a multivariate normal distribution with a mean equal to that of the possible
destination and a fixed covariance
The latter is a design parameter. Assuming that the logged finger positions at various
time instants are independent, the sought
P(
c1:k|
Bi) reduces to
Otherwise, the correlation between successive measurements will dictate combining
the destination probabilities obtained from each measurement.
Bearing Angle (BA) Model
[0069] The BA model is based on an assumption that the object moves directly toward the
intended destination. The BA model may use the current position of the object at an
instant in time
tk and a previous position of the object, which may be
tk-1. The bearing angle between the positions of the object and the item may be used to
calculate the probability.
[0070] This model is based on the premise that the pointing finger is heading directly towards
the intended destination, i.e. the cumulative angle between the finger positions and
the target is minimal. For every two consecutive measurements, the bearing angle with
respect to the destination can be assumed to be a random variable with zero mean and
fixed variance as per
where
p (.) is either a known distribution, for example Gaussian, or a distribution learnt
from previously recorded data. Whereas,
θi,k = ∠(
vk,
bi) for
Bi,
vk =
ck -
ck-1 and
is a design parameter. We can write
[0071] This algorithm can be considered to represent the best outcome of the linear-regression-extrapolation
techniques; e.g. assuming that the distance to the intended destination
dM is accurately estimated. According to (6) and (7), BA forms a wedge-shaped confidence
interval whose width is set by
Any selectable icon that falls within this region is assigned a high probability.
Heading and Solid Angle (HSA) Model
[0072] The HSA model is based upon a distance of the object from the item at an instant
in time
tk and a solid angle of the item. The HSA model may use the current position of the
object at an instant in time
tk and a previous position of the object, which may be
tk-1.
[0073] In the HSA model an object
Bi has a smaller solid angle if the observer is far from its location compared with
that if the observer is nearby as demonstrated in Fig. 3. Solid angle (in steradians)
of a sphere located at distance
di,k is approximated by
where A is the area of the target object. Targets of the arbitrary shapes can be
closely approximated by a number of spheres. Parameter
αk, which is the exposure angle, is irrelevant to the prediction problem and
αk = 0 is assumed. The direction of travel is specified by the measured velocity vector
vk at
tk and the HSA likelihood probability for two consecutive pointing positions can be
obtained via
[0074] Similar to the BA model, the divergence of the bearing from the location of B
i is defined by θ
i,k = ∠(v
k,b
i),κ which is a design parameter. If the pointing finger is in close proximity to a
possible target bigger θ
i,k values are tolerated due to the resultant Ω
i,k. The HSA model can be viewed as a combined BA and NN model. The probability P(c
1:k|B
i) can be calculated similar to (7).
[0075] It is noted that a distribution other than Gaussian with the relative moments, for
example learnt from the collected pointing trajectories, can be applied in the NN,
BA and HSA prediction models.
Linear Destination Reverting (LDR) Model
[0076] In this approach, the movement of a pointing object is modelled as a function of
the intended destination. The characteristics of the pointing movements captured by
the adopted model are denoted by a state
st at time
t. They can include the pointing object location, multidimensional velocity, multidimensional
acceleration, etc. An underlying premise is that the pointing object reverts to the
intended destination at a rate that can be specified in the model. A Markov process
is then defined where the current pointing movement characteristics is a linear function
of the one or more previous moves and the destination. Thus, each of the
N possible destinations in a set
is associated with a model. The model that matches the characteristics of the pointing
object pointing trajectory in the current pointing task is assigned high probability
and vice versa. Below we describe two possible LDR models.
Mean Reverting Diffusion (MRD)
[0077] The MRD models the object movements as a process that reverts to a particular average
value, for example a possible destination. It may only considers the location characteristic
of the pointing movement and therefore
sk =
ck. It assumes that the current pointing object location should be at the destination
that exerts an attraction force to bring the pointing object to its location. In a
continuous-time, the pointing object movement is modelled as a multivariate Ornstein-Uhlenbeck
process with a mean-reverting term. For the
N possible destination, it is described by
[0078] The square matrix
Λ sets the mean reversion rate that steers the evolution of the process,
bi is location of the
ith possible destination,
σ is a square matrix that drives the process dispersion and
wt is a Wiener process. Upon integration of (10) and discretising the outcome, we have:
where
si,k and
si,k-1 are the state vectors with respect to
Bi at the time instants
tk and
tk-1 respectively. The time step is denoted by
τk =
tk -
tk-1 and
is an additive Gaussian noise.
Equilibrium Reverting Velocity (ERV)
[0079] Each of the nominal destinations is assumed to have a gravitational field with strength
inversely proportional to distance away from its centre
bi. The speed of travel of the object towards the destination location
bi is expected to the highest when the object is far from
bi and vice versa. The movements of the object are modelled with respect to the
ith destination as
where
st = [
xt,ẋt,yt,ẏt,zt,żt]
T such that
ẋt,
ẏt and
żt are the velocities along the
x, y and
z axes, respectively. Whereas,
A =
diag{
Ax,Ay,Az},
Ay =
i = [
bx,i,
0,by,i,0,
bz,i,0]
T encompassing the coordinates of
Bi and
is a Wiener process. Each of
ηx,
ηy and
ηz dictates the restoration force along their corresponding axis;
ρx,
ρy and
ρz represents a damping factor to smooth the velocity transitions. After integrating
(12), we can represent the discretised resultant by
[0080] Given the Gaussian and linear nature of the LDR models, for example (11) and (13),
a linear optimal recursive filter can be used to determine the sought {
P(
m1:k|
Bi):
i = 1,2,...,
N} assuming linearly collected measurements
mk =
Hksk +
nk such that
nk is multivariate Additive White Gaussian Noise. For a destination
Bi, probability
P(
m1;k|
Bi) can be sequentially calculated since according to the chain rule the following applies
P(
m1:k|
Bi) =
P(
mk|
m1:k-1,
Bi),...,
P(
m2|
m1,
Bi) ×
P(
m1|
Bi). This implies that at time
tk, only the predictive probability
P(
mk|
m1:k-1,
Bi) is required to determine
P(
m1:k|
Bi) for the
ith nominal destination. The pursued
P(
mk|
m1:k-1,
Bi) can be obtained from a Linear Kalman Filter (LKF) whose purpose here is not to track
the object, but to produce the predictive probability. As a result, the predictor
compromises N Kalman filters each dedicated to a particular nominal suspected destination.
[0081] Linear destination reverting models, other than the MRD and ERV, that include more
movement characteristics such as acceleration or jerks can be applied. Their implementation
is similar to the MRD and ERV models via a bank of statistical filters.
Nonlinear Destination Reverting (NLDR) Model
[0082] In this approach, the movements of an object is assumed to include the destination,
the characteristics of the pointing movements and nonlinear phenomena such as jumps
or jolts representing perturbations in the pointing trajectory due to external factors.
An example is carrying out a pointing task in a vehicle moving over harsh terrain
as in Figure 1b. An example of a perturbations process is the jump process
pt which represent factors that knocks the pointing object off its planned trajectory.
For example,
dpt =
σpdW2,t +
σJdJt where the jump process is
and I is the number of jumps/jolts. The jumps effect allows occasional large impulsive
shocks to the pointing object location, velocity, acceleration, permitting the modelling
of sharp jolts or sudden movements. Other nonlinear models that capture the characteristics
of the present perturbations characteristics may be considered. The model state for
each nominal destination
si,t in the NLDR incorporates the pointing object position
ct = [
xt,yt,zt]
T, other characteristics of
ct (for example velocity
ċt or acceleration
c̈t, etc.), perturbations
pt, other characteristics of
pt and the destination
Bi.
[0083] Similar to the LDR model the underlying premise is that the pointing object reverts
to the intended destination at a rate that can be specified in the model. A Markov
process is then defined where the current pointing movement characteristics is a linear
function of the one or more previous moves, the present nonlinear perturbations and
the destination. Thus, each of the
N possible destinations in the set
is associated with a model. The model that matches the characteristics of the pointing
object pointing trajectory in the current pointing task is assigned high probability
and vice versa. Accordingly, a bank of
N statistical filters are applied to sequentially obtain the sought {
P(
m1:k|
Bi),
i = 1,2,...,
N}. Approaches such as sequential Monte Carlo methods or other numerical techniques
can be utilised to attain the pursued
P(
m1:k|
Bi) given the nonlinear nature of the state evolution equation once the nonlinear perturbations
are included. Minimising the computational complexity of the nonlinear filtering approaches
can be achieve by assuming that the perturbations such as jumps or jolts are identical
in the bank of N statistical filters. Hence, they need to be tracked or identified
only once.
Bridging Distributions (BD) Model
[0084] In this approach, the movement of an object is modelled as a bridge distribution,
such as a Markov bridge. In some embodiments the movement of the object is modelled
as one of several Markov bridges, each incorporating one of a plurality of possible
destinations, e.g. selectable icons on a GUI displayed on a touchscreen. The path
of the object, albeit random, must end at the intended destination, i.e. it follows
a bridge distribution from its start point to the destination. By determining a likelihood
of the observed partial object trajectory being drawn from a particular bridge, the
probability of each possible destination is evaluated. The bridging model may be based
upon a Linear Destination Reversion (LDR) or a Nonlinear Destination Reversion (NLDR)
model.
[0085] Where {
Bi:i = 1,2,..N} is a set of N nominal destinations, e.g. GUI icons such as on an in-vehicle
touchscreen although it will be realised that other GUIs may be envisaged. The objective
is to determine the probability of each of these endpoints being the intended destination
BI of the tracked object given a series of k measurements,
i.e. to calculate
P (
Bi|m
1:k) for all nominal destination, where i = 1,2, ..., N. The k
th observation m
k = [x̂
tk ŷ
tk ẑ
tk]' at time t
k can be the object or pointing finger 3D coordinates. It is derived from a true, but
unknown, underlying object position c
k; its velocity at the time t
k is notated as ċ
k.
[0086] The location of the tracked object, i.e. the pointing fingertip, at the end of the
pointing task is that of the intended destination
BI. Let T be the total duration of the overseen task, i.e. the duration needed by the
tracked object to reach its destination. The hidden state of the tracked object at
time T is given by
where c
T and ċ
T are the true finger position and velocity at T respectively;
such that b
i denotes the known location of the i
th destination, e.g. GUI icon in 3D, and v
i is the tracked object velocity upon reaching the destination. Thus, the probability
of
Bi being the intended destination is:
since p(m
1:k|s
T = b̂
i) = p(m
1:k|
Bi,T); T is unknown. The priors p(
Bi) summarise existing knowledge about the probability of various endpoints in
Bi being the intended one, before any pointing data is observed; they are independent
of the current trajectory m
1:k. Uninformative priors can be constructed by assuming that all possible destinations
are equally probable, i.e. p(
Bi) = 1/N, i = 1,2, ..., N. However, if priors are available based on relevant contextual
information, such as tracked object travel history, GUI interface design or user profile,
they can easily be incorporated as per (BD 1). The objective, then, is to estimate
the integral
for each of the N possible destinations. A simple quadrature approximation of
is given by:
where Δ
Tn = T
n - T
n-1 and the T
n are quadrature points, ideally chosen to cover the majority of the probability mass
in p(T|
Bi). More sophisticated quadrature or Monte-Carlo estimates could also be employed.
Uniformly arrival times priors can be assumed, i.e.
Otherwise, learnt or inferred priors on the task durations can be applied.
[0087] Adopting a linear motion model, the state of the user's finger
at time t
k is assumed to follow the linear Gaussian motion model:
with
This general form permits many useful motion models, the simplest of which is the
(near) constant velocity model, which is the solution of the continuous-time stochastic
differential equation
where dW
t is the instantaneous change of a standard Brownian motion at time t, 0
3 is a 3 × 3 zero matrix, I
3 is a 3 × 3 identity matrix and
is a 3 × 1 zero vector. The corresponding F
k and Q
k matrices in equation (BD 3) are given by F
k = M(Δ
k) and Q
k = R(Δ
k) and Q
k = R(Δ
k), where the time step Δ
k = t
k - t
k-1 (which can vary, allowing asynchronous observations), and
with σ setting the motion model state transition noise level. The movements in the
x, y and z dimensions are considered to be independent from one another. Observations
are assumed to be a linear function of the current system state with additive Gaussian
noise, such that
with
It is noted that other motion models suitable for intent inference that could be
utilised in this framework. Those include the destination-reverting models and the
linear portion of the perturbation removal model.
[0088] Without conditioning information, the distribution of a hidden state s
k given observations m
1:k in equations (BD 3) and (BD 5) can be calculated by a standard Kalman Filter (KF)
as per
with (using the 'correct' step of the Kalman filter):
[0089] Here,
and
are derived from the inferred system distribution at t - 1, given by the prediction
step of the KF:
when k = 1, these quantities are given by the priors, so that
and
∑
prior. They represent prior knowledge of track start position,
[0090] In order to condition on the system state at the destination arrival time,s
T, it is necessary to evaluate the density p(s
T|s
k) for the current tracked object state (and arrival time). For motion models derived
from continuous-time processes, such as the near constant velocity model, this is
possible by direct integration of the motion model (which is possible in the linear
time-invariant Gaussian case). For the near constant velocity model, this is given
by
where M
k = M(T - t
k) and R
k = R(T - t
k) from equation (BD 4), and T - t
k is the time step between the T
th and
observations. Alternatively, forward or backward recursions can be formed in terms
of F
2:T, and Q
2:T, which can be used with discrete models without a continuous-time interpretation.
[0091] Subsequently, the conditional predictive distribution of s
k given the k - 1 observations and the intended destination (which specifies s
T) can be shown to reduce to
[0092] This can be seen by analogy to the 'correct' step of the standard Kalman filter.
[0093] By taking the latest observation into account, the correction stage (taking account
of m
k) can be shown to be:
where
and
[0094] This can also be seen by analogy with the 'correct' step of the Kalman filter noting
that
[0095] Together with the standard KF, the above predict and correct steps allow the conditional
distribution of finger position to be calculated at the time of each observation,
conditional on the destination and arrival time. It remains to calculate
where it can be shown that:
[0096] This is equivalent to the prediction error decomposition in the KF. Note that the
likelihood calculation is the objective of filtering, the corrective step in equation
(BD 13) is not required.
[0097] Using the likelihood in equation (BD 14), the probability of each nominal destination
can be evaluated via equations (BD 1) and (BD 2) upon arrival of a new observation.
The integral in equation (BD 1) can be calculated using a two-step Kalman filter if
a linear model is used to describe the tracked object motion or dynamics as per equation
(BD 1) to (BD 15). This includes utilising the destination-reverting models, such
as the MRD and ERV, within the bridging-distributions-based predictor framework. For
nonlinear motion models, such as nonlinear destination reverting models, modified
advanced statistical inference methods, such as sequential Monte Carlo or Markov chain
Monte Carlo techniques, can be employed. Therefore, various models that describe the
tracked object dynamics can be used within the bridging-distributions-based prediction
framework, thus it can be considered to be a more general approach compared to the
original destination-reverting methods.
[0098] Whilst the BD approach requires some prior knowledge about the total duration of
the pointing task, i.e. distribution of those durations rather than a fixed value,
it delivers superior prediction results compared to using the destination-reverting
models alone as shown below. The required prior knowledge, this is
P(T|
Bi), can be obtained during the training phase undertaken by the system user or from
previously observed trajectories.
[0099] Predictors using bridging distributions also allow the intended destination to be
defined as a spatial region. This approach takes into account the destinations sizes
and caters for the scenario when the destinations can have distinct sizes/spatial-areas.
This is achieved by defining each destination as a random variable with a mean and
covariance. The location of the centre of the destination can be the distribution
mean (or a function of the mean) and the variance captures the destination spatial
area (or the spatial area is a function of the covariance). This is a more practical
formulation compared to the original destination-reverting-based techniques, such
as MRD and ERV, where each destination is considered to be a single location/point.
[0100] As will be described below with reference to Figure 12, the bridging model is able
to predict, well in advance, the intended destination of an object, such as of an
in-vehicle pointing gesture. In this case, the pointing gesture time or duration may
be reduced.
[0101] If the observations model for the LDR or NLDR or BD is not linear or present noise
is non-Gaussian, for example
mk =
fk(
sk) +
nk where
fk(.) is a nonlinear function, alternative statistical filtering approaches such as
sequential Monte Carlo methods or other numerical techniques may be utilised to attain
the pursued
P(
m1:k|
Bi).
[0102] While the processing means 220 produces the probability of each target being the
destination, it might be desirable to sequentially obtain in real-time the underlying
unperturbed pointing object trajectory or its characteristics represented by
sk, thus after removing unintentional movements or the present perturbations. This can
either be achieved by combining the results of the
N statistical filters used for intentionality prediction or to perform the smoothing
operation as a pre-processing stage that precedes calculating {
P(
m1:k|
Bi),
i = 1,2,...,
N}. In the former, it is equivalent to calculating the posterior distribution of the
state
sk at the time instant
tk;
sk incorporates the pointing object location
ck. The distribution is given by
P(
sk|
m1:k) =
where
such that
P(
sk|
m1:k,Bi) is produced by the sequential state update of the statistical filter and
P(
Bi|
m1:k) for
i = 1,2,...,
N is a determined constant. The summation in
P(
sk|
m1:k) results in a mixed Gaussian model with the minimum mean squared error or a maximum
a posteriori estimators of
sk being the mean and mode of the resultant distribution, respectively.
[0103] Removing the perturbations prior to calculating {
P(
m1:k|
Bi),
i = 1,2,...,
N) to establish the intended destination or destinations entails modelling the pointing
process as the sum of the intentional pointing object movements plus unintentional
perturbations or noise. In this case, the observed pointing object location using
a pointing object tracker module 210 can be modelled as
where the unintentional perturbations-related movements and their characteristics
are captured in
pk and the measurement noise is denoted by
εk. Various perturbation models can be used including the jump diffusion model. The
true pointing movement and/or its characteristics can be modelled using a linear model,
thus
sk =
Fksk +
vk where
sk incorporates the location of the pointing object, velocity, acceleration, etc. Whereas,
Fk is the state transition matrix and
vk is the present noise. Nearly constant velocity or acceleration models can be used
to model the pointing movement in this case, which is independent of the destination.
Statistical filtering approaches can be applied to extract
sk from
mk by removing or suppressing the unintentional perturbations-related movements. Such
techniques include Kalman filtering in case of linear state and perturbations models.
Various adapted version of Kalman filtering, sequential Monte Carlo methods or other
numerical techniques can be utilised for nonlinear state or observation models.
[0104] Figure 9 illustrates a trajectory 910 of an object which exhibits perturbations due,
for example, to movement of a vehicle in which the object is moving. A filtered trajectory
920 of the object is also shown which exhibits a more direct course toward the intended
target.
[0105] It has been observed by the present inventors that only a weak correlation exists
between acceleration determined from data output by the location sensing device 210
and that measured by an Inertia measurement unit (IMU) or accelerometer. Thus, whilst
use of the IMU data to compensate for noise in the location measurements may not be
effective, the IMU data may be used for modifying applied pre-processing and/or the
model.
[0106] The processing means 220 may comprise an intent inference module 222 for determining
the intended target
B̂(
tk) of the object at time instant
tk.
[0107] Determining the intended destination, or a number of possible destinations, or the
area of the possible destinations at the time instant
tk relies on the calculated probabilities
P(
Bi|
m1:k) for
i = 1,2,...,
N. The decision may be based on a cost function
that ranges from 0 to 1. It penalises an incorrect decision where
Bi is the predicted destination and
B* is the true intended target in the considered pointing task. For example predicting
the wrong destination may impose a maximum cost of 1. Therefore, the objective is
to minimise the average of the cost function in a given pointing task given the partially
observed pointing trajectory
m1:k according to
where
[.] is the mean. Assume the hard-decision criterion where
if
Bi =
B* and
otherwise leads to selecting one target out of the set {
Bi:
i = 1,2,...
N}. In this case, it is equivalent to determining the MAP destination estimate. Other
cost function formulations that reflect the desired level of predication certainty
may be used and subsequently a group of selectable targets may be selected in lieu
of one as with the MAP case.
[0108] The Bayesian approach relies on a belief-based inference followed by a classifier.
Since the aim is to utilise the available pointing trajectory to determine the destination,
a uniform prior may in some embodiments be assumed on all items, for example
P(
Bi) =
1/
N for
i = 1,2,...,
N. In this case, the classification problem corresponds to the maximum likelihood estimation
and the solution relies solely on establishing
P(
m1:k|
Bi) for
i = 1,2,...,
N. However, in other embodiments a non-uniform prior may be used for the items. For
example information concerning previous selections from a GUI may be used as the prior
such that the likelihood of the intended destination is influenced by a history of
user selections. It will be realised that the prior may alternatively or additionally
be based on other information.
[0109] In some embodiments only a last
L logged true object positions i.e.
{
ck-L,
ck-L+1,...,
ck} and
k -
L > 0 may be used to determine
B̂(
tk)
. In these embodiments a sliding time window is applied to the trajectory data and
a width of the window may be chosen appropriately.
[0110] Figure 4 illustrates a method 400 according to an embodiment of the invention. The
method 400 may be performed by the system 200 described with reference to Figure 2.
[0111] In step 410 a location of the object at an instant in time is determined. The location
of the object may be determined by the location sensing device 210 receiving radiation,
such as light or sound, reflected from the object and, from the received radiation,
determining at the time instant
tk location data as
indicative of the location of the object. The location data may be stored in a memory
to form data indicative of a trajectory of the object over a period of time.
[0112] In step 420 a likelihood of one or more items being the intended target of the object
is determined. The likelihood P may be determined as
P(
Bi|
m1:k) as explained above. Step 420 may be performed by the trajectory module 221, as previously
explained. In some embodiments the likelihood for each of a plurality of items as
P(
Bi|
m1:k) being the intended destination is determined in step 420. The likelihood for the
one or the plurality of items being the intended destination is determined based upon
a model and the location of the object determined in step 410.
[0113] In step 430 the intended target is determined. The intended target may be determined
from the likelihood for each of a plurality of items as
P(
Bi|
m1:k). Step 430 may be performed by the intent inference module 222 as discussed above.
Step 430 may comprise determining the Maximum a Posteriori (MAP).
[0114] In some embodiments the method 400 comprises a step 440 in which an output is determined
based on the result of step 430. The output may comprise a selection or operation
of the intended target. That is, where the intended target is a user-selectable item
on the GUI, the item may be selected as though the user had touched the display device
to select the item. Alternatively where the intended target is a button or control
the button or control may be activated.
[0115] The output may be provided via the display device 230. The output may be a modification
of the GUI displayed on the display device responsive to the determination of the
intended target in step 430. The output may only occur once the likelihood associated
with the intended target reaches a predetermined probability P, thereby avoiding the
item being selected when the likelihood is relatively low. In some embodiments the
output of step 440 may comprise a modification to the appearance of the GUI. For example
the intended target may be highlighted on the GUI. The intended target may be highlighted
when the likelihood associated with the intended target reaches a predetermined probability
P. The predetermined probability may be lower than that for selection of the intended
target, such that, at a first lower probability the intended target is visually indicated
and at a second higher probability the intended target is automatically selected.
In another embodiment a group of intended targets may be visually indicated in the
GUI when their associated likelihood's of being the intended target are at least the
predetermined probability
P.
[0116] In step 450 it is determined whether the method is complete. If the method is not
complete, then the method returns to step 410. If, however, the method is complete
then the method ends. The method 400 may be complete when the likelihood associated
with one or more items reaches a predetermined threshold probability. For example
the method 400 may end when the likelihood reaches the second probability discussed
in relation to step 440 at which the intended target is automatically selected.
[0117] Figure 5 illustrates results of an experiment at predicting an intended item on a
GUI against a percentage of completed pointing movement i.e. 100 ×
tk/
tM and averaged over all considered pointing tasks (
tM is the total pointing task completion time). Results using the NN, BA, MRD and ERV
models are illustrated. Figure 5 starts after completing 15% of the pointing trajectory
duration prior to which none of the techniques produce meaningful results. To represent
the level of average prediction uncertainty, Fig. 6 displays the mean of the uncertainty
metric given by
where
P(
B*(t
k)|
m1:k) is the calculated probability of the true intended item according to the prediction
model at time instant
tk . If the true target is predicted with high certainty, i.e.
P(
B*(
tk)|
m1:k) → 1, the confidence in the prediction will be very high as ε(
tk) → 0. It is noted that the level of the predictor's success in inferring the destination
does not necessarily imply high prediction certainty and vice versa. In all the simulations,
we do not assume that the predictor knows the proportion of the completed trajectory
when making decisions. It can be noticed from Fig. 5 that the proposed Bayesian approach
provides the earliest successful predictions of the intended target, especially in
the crucial first 15% to 75% of the pointing movement duration. This success can be
twice or three times that the nearest examined competitor. Both MRD and ERV models
exhibit similar behaviour, with MRD prediction quality marginally and temporarily
degrading in the 70%-80% region. This can be due to a failed prediction in a single
experiment. Both of these models provide significant performance improvements compared
with other techniques. The NN method tends to make successful predictions only in
the final portion of the pointing task since the user's finger is inherently close
to the intended item at this stage, i.e. briefly before the selection action. In practice,
an early prediction, e.g. in the first 75% of the pointing task duration, is more
effective at minimising the user movement/cognitive effort, enabling early pointing
facilitation techniques and enhancing the overall user experience. The benefits of
successful intent inference in the last 25% of the pointing gesture duration are questionable
since the user has already dedicated the necessary effort to execute the selection
task. The proposed predictors notably outperform the NN for the majority of the duration
of the pointing task (or all in the ERV case). With regards to the prediction uncertainty,
Fig. 6 shows that the introduced Bayesian predictions can make correct classification
decisions with substantially higher confidence levels compared with other techniques.
This advantage over the NN model inevitably diminishes as the pointing finger gets
closer to the interface in the last portion of the pointing gesture period, e.g. after
completing over 75% of the pointing movement.
[0118] Figure 7 provides a similar plot to Figure 5 illustrating prediction based on the
NN, BA, HSA and MRD models. Again it can be noticed from Figure 7 that the MRD model
provides the earliest successful predictions of the intended destination, especially
in the crucial first 85% of the pointing gesture.
[0119] The performance of the proposed Bridging Distributions (BD) predictor for 57 pointing
tracks collected in an instrumented car driven over various road types was assessed.
The data pertains to four passengers undertaking pointing tasks to select highlighted
GUI icons displayed on the in-vehicle touchscreen. The layout of the GUI is similar
to that in Figures 1 and 2 with 21 selectable circular icons that are less than 2
cm apart.
[0120] The predictor performance is evaluated in terms of its ability to successfully establish
the intended icon I via the MAP estimator in (BD 2), i.e. how early in the pointing
gesture the predictor assigns the highest probability to the intended GUI icon I.
This is depicted in Fig. 10 against the percentage of completed pointing gesture (in
time) and averaged over all pointing tasks considered. Fig. 11 shows the proportion
of the total pointing gesture (in time) for which the predictors correctly established
the intended destination. To represent the level of average prediction uncertainty,
Fig. 12 displays the mean of the uncertainty metric given by ϑ(
tk) = -log
10p(
Bi|m
1:k) where i is the true intended destination; it is expected that ϑ(
tk) → 0 as t
k → T for a reliable predictor.
[0121] Fig. 10 shows that the introduced bridging-distributions based inference achieves
the earliest successful intent predictions. This is particularly visible in the first
75% of the pointing gesture where notable reductions in the pointing time can be achieved
and pointing facilitation regimes can be most effective. The performance gap between
the various predictors diminishes towards the end of the pointing task. An exception
is the BA model where the reliability of the heading angle as a measure of intent
declines as the pointing finger gets closer to the target. Fig. 11 shows that the
BD approach delivers the highest overall correct predictions across the pointing trajectories
(NN and BA performances are similar over the relatively large data set considered).
[0122] Fig. 12 illustrates that the proposed BD model makes correct predictions with significantly
higher confidence throughout the pointing task, compared to other methods. Overall,
Figs. 10, 11 and 12 demonstrate that the BD inference approach introduced predicts,
well in advance, the intent of an in-vehicle pointing gesture, e.g. only 20% into
the gesture in 60% of cases, which can reduce pointing time/effort by 80%.
[0123] It can be appreciated that embodiments of the present invention provide methods and
apparatus for determining an intended target of an object, where the object may be
a pointing object such as a stylus or finger, although the invention is not limited
in this respect. The intended target may be one or more intended targets from a plurality
of possible targets. The possible targets may be items in a GUI or physical controls.
Advantageously embodiments of the present invention may reduce errors associated with
HMI, such as by detecting when a selected target was not the intended target i.e.
the user accidentally selected a GUI item due to, for example, vehicle movement. Advantageously
embodiments of the invention may also reduce a gesture time by selecting a target
before a user is able to physically touch the target. Embodiments of the invention
may be useful in vehicles such as land vehicles, as illustrated in Figure 8 which
comprises a system according to an embodiment of the invention or a processing device
arranged to perform a method according to an embodiment of the invention, but also
aircraft and watercraft. Embodiments of the invention may also be useful with computing
devices such as portable computer devices e.g. handheld electronic devices such as
smartphones or tablet computing devices.
[0124] It will be appreciated that embodiments of the present invention can be realised
in the form of hardware, software or a combination of hardware and software. Any such
software may be stored in the form of volatile or non-volatile storage such as, for
example, a storage device like a ROM, whether erasable or rewritable or not, or in
the form of memory such as, for example, RAM, memory chips, device or integrated circuits
or on an optically or magnetically readable medium such as, for example, a CD, DVD,
magnetic disk or magnetic tape. It will be appreciated that the storage devices and
storage media are embodiments of machine-readable storage that are suitable for storing
a program or programs that, when executed, implement embodiments of the present invention.
Accordingly, embodiments provide a program comprising code for implementing a system
or method as claimed in any preceding claim and a machine readable storage storing
such a program. Still further, embodiments of the present invention may be conveyed
electronically via any medium such as a communication signal carried over a wired
or wireless connection and embodiments suitably encompass the same.
[0125] All of the features disclosed in this specification (including any accompanying claims,
abstract and drawings), and/or all of the steps of any method or process so disclosed,
may be combined in any combination, except combinations where at least some of such
features and/or steps are mutually exclusive.
[0126] The claims should not be construed to cover merely the foregoing embodiments, but
also any embodiments which fall within the scope of the claims.