[0001] The present invention relates to a system for embedding data in an audio signal and
to its subsequent recovery, which can be used for watermarking, data communications,
audience surveying etc. The invention has particular relevance, to a system for hiding
data in an audio signal by adding echoes to the audio signal and to a system for recovering
the hidden data by detecting the added echoes.
[0002] US 5893067 discloses a technique for hiding data within an audio signal for transmission to
a remote receiver. The data is hidden in the audio signal by adding an artificial
echo to the audio signal and varying the amplitude and/or delay of the echo in accordance
with the data to be hidden.
[0003] A problem with the data hiding technique described in
US 5893067 is that in most situations, natural echoes can mask the artificial echoes making
it difficult for the receiver to be able to identify the artificial echoes and hence
recover the hidden data.
[0004] The present invention aims to provide an alternative data hiding technique. One embodiment
of the invention at least alleviates the above problem by applying echoes of opposite
polarity to represent each data value.
[0005] According to one aspect, the present invention provides a method of embedding a data
value in an audio signal, the method comprising: generating an echo of at least a
portion of the received audio signal; and embedding the data value in the audio signal
by combining the received audio signal with the generated echo; wherein the data value
is embedded in the audio by varying the polarity of the echo that is combined with
the audio signal in dependence upon the data value. The inventors have found that
using polarity modulation to embed the data in the audio signal can make the recovery
of the embedded data easier in the receiver, especially in the presence of natural
echoes caused, by for example the acoustics of the room. The polarity modulation can
be achieved by varying the echo that is generated and/or by varying the way in which
the echo is combined with the audio signal.
[0006] In one embodiment, the generating step generates a first echo of at least a portion
of the received audio signal and a second echo of at least a portion of the received
audio signal, the first and second echoes having first and second polarities respectively,
which polarities vary in dependence upon the data value; and wherein the embedding
step embeds the data value in the audio signal by combining the received audio signal
with the generated first and second echoes.
[0007] Each of the echoes may be generated by repeating at least a part of said audio signal.
The first echo may be generated by repeating a first portion of the audio signal and
the second echo may be generated by repeating a second portion of said audio signal.
Alternatively, the first and second echoes may be generated by repeating substantially
the same first portion of the audio signal. Where the audio signal is received as
a stream of samples, the or each echo may be generated by passing the stream of audio
samples through a delay line.
[0008] In one embodiment, third and fourth echoes may be generated, the third echo having
the same polarity as said second echo and the fourth echo having the same polarity
as said first echo. In this case, the third and fourth echoes may be generated by
repeating substantially the same second portion of the audio signal which is different
to the first portion repeated by the first and second echoes. The second portion of
the audio signal may be adjacent to the first portion. The generating step may generate
the third and fourth echoes by inverting the polarity of a gain factor applied to
the echoes before being combined with the audio signal.
[0009] The first echo may be combined with the audio signal at a first delay relative to
the first portion of the audio signal; the second echo may be combined with the audio
signal at a second delay relative to said first echo; the third echo may be combined
with said audio signal at a third delay relative to said second portion of the audio
signal; and the fourth echo may be combined with the audio signal at a fourth delay
relative to the third echo. The first delay may be equal to said third delay and/or
the second delay may be equal to said fourth delay. In one embodiment, the delays
and the amplitudes of the echoes are independent of the data value.
[0010] Preferably the first and third delays are between 0.5 ms and 100 ms and more preferably
between 5 ms and 40 ms; and the second and fourth delays delayed (relative to the
first and third echoes respectively) by between 0.125 ms and 3 ms and more preferably
between 0.25ms and 1 ms, as these delays are similar to those of natural echoes and
so are less noticeable to users. In one embodiment, the or each echo has an amplitude
that is less than the amplitude of said audio signal. Preferably the or each echo
is faded in and out to reduce obtrusiveness of the echoes to a listener.
[0011] The first and second portions of the audio signal should be long enough for the receiver
to be able to detect the presence of the echoes but not too long as to overly reduce
the data rate that can be communicated. The inventors have found that echoes having
durations of between 20 ms and 500 ms provides a reasonable data rate whilst keeping
to a minimum data transmission errors when transmission occurs over an acoustic link.
If transmission is over an electrical link, then shorter echoes may be used.
[0012] The echoes may be combined with the audio signal by adding and/or subtracting the
echoes to/from the audio signal. The polarity of each echo can therefore be controlled
by controlling the way in which each echo is combined with the audio signal.
[0013] This aspect of the invention also provides a computer implementable instructions
product comprising computer implementable instructions for causing a programmable
computer device to carry out the method described above.
[0014] This aspect also provides an apparatus for embedding a data value in an audio signal,
the apparatus comprising: an echo generator operable to generate an echo of at least
a portion of the received audio signal; and a combiner operable to combine the received
audio signal with the generated first and second echoes to embed the data value in
the audio signal; wherein the echo generator and/or the combiner are arranged so that
the data value is embedded in the audio by varying the polarity of the echo that is
combined with the audio signal in dependence upon the data value.
[0015] According to another aspect, the present invention provides a method of recovering
a data value embedded in an audio signal, the method comprising: receiving an input
signal having the audio signal and an echo of at least part of the audio signal whose
polarity depends upon said data value; processing the received input signal to determine
the polarity of the echo; and recovering the data value from the determined polarity.
[0016] The input signal may comprise a first echo of at least a portion of the audio signal
and a second echo of at least a portion of the audio signal, the first and second
echoes having first and second polarities respectively, which polarities vary in dependence
upon the data value; and wherein the processing step processes the input signal to
combine the first and second echoes and to determine the polarity of the combined
echoes and wherein the recovering step recovers the data value from the determined
polarity of the combined echoes.
[0017] In one embodiment the processing step processes the input signal to determine a first
autocorrelation measure that depends upon the first echo and a second autocorrelation
measure that depends upon the second echo and combines the echoes by differencing
the first and second autocorrelation measures and determines the polarity of the combined
echoes by determining the polarity of the result of the differencing step.
[0018] The first echo may be of a first portion of the audio signal and the second echo
may be of a second portion of the audio signal. Alternatively the first and second
echoes may be repeats of substantially the same portion of the audio signal.
[0019] In one embodiment, the input signal comprises first, second, third and fourth echoes,
the first and fourth echoes having the same polarity and the second and third echoes
having the same polarity which is opposite to the polarity of the first and fourth
echoes, wherein the processing step processes the input signal to combine the first
to fourth echoes and to determine the polarity of the combined echoes and wherein
the recovering step recovers the data value from the determined polarity of the combined
echoes.
[0020] In this embodiment, the processing step may process the input signal to determine
a first autocorrelation measure that depends upon the first echo, a second autocorrelation
measure that depends upon the second echo, a third autocorrelation measure that depends
upon the third echo and a fourth autocorrelation measure that depends upon the fourth
echo and combines the echoes by differencing the autocorrelation measures and determines
the polarity of the combined echoes by determining the polarity of a result of the
differencing step.
[0021] The differencing step may perform a first difference of the first and third autocorrelation
measures, a second difference of the second and fourth autocorrelation measures, a
third difference of the result of said first difference and the result of the second
difference and wherein the polarity of the combined echoes may be determined from
the polarity of a result of the third difference.
[0022] The first and second echoes may be repeats of substantially the same first portion
of the audio signal and the third and fourth echoes may be repeats of substantially
the same second portion of the audio signal. Alternatively, the first and third echoes
may be repeats of substantially the same first portion of the audio signal and the
second and fourth echoes may be repeats of substantially the same second portion of
the audio signal. In one embodiment, the or each echo is faded in and out to reduce
obtrusiveness of the echoes to a listener. In this case, the polarity of the echo
may be determined when the amplitude of the echo is at or near a maximum.
[0023] The first echo may be delayed relative to said first portion of the audio signal
by a first delay; the second echo may be delayed relative to the first echo by a second
delay; the third echo may be delayed relative to the second portion of the audio signal
by a third delay; and the fourth echo may be delayed relative to the third echo by
a fourth delay. The first delay may be equal to the third delay and/or the second
delay may be equal to said fourth delay.
[0024] According to this aspect, a computer implementable instructions product is also provided
comprising computer implementable instructions for causing a programmable computer
device to carry out the above method.
[0025] This aspect also provides an apparatus for recovering a data value embedded in an
audio signal, the apparatus comprising: an input for receiving an input signal having
the audio signal and an echo of at least part of the audio signal whose polarity depends
upon said data value; a processor operable to process the input signal to determine
the polarity of the echo; and a data regenerator operable to recover the data value
from the determined polarity.
[0026] These and other aspects of the invention will become apparent to those skilled in
the art from the following detailed description of exemplary embodiments, which are
described with reference to the following drawings in which:
Figure 1 is a block diagram illustrating the main components of a transmitter and
receiver used in an exemplary embodiment;
Figure 2a is an impulse plot illustrating the echoes that are added to an audio signal
to encode a binary "one";
Figure 2b is an impulse plot illustrating the echoes that are added to an audio signal
to encode a binary "zero";
Figure 3a is an impulse plot illustrating the presence of artificial echoes for a
binary "one" after Manchester encoding and illustrating natural echoes;
Figure 3b is an impulse plot illustrating the presence of artificial echoes for a
binary "zero" after Manchester encoding and illustrating natural echoes;
Figure 4 is a block diagram illustrating in more detail the encoding performed in
the transmitter shown in Figure 1;
Figure 5 is a block diagram illustrating the main components of an echo generation
and shaping module forming part of the transmitter shown in Figure 1;
Figure 6a illustrates a shaping and modulation function that is applied to the echoes
prior to being combined with the audio signal when a binary "one" is to be transmitted;
Figure 6b illustrates a shaping and modulation function that is applied to the echoes
prior to being combined with the audio signal when a binary "zero" is to be transmitted;
Figure 6c illustrates the way in which the shaping and modulation function varies
when two successive binary "ones" are to be transmitted;
Figure 6d illustrates the shaping and modulation function that is applied when a binary
"zero" is transmitted after a binary "one";
Figure 7 illustrates the processing performed in the receiver shown in Figure 1 for
recovering the hidden data from the received audio signal;
Figure 8a is an autocorrelation plot for a typical audio signal without artificial
echoes;
Figure 8b is an autocorrelation plot for the audio signal with artificial echoes during
a first half of a bit symbol;
Figure 8c is an autocorrelation plot for the audio signal with artificial echoes during
the second half of the bit symbol;
Figure 8d is a plot obtained by subtracting the autocorrelation plot shown in Figure
8c from the autocorrelation plot shown in Figure 8b;
Figure 9 is a block diagram illustrating an alternative form of receiver used to receive
and recover the hidden data embedded in the audio signal;
Figure 10 is a plot illustrating the way in which an FEC error count varies during
a synchronisation process used to find the hidden data message within the input signal;
and
Figures 11a and 11b illustrate the processing performed respectively by an FEC encoder
and an FEC decoder in one embodiment.
Overview
[0027] Figure 1 is a block diagram illustrating a transmitter and receiver system according
to one embodiment in which a transmitter 1 transmits data hidden within an acoustic
signal 3 to a remote receiver 5. The transmitter 1 may form part of a television or
radio distribution network and the receiver may be a portable device such as a cellular
telephone handset that is capable of detecting the acoustic signal 3 output by the
transmitter 1.
The transmitter
[0028] As shown in Figure 1, the transmitter 1 includes a forward error and correction (FEC)
encoder module 7, which receives and encodes the input data to be transmitted to the
remote receiver 5. The encoded message data output from the FEC encoding module 7
is then passed to an echo generation and shaping module 9, which also receives an
audio signal in which the encoded message data is to be hidden. The echo generation
and shaping module 9 then hides the message data into the audio by generating echoes
of the audio which depend upon the message data to be transmitted. The generated echoes
are then combined with the original audio signal in a combiner module 11 and the resulting
modified audio signal is then passed to a gain control module 13 for appropriate gain
control. The audio signal is then converted from a digital signal to an analogue signal
by the digital to analogue converter 15 and it is then amplified by a driver module
17 for driving a loudspeaker 19 which generates the acoustic signal 3 having the data
hidden therein.
[0029] As will be described in more detail below, in this embodiment, the polarity of the
echoes (as opposed to their lag and/or amplitude) is varied in order to encode the
data to be transmitted. The inventors have found that this polarity modulation can
be more robust in the presence of natural echoes and periodicities in the audio signal.
This is particularly the case when each data value is represented by two echoes of
the same magnitude but having different lags and opposite polarities. The polarities
of the echoes representing each message bit are reversed to distinguish between a
binary zero and a binary one. This is illustrated by the impulse plots illustrated
in Figure 2. In particular, Figure 2a is an impulse plot illustrating the component
signals that are present when a binary one is to be transmitted and Figure 2b is an
impulse plot illustrating the component signals present when a binary zero is to be
transmitted. As shown in Figure 2a, the component signals include an initial impulse
21 representing the original audio signal followed by two lower amplitude impulses
23-1 and 23-2 representing the two echoes of the original signal component 21 which
are added to the audio signal. As can be seen by comparing Figures 2a and 2b, when
a binary one is to be transmitted, a positive echo 23-1 is transmitted first followed
by a negative echo 23-2; and when transmitting a binary zero a negative echo 23-1
is transmitted first followed by a positive echo 23-2. Although this could be reversed
if desired.
[0030] As shown in Figure 2, in this embodiment, the first echo is added with a lag of approximately
ten milliseconds and the second echo is added 0.25 milliseconds after the first echo.
This is the same regardless of whether a binary one or a binary zero is to be transmitted.
Additionally, as represented in Figure 2, in this embodiment, the echoes that are
added have lower amplitudes compared with the amplitude of the original audio signal.
In particular, in this embodiment, the amplitude of the echoes is approximately one
third that of the original audio signal.
The receiver
[0031] Figure 1 also illustrates the main components of the receiver 5. As shown, the receiver
includes a microphone 31 for detecting the acoustic signal 3 and for converting it
into a corresponding electrical signal which is then filtered and amplified by filter
and amplification circuitry 33. The output from the filter amplification circuitry
33 is then digitised by an analogue to digital converter 35 and the digital samples
are then passed to an echo detector 37. The echo detector 37 then processes the digital
samples to identify the polarities of the echoes in the received signal. This information
is then passed through a data recovery module 39 which processes the echo information
to recover the encoded message data. This message data is then decoded by a decoder
41 to recover the original data that was input to the FEC decoding module of the transmitter
1.
Manchester Encoding
[0032] As will be explained in more detail below, the echo detector 37 detects the echoes
from the received signal by calculating the auto-correlation of the received signal
at specified delays. However, natural echoes (e.g. room echoes) will also contribute
to the autocorrelation values thus calculated as will periodicities of the original
audio track. In order to distinguish the artificial echoes representing the encoded
data from these natural echoes, the message data is also Manchester encoded so that
a message data value of "1" is transmitted as a "1", followed by a "0" (or vice versa),
whilst a message data value of "0" is transmitted as a "0" followed by a "1". In this
embodiment, this Manchester encoding is performed by the echo generation and shaping
module 9. Therefore, when a message bit value of "0" is to be transmitted, for the
first half of the symbol, the first echo 23-1 is of positive polarity and the second
echo 23-2 is of negative polarity, whilst for the second half of the symbol, the first
echo 23-1 is of negative polarity and the second echo 23-2 is of positive polarity.
To transmit a message bit value of "0", all polarities are reversed, as summarised
in the table given below.
| |
first half of symbol |
second half of symbol |
| data value |
polarity of first echo |
polarity of second echo |
polarity of first echo |
polarity of second echo |
| 0 |
Positive |
negative |
negative |
positive |
| 1 |
Negative |
positive |
positive |
negative |
[0033] The reason that the Manchester encoding can help to distinguish the artificial echoes
from the natural echoes is that the natural echoes will be stable over the two half
symbol periods. Therefore, by subtracting the autocorrelations in the second half
of the symbol from autocorrelations in the first half of the symbol (or vice versa),
the effect of the natural echoes and periodicities will cancel, whilst the autocorrelation
peaks caused by the artificial echoes will add constructively. Similarly, the reason
for using two echoes in each half symbol period is to distinguish the artificial echoes
from periodicities in the original track. Typically, the autocorrelation of the original
track will not change significantly between these two lags (i.e. between 10ms and
10.25ms). Therefore, by differencing the autocorrelations at the two lags, the effect
of the periodicities is reduced and the autocorrelation peaks caused by the two echoes
add constructively.
[0034] Figures 3a and 3b are impulse plots showing the two half symbols and the artificial
echoes 23 that are added within each half symbol period to represent a binary "1"
and a binary "0" respectively. Figures 3a and 3b also illustrate natural echoes 25-1
and 25-2 which do not change from one half period to the next. Therefore, by subtracting
the echoes in one half of the symbol period from the corresponding echoes (i.e. those
with the same lag or delay) in the other half of the symbol period, the effect of
the natural echoes and periodicities will cancel, whilst the artificial echoes will
add constructively, thereby making it easier to detect the hidden data.
[0035] The above description provides an overview of the encoding and decoding techniques
used in the present embodiment. A more detailed description will now be given of the
main components of the transmitter 1 and receiver 5 to carry out the encoding and
decoding processes described above.
FEC encoder
[0036] Figure 4 is a block diagram illustrating the main components of the FEC encoder module
7 used in this embodiment. As shown, the first encoding module is a Reed-Solomon encoder
module 51 which uses a shortened (13, 6) block code to represent the input data. The
data output from the Reed-Solomon encoder 51 is then passed to a convolutional encoder
53 which performs convolutional encoding on the data. The data bits output from the
convolutional encoder 53 are then interleaved with each other by a data interleaving
module 55 to protect against errors occurring in bursts. Finally, a synchronisation
data adder module 57 adds a sequence of synchronisation bits that will help the receiver
5 lock on to the encoded data within the received acoustic signal 3. The output from
the synchronisation data adder module 57 represents the message data which is then
passed to the echo generation and shaping module 9 shown in Figure 1.
Echo generation and shaping
[0037] Figure 5 is a block diagram illustrating the main components of the echo generation
and shaping module 9 and the combiner module 11 shown in Figure 1. The input audio
signal is represented by the sequence of audio samples a(n) which are applied to a
10 millisecond delay unit 61 and to the adder 63 (corresponding to the combiner 11
shown in Figure 1). The 10 millisecond delay unit 61 delays the input sample a(n)
by 10 milliseconds which it then outputs to a 0.25 millisecond delay unit 65 and to
a subtractor 67. The 0.25 millisecond delay unit 65 delays the audio sample output
from the 10 millisecond delay unit 61 by a further 0.25 milliseconds which it then
outputs to the subtractor 67. The subtractor 67 subtracts the 10.25 millisecond delayed
sample from the 10 millisecond delayed sample outputting the result to a multiplier
69. The delay units and the subtractor operate each time a new audio sample a(n) arrives.
In this embodiment, the audio sample frequency is one of 8kHz, 32kHz, 44.1 kHz or
48kHz.
[0038] Therefore, as those skilled in the art will appreciate, the 10 millisecond delay
unit 61, the 0.25 millisecond delay unit 65 and the subtractor 67 will generate the
two echoes 23-1 and 23-2 illustrated in Figure 2. At this stage, however, the echoes
that have been generated do not depend on the data to be transmitted. As will be explained
below, this dependency is achieved by multiplying the echoes in the multiplier 69
with a modulation function g(n) that is output by a lookup table 71 which is addressed
by lookup table address logic 73 in response to the current message data value. In
particular, the lookup table output g(n) changes the polarity of the echoes in dependence
upon the message data so that the echoes with the modulated polarities can then be
added back to the original audio signal by the adder 63 to generated the echo-modulated
audio output signal.
Lookup table output g(n)
[0039] The inventors have found that abrupt changes in the echoes that are added can make
the echoes more obtrusive to users in the vicinity of the loudspeaker 19. Therefore,
the lookup table output g(n) is gradually increased and decreased so that the echoes
are effectively faded in and out.
[0040] Additionally, in this embodiment, the lookup table output g(n) also performs the
above described Manchester encoding of the message data. The way in which this is
achieved will now be explained with reference to Figure 6. In particular, Figure 6a
is a plot illustrating the way in which the lookup table output g(n) varies over one
symbol period, when the bit value of the message data is a binary "1". In this embodiment,
the symbol period is 100ms. As shown, during the first half of the symbol period,
the function g(n) increases from zero to a maximum value and then decreases back to
zero at the end of the first half of the symbol period. During the second half of
the symbol period, the function g(n) is negative and increases in magnitude to a maximum
negative value and then decreases back to zero. As can be seen from Figure 6a, in
this embodiment, the gradual increasing and decreasing of the lookup table output
g(n) is achieved by using a sinusoidal function. Therefore, during the first half
of the symbol, the combined echoes output from the subtractor 67 will be multiplied
by a positive value and so their polarity will not be changed when they are multiplied
by g(n) in the multiplier 69. On the other hand, during the second half of the symbol
period the lookup table output g(n) is negative and therefore, the polarities of the
echoes output from the subtractor 67 will be reversed when the echoes are multiplied
by g(n) in the multiplier 69.
[0041] As mentioned above, the artificial echoes 23 that are generated and added to the
audio signal have an amplitude which is approximately a third that of the audio signal.
In this embodiment, the amplitude of the echoes is controlled by the output of the
lookup table g(n). As shown in Figure 6a, the peak amplitude of the lookup table output
g(n) is a third, which means that the maximum amplitude of the echoes which are added
to the audio signal will be a third of the amplitude of the original audio signal.
[0042] As shown in Figure 6b, when the message data is a binary value "0" the lookup table
output g(n) is inverted compared with when the message data has a binary value of
"1". Therefore, during the first half symbol period, the polarity of the echoes output
from the subtractor 67 will be reversed when they are multiplied by g(n) in the multiplier
69 and during the second half of the symbol period the polarities of the echoes output
by the subtractor 67 will not be inverted when they are multiplied by g(n) in the
multiplier 69.
[0043] Figure 6c illustrates the lookup table output g(n) over two symbol periods when the
message data to be transmitted is a binary "1" followed by another binary "1". As
shown in Figure 6c, in this case, the lookup table output g(n) is a simple repeat
of the output illustrated in Figure 6a. Similarly, if successive values of the message
data are binary "0's" then the lookup table output g(n) over the two symbol periods
will be the inverse of that shown in Figure 6c.
[0044] However, If the message data transitions from a binary "1" to a binary "0", then
instead of using a lookup table output function obtained by concatenating the functions
shown in Figure 6a and Figure 6b, the function shown in Figure 6d is used instead.
As can be seen in Figure 6d, when the lookup table output g(n) reaches its peak negative
value in the first symbol period, it remains at that value until the peak would have
occurred in the second symbol period before decreasing in magnitude back to zero.
Similarly, when successive bits of the message data transition from a binary "0" to
a binary "1", the lookup table output g(n) over the two symbol periods will be the
inverse of that shown in Figure 6d. The inventors have found that not returning to
the zero level in this way reduces the obtrusiveness of the echo modulation scheme
that is used. This is because the human ear is more sensitive to changing echoes than
to constant echoes.
[0045] As those skilled in the art will appreciate, the lookup table address logic 73 is
responsible for analysing the successive bits of the message data and then to look
up the appropriate part of the lookup table 71 so that the appropriate output function
g(n) is applied to the multiplier 69.
Echo Detector
[0046] Figure 7 is a part schematic and part block diagram illustrating the processing performed
by the echo detector 37. In particular, Figure 7 illustrates 100 milliseconds of an
input signal 61 at the input of the echo detector 37. As those skilled in the art
will appreciate, the input signal 61 is illustrated schematically as a continuous
signal for ease of understanding but it will be a sampled and digitised waveform.
[0047] As illustrated by window i and window j, the echo detector 37 includes two sliding
windows 63-1 and 63-2 which extract adjacent segments of the input audio signal 61-1
and 61-2, each of length 50 milliseconds. Therefore, the two windows 63 extract portions
of the input acoustic signal 61 which correspond to the above-described half symbol
periods. As shown in Figure 7, the extracted portion 61-1 of the input acoustic signal
is input to a first autocorrelation unit 65-1 and the extracted portion 61-2 of the
input audio signal is input to a second autocorrelation unit 65-2. Both autocorrelation
units 65 operate to determine the autocorrelation of the corresponding portion 61-1
or 61-2 of the input acoustic signal at 10 millisecond and 10.25 millisecond lags.
The determined autocorrelation values at lags 10.25 from autocorrelation units 65-1
and 65-2 are then input to a subtractor 67, that subtracts the autocorrelation value
obtained from window j from the autocorrelation value obtained from window i (or vice
versa). The result of this subtraction is then supplied to another subtractor 69.
Similarly, the autocorrelation value at lag 10 milliseconds from window i and the
autocorrelation value at lag 10 milliseconds from window j are output from the autocorrelation
units 65 to the subtractor 71, that subtracts the autocorrelation value obtained from
window j from the autocorrelation value obtained from window i (or vice versa) and
feeds the result to the subtractor 69. The subtractor 69 then subtracts the output
from subtractor 67 from the output from subtractor 71 (or vice versa). Therefore,
the output from the subtrator 69 is represented by the following equation:

[0048] As mentioned above, subtracting the autocorrelation values of one half symbol period
from the corresponding autocorrelation values of the other half symbol period can
reduce the effect of natural echoes in the input acoustic signal 61. This is because
natural echoes are unlikely to change from one half symbol period to the next and
so their effect will be constant in the autocorrelations that are calculated. Consequently,
performing this subtraction will remove this common effect. Likewise, subtracting
the autocorrelation values obtained from each half symbol period will reduce the effect
of periodicities in the original audio signal. This is because in the 0.25ms delay
between the first echo and the second echo in the half symbol period, the effect of
the periodicities on the autocorrelations will be approximately constant and so this
subtraction will remove this common effect. This will now be described in more detail
with reference to Figure 8.
[0049] Figure 8a shows an autocorrelation plot 81 obtained from a typical audio signal without
any artificial echoes. As shown, the autocorrelation plot 81 has a peak at zero lag.
However, because of periodicities in the audio signal and because of natural echoes,
the autocorrelation plot 81 does not tail off towards zero until about 15 milliseconds
after the initial peak and exhibits local peaks and troughs in between. Peak 82 illustrates
such a local peak that may occur as a result of a natural echo being added to the
audio signal.
[0050] Figure 8b illustrates an autocorrelation plot 83 for the same audio signal after
a positive echo has been added at a lag of 10 milliseconds and a negative echo has
been added at a lag of 12 milliseconds (rather than at 10.25 ms so that the two echoes
can be seen more clearly). As shown in Figure 8b, as a result of the artificial echoes,
the autocorrelation plot 83 includes a peak 85 at 10 milliseconds and a peak 87 at
12 milliseconds. However, the peak 85 is masked somewhat by the earlier peak 82 caused
by a natural echo.
[0051] Figure 8c illustrates the autocorrelation plot 89 for the audio signal after the
echoes have been added in the second half of the symbol period. As shown, the autocorrelation
plot 89 includes a negative peak 91 at 10 milliseconds and a positive peak 93 at 12
milliseconds.
[0052] Finally, Figure 8d illustrates the autocorrelation plot that is obtained by subtracting
the autocorrelation plot shown in Figure 8c from the autocorrelation plot shown in
Figure 8b. As can be seen, the common peaks in the autocorrelation plots shown in
Figures 8b and 8c have been removed, whilst the complementary peaks 85 and 91; and
87 and 93 have added together to create the combined peaks 95 and 97 respectively.
As those skilled in the art will appreciate, it is therefore much easier to detect
the peaks 95 and 97 because their values are much greater than the autocorrelation
values at other lags. This effect is further enhanced by subtracting the autocorrelation
value at 12 milliseconds from the autocorrelation value at 10 milliseconds. This will
effectively add the two peaks 95 and 97 together to provide an even larger peak, which
can then be detected by suitable thresholding. The value of the corresponding data
value can then be determined from the polarity of the combined peak.
[0053] As those skilled in the art will appreciate, in this embodiment, the echo detector
37 does not calculate the autocorrelation of the input signal over all lags. It only
calculates the autocorrelation values at the lags where the artificial echoes have
been added. The plots shown in Figure 8 show the autocorrelation values over lags
from 0 to 15 milliseconds. These plots therefore help to illustrate the effect of
natural echoes and periodicities in the audio signal which can mask the artificial
echoes that are added to encode the data.
Synchronisation
[0054] In this embodiment, the receiver 5 knows the duration of each half symbol period.
This defines the length of the windows 63-1 and 63-2 used in the echo detector 37.
However, the echo detector 37 initially will not be synchronised with the transmitted
data. In other words, the echo detector 37 does not know where each symbol period
begins and ends or where the start of the message is located. Therefore, in this embodiment,
the echo detector 37 performs the above analysis as each new sample is received from
the analogue to the digital converter 35. The output from the subtractor 69 is then
analysed by the data recovery module 39 to determine the most likely symbol boundaries.
The data recovery module then determines the location of the start of the message
by finding the synchronisation bits that were added by the synchronisation data adder
57. At this point, the data recovery unit 39 can start to recover the whole message
from the polarity of the autocorrelation values output from the subtractor 69.
[0055] Once synchronisation has been achieved, the echo detector 37 will typically determine
the autocorrelation measurements in the middle of each half symbol period, when the
echo is expected to be at its peak amplitude and the data recovery module 39 will
determine the bit value from the polarity of the output from the subtractor 69. The
echo detector 37 may also take measurements just before and just after the middle
of each half symbol period, to allow the data recovery module 39 to track the synchronisation.
[0056] The message data recovered by the data recovery module 39 is then input to the FEC
decoding module 41 where the message data is decoded (using the inverse processing
of the FEC encoder 7) to obtain the original input data that was input to the encoder
7 of the transmitter 1.
Modifications and alternatives
[0057] In the above embodiments, the data was hidden within an audio signal by employing
a number of echoes whose polarity varied with the data value to be transmitted. These
echoes were added to the original audio signal after appropriate delays. As those
skilled in the art will appreciate, the echoes may be added before the original audio
signal (pre- echoes), before and after the original audio signal or only after the
original audio signal.
[0058] In the above embodiment, synchronisation bits were added to the data that was transmitted
so that the decoder can identify the boundaries of each symbol period and the start
and end of each message. The use of such synchronisation bits significantly increases
the overall message length that has to be transmitted (in some cases by as much as
25%). Additionally, as the decoding of each bit is subject to noise, the matching
is not perfect which can reduce the chances of a successful synchronisation. The inventors
have realised, however, that the synchronisation bits are not required. In particular,
the inventors have realised that the FEC decoding module 41 will have higher error
rates when the echo detector 37 is not properly synchronised with the incoming data
compared with its error rate when the echo detector is synchronised with the incoming
data. Therefore, in the embodiment illustrated in Figure 9, the error output generated
by the FEC decoding module 41 is used to control the synchronisation of the receiver
to the incoming data.
[0059] More specifically, in this embodiment, the echo detector 37 receives a block of samples
corresponding to one or more symbol(s) and determines the optimum time within that
block of samples to detect the echoes within the symbols. Multiple symbols may be
required when Manchester encoding is used as a Manchester encoded "one" looks the
same as a Manchester encoded "zero" with a time shift. Therefore, it may be necessary
to consider a number of symbols to allow the symbol boundaries to be identified. The
actual determination of the optimum time within the block of samples to detect the
echoes may be determined by passing the block of samples through a matched filter
(loaded with the expected signal pattern for one symbol period) and the time within
the symbol when the absolute output (averaged over a number of successive symbols)
is at a maximum is deemed to be the best time to sample the symbols. For example,
if there are N samples per symbol, and the block of samples has M symbols, then the
following values are calculated:

where x(i) is absolute output of the matched filter for sample i. The largest average
value thus determined identifies the best time to detect the echoeswithin the incoming
signal during each symbol.
[0060] The echo detector 37 then uses the determined optimum time to detect echoes in that
symbol and in the previous N-1 symbols of the input signal (where N is the number
of symbols in the transmitted message). The data recovery module 39 then determines,
from the detected echoes, bit value(s) for each symbol and outputs the string of bits
corresponding to the possible message to the FEC decoding module 41. The FEC decoding
module 41 then performs the inverse processing of the FEC encoder 7 to regenerate
a candidate input data codeword, which is stored in the buffer 93. The FEC decoding
module 41 also outputs an error count indicating how many errors are identified in
the candidate codeword, which it passes to a controller 91. In response, the controller
91 compares the error count with a threshold value and if it is greater than the threshold,
then the controller 91 flushes the candidate codeword from the buffer 93. The above
process is then repeated for the next received symbol in the input signal, until the
controller 91 determines that the error count is below the threshold. When it is,
the controller 91 instructs the FEC decoding module 41 to accept the candidate codeword,
which it then outputs for further use in the receiver 5. In effect, therefore, the
echo detector 37, the data recovery module 39 and the FEC decoding module 41 all operate
on a window of the input signal corresponding to the length of the transmitted message,
which window is slid over the input signal until a point is found where the FEC error
count is below a defined threshold - indicating the identification of the full message
within the input signal.
[0061] Figure 10 is a plot illustrating the way in which the FEC decoding module's error
count 99 is expected to change as the window 101 is slid over an input signal 103
containing a data message 105, with the minimum appearing at symbol S
N, when the window 101 is aligned with the data message 105 in the input signal 103.
The threshold (Th) level is then set to reduce the possibility that false minimums
in the FEC error output count are considered as possible codewords, so that (in the
ideal situation) only when the receiver 5 is properly synchronised (aligned) to the
message data, will the FEC decoding module's error count reduce below the threshold
in the manner illustrated in Figure 10. Ideally, in this embodiment, the FEC encoding/decoding
that is used is designed to keep the error rate of the FEC decoding module 41 high
except when the window 101 is aligned with the message data 105 in the input signal
103. The inventors have found that this simple thresholding technique is sufficient
to identify the location of the message data in the input signal 103. However, if
more accurate detection is required, then further consideration can be made, varying
the possible positions of the start and end of the message and looking for the positions
that give the minimum FEC error count.
[0062] The above technique is useful for finding a single message in the input signal. Clearly,
if a sequence of such data messages is transmitted, then the synchronisation timing
determined for the first data message may be used to identify the synchronisation
timing for the next data message.
[0063] One problem identified by the inventors with the synchronisation approach discussed
above is that the FEC encoder 7 often uses cyclic codewords (for example when using
Reed Solomon block encoding) which means that a one bit shift in the codeword can
also be a valid codeword. This is problematic because it can result in false detections
of a codeword (a so called false positive) in the input signal 105. This problem can
be overcome by reordering the bits of the codeword in the FEC encoder 7 in some deterministic
manner (for example in a pseudo random manner), and using the inverse reordering in
the FEC decoder 41. The processing that may be performed by the FEC encoder 7 and
by the FEC decoder 41 in such an embodiment is illustrated in Figures 11 a and 11
b respectively. As shown, the FEC encoder 7 performs a cyclic encoding of the data
(in this case Reed Solomon encoding 111), followed by a pseudo random reordering 113
of the data. The reordered data is then convolutionally encoded 115 and then interleaved
117 as before. Similarly, the FEC decoding module 41 initially de-interleaves 121
the data and performs convolutional decoding 123. The FEC decoding module 41 then
reverses 123 the pseudo random data reordering performed by the FEC encoder 7 and
then performs the Reed Solomon decoding 125. As those skilled in the art will appreciate,
by performing this reordering of the data in this way, if there is a bit shift in
the message data output by the data recovery module 39, then it is far less likely
to result in a valid codeword and so the FEC error rate output is unlikely to trigger
the false identification of a data message.
[0064] In the above embodiments, each data value was represented by four echoes - two echoes
in each of two half symbol periods. As those skilled in the art will appreciate, each
data value may be represented by any number of echoes in any number of sub-symbol
periods. For example, instead of having two echoes within each half symbol period,
each data value may be represented by a single echo in each half symbol period. In
this case, the echoes in each half symbol period would preferably be of opposite polarity
so that the same differencing technique can be used to reduce the effects of natural
echoes. Indeed, the inventors have found that in some cases using two echoes of opposite
polarity in each half symbol period can result in some frequency components within
the original audio signal adding constructively with the echoes and some frequency
components within the original audio signal adding destructively with the echoes.
If a single artificial echo is added, then such distortions are less evident making
the hidden data less noticeable to users in the acoustic sound that is heard.
[0065] As those skilled in the art will appreciate, representing each data value by one
or more echoes in different sub-symbol periods, means that the echoes in each sub-symbol
period will be a repetition of a different portion of the audio signal. If there is
only one symbol period, then each data value will be represented by echoes of the
same (or substantially the same) portion of the audio signal.
[0066] In the above embodiments, each data value was represented by a positive and a negative
echo in a first half symbol period and by a positive and a negative echo in the second
half symbol period. The positive and negative echoes in the first half symbol period
allowed the receiver to reduce the effects of periodicities in the original audio
signal which effect the autocorrelation measurements. The use of complementary echoes
in adjacent half symbol periods allows the receiver to reduce the effect of natural
echoes within the received audio signal, which might otherwise mask the artificial
echoes added to represent the data. As those skilled in the art will appreciate, in
other embodiments, neither or only one of these techniques may be used.
[0067] In the above embodiment, each data value was represented by echoes within two adjacent
half symbol periods. As those skilled in the art will appreciate, these two half symbol
periods do not have to be immediately adjacent to each other and a gap may be provided
between the two periods if required.
[0068] In the above embodiment, the echoes in each half symbol period were of exactly the
same portion of the audio signal. As those skilled in the art will appreciate, this
is not essential. The echoes in each half symbol period may be of slightly different
portions of the audio signal. For example, one echo may miss out some of the audio
samples of the audio signal. Alternatively, the audio signal may include different
channels (for example left and right channels for a stereo signal) and one echo may
be formed from a repetition of the left channel and the other may be formed from a
repetition of the right channel. With modern multi channel surround sound audio the
repetitions can be of any of these channels.
[0069] In the above embodiment, the echoes generated within the transmitter were added to
the original audio signal. As those skilled in the art will appreciate, the generated
echoes may be combined with the original audio signal in other ways. For example,
the echoes may be subtracted from the audio signal. Similarly, instead of inverting
the echoes to be added to the audio (by controlling the polarity of the function g(n)),
the same result can be achieved by changing the way in which the echoes are combined
with the audio signal. For example, one echo may be added to the original audio signal
whilst the next echo may be subtracted from the audio signal.
[0070] In the above embodiment, the lookup table stored values for g(n) corresponding to
one or two bits of the message data (as illustrated in Figure 6). As those skilled
in the art will appreciate, this is not essential. For example, the lookup table could
simply store a function which increased in value and then decreased in value. Additional
circuitry could then be provided to convert the polarity of this output as appropriate
for the two half symbol periods. In this way, the function stored in the lookup table
would only control the fading in and out of the echo and the additional circuitry
would control the polarity of the echo as required.
[0071] In the above embodiment, the Manchester encoding was performed by the echo generation
and shaping module. As those skilled in the art will appreciate, this Manchester encoding,
if performed, may be performed within the FEC encoding module.
[0072] As those skilled in the art will appreciate, the techniques described above for hiding
data within the audio may be done in advance of the transmission of the acoustic signal
or it may be done in real time. Even in the case where the data is to be embedded
within an audio signal in real time, some of the processing can be done in advance.
For example, the FEC encoding may be performed on the data in advance so that only
the echo generation and echo shaping is performed in real time.
[0073] In the above embodiments, specific examples have been given of sample rates for the
audio signal and symbol rates for the data that is hidden within the audio signal.
As those skilled in the art will appreciate, these rates are not intended to be limiting
and they may be varied as required. However, in order to keep the obtrusiveness of
the added echoes to a minimum, the data rate of the encoded data is preferably kept
between one and twenty symbols per second. This corresponds to a symbol period of
between 50ms and 1 second. In some embodiments, a long symbol period is beneficial
because the added echoes will span across spoken words within the audio, making it
easier to hide the data echoes within the audio. A larger symbol period also reduces
audibility of the echoes. This is because humans are more sensitive to changing echoes
than they are to static or fixed echoes. Therefore, by having a longer symbol period,
the rate of change of the echoes is lower making the presence of the echoes less noticeable
to a user.
[0074] In the above embodiment, the data rate of the data added to the audio signal in the
transmitter was constant and was known by the receiver. This knowledge reduces the
complexity of the receiver circuitry for locking on to the data within the received
signal. However, it is not essential to the invention and more complex circuitry may
be provided in the receiver to allow the receiver to try different data rates until
the actual data rate is determined. Similarly, the receiver may use other techniques
to synchronise itself with the transmitted data so that it knows where the symbol
boundaries are in advance of receiving the data.
[0075] In the above embodiment, FEC encoding techniques were used to allow the receiver
to be able to correct errors in the received data. As those skilled in the art will
appreciate, such encoding techniques are not essential to the invention. However,
they are preferred, as they help to correct errors that occur in the transmission
process over the acoustic link.
[0076] In the above embodiments, the peak amplitudes of the echoes were all the same and
were independent of the data value being transmitted. As those skilled in the art
will appreciate, the peak amplitudes of the echoes may also be varied with data to
be transmitted if desired.
[0077] In the above embodiment, the echoes in each half symbol period were at the same delays
relative to the original audio signal. As those skilled in the art will appreciate,
this is not essential. There may be some variation in the actual delay values used
within each half symbol period.
[0078] In the above embodiment, the second echo within each half symbol period was generated
by delaying the first echo by a further delay value. In an alternative embodiment,
each echo within each sub-symbol period may be independently generated from the original
audio signal using an appropriate delay line.
[0079] As those skilled in the art will appreciate, various uses can be made of the above
communication system. For example, the encoded data may be used as a watermark to
protect the original audio signal. Alternatively, the embedded data may be used to
control the receiver so that it can respond in synchronism with the audio signal.
In particular, the decoder can be programmed to perform some action a defined time
after receiving the codeword. The time delay may be programmed into the decoder by
any means and may even be defined by data in the received codewords. When used to
perform such synchronisation, shorter symbol periods are preferred as shorter symbol
periods allows for better temporal resolution and hence more accurate synchronisation.
The data may be used for interactive gaming applications, audience surveying, e-commerce
systems, toys and the like. The reader is referred to the Applicant's earlier International
application
WO02/45273 which describes a number of uses for this type of data hiding system.
[0080] In the above embodiment, the receiver performed autocorrelation measurements on the
input audio signal in order to identify the locations of the echoes. As those skilled
in the art will appreciate, other techniques can be used to identify the echoes. Some
of these other techniques are described in the Applicant's earlier PCT application
PCT/GB2008/001820 and in
US 5893067, the contents of which are incorporated herein by reference. Typically, although
not necessarily, the techniques involve some form of autocorrelation of the original
audio signal or of parameters obtained from the audio signal (eg LPC parameters, cepstrum
parameters etc). As an alternative, a best fit approach could be used in which an
expected audio signal (with different echo polarities) is fitted to the actual signal
until a match is found and the polarity of the echoes thus determined.
[0081] In the embodiment described above, a single transmitter was provided together with
a receiver. As those skilled in the art will appreciate, multiple transmitters and/or
multiple receivers may be provided. Further, the components of the transmitter may
be distributed among a number of different entities. For example, the encoding and
data hiding part of the transmitter may be provided within a head end of a television
distribution system or a user's set top box and the loudspeaker 19 may be a speaker
of the user's television set.
[0082] In the above embodiments, the echoes were directly derived from the original audio
signal. In alternative embodiments, the echo may not include all frequency components
of the audio signal. For example, one or more of the echoes may be generated from
a portion of the audio signal after it has been filtered to remove certain frequencies.
This may be beneficial where it is found, for example, that there is additional noise
in the low frequency part of the echoes but not in the higher frequency part. In this
case, the received signals would also be filtered to remove the lower frequency components
(for example frequencies below about 500Hz) so that only the higher frequency components
(those above the lower frequency components) of the audio signal and the echoes would
be present in the signals being analysed. Alternatively, in this case, the received
signal may be passed through a filter that simply reduces the level of the lower frequency
components in the received signal compared with the higher frequency components. This
will have the effect of reducing the relevance of the noisy low frequency part of
the received signal in the subsequent decoding process. Similarly, if it turns out
that the added echoes introduce a noticeable distortion in the higher frequencies
of the composite audio signal, then the echoes (or the signals from which they are
derived) may be low pass filtered to remove the higher frequencies.
[0083] The division of the audio signal into separate frequency bands can also be used to
carry data on multiple channels. For example, if the frequency band is divided into
a high frequency part and a low frequency part, then one channel may be provided by
adding echoes to the high frequency part and another channel may be provided by adding
different echoes to the low frequency part. The use of multiple channels in this way
allows frequency or temporal diversity if the data carried in the two channels is
the same; or allows for an increased data transfer rate if each channel carries different
data. Multiple channels can also be provided where the audio signal also contains
multiple channels (used to drive multiple speakers). In this case, one or more data
channels may be provided in the audio signal for each audio channel.
[0084] In the above embodiment, data was hidden within an audio signal by adding echoes
to the audio signal. In some situations, the incoming audio may already contain hidden
data in the form of such echoes. In this case, the encoder could decode the existing
hidden data from the received audio signal and then use the decoded data to clean
the audio signal to remove the artificial echoes defining this hidden data. The encoder
could then add new echoes to the thus cleaned audio signal to hide the new data in
the audio signal. In this way, the original hidden data will not interfere with the
new hidden data.
[0085] In the above embodiment, the echoes were obtained by delaying digital samples of
the audio signal. As those skilled in the art will appreciate, the echoes may be generated
in the analogue domain, using suitable analogue delay lines and analogue circuits
to perform the echo shaping and polarity modulation.
[0086] In the above embodiments, the audio signal with the embedded data was transmitted
to a receiver over an acoustic link. In an alternative embodiment, the audio signal
may be transmitted to the receiver over an electrical wire or wireless link. In such
an embodiment, the data rates that are used may be higher, due to lower noise levels.
[0087] In the above embodiment, one data bit was transmitted within each symbol period.
In an alternative embodiment, multiple bits may be transmitted within each symbol
period. For example a second pair of echoes may be added at lags of 20 ms and 20.25
ms within each half symbol period to encode a second bit; a third pair of echoes may
be added at lags of 30 ms and 30.25 ms within each half symbol period to encode a
third bit etc. Each echo could then be faded in and out during each half symbol period
and polarity modulated in accordance with the bit value as before. The fading in and
out of the echoes for the different bits may be the same or it may be different for
the different bits. The polarity modulation of the different echoes will of course
depend on the different bit values to be transmitted in the symbol period. In a preferred
embodiment, the echoes for the different bits within the same half symbol period are
faded in and out at different times of the half symbol period, so that the different
echoes reach their peak amplitudes at different times within the half symbol period.
In this way, when the echo for one bit is at its peak amplitude (or when all the echoes
for one bit are at their peak amplitudes - if there are multiple echoes representing
each bit in each half symbol period), the echoes for the other bits will not be at
their peaks. Doing this and sampling the different echoes when they are expected to
be at their peak amplitudes, will reduce the interference between the echoes for the
different bits within the same half symbol period. It also reduces constructive interference
of the echoes that may render the added echoes more noticeable to a listener. Looking
at this another way, this is the same as having multiple parallel data messages, each
encoded as per the embodiments described above, but with their respective symbol periods
offset in time from each other so that the echoes for the different messages peak
at different times - thereby reducing interference between the messages if the echoes
are all sampled around the time when they are each at their maximum amplitudes. This
technique will increase the bit rate of data transmission between the transmitter
and receiver. The additional bits may be of the same message or they may be bits of
different messages.
[0088] The inventors have found that the above described data hiding techniques do not work
as well during portions of the audio that include single tones or multiple harmonic
tones, such as would be found in some sections of music. This is because the hidden
data becomes more obtrusive to the listener in these circumstances and if the tones
are being used as part of an automatic setup procedure they can cause the procedure
to fail. Therefore, in one embodiment, the inventors propose to include (within the
encoder) a detector that detects the level of tonality or other characteristic of
the audio signal and, if it is highly tonal, that switches off the echo addition circuitry.
Alternatively, as this switching off of the echoes may itself be noticeable to the
user, the encoder may fade the echoes out during periods of high tonality and then
fade them back in during periods of low tonality. In this way, the data is only added
to the audio signal when the audio signal is not highly tonal in nature. Various techniques
may be used for making this detection. One technique for determining the level of
tonality of an audio signal (although for a different purpose) is described in the
applicant's earlier
PCT application WO02/45286, the contents of which are incorporated herein by reference. Another technique can
be found in
Davis P (1995) "A Tutorial on MPEG/Audio Compression", IEEE Multimedia Magazine, 2(2),
pp. 60-74. Instead of switching off the echo addition circuitry, the system may be arranged
to adapt the amplitude of the added echoes depending on the detected characteristic
of the audio signal. Alternatively, instead of varying the amplitudes of the echoes
in this way, the encoder may instead or in addition vary the data rate or the symbol
period in order to reduce the obtrusiveness of the hidden data during periods when
the audio signal is highly tonal.
[0089] An embodiment was described above in which a single message was encoded and transmitted
to a remote receiver as a number of echoes within an audio signal. In some applications,
a sequence of messages may be transmitted. These messages may be the same or they
may be different. In either case, each message may be transmitted after a preceding
message has been transmitted. Alternatively, the end of one message may be overlapped
with the start of the next message in a predefined way (so that the receiver can regenerate
each message. This arrangement can increase the time diversity of the transmitted
messages making them less susceptible to certain types of noise or data loss. In a
further alternative, the data from the different messages may be interleaved in a
known manner and transmitted as a single data stream to the receiver. The receiver
would then regenerate each message by de-interleaving the bits in the data stream
using knowledge of how the messages were originally interleaved.
[0090] As discussed above, Convolutional Coding is used as part of the forward error correction
(FEC) encoder. As is well known to those skilled in the art, data encoded in this
way generally is decoded using a Viterbi decoder, which operates by constructing a
trellis of state probabilities and branch metrics. The transmitted data is often terminated
with a number of zeros to force the encoder back to the zero state. This allows the
decoder to start decoding from a known state, however, it requires extra symbols to
be transmitted over the channel. An alternative technique is to ensure that the trellis
start and end states are identical. This technique is referred to as
tail biting and has the advantage of not requiring any extra symbols to be transmitted. Tail
biting is used in many communications standards and, if desired, may be used in the
embodiments described above.
[0091] The description above has described the operation of a system for hiding data as
echoes within an audio signal. The systems described used time domain techniques to
generate and add the echoes and to detect the echoes in the received signal. As those
skilled in the art will appreciate, equivalent processing can be performed in the
frequency domain to achieve the same or similar results.
[0092] The inventors have found that in some instances, the decoder does not work as well
when the message consists of predominantly 'zero' bits (or conversely predominately
'one' bits), since under the encoding scheme an 'all zeros' codeword segment looks
the same as a time-shifted 'all ones' codeword segment. A particular example is the
'all zeros' message, which results in an 'all zeros' codeword after Reed Solomon encoding.
The encoding works best when there are approximately equal numbers of ones and zeros
in the codeword, evenly distributed throughout the codeword. This can be achieved
for the disclosed system by inverting the Reed Solomon parity bits. This has the effect
of changing the all zeroes codeword to a mixture of zeroes and ones. This can also
be achieved by altering the initial state of the feedback shift register used within
the Reed Solomon encoder which is used to generate the parity bits. This gives more
flexibility in setting the ratio of ones to zeroes in the codeword. Subsequent interleaving
distributes these inverted parity bits throughout the codeword. As those skilled in
the art of error detection and correction will appreciate, these approaches to balancing
the distribution of ones and zeroes applies to any of the many FEC schemes implemented
using feedback shift registers (or Galois field arithmetic) of which Reed Solomon
is an example.
[0093] In the above embodiments, a number of processing modules and circuits have been described.
As those skilled in the art will appreciate, these processing modules and circuits
may be provided as hardware circuits or as software modules running within memory
of a general purpose processor. In this case, the software may be provided on a storage
medium such as a CD-ROM or it may be downloaded into an appropriate programmable device
on a carrier signal over a computer network, such as the internet. The software may
be provided in compiled form, partially compiled form or in un-compiled form.
[0094] The present application also includes the following numbered clauses:
- 1. A method of embedding a data value in an audio signal, the method comprising:
receiving the audio signal;
generating an echo of at least a portion of the received audio signal; and
embedding the data value in the audio signal by combining the received audio signal
with the generated echo;
wherein said data value is embedded in the audio by varying the polarity of the echo
that is combined with the audio signal in dependence upon the data value.
- 2. A method according to clause 1, wherein said generating generates a first echo
of at least a portion of the received audio signal and a second echo of at least a
portion of the received audio signal; wherein said embedding embeds the data value
in the audio signal by combining the received audio signal with the generated first
and second echoes; and wherein the first and second echoes that are combined with
the audio signal have first and second polarities respectively, which polarities vary
in dependence upon the data value.
- 3. A method according to clause 2, wherein each of said echoes is generated by repeating
at least a part of said audio signal.
- 4. A method according to clause 2, wherein said first echo is generated by repeating
a first portion of said audio signal and said second echo is generated by repeating
a second portion of said audio signal.
- 5. A method according to clause 2, wherein said first and second echoes are generated
by repeating substantially the same first portion of the audio signal.
- 6. A method according to clause 5, further comprising generating third and fourth
echoes, the third echo having the same polarity, when combined with said audio signal,
as said second echo and the fourth echo having the same polarity, when combined with
said audio signal, as said first echo and wherein the third and fourth echoes are
generated by repeating substantially the same second portion of the audio signal which
is different to the first portion repeated by the first and second echoes.
- 7. A method according to clause 6, wherein said second portion is adjacent the first
portion.
- 8. A method according to clause 6 or 7, wherein said first echo is combined with said
audio signal at a first delay relative to said first portion of the audio signal;
said second echo is combined with the audio signal at a second delay relative to said
first echo; said third echo is combined with said audio signal at a third delay relative
to said second portion of the audio signal; and said fourth echo is combined with
the audio signal at a fourth delay relative to said third echo.
- 9. A method according to clause 8, wherein said first delay is equal to said third
delay and/or said second delay is equal to said fourth delay.
- 10. A method according to clause 8 or 9, wherein said delays are independent of said
data value.
- 11. A method according to any of clauses 8 to 10, wherein said first and third delays
are between 0.5 ms and 100 ms and wherein said second and fourth delays are spaced
from said first and third echoes respectively by between 0.125 ms and 3 ms.
- 12. A method according to any of clauses 6 to 11, wherein said first and second portions
of the audio signal have a duration of between 20 ms and 500 ms.
- 13. A method according to any of clauses 6 to 12, wherein said generating generates
said third and fourth echoes by inverting the polarity of a gain factor applied to
the echoes before being combined with the audio signal.
- 14. A method according to any of clauses 2 to 13, wherein said first echo has an opposite
polarity to said second echo.
- 15. A method according to any preceding clause wherein the or each echo has an amplitude
that is less than the amplitude of said audio signal.
- 16. A method according to any preceding clause wherein the amplitude of the or each
echo is independent of said data value.
- 17. A method according to any preceding clause, wherein the or each echo is faded
in and out to reduce obtrusiveness of the echoes to a listener.
- 18. A method according to any preceding clause wherein said audio signal is received
as a stream of samples and wherein the or each echo is generated by passing the stream
of audio samples through a delay line.
- 19. A method according to any preceding clause wherein said combining step combines
the or each echo with the audio signal by adding and/or subtracting each echo to/from
the audio signal in dependence upon the data value.
- 20. A method according to any of clauses 1 to 18, wherein said generating generates
said echo such that the polarity of the generated echo depends on the data value.
- 21. A computer executable instructions product comprising computer implementable instructions
for causing a programmable computer device to carry out the method of any of clauses
1 to 20.
- 22. An apparatus for embedding a data value in an audio signal, the apparatus comprising:
an input for receiving the audio signal;
an echo generator operable to generate an echo of at least a portion of the received
audio signal; and
a combiner operable to combine the received audio signal with the generated echo to
embed the data value in the audio signal;
wherein said echo generator and/or said combiner is arranged so that said data value
is embedded in the audio by varying the polarity of the echo that is combined with
the audio signal in dependence upon the data value.
- 23. An apparatus according to clause 22, operable to perform the method of any of
claims 2 to 20.
- 24. A method of recovering a data value embedded in an audio signal, the method comprising:
receiving an input signal having the audio signal and an echo of the audio signal
whose polarity depends upon said data value;
processing the received input signal to determine the polarity of the echo; and
recovering the data value from the determined polarity.
- 25. A method according to clause 24, wherein said input signal comprises a first echo
of at least a portion of the audio signal and a second echo of at least a portion
of the audio signal, the first and second echoes having first and second polarities
respectively, which polarities vary in dependence upon the data value; and wherein
said processing processes said input signal to combine said first and second echoes
and to determine the polarity of the combined echoes and wherein said recovering recovers
the data value from the determined polarity of the combined echoes.
- 26. A method according to clause 25, wherein said processing processes said input
signal to determine a first measure that depends upon the first echo and a second
measure that depends upon the second echo and combines the echoes by differencing
the first and second measures and determines the polarity of the combined echoes by
determining the polarity of the result of the differencing.
- 27. A method according to clause 25 or 26, wherein said first echo is of a first portion
of the audio signal and said second echo is of a second portion of the audio signal.
- 28. A method according to clause 25 or 26, wherein said first and second echoes are
repeats of substantially the same portion of the audio signal.
- 29. A method according to clause 24, wherein said input signal comprises first, second,
third and fourth echoes, the first and fourth echoes having the same polarity and
the second and third echoes having the same polarity which is opposite to the polarity
of the first and fourth echoes, wherein said processing step processes the input signal
to combine said first to fourth echoes and to determine the polarity of the combined
echoes and wherein said recovering step recovers the data value from the determined
polarity of the combined echoes.
- 30. A method according to clause 29, wherein said processing processes said input
signal to determine a first measure that depends upon the first echo, a second measure
that depends upon the second echo, a third measure that depends upon the third echo
and a fourth measure that depends upon the fourth echo and combines the echoes by
differencing the measures and determines the polarity of the combined echoes by determining
the polarity of a result of the differencing.
- 31. A method according to clause 30, wherein said differencing performs a first difference
of the first and third measures and performs a second difference of the second and
fourth measures.
- 32. A method according to clause 31, wherein said differencing performs a third difference
of the result of said first difference and the result of the second difference and
wherein the polarity of the combined echoes is determined from the polarity of a result
of the third difference.
- 33. A method according to any of clauses 29 to 32, wherein the first and second echoes
are repeats of substantially the same first portion of the audio signal and said third
and fourth echoes are repeats of substantially the same second portion of the audio
signal.
- 34. A method according to any of clauses 29 to 32, wherein the first and third echoes
are repeats of substantially the same first portion of the audio signal and said second
and fourth echoes are repeats of substantially the same second portion of the audio
signal.
- 35. A method according to clause 33 or 34, wherein said second portion is adjacent
the first portion.
- 36. A method according to clause 33, wherein said first echo is delayed relative to
said first portion of the audio signal by a first delay; said second echo is delayed
relative to said first echo by a second delay; said third echo is delayed relative
to said second portion of the audio signal by a third delay; and said fourth echo
is delayed relative to said third echo by a fourth delay.
- 37. A method according to clause 36, wherein said first delay is equal to said third
delay and/or said second delay is equal to said fourth delay.
- 38. A method according to clause 36 or 37, wherein said first and third delays are
between 0.5 ms and 100 ms and wherein said second and fourth delays are delayed relative
to said first and third echoes respectively by between 0.125 ms and 3ms.
- 39. A method according to clause 33, wherein said first echo is delayed relative to
said first portion of the audio signal by a first delay; said second echo is delayed
relative to said second portion of the audio signal by a second delay; said third
echo is delayed relative to said first echo by a third delay; and said fourth echo
is delayed relative to said second echo by a fourth delay.
- 40. A method according to any of clauses 36 to 39, wherein said delays are independent
of said data value.
- 41. A method according to any of clauses 33 to 40, wherein said first and second portions
of the audio signal have a duration between 20 ms and 500 ms.
- 42. A method according to any of clauses 25 to 41, wherein said first echo has an
opposite polarity to said second echo.
- 43. A method according to any of clauses 24 to 42, wherein the or each echo is faded
in and out to reduce obtrusiveness of the echoes to a listener.
- 44. A method according to clause 43, wherein the polarity of the echo is determined
when the amplitude of the echo is at or near a maximum.
- 45. A computer implementable instructions product comprising computer implementable
instructions for causing a programmable computer device to carry out the method of
any of clauses 24 to 44.
- 46. An apparatus for recovering a data value embedded in an audio signal, the apparatus
comprising:
an input that receives an input signal having the audio signal and an echo of the
audio signal whose polarity depends upon said data value;
a processor operable to process the input signal to determine the polarity of the
echo; and
a data regenerator operable to recover the data value from the determined polarity.
- 47. An apparatus according to clause 45, operable to perform the method of any of
claims 25 to 44.
- 48. A method according to any of clauses 1 to 20, performed on each of a plurality
of data values that form part an encoded data message, wherein the data message is
encoded using an FEC encoding technique that performs a cyclic encoding and a convolutional
encoding of the message data and in which the order of the cyclic encoded message
data is changed before the convolutional encoding.
1. A method of recovering a data message embedded in an audio signal, the data message
being FEC encoded and embedded in the audio signal as a plurality of echoes, the method
comprising:
receiving an input signal having the audio signal and the echoes; and
processing the input signal to detect the echoes and to recover the embedded data
message;
wherein the processing includes synchronizing the processing of the input signal with
the embedded data message; wherein the processing performs an FEC decoding on recovered
data; and wherein the synchronizing uses an error signal from the FEC decoding to
control the synchronization of the processing to the embedded data message.
2. A method according to claim 1, wherein the receiving receives an input signal corresponding
to a sequence of symbols ending with a current symbol, wherein the processing processes
the input signal corresponding to the current symbol and the N-1 preceding symbols,
where N is a number of symbols in the data message, to detect echoes and to recover
a possible message, wherein the possible message is processed by said FEC decoding
to generate a candidate data message, wherein the FEC decoding generates error data
relating to the generation of the candidate data message and wherein the candidate
data message is discarded in dependence upon the error data.
3. A method according to claim 2, wherein the processing is repeated after receipt of
input signal corresponding to a next symbol until the error data meets a predetermined
condition that indicates synchronisation of the processing to the embedded data message.
4. A method according to claim 3, wherein the predetermined condition is that the error
data indicates that a number of errors is less than a threshold or is at a minimum
value.
5. A method according to claim 1, wherein the input signal comprises a sequence of data
messages and wherein a synchronisation timing obtained for one data message is used
to identify a synchronisation timing for a subsequent data message in the sequence.
6. A method according to claim 1, wherein the FEC decoding includes a cyclic decoding
and further comprises a re-ordering of the recovered data before the cyclic decoding
to avoid false detection of a codeword.
7. A method according to claim 6, wherein the FEC decoding includes a convolutional decoding
prior to the cyclic decoding and wherein said re-ordering of the recovered data is
performed between the convolutional decoding and the cyclic decoding.
8. A method according to claim 6, wherein the re-ordering performs a pseudo random re-ordering
of the recovered data prior to cyclic decoding.
9. An apparatus for recovering a data message embedded in an audio signal, the data message
being FEC encoded and embedded in the audio signal as a plurality of echoes, the apparatus
comprising:
an echo detector that receives an input signal having the audio signal and the echoes
and that processes the input signal to identify echoes within the input signal;
a data recovery module that processes the identified echoes to recover data corresponding
to the identified echoes;
an FEC decoder for performing FEC decoding of the recovered data to regenerate the
data message; and
a controller, responsive to an error signal from the FEC decoder, to control the operation
of the FEC decoder to synchronize the processing of the input signal with the embedded
data message.
10. An apparatus according to claim 9, wherein the echo detector is configured to receive
an input signal corresponding to a sequence of symbols, wherein the data recovery
module is configured to processes echoes detected by the echo detector in a current
symbol and N-1 preceding symbols, where N is the number of symbols within the data
message, to recover a possible message, wherein the FEC decoder is configured to process
the possible message to generate a candidate data message, wherein the FEC decoder
is configured to generate error data indicating errors in the candidate data message
and wherein the controller is configured to cause the candidate data message to be
discarded in dependence upon the generated error data.
11. An apparatus according to claim 10, wherein after receipt of input signal corresponding
to a next symbol, the data recovery module is configured to recover a next possible
message and the FEC decoder is configured to generate a next candidate data message,
until the error data for the candidate data message meets a predetermined condition
that indicates synchronisation of the processing to the embedded data message.
12. An apparatus according to claim 11, wherein the predetermined condition is that the
error data is less than a threshold or is at a minimum value.
13. An apparatus according to claim 10, wherein the controller is configured to cause
the candidate data message to be discarded if the error data is greater than a threshold.
14. An apparatus according to claim 9, wherein the input signal comprises a sequence of
data messages and wherein a synchronisation timing obtained for one data message is
used to identify a synchronisation timing for a subsequent data message in the sequence.
15. An apparatus according to claim 9, wherein the FEC decoder includes a cyclic decoder
and is configured to re-order the recovered data before the cyclic decoding to avoid
false detection of a codeword.
16. An apparatus according to claim 15, wherein the FEC decoder includes a convolutional
decoder and a cyclic decoder and is configured to re-order the recovered data between
the convolutional decoding performed by the convolutional decoder and a cyclic decoding
performed by the cyclic decoder.
17. An apparatus according to claim 15, wherein the re-ordering is a pseudo random re-ordering
of the recovered data.
18. An apparatus according to claim 9, wherein each data symbol is represented by one
or more echoes.
19. A computer implementable instructions product comprising computer implementable instructions
for causing a programmable computer device to perform a method of recovering a data
message embedded in an audio signal, the data message being FEC encoded and embedded
in the audio signal as a plurality of echoes, the computer implementable instructions
comprising:
instructions to process a received input signal that has the audio signal and the
echoes, to detect the echoes and to recover the embedded data message;
wherein the instructions to process the received input signal includes instructions
to synchronize the processing of the input signal with the embedded data message;
wherein the instructions include instructions to perform an FEC decoding on recovered
data; and
wherein the instructions to synchronize uses an error signal from the FEC decoding
to control the synchronization of the processing to the embedded data message.