[0001] The invention relates to a method and to an apparatus for generating reference phase
patterns to be used for controlling watermarking by frequency domain phase modulation
of spectra of an audio signal.
Background
[0002] For real-time applications of audio watermarking in the context of home networking
and for efficient implementation, e.g. in set-top boxes, low-complexity watermark
embedding and/or detecting are desired, if robustness and perceptual transparency
are not sacrificed significantly.
[0003] In
WO 2007/031423 A1 and in M.Arnold, P.G.Baum, W.Voessing, "A phase modulation audio watermarking technique",
11th Information Hiding Workshop, pages 102-116, 2009, watermark embedding occurs
in the phase domain by modifying phases of the host signal spectrum according to phase
reference patterns. Specifically, the host audio signal is split into blocks of length
LB and each block is used for embedding a watermark symbol. The watermark message is
represented by a sequence of symbols. For embedding, each block can be further partitioned
into
NSB overlapping subblocks of
LSB samples length with an overlap length of
LSB/2. Following weighting of each subblock by a windowing function, phases of the weighted
subblock spectrum are modified according to generated reference phases and according
to results from a psycho-acoustic analysis of the host audio signal. Depending on
a secret key SK (e.g. as seed of a random generator), reference phases for each watermark
message symbol are generated by the following steps:
B1) Randomly generate independently phase vectors Θi of
length LSB, which are uniformly distributed over [-π,π), where
B2) Generate time-domain subblocks via inverse discrete Fourier transform (IDFT) of
a normalised spectrum:
a'i=IDFT{ejθi}, where
. Concatenate time-domain subblocks to get an LB-length signal a.
B3) Divide block a into NSB overlapped subblocks of length LSB with an overlap length of LSB/2, multiply each subblock with a windowing function, and perform a DFT for weighted
subblocks to get reference spectra.
B4) Extract angles of reference spectra that serve as reference angles for the data
embedding.
[0004] Generating phase vectors at the beginning enables the flexibility of controlling
the phase vectors in case structured phase vectors rather than randomly generated
ones will be employed for the embedding. Step B3) mimics the windowing-overlap process
that is commonly used in the analysis stage of digital audio signal processing. A
windowing function is selected such that an ideal reconstruction is guaranteed following
ideal analysis (windowing-overlap) and synthesis (windowing-overlap-add) processes.
Finally, generated reference angles in step B4) are employed to modify host audio
spectrum phases after the host signal has undergone the windowing-overlap processing.
Invention
[0005] One possible problem of such phase generation processing is that many IDFTs/DFTs
are required, which imposes constraints on the achievable speed of hardware/software
implementations, and the deployment of related audio watermarking schemes is therefore
limited for real-time applications. For example in a home network where the AV material
is sent to the customer, in a set-top box the audio content is watermarked with a
label that identifies the customer. In a general setting a copyright owner will license
multiple different works to the customer. In order to avoid the serious security drawback
of using only one secret key for all AV material, different keys for individual copyrighted
works are used. Therefore efficient reference phase generation is a necessary prerequisite
for deploying an audio watermarking scheme in such application scenario.
[0006] A problem to be solved by the invention is to provide more efficient generation of
the reference phase patterns. This problem is solved by the method disclosed in claim
1. An apparatus that utilises this method is disclosed in claim 6.
[0007] The efficient generation of random reference phase patterns is carried out by approximating
the inverse transform of the phase reference patterns, and the corresponding forward
transform and the windowing and windowing-overlapping effects using two different
short-length frequency-domain filters for even and odd blocks or subblocks. Thereby
the computational complexity for generating the reference phase pattern blocks or
subblocks for frequency domain phase modulation of the audio signal blocks or subblocks
is significantly reduced, which in turn reduces the complexity of both watermark embedder
at sender side and watermark detector at receiver side.
[0008] The inventive reference phase generation fulfils critical real-time requirements
and enables a wide deployment of corresponding audio watermarking processings. Advantageously,
the flexibility of controlling the generated phases is maintained, e.g. in case structured
phases are used instead of random ones.
[0009] In principle, the inventive method is suited for generating reference phase patterns
to be used for controlling watermarking by frequency domain phase modulation of spectra
of an audio signal, said method including the steps:
- based on at least one secret key, generating blocks with random or pseudo-random phases
for said watermarking for frequency domain blocks of said audio signal,
- approximating - for said blocks with random or pseudo-random phases - an inverse transform
and windowing and a corresponding transform and overlap+add by a first frequency domain
filtering for even ones of said blocks and a different second frequency domain filtering
for odd ones of said blocks, thereby generating approximated versions of said reference
phase patterns for said blocks;
- using said approximated versions for said phase modulation.
[0010] In principle the inventive apparatus is suited for generating reference phase patterns
to be used for controlling watermarking by frequency domain phase modulation of spectra
of an audio signal, said apparatus including:
- means being adapted for generating - based on at least one secret key - blocks with
random or pseudo-random phases for said watermarking for frequency domain blocks of
said audio signal;
- means being adapted for approximating - for said blocks with random or pseudo-random
phases - an inverse transform and windowing and a corresponding transform and overlap+add
by a first frequency domain filtering for even ones of said blocks and a different
second frequency domain filtering for odd ones of said blocks, thereby generating
approximated versions of said reference phase patterns for said blocks, wherein said
approximated versions are to be used for said phase modulation.
[0011] Advantageous additional embodiments of the invention are disclosed in the respective
dependent claims.
Drawings
[0012] Exemplary embodiments of the invention are described with reference to the accompanying
drawings, which show in:
- Fig. 1
- known reference spectra generation;
- Fig. 2
- magnitude distribution of F[k'-k];
- Fig. 3
- DFT spectrum of windowing function w1[n] for LSB = 1024;
- Fig. 4
- principle of inventive reference spectra generation;
- Fig. 5
- block diagram for the inventive watermark embedder;
- Fig. 6
- inventive phase generation procedure.
Exemplary embodiments
[0013] According to the invention, for the watermark embedding and for watermark detection
reference spectra are directly evaluated, which approximates above steps B2) and B3).
To elaborate this, the original reference phase generation is investigated in detail.
In the sequel, a subblock with an even/odd subscript is referred to as an even/odd
subblock, respectively.
[0014] Regarding above step B3), an audio signal block
a is partitioned into
NSB overlapped (50% overlapping) subblocks of length
LSB denoted as {
am, 0 ≤
m<
NSB}. Accordingly, the even subblocks are non-overlapping and each odd subblock can be
obtained by its two adjacent even subblocks. Specifically, the (2
i+1)-th (i.e. odd) subblock is constructed from the upper half of subblock
a2i and the lower half of subblock
a2i+2 as follows:
[0015] The even subblocks are obtained according to step B2) as
a2i=
IDFT{
ejθi}. Next, the even and odd subblocks are weighted by a windowing function, denoted
as {
w[
n], 0 ≤
n<
LSB}, and a windowed subblock is obtained as
aw,m[
n] =
am[
n]
w[
n], 0 ≤
n<
LSB, 0 ≤
m<
NSB. Finally, the phases of spectra of windowed subblocks
aw,m serve as reference phases for the watermark embedding, followed by DFT for the weighted
subblocks to get reference spectra.
[0016] This known reference spectra generation processing is depicted in Fig. 1.
Reference spectra evaluation
[0017] How to directly evaluate reference spectra, i.e. DFTs of {
aw,m, 0 ≤
m<
NSB}? For DFT, multiplication in time-domain corresponds to circular convolution in frequency-domain.
Therefore reference spectra can be obtained via circular convolution of the spectra
for subblocks before windowing and of the spectrum of the windowing function.
[0018] {
Am[
k], 0 ≤
k<
LSB-1} and {
W[
k], 0 ≤
k<
LSB-1} shall denote DFTs for the m-th subblock
am and for the windowing function, respectively. The reference spectrum for the m-th
subblock is obtained by
where the term '<...>' denotes a modulo operation that is defined by
which maps any integer number to
the interval [0,
LSB), and where
denotes the greatest integer number not larger than
x.
[0019] Therefore the reference spectrum for an even subblock is
[0020] For an odd subblock, the subblock spectrum is calculated according to Eq.(1) by
where
if
k'≠
k and
F[0]=0.
[0021] The first term in Eq.(4) is the contribution from two spectral lines of the same
index within two adjacent subblocks and the second term is the contribution from other
spectral lines. The reference spectrum for an odd subblock is obtained by inserting
Eq.(4) into Eq.(2). Therefore, for the evaluation of reference spectra, two frequency-domain
filters {
F[
k]} and {
W[
k]} are involved. In case they can be well-approximated by a few filter taps, an efficient
evaluation of reference spectra is feasible.
Efficient reference phase generation by filter approximation Approximation for odd
subblock spectrum
[0022] The coefficient
F[
k'-
k] in Eq.(4) has the following properties:
[0023] For a typical subblock length, there are only a few dominant central terms for
F[
k'-
k] with respect to the magnitude, as shown in Fig. 2 for
LSB = 1024. Assuming that there are 2
L1 central dominant terms, i.e. for 0 <|
k'-
k|≤
L1, the Eq.(4) spectrum of an odd subblock can be approximated by setting q =
k'-
k:
Approximation for windowing function spectrum
[0024] Typically, the spectrum of the windowing function can also be approximated by a few
dominant components. Examples of such a windowing function are:
[0025] All three windowing functions have similar spectrum magnitudes. For example, the
spectrum of
w1[
n] is illustrated in Fig. 3 for
LSB = 1024, and there are only few dominant central terms.
[0026] Assuming that the windowing function spectrum can be well-approximated by 2
Lw+1 components, the reference spectrum in Eq.(2) is then approximated by:
Approximation for even subblock spectrum
[0027] For even subblocks, the reference spectra are approximated by:
[0028] For odd subblocks, by taking the approximation of Eq.(6) into account, the reference
spectra can be approximated by:
[0029] The second term in Eq.(9) can be interpreted as the spectrum difference that passes
through two filters, which can be regarded as representing in total a single filter:
where
Lg =
Lw+
L1.
[0030] This filter
G[<
q>] is defined for -
Lg ≤
q≤
Lg as
Efficient reference phase generation
[0031] Advantageously, Eq.(8) and Eq.(10) provide an efficient evaluation of the reference
spectra, whereby only two frequency-domain filters {
W[<
q>]} and {
G[<
q>]} with a few dominant taps are required. The corresponding filter coefficients can
be pre-calculated, and the approximation accuracy can be controlled by filter length
parameters
Lw and
Lg. That is, the trade-off between robustness and computational complexity can be adjusted
by choosing different
Lw and
Lg values.
[0032] This efficient generation of reference spectra is illustrated in Fig. 4, where WIN{...}
and WOL{...} denote operations defined by Eq.(8) and Eq.(10), respectively.
[0033] I.e, instead of using IDFTs/DFTs for evaluating the reference spectra, simple frequency-domain
filtering is employed for direct evaluation of the reference spectra, resulting in
significant complexity reduction. Since reference phase generation is an inherent
step or stage to be carried out in both audio watermark embedder and detector, the
complexity of embedder/detector is also reduced correspondingly.
[0034] For the case where
LSB = 1024,
NSB = 31 and
w3[
n] are used as windowing function, experiments have shown that
Lw = 2 and
Lg = 3 are already sufficient to get a robustness against a wide range of signal processing
attacks that is very similar to that of the original processing.
[0035] With respect to computational complexity, the original processing employing FFTs/IFFTs
requires on average 62 real-value multiplications and 62 real-value additions for
the evaluation of each DFT coefficient of the reference spectra.
[0036] In contrast, the inventive processing (with
Lw = 2 and
Lg = 3) requires on average only 9 real multiplications and 25 real additions for each
DFT coefficient.
[0037] In the block diagram of the inventive watermark embedder in Fig. 5, a received audio
signal AS is sampled in an acquisition or receiving section step or stage 51, and
thereafter passes through a segmentation, windowing and DFT step or stage 52 to a
phase modulation step or stage 53, in which the phases are modulated within the frequency
domain using references patterns or phases from a block 56. Following the phase modulation,
a corresponding IDFT, windowing and overlap-add step or stage 54 is passed, which
outputs the watermarked audio signal WAS.
[0038] In a current block or subblock, the phase value modification can be controlled according
to results from a psycho-acoustic analysis of the audio signal AS.
[0039] A secret key SK is used to generate pseudo-random phases in a step or stage 55, which
controls a reference pattern or phase generation step or stage 56 that operates using
the above-described inventive processing. In contrast to the prior art, stage 56 does
not include IDFT, windowing, DFT and overlap+add, but instead approximates these functions
by a first frequency domain filtering by
W for even subblocks and a different second frequency domain filtering by
W and
G for odd subblocks, thereby generating approximated versions of reference phase patterns
for the audio signal subblocks.
[0040] The inventive phase generation processing is depicted in Fig. 6. In step 60, nSymbols,
nSubBlocks and the secret key SK are input. In step 611 running variable 'j' is set
'0' and in step 612 running variable 'i' is set '0'. A first outer loop L1 runs from
j = 0 to j = (nSymbols-1). A second inner loop L2 runs from i = 0 to 2i+2 = (nSubBlocks-1).
In loop L2, in step 62 (which is included in step/stage 55 in Fig. 5) a random phase
is generated for subblocks 2i and 2i+2. In step 63 the spectra are constructed from
the phases, whereby the magnitude equals '1'. In step 64 the WOL subblock 2i+1 and
the WIN subblocks 2i and 2i+2 are calculated. Steps 63 and 64 are included in step/stage
56 in Fig. 5. In step 65, 'i' is incremented by '1'. In the outer loop L2, following
checking step 66, the running variable 'j' is incremented by '1' in step 67. If 'j'
has reached value (nSymbols-1) in step 68, nSubBlocks reference spectra for nSymbols
are output in step 69.
[0041] In an alternative embodiment, the order of even and odd subblocks is exchanged.
[0042] As a further alternative embodiment, a combination of frequency-domain convolution
and IDFT-windowing-DFT can be employed for the phase generation. Specifically, for
generating the reference spectra for an even subblock Eq.(8) is used. For an odd subblock
the first term of Eq.(10) is realised as a frequency-domain convolution, while the
second term is realised by IDFT-windowing-DFT. That is, a time-domain subblock corresponding
to {
A2i[
k] -
A2i+2[
k]} is obtained via IDFT, and is thereafter weighted by a time-domain filter corresponding
to {
G[
k]}. The reference spectrum of the odd subblock is obtained by a DFT of the weighted
subblock.
[0043] Such alternative processing typically has a higher computational complexity than
the inventive processing described before but a lower complexity than the processing
described in the background section, because the IDFT-windowing-DFT is applied to
odd blocks only.
1. Method for generating (56; 63, 64) reference phase patterns to be used for controlling
watermarking by frequency domain phase modulation (53) of spectra of an audio signal
(AS), said method including the steps:
- based on at least one secret key (SK), generating (55; 62) blocks with random or
pseudo-random phases for said watermarking for frequency domain blocks of said audio
signal (AS),
characterised by:
- approximating (56; 63, 64) - for said blocks with random or pseudo-random phases
- an inverse transform and windowing and a corresponding transform and overlap+add
by a first frequency domain filtering (WIN) for even ones of said blocks and a different
second frequency domain filtering (WOL) for odd ones of said blocks, thereby generating
approximated versions of said reference phase patterns for said blocks;
- using said approximated versions for said phase modulation (53).
2. Method according to claim 1, wherein said first frequency domain filtering (WIN) is
and said second frequency domain filtering (WOL) is
wherein
i is the running number of a block and
k is the position within a block and
LSB is the length of said blocks, which blocks can be subblocks of blocks,
and wherein
Lw defines the length 2
Lw+1 of central components used from a frequency domain filter
W representing the windowing function spectrum,
and wherein the term '<...>' denotes a modulo operation that is defined by
where
denotes the greatest integer number not larger than
x,
and wherein
Lg =
Lw+
L1 and
L1 defines the length 2
L1 of central components used from the function
if
k'≠
k and
F[0]=0, 0 ≤
k,k'<
LSB-1,
and wherein
3. Method according to claim 1 or 2, wherein within a current audio signal block in the
frequency domain the phase of the audio signal is modified adaptively using psycho-acoustic
calculations.
4. Method according to claim 2 or 3, wherein in case there are 31 subblocks per audio
signal block and
LSB = 1024, and
is used as windowing function,
Lw = 2 and
Lg = 3.
5. Method according to claim 1, wherein the second term of said second frequency domain
filtering is replaced by the known IDFT-windowing-DFT processing.
6. Apparatus for generating (56; 63, 64) reference phase patterns to be used for controlling
watermarking by frequency domain phase modulation (53) of spectra of an audio signal
(AS), said apparatus including:
- means (55) being adapted for generating - based on at least one secret key (SK)
- blocks with random or pseudo-random phases for said watermarking for frequency domain
blocks of said audio signal (AS);
- means (56) being adapted for approximating - for said blocks with random or pseudo-random
phases - an inverse transform and windowing and a corresponding transform and overlap+add
by a first frequency domain filtering (WIN) for even ones of said blocks and a different
second frequency domain filtering (WOL) for odd ones of said blocks, thereby generating
approximated versions of said reference phase patterns for said blocks, wherein said
approximated versions are to be used for said phase modulation (53).
7. Apparatus according to claim 6, wherein said first frequency domain filtering (WIN)
is
and said second frequency domain filtering (WOL) is
wherein
i is the running number of a block and
k is the position within a block and
LSB is the length of said blocks, which blocks can be subblocks of blocks,
and wherein
Lw defines the length 2
Lw+1 of central components used from a frequency domain filter
W representing the windowing function spectrum,
and wherein the term '<...>' denotes a modulo operation that is defined by <n> =
where
denotes the greatest integer number not larger than
x,
and wherein
Lg =
Lw+
L1 and
L1 defines the length 2
L1 of central components used from the function
if
k'≠
k and
F[0]=0, 0 ≤
k,
k'<
LSB-1,
and wherein
8. Apparatus according to claim 6 or 7, wherein within a current audio signal block in
the frequency domain the phase of the audio signal is modified adaptively using psycho-acoustic
calculations.
9. Apparatus according to claims 7 or 8, wherein in case there are 31 subblocks per audio
signal block and
LSB = 1024, and
w3[
n]=
w3[
LSB-1-
n] for
LSB/2 ≤
n<
LSB is used as windowing function,
Lw = 2 and
Lg = 3.
10. Digital audio signal that was watermarked according to the method of one of claims
1 to 5.
11. Storage medium that contains or stores, or has recorded on it, a digital audio signal
according to claim 10.