(19)
(11) EP 2 439 735 A1

(12) EUROPEAN PATENT APPLICATION

(43) Date of publication:
11.04.2012 Bulletin 2012/15

(21) Application number: 10306091.9

(22) Date of filing: 06.10.2010
(51) International Patent Classification (IPC): 
G10L 19/00(2006.01)
(84) Designated Contracting States:
AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR
Designated Extension States:
BA ME

(71) Applicant: Thomson Licensing
92130 Issy-les-Moulineaux (FR)

(72) Inventors:
  • Chen, Xiao-Ming
    30165, Hannover (DE)
  • Arnold, Michael
    30916, Isernhagen (DE)
  • Baum, Peter Georg
    30539, Hannover (DE)
  • Gries, Ulrich
    30419, Hannover (DE)

(74) Representative: Hartnack, Wolfgang 
Deutsche Thomson OHG European Patent Operations Karl-Wiechert-Allee 74
30625 Hannover
30625 Hannover (DE)

   


(54) Method and Apparatus for generating reference phase patterns


(57) For real-time audio watermarking, wherein the watermark embedding occurs in the frequency domain by modifying phases of spectra of the audio signal according to phase reference patterns, low-complexity watermark embedding is desired. Depending on a secret key SK, random reference phases for each watermark message symbol are generated, wherein the corresponding reference phase patterns are generated by approximating the inverse transform of the phase reference patterns and the corresponding transform and the windowing and windowing-overlapping effects, using two different short-length frequency-domain filters for even and odd blocks or subblocks. Thereby the computational complexity for generating the reference phase pattern blocks or subblocks is significantly reduced.




Description


[0001] The invention relates to a method and to an apparatus for generating reference phase patterns to be used for controlling watermarking by frequency domain phase modulation of spectra of an audio signal.

Background



[0002] For real-time applications of audio watermarking in the context of home networking and for efficient implementation, e.g. in set-top boxes, low-complexity watermark embedding and/or detecting are desired, if robustness and perceptual transparency are not sacrificed significantly.

[0003] In WO 2007/031423 A1 and in M.Arnold, P.G.Baum, W.Voessing, "A phase modulation audio watermarking technique", 11th Information Hiding Workshop, pages 102-116, 2009, watermark embedding occurs in the phase domain by modifying phases of the host signal spectrum according to phase reference patterns. Specifically, the host audio signal is split into blocks of length LB and each block is used for embedding a watermark symbol. The watermark message is represented by a sequence of symbols. For embedding, each block can be further partitioned into NSB overlapping subblocks of LSB samples length with an overlap length of LSB/2. Following weighting of each subblock by a windowing function, phases of the weighted subblock spectrum are modified according to generated reference phases and according to results from a psycho-acoustic analysis of the host audio signal. Depending on a secret key SK (e.g. as seed of a random generator), reference phases for each watermark message symbol are generated by the following steps:

B1) Randomly generate independently phase vectors Θi of
length LSB, which are uniformly distributed over [-π,π), where

B2) Generate time-domain subblocks via inverse discrete Fourier transform (IDFT) of a normalised spectrum:

a'i=IDFT{ejθi}, where

. Concatenate time-domain subblocks to get an LB-length signal a.

B3) Divide block a into NSB overlapped subblocks of length LSB with an overlap length of LSB/2, multiply each subblock with a windowing function, and perform a DFT for weighted subblocks to get reference spectra.

B4) Extract angles of reference spectra that serve as reference angles for the data embedding.



[0004] Generating phase vectors at the beginning enables the flexibility of controlling the phase vectors in case structured phase vectors rather than randomly generated ones will be employed for the embedding. Step B3) mimics the windowing-overlap process that is commonly used in the analysis stage of digital audio signal processing. A windowing function is selected such that an ideal reconstruction is guaranteed following ideal analysis (windowing-overlap) and synthesis (windowing-overlap-add) processes. Finally, generated reference angles in step B4) are employed to modify host audio spectrum phases after the host signal has undergone the windowing-overlap processing.

Invention



[0005] One possible problem of such phase generation processing is that many IDFTs/DFTs are required, which imposes constraints on the achievable speed of hardware/software implementations, and the deployment of related audio watermarking schemes is therefore limited for real-time applications. For example in a home network where the AV material is sent to the customer, in a set-top box the audio content is watermarked with a label that identifies the customer. In a general setting a copyright owner will license multiple different works to the customer. In order to avoid the serious security drawback of using only one secret key for all AV material, different keys for individual copyrighted works are used. Therefore efficient reference phase generation is a necessary prerequisite for deploying an audio watermarking scheme in such application scenario.

[0006] A problem to be solved by the invention is to provide more efficient generation of the reference phase patterns. This problem is solved by the method disclosed in claim 1. An apparatus that utilises this method is disclosed in claim 6.

[0007] The efficient generation of random reference phase patterns is carried out by approximating the inverse transform of the phase reference patterns, and the corresponding forward transform and the windowing and windowing-overlapping effects using two different short-length frequency-domain filters for even and odd blocks or subblocks. Thereby the computational complexity for generating the reference phase pattern blocks or subblocks for frequency domain phase modulation of the audio signal blocks or subblocks is significantly reduced, which in turn reduces the complexity of both watermark embedder at sender side and watermark detector at receiver side.

[0008] The inventive reference phase generation fulfils critical real-time requirements and enables a wide deployment of corresponding audio watermarking processings. Advantageously, the flexibility of controlling the generated phases is maintained, e.g. in case structured phases are used instead of random ones.

[0009] In principle, the inventive method is suited for generating reference phase patterns to be used for controlling watermarking by frequency domain phase modulation of spectra of an audio signal, said method including the steps:
  • based on at least one secret key, generating blocks with random or pseudo-random phases for said watermarking for frequency domain blocks of said audio signal,
  • approximating - for said blocks with random or pseudo-random phases - an inverse transform and windowing and a corresponding transform and overlap+add by a first frequency domain filtering for even ones of said blocks and a different second frequency domain filtering for odd ones of said blocks, thereby generating approximated versions of said reference phase patterns for said blocks;
  • using said approximated versions for said phase modulation.


[0010] In principle the inventive apparatus is suited for generating reference phase patterns to be used for controlling watermarking by frequency domain phase modulation of spectra of an audio signal, said apparatus including:
  • means being adapted for generating - based on at least one secret key - blocks with random or pseudo-random phases for said watermarking for frequency domain blocks of said audio signal;
  • means being adapted for approximating - for said blocks with random or pseudo-random phases - an inverse transform and windowing and a corresponding transform and overlap+add by a first frequency domain filtering for even ones of said blocks and a different second frequency domain filtering for odd ones of said blocks, thereby generating approximated versions of said reference phase patterns for said blocks, wherein said approximated versions are to be used for said phase modulation.


[0011] Advantageous additional embodiments of the invention are disclosed in the respective dependent claims.

Drawings



[0012] Exemplary embodiments of the invention are described with reference to the accompanying drawings, which show in:
Fig. 1
known reference spectra generation;
Fig. 2
magnitude distribution of F[k'-k];
Fig. 3
DFT spectrum of windowing function w1[n] for LSB = 1024;
Fig. 4
principle of inventive reference spectra generation;
Fig. 5
block diagram for the inventive watermark embedder;
Fig. 6
inventive phase generation procedure.

Exemplary embodiments



[0013] According to the invention, for the watermark embedding and for watermark detection reference spectra are directly evaluated, which approximates above steps B2) and B3). To elaborate this, the original reference phase generation is investigated in detail. In the sequel, a subblock with an even/odd subscript is referred to as an even/odd subblock, respectively.

[0014] Regarding above step B3), an audio signal block a is partitioned into NSB overlapped (50% overlapping) subblocks of length LSB denoted as {am, 0 ≤m< NSB}. Accordingly, the even subblocks are non-overlapping and each odd subblock can be obtained by its two adjacent even subblocks. Specifically, the (2i+1)-th (i.e. odd) subblock is constructed from the upper half of subblock a2i and the lower half of subblock a2i+2 as follows:



[0015] The even subblocks are obtained according to step B2) as a2i=IDFT{ejθi}. Next, the even and odd subblocks are weighted by a windowing function, denoted as {w[n], 0 ≤n< LSB}, and a windowed subblock is obtained as aw,m[n] = am[n]w[n], 0 ≤n< LSB, 0 ≤m< NSB. Finally, the phases of spectra of windowed subblocks aw,m serve as reference phases for the watermark embedding, followed by DFT for the weighted subblocks to get reference spectra.

[0016] This known reference spectra generation processing is depicted in Fig. 1.

Reference spectra evaluation



[0017] How to directly evaluate reference spectra, i.e. DFTs of {aw,m, 0 ≤m< NSB}? For DFT, multiplication in time-domain corresponds to circular convolution in frequency-domain. Therefore reference spectra can be obtained via circular convolution of the spectra for subblocks before windowing and of the spectrum of the windowing function.

[0018] {Am[k], 0 ≤k< LSB-1} and {W[k], 0 ≤k< LSB-1} shall denote DFTs for the m-th subblock am and for the windowing function, respectively. The reference spectrum for the m-th subblock is obtained by


where the term '<...>' denotes a modulo operation that is defined by

which maps any integer number to
the interval [0,LSB), and where

denotes the greatest integer number not larger than x.

[0019] Therefore the reference spectrum for an even subblock is



[0020] For an odd subblock, the subblock spectrum is calculated according to Eq.(1) by


where

if k'≠k and F[0]=0.

[0021] The first term in Eq.(4) is the contribution from two spectral lines of the same index within two adjacent subblocks and the second term is the contribution from other spectral lines. The reference spectrum for an odd subblock is obtained by inserting Eq.(4) into Eq.(2). Therefore, for the evaluation of reference spectra, two frequency-domain filters {F[k]} and {W[k]} are involved. In case they can be well-approximated by a few filter taps, an efficient evaluation of reference spectra is feasible.

Efficient reference phase generation by filter approximation Approximation for odd subblock spectrum



[0022] The coefficient F[k'-k] in Eq.(4) has the following properties:



[0023] For a typical subblock length, there are only a few dominant central terms for F[k'-k] with respect to the magnitude, as shown in Fig. 2 for LSB = 1024. Assuming that there are 2L1 central dominant terms, i.e. for 0 <|k'-k|≤ L1, the Eq.(4) spectrum of an odd subblock can be approximated by setting q = k'-k:


Approximation for windowing function spectrum



[0024] Typically, the spectrum of the windowing function can also be approximated by a few dominant components. Examples of such a windowing function are:





[0025] All three windowing functions have similar spectrum magnitudes. For example, the spectrum of w1[n] is illustrated in Fig. 3 for LSB = 1024, and there are only few dominant central terms.

[0026] Assuming that the windowing function spectrum can be well-approximated by 2Lw+1 components, the reference spectrum in Eq.(2) is then approximated by:


Approximation for even subblock spectrum



[0027] For even subblocks, the reference spectra are approximated by:



[0028] For odd subblocks, by taking the approximation of Eq.(6) into account, the reference spectra can be approximated by:



[0029] The second term in Eq.(9) can be interpreted as the spectrum difference that passes through two filters, which can be regarded as representing in total a single filter:


where Lg = Lw+L1.

[0030] This filter G[<q>] is defined for -LgqLg as


Efficient reference phase generation



[0031] Advantageously, Eq.(8) and Eq.(10) provide an efficient evaluation of the reference spectra, whereby only two frequency-domain filters {W[<q>]} and {G[<q>]} with a few dominant taps are required. The corresponding filter coefficients can be pre-calculated, and the approximation accuracy can be controlled by filter length parameters Lw and Lg. That is, the trade-off between robustness and computational complexity can be adjusted by choosing different Lw and Lg values.

[0032] This efficient generation of reference spectra is illustrated in Fig. 4, where WIN{...} and WOL{...} denote operations defined by Eq.(8) and Eq.(10), respectively.

[0033] I.e, instead of using IDFTs/DFTs for evaluating the reference spectra, simple frequency-domain filtering is employed for direct evaluation of the reference spectra, resulting in significant complexity reduction. Since reference phase generation is an inherent step or stage to be carried out in both audio watermark embedder and detector, the complexity of embedder/detector is also reduced correspondingly.

[0034] For the case where LSB = 1024, NSB = 31 and w3[n] are used as windowing function, experiments have shown that Lw = 2 and Lg = 3 are already sufficient to get a robustness against a wide range of signal processing attacks that is very similar to that of the original processing.

[0035] With respect to computational complexity, the original processing employing FFTs/IFFTs requires on average 62 real-value multiplications and 62 real-value additions for the evaluation of each DFT coefficient of the reference spectra.

[0036] In contrast, the inventive processing (with Lw = 2 and Lg = 3) requires on average only 9 real multiplications and 25 real additions for each DFT coefficient.

[0037] In the block diagram of the inventive watermark embedder in Fig. 5, a received audio signal AS is sampled in an acquisition or receiving section step or stage 51, and thereafter passes through a segmentation, windowing and DFT step or stage 52 to a phase modulation step or stage 53, in which the phases are modulated within the frequency domain using references patterns or phases from a block 56. Following the phase modulation, a corresponding IDFT, windowing and overlap-add step or stage 54 is passed, which outputs the watermarked audio signal WAS.

[0038] In a current block or subblock, the phase value modification can be controlled according to results from a psycho-acoustic analysis of the audio signal AS.

[0039] A secret key SK is used to generate pseudo-random phases in a step or stage 55, which controls a reference pattern or phase generation step or stage 56 that operates using the above-described inventive processing. In contrast to the prior art, stage 56 does not include IDFT, windowing, DFT and overlap+add, but instead approximates these functions by a first frequency domain filtering by W for even subblocks and a different second frequency domain filtering by W and G for odd subblocks, thereby generating approximated versions of reference phase patterns for the audio signal subblocks.

[0040] The inventive phase generation processing is depicted in Fig. 6. In step 60, nSymbols, nSubBlocks and the secret key SK are input. In step 611 running variable 'j' is set '0' and in step 612 running variable 'i' is set '0'. A first outer loop L1 runs from j = 0 to j = (nSymbols-1). A second inner loop L2 runs from i = 0 to 2i+2 = (nSubBlocks-1). In loop L2, in step 62 (which is included in step/stage 55 in Fig. 5) a random phase is generated for subblocks 2i and 2i+2. In step 63 the spectra are constructed from the phases, whereby the magnitude equals '1'. In step 64 the WOL subblock 2i+1 and the WIN subblocks 2i and 2i+2 are calculated. Steps 63 and 64 are included in step/stage 56 in Fig. 5. In step 65, 'i' is incremented by '1'. In the outer loop L2, following checking step 66, the running variable 'j' is incremented by '1' in step 67. If 'j' has reached value (nSymbols-1) in step 68, nSubBlocks reference spectra for nSymbols are output in step 69.

[0041] In an alternative embodiment, the order of even and odd subblocks is exchanged.

[0042] As a further alternative embodiment, a combination of frequency-domain convolution and IDFT-windowing-DFT can be employed for the phase generation. Specifically, for generating the reference spectra for an even subblock Eq.(8) is used. For an odd subblock the first term of Eq.(10) is realised as a frequency-domain convolution, while the second term is realised by IDFT-windowing-DFT. That is, a time-domain subblock corresponding to {A2i[k] - A2i+2[k]} is obtained via IDFT, and is thereafter weighted by a time-domain filter corresponding to {G[k]}. The reference spectrum of the odd subblock is obtained by a DFT of the weighted subblock.

[0043] Such alternative processing typically has a higher computational complexity than the inventive processing described before but a lower complexity than the processing described in the background section, because the IDFT-windowing-DFT is applied to odd blocks only.


Claims

1. Method for generating (56; 63, 64) reference phase patterns to be used for controlling watermarking by frequency domain phase modulation (53) of spectra of an audio signal (AS), said method including the steps:

- based on at least one secret key (SK), generating (55; 62) blocks with random or pseudo-random phases for said watermarking for frequency domain blocks of said audio signal (AS),

characterised by:

- approximating (56; 63, 64) - for said blocks with random or pseudo-random phases - an inverse transform and windowing and a corresponding transform and overlap+add by a first frequency domain filtering (WIN) for even ones of said blocks and a different second frequency domain filtering (WOL) for odd ones of said blocks, thereby generating approximated versions of said reference phase patterns for said blocks;

- using said approximated versions for said phase modulation (53).


 
2. Method according to claim 1, wherein said first frequency domain filtering (WIN) is


and said second frequency domain filtering (WOL) is


wherein i is the running number of a block and k is the position within a block and LSB is the length of said blocks, which blocks can be subblocks of blocks,
and wherein Lw defines the length 2Lw+1 of central components used from a frequency domain filter W representing the windowing function spectrum,
and wherein the term '<...>' denotes a modulo operation that is defined by

where

denotes the greatest integer number not larger than x,
and wherein Lg = Lw+L1 and L1 defines the length 2L1 of central components used from the function

if k'≠k and F[0]=0, 0 ≤k,k'< LSB-1,
and wherein


 
3. Method according to claim 1 or 2, wherein within a current audio signal block in the frequency domain the phase of the audio signal is modified adaptively using psycho-acoustic calculations.
 
4. Method according to claim 2 or 3, wherein in case there are 31 subblocks per audio signal block and LSB = 1024, and


is used as windowing function, Lw = 2 and Lg = 3.
 
5. Method according to claim 1, wherein the second term of said second frequency domain filtering is replaced by the known IDFT-windowing-DFT processing.
 
6. Apparatus for generating (56; 63, 64) reference phase patterns to be used for controlling watermarking by frequency domain phase modulation (53) of spectra of an audio signal (AS), said apparatus including:

- means (55) being adapted for generating - based on at least one secret key (SK) - blocks with random or pseudo-random phases for said watermarking for frequency domain blocks of said audio signal (AS);

- means (56) being adapted for approximating - for said blocks with random or pseudo-random phases - an inverse transform and windowing and a corresponding transform and overlap+add by a first frequency domain filtering (WIN) for even ones of said blocks and a different second frequency domain filtering (WOL) for odd ones of said blocks, thereby generating approximated versions of said reference phase patterns for said blocks, wherein said approximated versions are to be used for said phase modulation (53).


 
7. Apparatus according to claim 6, wherein said first frequency domain filtering (WIN) is


and said second frequency domain filtering (WOL) is




wherein i is the running number of a block and k is the position within a block and LSB is the length of said blocks, which blocks can be subblocks of blocks,
and wherein Lw defines the length 2Lw+1 of central components used from a frequency domain filter W representing the windowing function spectrum,
and wherein the term '<...>' denotes a modulo operation that is defined by <n> =

where

denotes the greatest integer number not larger than x,
and wherein Lg = Lw+L1 and L1 defines the length 2L1 of central components used from the function


if k'≠k and F[0]=0, 0 ≤k,k'< LSB-1,
and wherein


 
8. Apparatus according to claim 6 or 7, wherein within a current audio signal block in the frequency domain the phase of the audio signal is modified adaptively using psycho-acoustic calculations.
 
9. Apparatus according to claims 7 or 8, wherein in case there are 31 subblocks per audio signal block and LSB = 1024, and

w3[n]=w3[LSB-1-n] for LSB/2 ≤n< LSB is used as windowing function, Lw = 2 and Lg = 3.
 
10. Digital audio signal that was watermarked according to the method of one of claims 1 to 5.
 
11. Storage medium that contains or stores, or has recorded on it, a digital audio signal according to claim 10.
 




Drawing
















Search report







Cited references

REFERENCES CITED IN THE DESCRIPTION



This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

Patent documents cited in the description