Method and apparatus for determining whether a specific watermark symbol out of one or more candidate watermark symbols is embedded in a current section of a received audio signal

(19)

(11)

EP 3 001 415 A1

(12)	EUROPEAN PATENT APPLICATION

(43)	Date of publication:
	30.03.2016 Bulletin 2016/13

(21)	Application number: 14306464.0

(22)	Date of filing: 23.09.2014

(51)

International Patent Classification (IPC):

G10L 19/018^(2013.01)

(84)	Designated Contracting States:
	AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR
	Designated Extension States:
	BA ME

(71)	Applicant: Thomson Licensing
	92130 Issy-les-Moulineaux (FR)

(72)	Inventors:
	Arnold, Michael 30916 Isernhagen (DE) Baum, Peter Georg 30539 Hannover (DE) Chen, Xiaoming 30165 Hannover (DE) Gries, Ulrich 30419 Hannover (DE)

(74)	Representative: Hartnack, Wolfgang
	Deutsche Thomson OHG European Patent Operations Karl-Wiechert-Allee 74 30625 Hannover 30625 Hannover (DE)

(54)	Method and apparatus for determining whether a specific watermark symbol out of one or more candidate watermark symbols is embedded in a current section of a received audio signal

(57) From sets of correlation result values it is determined whether a specific watermark symbol out of one or more candidate watermark symbols is embedded in a received audio signal. For all candidate watermark symbols, from each corresponding set of correlation result values, a group (n_p) of maximal values together form a peak vector (v_i). From the normalised peak values (w_i) a probability distribution function (pdf, g(w_i)) and a false positive probability function (P_fp(w_i)) are calculated. If the values of the false positive probability function are smaller than a first threshold value (T_min), the current candidate watermark symbol is taken as a true watermark symbol. If not yet all candidate watermark symbols have been processed, the next candidate watermark symbol is selected. Otherwise, a minimal value

of the false positive probability functions for all candidate watermark symbols is determined (76) and is compared (77) with a second threshold value (T_max). If it is smaller than the second threshold value, the current candidate watermark symbol is selected. Otherwise, it is determined (78) that no watermark symbol is present.

Description

Technical field

[0001] The invention relates to a method and to an apparatus for determining from sets of correlation result values whether a specific watermark symbol out of one or more candidate watermark symbols is embedded in a current section of a received audio signal, or whether no watermark symbol is embedded in the current section of the received audio signal.

Background

[0002] In a watermark detector cross correlations between a received signal and reference patterns are evaluated. Basically, the maximal correlation result value is compared to a threshold in order to determine whether watermark information has been embedded in the received signal. For acoustic path transmission, multiple correlation result peaks are employed for detection, in order to take a multi-path environment into account. Again, an appropriately defined metric aggregating multiple correlation result peaks is compared to a threshold for watermark detection.

[0003] A false positive probability defines the probability that a watermark is detected for unmarked content and is denoted as P_fp, which is naturally dependent on the applied watermark detection processing.

Summary of invention

[0004] A problem to be solved by the invention is to provide an improved watermark information detection. This problem is solved by the method disclosed in claim 1. An apparatus that utilises this method is disclosed in claim 2.
Advantageous additional embodiments of the invention are disclosed in the respective dependent claims.

[0005] According to the invention, order statistics are used for watermark symbol detection from the correlation result values, where the joint probability distribution function (pdf) for one or more peaks of cross correlation values between a current section of the received audio signal and reference patterns is employed directly for watermark detection. In addition, Monte Carlo or quasi-Monte Carlo simulations are used for evaluating the false positive probability corresponding to a pdf value threshold. A pdf threshold look-up table (LUT) and an associated false positive probability look-up table can be constructed, which both are used for the watermark symbol detection.
Using such LUTs significantly simplifies the complexity of watermark detection when taking more correlation result peaks for detection into account.
Because the derived false positive probability has intuitive interpretation, it can be used for the design of watermarking systems employing correlation for watermark detection.

[0006] In principle, the inventive method is adapted for determining from sets of correlation result values whether a specific watermark symbol out of one or more candidate watermark symbols is embedded in a current section of a received audio signal, or whether no one of said candidate watermark symbols is embedded in said current section of said received audio signal, wherein said current section of said received audio signal was correlated with at least one candidate reference pattern, each one of which representing one of said one or more candidate watermark symbols, said method including:

a) taking from the current set of correlation result values a group of maximal values which together form a peak vector;
- obtaining from the values of said peak vector a value of a probability distribution function;
- obtaining from said probability distribution function value a false positive probability value, which false positive probability value represents a probability that peaks resulting from correlation between a candidate reference pattern and non-watermarked audio signal content have a smaller pdf value than said probability distribution function value;
- determining whether said false positive probability value is smaller than a first threshold value and, if true, determining that the current candidate watermark symbol is the watermark symbol present in said current section of said received audio signal;
- if not yet all candidate watermark symbols have been processed, select the next candidate watermark symbol as said current candidate watermark symbol and go to a);
- otherwise, determining a minimal value of said false positive probability function values for all candidate watermark symbols;
- comparing said minimal value with a second threshold value that is greater than said first threshold value;
- if said minimal value is smaller than said second threshold value, determining that the current candidate watermark symbol is the watermark symbol present in said current section of said received audio signal;
- otherwise, determining that one of said candidate watermark symbols is present in said current section of said received audio signal.

[0007] In principle the inventive apparatus is adapted for determining from sets of correlation result values whether a specific watermark symbol out of one or more candidate watermark symbols is embedded in a current section of a received audio signal, or whether no one of said candidate watermark symbols is embedded in said current section of said received audio signal, wherein said current section of said received audio signal was correlated with at least one candidate reference pattern, each one of which representing one of said one or more candidate watermark symbols, said apparatus including means configured to:

a) take from the current set of correlation result values a group of maximal values which together form a peak vector;
- obtain from the values of said peak vector a value of a probability distribution function;
- obtain from said probability distribution function value a false positive probability value, which false positive probability value represents a probability that peaks resulting from correlation between a candidate reference pattern and non-watermarked audio signal content have a smaller pdf value than said probability distribution function value;
- determine whether said false positive probability value is smaller than a first threshold value and, if true, determine that the current candidate watermark symbol is the watermark symbol present in said current section of said received audio signal;
- if not yet all candidate watermark symbols have been processed, select the next candidate watermark symbol as said current candidate watermark symbol and go to a);
- otherwise, determine a minimal value of said false positive probability function values for all candidate watermark symbols;
- compare said minimal value with a second threshold value that is greater than said first threshold value;
- if said minimal value is smaller than said second threshold value, determine that the current candidate watermark symbol is the watermark symbol present in said current section of said received audio signal;
- otherwise, determine that one of said candidate watermark symbols is present in said current section of said received audio signal.

Brief description of drawings

[0008] Exemplary embodiments of the invention are described with reference to the accompanying drawings, which show in:

Fig. 1: probability of the calculated P_fp values for unmarked content being smaller than the given P_fp threshold;
Fig. 2: probability distribution for single peak case;
Fig. 3: determination of w_min,w_max for the single-peak case;
Fig. 4: flow diagram for generation of pdf threshold LUT and P_fp LUT, which are used for watermark detection;
Fig. 5: first flow diagram for watermark detection based on order statistics;
Fig. 6: block diagram for watermark information detection in a received audio signal;
Fig. 7: second flow diagram for watermark detection based on order statistics.

Description of embodiments

[0009] Even if not explicitly described, the following embodiments may be employed in any combination or sub-combination.

[0010] In a watermark detector, cross correlations between a received signal and reference patterns are evaluated. Usually, the maximal correlation result value is compared to a threshold in order to determine whether a watermark is embedded in the received signal.
However, when receiving a watermarked audio signal that has passed an acoustic path, multiple correlation result value peaks are employed for watermark detection, in order to take a resulting multi-path environment due to echoes and reverberation into account. An appropriately defined metric aggregating multiple correlation result value peaks is compared to a threshold for watermark detection. A false positive probability defines the probability that a watermark is detected for unmarked content and is denoted as P_fp. It is naturally dependent on the applied detection method.

[0011] For the audio watermarking system proposed in M. Arnold, X.M. Chen, P. Baum, U. Gries, G. Doerr, "A phase-based audio watermarking system robust to acoustic path propagation", IEEE Transactions on Information Forensics and Security, vol.9, no.3, pp.411-425, March 2014, it has been verified that cross correlation values between unmarked content and a reference pattern are Gaussian distributed with zero mean for a sufficiently large correlation length.
The inventors have found that statistical methods can be employed for watermark detection. That is, if P_fp can be evaluated for individual correlation arrays, P_fp itself may be used for watermark detection. Specifically, P_fp values are evaluated for different symbols in the watermark symbol alphabet. The smallest P_fp among all watermark symbols is compared to a threshold in order to decide whether watermark information is present in the received signal. If the smallest P_fp is smaller than the threshold, a watermark is assumed to be present. The symbol associated with the smallest P_fp is taken as the embedded watermark symbol. Otherwise, if the smallest P_fp is higher than the threshold, it is declared that no watermark data is present.

[0012] In WO 2011/141292 A1 and PCT/EP2014/066063, P_fp is defined as the probability that n_p or more correlation result values for a random correlation array subject to Gaussian distribution are larger than or equal to the actual n_p peaks under consideration. In this context, detection is based on comparison of multiple peaks. One can find all disjoint complementary cases for the case where there are n_p or more correlation values larger than or equal to the actual n_p peaks under consideration. However, the number of disjoint complementary cases exponentially increases with increased n_p, which limits its application, especially for environments with severe reflections and/or reverberations.
Another issue is associated with the interpretation of the defined P_fp. In WO 2011/141292 A1 and PCT/EP2014/066063 watermark detection is carried out by comparing evaluated P_fp values. Given a P_fp threshold, it is not straightforward to determine the probability of evaluated P_fp values for unmarked content watermark being smaller than the given P_fp threshold. For single peak cases, the probability of evaluated P_fp values for unmarked content watermark being smaller than the given P_fp threshold is equal to the given P_fp threshold.
However, for n_p > 1 that is not the case, as illustrated in Fig. 1, in which P_fp values delivered from the detector described in WO 2011/141292 A1 and PCT/EP2014/066063 are compared to P_fp thresholds shown on the x-axis. The probability for P_fp values being lower than a threshold is estimated by dividing the number of P_fp values lower than the threshold by the total number of delivered P_fp values. As shown in Fig. 1, the probability of evaluated P_fp values for unmarked content watermark being smaller than the given P_fp threshold is higher than the given P_fp threshold for n_p > 1. With increased n_p the deviation between both becomes larger.

[0013] Therefore, in the improved processing, order statistics are used for watermark detection. To resolve the related complexity issue, two look-up tables are employed for P_fp function evaluation. Moreover, using order statistics as decision metric provides a nice interpretation of evaluated P_fp function values, namely, the probability of evaluated P_fp function values for unmarked content watermark being smaller than the given P_fp threshold is exactly equal to the given P_fp threshold for any n_p value.

Derivation of joint distribution function for multiple peaks based on order statistics

[0014] From the viewpoint of maximum likelihood detection, the probability distribution function denoted pdf of peaks resulting from unmarked content can be employed for watermark detection. The decision criterion is to minimise the likelihood pdf. That is, the higher the pdf value for multiple peaks, the more likely it is that these peaks are generated from unmarked content. Conversely, the lower the pdf value, the more likely it is that these peaks are generated from marked content.

[0015] Under the assumption that correlation result values are independent, i.e. identically distributed random variables, the pdf of multiple peaks occurring in the correlation result can be evaluated based on order statistics, see H.A. David and H.N. Nagaraja, "Order statistics", John Wiley & Sons, 3rd edition, 2003.
Specifically for the audio watermarking system under consideration, let {r₀, r₁,..., r_L-1} denote L absolute correlation result values subject to the following zero mean folded Gaussian distribution (r_i ≥ 0) :

where σ denotes the standard deviation. Order statistics deal with sorted random variables r₍₀₎ ≥ r₍₁₎ ≥ ... ≥ r_(L-1), which are obtained by sorting the original correlation result values {r₀,r₁,...,r_L-1}.
Let v = [v₀ = r₍₀₎, v₁ = r₍₁₎, ..., v_{n_p-1} = r_{(n_p-1)}] denote the peak vector containing n_p largest peaks in a correlation array. Tests have shown that n_p can be in a range from 3 to 12. The constraint v₀ ≥ v₁ ≥ ··· ≥ v_{n_p-1} is referred to as peak constraint. The joint pdf of these peaks can be derived as (see the above mentioned David/Nagaraja book):

where

is the error function.
For convenience, a normalised peak vector is defined as

and the distribution for w becomes (w₀ ≥ w₁ ≥ ··· ≥ w_{n_p-1})

In principle, watermark detection can be carried out by comparing the pdf values of normalised peak vectors in correlation arrays corresponding to different watermark symbols, and the symbol resulting in the smallest pdf value is selected as embedded watermark symbol. On the other hand, a threshold should be used to avoid a high false positive probability, or in other words, the resulting P_fp using that threshold should be below the target P_fp. That is, only when the smallest pdf value g(w) is sufficiently low, it is decided that a watermark is present in the received signal. Otherwise, if the threshold is not low enough, for unmarked content, a watermark will be detected with a high probability. Consequently, the corresponding P_fp becomes high. Therefore it is necessary to evaluate P_fp for a specific threshold for pdf values g(w).

Evaluation of false positive probabilities based on Monte Carlo simulation

[0016] For watermark detection, the pdf values g(w) for different watermark symbols are compared to a threshold in order to decide whether or not a watermark is present. If g(w) is smaller than the threshold, it is decided that a watermark is present. And the watermark symbol resulting in the smallest pdf value is taken as the embedded one. If none of evaluated pdf values is smaller than the threshold, it is assumed that no watermark information data is present.
For a specific normalised peak vector z, the false positive probability is defined as the probability that a random normalised peak vector w has a pdf value lower than the pdf value of z: P_fp(z) = Pr{g(w) < g(z)}. (2) In this context, g(z) is interpreted as a threshold for determining the presence of watermark.
Experimentally, a larger number of length-L correlation arrays, say M, can be generated according to the Gaussian distribution. Normalised peak vectors of these correlation arrays are denoted as {w⁽ⁱ⁾, 1 ≤ i ≤ M} and are used for evaluating g(w⁽ⁱ⁾). Let m denote the number of generated correlation arrays fulfilling g(w⁽ⁱ⁾) < g(z), then

[0017] Fig. 2 illustrates the distribution function g(w) for the single-peak case, where the correlation length is L =16384. The pdf values for extremely small or extremely large peak values are extremely small. For a given threshold th, P_fp is represented by the area below the distribution function where pdf values are smaller than the threshold. Therefore, the evaluation of P_fp can be interpreted as one-dimensional integration for the single-peak case. For multi-peaks, it is a multi-dimensional integration.
Moreover, Fig. 2 indicates that an increase of the threshold th also increases P_fp. Therefore in the above experiment g(w⁽ⁱ⁾) < g(z) also indicates that P_fp(w⁽ⁱ⁾) < P_fp(z), where g(z) and g(w⁽ⁱ⁾) are interpreted as two thresholds. Consequently, if calculating P_fp for each correlation array, there are m values lower than P_fp(z) out of M calculated P_fp values, i.e.

In other words, on average, for every ten calculated P_fp values there is one value smaller than 0.1, for every 100 calculated P_fp values there is one value smaller than 0.01. This kind of interpretation is important for the design of a watermarking system. When setting a threshold for P_fp values, or equivalently for pdf values, the probability that watermark data is detected in unmarked content is equal to the P_fp threshold.

[0018] Since theoretical evaluation of false positive probabilities defined in equation (2) is difficult for n_p > 1, they are determined numerically. Specifically, a look-up table (LUT) with K entries for pdf thresholds is defined, for example, linearly on the log-scale in the range [αg_max,βg_max],α << 1, β ≤ 1, where g_max denotes the maximal pdf value for all possible normalised peak vectors. According to equation (1), for a multi-dimensional point w* resulting in g_max, it is obvious that

due to the peak constraint w₀ ≥ w₁ > ... ≥ w_{n_p-1}. Therefore, the multi-dimensional point w* can be derived as follows:

Given L and n_p, it is straightforward to determine w* and consequently the maximal pdf value g_max. For L =16384, the following table lists g_max values for 1 ≤ n_p ≤ 8 :

n_p = 1	n_p = 2	n_p = 3	n_p = 4	n_p = 5	n_p = 6	n_p = 7	n_p = 8
4.0216	3.8563	3.7566	3.6277	3.5808	3.5406	3.5055	3.4744

[0019] For each entry in the pdf threshold LUT the false positive probability is determined numerically. The evaluation of the false positive probability can be interpreted as multi-dimensional integration. However, for multi-dimensional integration the convergence of Monte Carlo or quasi-Monte Carlo simulations (see R.E. Caflisch, "Monte Carlo and quasi-Monte Carlo methods", Acta Numerica, vol.7, pp.1-49, January 1998) is independent of dimension, while linear-grid based methods do depend on dimension and therefore do not converge well with increased dimension. Therefore the Monte Carlo simulation is used for the numerical evaluation of the false positive probability, whereby the Monte Carlo simulation is carried out according to the Monte Carlo method.

[0020] At the watermark detector, there are two lookup tables (LUTs) used for watermark detection. One LUT stores values of probability distribution function (pdf) for normalised peaks of correlation between non-watermarked content and reference patterns, and the other one stores values of false positive probability corresponding to entries in the pdf LUT. That is, each entry in the pdf LUT corresponds to a unique entry in the LUT for false positive probability. Different correlation lengths and different number of peaks result in different LUTs. And determined LUTs are stored in the memory unit of watermark detector, which is accessed during watermark detection.
As mentioned above, for watermark detection, the detector performs correlation between received audio section and reference patterns corresponding to watermark symbols. Correlation values are sorted to find peaks, which are normalised by standard deviation. The standard deviation is estimated either individually for each set of correlation result values corresponding to individual candidate watermark symbol, or by averaging over sets of correlation result values. Afterwards, the probability distribution function is evaluated for the normalised peaks. And the LUT for probability distribution is accessed to find the entry index which is nearest to the evaluated pdf value from the normalised peaks. This entry index is then used to access the second LUT for the false positive probability. And the false positive probability corresponding to the peaks found is then evaluated by means of interpolation or extrapolation.

[0021] For convenience, an n_p-dimensional hypercube

is used for Monte Carlo simulation. Correspondingly, all volume outside the hypercube is ignored for the P_fp evaluation. The inventors have found that, by a careful choice of w_min,w_max, the influence on the evaluated P_fp values is negligible for relevant P_fp values in practical applications. One possible choice for w_max > w_min is

[0022] Fig. 3 depicts the determination of w_min,w_max for the single-peak case. The volume of the hypercube is A=(w_max-w_min)^np.

[0023] The false positive probability can be reformulated as (see the definition in equation (2)):

were

denotes the hyper-region subject to the peak constraint w₀ ≥ w₁ ≥ ... ≥ w_{n_p-1} and with pdf values g(w) less than the threshold th. The function g'(w) is defined as g'(w) = g(w) for w ∈

, and g'(w) = 0 for w ∈

and w ∉

.
Moreover,

can be interpreted as the distribution function for an n_p-dimensional random vector w uniformly distributed in the hypercube

. Therefore, equation (3) is the expectation of g'(w) with respect to a uniformly distributed random vector w :

That is, n_p random variables [w₀,w₁,..,w_{n_p-1}] uniformly distributed in

are generated M times. For these cases fulfilling the peak condition w₀ ≥ w₁ ≥ ... ≥ w_{n_p-1}, the pdf value g(w) is evaluated and compared to the pdf threshold th. If g(w) < th, g(w) values are accumulated. The final result of accumulation is scaled by

which delivers an estimated false positive probability. Consequently, for each entry in the pdf threshold LUT, the corresponding false positive probability is determined numerically according to equation (4).

[0024] Although Monte Carlo or quasi-Monte Carlo simulations solve the problem associated with linear grid based numerical integration for increased dimension, the number of generated multi-dimensional points fulfilling the peak constraint decreases exponentially with increased dimension, however.
More precisely, only

(note that

of generated points fulfill the peak constraint and can be used for P_fp(th) estimation. To tackle this problem, the roles of generated n_p random variables may be exchanged, so that equation (4) is evaluated (n_p!) times.
If for example n_p=2, two values [a₀,a₁] are generated randomly according to uniform distribution:

If a₀≥a₁, w₀ = a₀ and w₁ = a₁ are assigned, where a₀ is interpreted as the largest peak, and g(w) is evaluated and compared to threshold th. Potentially, g(w) is added to the accumulation;
If a₁ > a₀, w₀ = a₁ and w₁ = a₀, where a₁ instead of a₀ is now interpreted as the largest peak. Again, g(w) is evaluated and compared to threshold th. Potentially, g(w) is also added to the accumulation.

By doing so, the estimation in equation (4) is performed two times, and by scaling the final accumulation with

an estimate of P_fp(th) is resulting. This idea can generally be applied for n_p peaks, which estimates n_p! P_fp(th) instead of P_fp(th). That is, n_p random variables [w₀,w₁,...,w_{n_p-1}] uniformly distributed in

are generated for M times. Following sorting, the values are used to evaluate the pdf function, which is compared to the threshold th. And the final accumulation is scaled by

in order to estimate P_fp(th). Estimating n_p! P_fp(th) via Monte Carlo simulation provides significantly better results than direct estimation of P_fp(th).
After estimating P_fp for each entry in the pdf threshold LUT, another LUT for P_fp is constructed. Based on interpolation/extrapolation, mapping from pdf values to false positive probabilities is established. More specifically, given a normalised peak vector w, the corresponding pdf g(w) is evaluated according to equation (1). The entry in the pdf threshold LUT is found which is nearest to g(w). If the entry is not at the boundary of the pdf threshold LUT, the corresponding entry in the P_fp LUT and its neighbors are used to evaluate the false positive probability corresponding to g(w) by means of interpolation. If the entry in the pdf threshold LUT nearest to g(w) is at the threshold LUT boundary, extrapolation may be necessary to calculate the corresponding false positive probability for g(w).

[0025] Fig. 4 shows a flow diagram for the generation of the pdf threshold LUT and P_fp LUT, which are used for the watermark detection.
In step 41, the aim is stated to construct a pdf threshold LUT with K entries, given a pdf range [p_min,p_max]. For example, K entries linear on a log₁₀-scale are defined as

Step 42 is the initialisation of the P_fp LUT: K entries in the P_fp LUT are initialised to zero: P_fp,i = 0, 1 ≤ i ≤ K. A loop over m starts with m = 1. Normalised peak vectors are generated in step 43: Generate M times normalised peak vectors using Monte Carlo or quasi-Monte Carlo processing. In the Monte Carlo case, a random generator is used to generate normalised peak vectors uniformly distributed in the hyper-cube [w_min,w_max]^np, where [w_min,w_max] defines the range of generated random normalised peak values. For quasi-Monte Carlo simulations, a low-discrepancy sequence like Sobol sequence is generated as normalised peak values, which also approximate the uniform distribution.
In step 44 each generated normalised peak vector is sorted such that the peak constraint is fulfilled: w₀ ≥ w₁ ≥ ··· ≥ w_{n_p-1}, which is used to calculate pdf value g(w^(m)) in step 45. A loop over i starts with i=1.
In step 46, the calculated values g(w^(m)) are compared with threshold entries th_i in the pdf threshold LUT. If g(w^(m)) < th_i, the P_fp LUT entries are updated in step 47: all P_fp,i entries in the P_fp LUT having a corresponding pdf threshold greater than g(w^(m)) are increased by g(w^(m)). Thereafter i is incremented and, as long as i ≤ K in step 48, the i loop continues with step 46.
Thereafter m is incremented and, as long as m ≤ M in step 49, the m loop continues with step 43. After generating all M times normalised peak vectors and corresponding updating of the P_fp LUT entries, the final P_fp values are estimated in step 40 by scaling the P_fp LUT entries by

[0026] A first flow diagram for watermark detection is shown in Fig. 5. There are nSymbols watermark symbols in the watermark symbol alphabet. Watermark detection is carried out as follows:

In step 50, a result values array or block r of a cross correlation between a current section of the received signal and reference patterns is provided, for example by means of fast Fourier transform and inverse fast Fourier transform.

Accordingly, there are nSymbols correlation result value arrays used for a watermark detection. A first loop over i starts with i=1.
In step 511, the correlation array values are sorted according to their magnitude, and a couple of n_p maximal values are used as a peak vector. The peak vector peak values can be normalised in step 512, for all nSymbols peak vectors:

1 ≤ i ≤ nSymbols, where v_i denotes the peak vector obtained after sorting the correlation results array i and w_i denotes the normalised peak vector i. The standard deviation σ can be estimated either individually for each correlation result array, or by averaging over sets of correlation result arrays. Thereafter i is incremented and, as long as i ≤ nSymbols in step 513, the first i loop continues with step 511.
A second loop over i starts with i=1. In step 52, pdf and P_fp values are evaluated. Pdf values g(w_i) are evaluated for nSymbols candidate watermark symbols based on equation (1). Based on pdf threshold and P_fp value LUTs, P_fp(w_i) is obtained in step 53 by means of interpolation or extrapolation.
In step 54 it is checked whether the values P_fp(w_i) are smaller than a first threshold T_min. If true, the corresponding candidate watermark is detected as the embedded one and is output in step 59. If not true, i is incremented. In step 55, if i is smaller equal nSymbols, the processing continues with step 52.
Otherwise, in step 56, i* is defined as i* = arg min_i{P_fp(w_i)} and

is defined as

In step 57 the minimal

value for all candidate watermark symbols is then compared to a second threshold T_max>T_min. If the minimal

value is smaller than T_max, the symbol resulting in the minimal

value is determined to be the embedded one and is output in step 59. If the minimal

value is not smaller than T_max, it is decided in step 58 that no watermark is present in the received current signal section.
In this description, "no watermark is present/detected" means that none of the candidate watermarks is present or detected.
The thresholds T_min and T_max can be adapted to the current application. Tests have shown that T_min = 10^-8 and T_max = 10^-2 are suitable values. For security applications the thresholds should be selected smaller and for metadata transmission the thresholds should be selected higher.

[0027] A second flow diagram for watermark detection is shown in Fig. 7. There are nSymbols watermark symbols in the watermark symbol alphabet. Watermark detection is carried out as follows:

In step 70, a result values array or block r of a cross correlation between a current section of the received signal and reference patterns is provided, for example by means of fast Fourier transform and inverse fast Fourier transform.

Accordingly, there are nSymbols correlation result value arrays used for a watermark detection. A loop over i starts with i=1.
In step 711, the correlation array values are sorted according to their magnitude, and a couple of n_p maximal values are used as a peak vector. The peak vector peak values can be normalised in step 712, for all nSymbols peak vectors:

1 ≤ i ≤ nSymbols, where v_i denotes the peak vector obtained after sorting the correlation results array i and w_i denotes the normalised peak vector i. The standard deviation is estimated either individually for each set of correlation result values corresponding to individual candidate watermark symbol, or by averaging over sets of correlation result values.
In step 72, pdf and P_fp values are evaluated. Pdf values g(w_i) are evaluated for nSymbols candidate watermark symbols based on equation (1). Based on pdf threshold and P_fp value LUTs, P_fp(w_i) is obtained in step 73 by means of interpolation or extrapolation.
In step 74 it is checked whether the values P_fp(w_i) are smaller than a first threshold T_min. If true, the corresponding candidate watermark is detected as the embedded one and is output in step 79. If not true, i is incremented. In step 75, if i is smaller equal nSymbols, the processing continues with step 711.

[0028] Otherwise, in step 76, i* is defined as i* = arg min_i{P_fp(w_i)} and

is defined as

In step 77 the minimal

value for all candidate watermark symbols is then compared to a second threshold T_max > T_min. If the minimal

value is smaller than T_max, the symbol resulting in the minimal

value is determined to be the embedded one and is output in step 79. If the minimal

value is not smaller than T_max, it is decided in step 78 that no watermark is present in the received current signal section.

[0029] In the watermark decoder processing in Fig. 6, a received watermarked signal is re-sampled in an acquisition or receiving section step or stage 61, and thereafter may pass through a spectral shaping and/or whitening step or stage 62. In the following correlation step or stage 63 it is correlated section by section with the nSymbols reference patterns. A symbol detection or decision step or stage 64 determines, whether or not a corresponding watermark symbol is present in the current signal section.
At watermark encoder side, a secret key was used to generate pseudo-random phases, from which related reference pattern bit sequences or symbols were generated and used for watermarking the audio signal. At watermark decoder side, these pseudo-random phases are generated in the same way in a corresponding step or stage 65, based on the same secret key. From the pseudo-random phases, related candidate reference patterns or symbols are generated in a reference pattern generation step or stage 66 and are used in step/stage 63 for checking whether or not a related watermark symbol is present in the current signal section of the received audio signal.
In the symbol detection or decision step or stage 64, a look-up table 67 for probability distribution function values and a look-up table 68 for false positive probabilities are used for the embedded watermark symbol determination as described above.

[0030] The described processing can be carried out by a single processor or electronic circuit, or by several processors or electronic circuits operating in parallel and/or operating on different parts of the complete processing.
The instructions for operating the processor or the processors according to the described processing can be stored in one or more memories. The at least one processor is configured to carry out these instructions.

Claims

1. Method for determining (64) from sets of correlation result values whether a specific watermark symbol out of one or more candidate watermark symbols is embedded in a current section of a received audio signal, or whether no one of said candidate watermark symbols is embedded in said current section of said received audio signal, wherein said current section of said received audio signal was correlated (63) with at least one candidate reference pattern (66), each one of which representing one of said one or more candidate watermark symbols, said method including:

a) taking (711, 712) from the current set of correlation result values a group (n_p) of maximal values which together form a peak vector (v_i,w_i);

- obtaining (72) from the values (w_i) of said peak vector a value of a probability distribution function (pdf, g(w_i));

- obtaining (73) from said probability distribution function value a false positive probability value (P_fp(w_i)), which false positive probability value represents a probability that peaks resulting from correlation between a candidate reference pattern and non-watermarked audio signal content have a smaller pdf value than said probability distribution function value;

- determining (74) whether said false positive probability value (P_fp(w_i)) is smaller than a first threshold value (T_min) and, if true, determining (79, 64) that the current candidate watermark symbol is the watermark symbol present in said current section of said received audio signal;

- if not yet all candidate watermark symbols have been processed, select (75) the next candidate watermark symbol as said current candidate watermark symbol and go to a);

- otherwise, determining (76) a minimal value

of said false positive probability function values for all candidate watermark symbols;

- comparing (77) said minimal value

with a second threshold value (T_max) that is greater than said first threshold value (T_min);

- if said minimal value

is smaller than said second threshold value (T_max), determining (79, 64) that the current candidate watermark symbol is the watermark symbol present in said current section of said received audio signal;

- otherwise, determining (78) that one of said candidate watermark symbols is present in said current section of said received audio signal.

2. Apparatus for determining (64) from sets of correlation result values whether a specific watermark symbol out of one or more candidate watermark symbols is embedded in a current section of a received audio signal, or whether no one of said candidate watermark symbols is embedded in said current section of said received audio signal, wherein said current section of said received audio signal was correlated (63) with at least one candidate reference pattern (66), each one of which representing one of said one or more candidate watermark symbols, said apparatus including means (64, 67, 68) configured to:

a) take (711, 712) from the current set of correlation result values a group (n_p) of maximal values which together form a peak vector (v_i,w_i);

- obtain (72) from the values (w_i) of said peak vector a value of a probability distribution function (pdf, g(w_i));

- obtain (73) from said probability distribution function value a false positive probability value (P_fp(w_i)), which false positive probability value represents a probability that peaks resulting from correlation between a candidate reference pattern and non-watermarked audio signal content have a smaller pdf value than said probability distribution function value;

- determine (74) whether said false positive probability value (P_fp(w_i)) is smaller than a first threshold value (T_min) and, if true, determine (79, 64) that the current candidate watermark symbol is the watermark symbol present in said current section of said received audio signal;

- if not yet all candidate watermark symbols have been processed, select (75) the next candidate watermark symbol as said current candidate watermark symbol and go to a);

- otherwise, determine (76) a minimal value

of said false positive probability function values for all candidate watermark symbols;

- compare (77) said minimal value

with a second threshold value (T_max) that is greater than said first threshold value (T_min);

- if said minimal value

is smaller than said second threshold value (T_max), determine (79, 64) that the current candidate watermark symbol is the watermark symbol present in said current section of said received audio signal;

- otherwise, determine (78) that one of said candidate watermark symbols is present in said current section of said received audio signal.

3. Method according to claim 1, or apparatus according to claim 2, wherein said obtaining (72) of a probability distribution function value (pdf, g(w_i)) and said obtaining (73) of a false positive probability function value (P_fp(w_i)) is a corresponding calculation, or is a fetching of corresponding values from related look-up tables (67, 68).

4. Method according to claim 1 or 3, or apparatus according to claim 2 or 3, wherein said false positive probability function (P_fp(w_i)) is determined using a Monte Carlo simulation or a quasi-Monte Carlo simulation.

5. Method according to the method of one of claims 1, 3 and 4, or apparatus according to the apparatus of one of claims 2 to 4, wherein the correlation result values of said peak vector (v_i) are normalised (712, w_i) by a standard deviation value which is estimated either individually for each set of candidate watermark symbol correlation result values, or by averaging over sets of correlation result values,
and wherein said probability distribution function (pdf, g(w_i)) is defined as

where L is the correlation length and w = [w₀,w₁,...,w_np-1].

6. Method according to the method of one of claims 3 to 5, or apparatus according to the apparatus of one of claims 3 to 5, wherein said probability distribution function values pdf and said false positive probability function values (P_fp) for said related look-up tables (67, 68) are calculated as follows:

- initialising (41) a pdf threshold value look-up table with K entries and a given pdf range [p_min,p_max] by

and initialising (42) a false positive probability function P_fp look-up table with K entries by P_fp,i = 0, 1 ≤ i ≤ K;

- carrying out the following steps for m = 1,...,M, M being a number of randomly generated vectors:

-- generating (43) normalised peak vectors uniformly distributed in a hyper-cube [w_min,w_max]^np using Monte Carlo or quasi-Monte Carlo processing, where [w_min,w_max] defines the range of generated random normalised peak values;

-- sorting (44) said normalised peak vectors so as to get a sorted peak vector w^(m);

-- calculating (45) pdf values g(w^(m));

-- carrying out the following steps for i = 1,...,K:

--- comparing (46) said calculated values g(w^(m)) with said threshold values th_i in said pdf threshold value look-up table;

--- if g(w^(m))<th_i, increasing (47) corresponding P_fp,i entries in said false positive probability function P_fp look-up table by g(w^(m));

- estimating (40) K final P_fp values by scaling the P_fp look-up table entries by

7. Method according to the method of one of claims 1 and 3 to 6, or apparatus according to the apparatus of one of claims 2 to 6, wherein said received audio signal is an audio signal received after passing an acoustic path.

8. Computer program product comprising instructions which, when carried out on a computer, perform the method according to one of claims 1 and 3 to 7.

Drawing

Search report

Search report

Cited references

REFERENCES CITED IN THE DESCRIPTION

This list of references cited by the applicant is for the reader's convenience only. It does not form part of the European patent document. Even though great care has been taken in compiling the references, errors or omissions cannot be excluded and the EPO disclaims all liability in this regard.

Patent documents cited in the description

WO2011141292A1 [0012] [0012] [0012]
EP2014066063W [0012] [0012] [0012]

Non-patent literature cited in the description

M. ARNOLDX.M. CHENP. BAUMU. GRIESG. DOERRA phase-based audio watermarking system robust to acoustic path propagationIEEE Transactions on Information Forensics and Security, 2014, vol. 9, 3411-425 [0011]
DAVIDH.N. NAGARAJAOrder statisticsJohn Wiley & Sons20030000 [0015]
R.E. CAFLISCHMonte Carlo and quasi-Monte Carlo methodsActa Numerica, 1998, vol. 7, 1-49 [0019]