Technical Field
[0001] The present invention relates to a similarity calculation method, a similarity calculation
apparatus, a program and a recording medium which perform pattern matching between
two vectors at a high speed.
Background Art
[0003] Hitherto, in order to detect a pattern which is substantially the same as an already
known pattern from an unknown input signal, or to evaluate similarity between two
signals, judgment of similarity or coincidence of data is conducted in all technical
fields to which signal processing is related, such as acoustic processing a technology,
image processing technology, communication technology, and/or radar technology, etc.
In general, for detection of analogous data, there is used a technique of allowing
data to be feature vector to judge similarity by magnitude of the distance or angle
(correlation) thereof.
[0004] Particularly, the so-called full search in which similarities between input value
and respective all candidiates are determined thereafter to determine data where the
distance is the shortest is a technology which is most simple and has no detection
leakage, and is frequently used in the case where data quantity is small. However,
e.g., in the case where the portion similar to input image or input voice (sound)
is retrieved from a large quantity of accumulated images or voices (sounds), since
the dimension of the feature vector per second is large and retrieval with respect
to those feature vectors which have been accumulated by several ten to several hundred
hours is conducted, there is the problem that retrieval time becomes vast when such
simple full search is performed.
[0005] On the other hand, in order to retrieve large quantity of data, in such cases that
complete coincidence retrieval of coded data, e.g., document retrieval is conducted,
high speed operation technology such as binary tree search or Hash method is used.
In accordance with this technology, data are stored in advance in the state where
they are put in order, to omit comparison of branch or table different from input
data at the time of retrieval to thereby realize high speed operation. However, in
the case where physical signal, e.g., image or sound, etc. is taken as subject, since
distortion and/or noise essentially exist in data, it is rare that coded data completely
coincide with each other. As a result, in the case where high speed operation technology
is used, a large number of detection leakages would take place. In addition, since
data is essentially multi-dimensional, there is the problem that it is difficult to
implement in advance univocal sequencing to data.
[0006] In view of the above, there is proposed, in the Japanese Patent Publication Laid
Open No.
H08-123460, a technology in which processing for grouping plural vectors close in distance to
represent the grouped vectors by one representative vector is performed at the time
of data registration to first calculate distance between input vector and representative
vector at the time of retrieval to conduct comparison with all vectors within group
only with respect to vectors of the group close in distance to thereby permit similar
(analogous) vector retrieval to be performed at high speed, and to have ability to
reflect distortion of vector at multi-dimension.
[0007] Further, there is proposed, in the Japanese Patent Publication Laid Open No.
2001-134573, a technology in which vectors are encoded to index them by short code to thereby
suppress increase in the number of times of distance calculations to permit high speed
similar (analogous) data retrieval.
[0008] However, in the technology described in the above-described Japanese Patent Publication
Laid Open No.
H08-123460, there was the problem that suitable grouping and selection of representative vector
are required at the time of registration so that registration operation becomes troublesome.
Moreover, there was also the problem that since it is not limited at the time of retrieval
that, e.g., registered vector which is minimum distant with respect to input vector
belongs to group in which representative vector which is minimum distant with respect
to input vector represents, operation for determining group to be retrieved becomes
troublesome.
[0009] Further, in the technology described in the above-described Japanese Patent Publication
Laid Open No.
2001-134573, there was the problem that distance relationship between vectors is lost when encoding
is performed, or there results in complicated distance relationship in non-additive
or non-monotonous manner so that mechanism of registration and/or retrieval becomes
troublesome.
[0010] Here, since image and/or sound are essentially time-series, it is desirable that
registration is conducted on the real time basis, and it is desirable that time order
can be reflected at the time of retrieval. In other words, there are instances where
such techniques which requires registration operation to exchange time-series, and/or
which requires redistribution (reshuffle) with respect to data or index of already
registered data at the time of registration as in the case of the technology described
in the above-described Japanese Patent Publication Laid Open No.
H08-123460 and Japanese Patent Publication Laid Open No.
2001-134573 are not suitable for retrieval of time-series data.
[0013] McNames. J. discloses in his article "Rotated Partial Distance search for Faster
Vector Quantization Encoding" a method of reducing the amount of computation required
for vector quantization encoding. The partial distance search (PDS) is improved by
a principal components rotation (PCR) of the codebook.
[0014] That is, there is desired such a mechanism that retrieval is performed in a time
extremely shorter than that at full search while satisfying the conditions where
- (a) structural simplicity and robustness with respect to distortion of full search
are not lost,
- (b) registration and/or deletion are conducted within real time, and
- (c) operation with respect to other already registered data is not required by registration
or deletion.
Disclosure of the Invention
[0015] The present invention has been proposed in view of such conventional actual circumstances,
and its object is to provide a similarity calculation method and a similarity calculating
apparatus which perform pattern matching between two vectors at a high speed while
satisfying the above-described conditions, a program for allowing computer to execute
the similarity calculation processing, and a computer readable recording medium where
such program is recorded.
[0016] To attain the above-described object, a similarity calculation method according to
the present invention is directed to a similarity calculation method as defined in
claim 1.
[0017] In such similarity calculation method, distance calculation between two vectors is
conducted in a hierarchical manner, whereby in the case where integrated value of
distances calculated up to a certain hierarchy is above a predetermined threshold
value, it is only detected, without calculating actual distance, that the integrated
value of distances is above the threshold value to thereby allow operation to be performed
at a high speed.
[0018] The predetermined transform operation is, e.g., transform for performing sequencing
of order of respective components constituting input vector in accordance with magnitude
of dispersion of the respective components, Discrete Cosine Transform, Discrete Fourier
Transform, Walsh-Hadamard Transform or Karhunen-Lueve Transform.
[0019] Further, in order to attain the above-described object, a similarity calculating
apparatus according to the present invention is directed to a similarity calculating
apparatus as defined in claim 11.
[0020] Such similarity calculating apparatus performs distance calculation between two vectors
in a hierarchical manner, whereby in the case where integrated value of distances
calculated up to a certain hierarchy is above a predetermined threshold value, it
is only detected, without calculating actual distance, that the integrated value of
distances is the threshold value or larger to thereby allow operation to be conducted
at a high speed.
[0021] The predetermined transform operation is, e.g., transform for performing sequencing
of order of respective components which constitute input vector in accordance with
magnitude of dispersion of the respective components, Discrete Cosine Transform, Discrete
Fourier Transform, Walsh-Hadamard Transform, or Karhunen-Lueve Transform.
[0022] In addition, program according to the present invention serves to allow computer
to execute the above-described similarity calculation processing, and recording medium
according to the present invention is a computer readable recording medium where such
program is recorded.
[0023] Still further objects of the present invention and practical merits obtained by the
present invention will become more apparent from the description of the embodiments
which will be given below.
Brief Description of the Drawings
[0024]
FIG. 1 is a view for explaining outline of the configuration of a similarity vector
detecting apparatus in the first embodiment describing background art and being useful
for understanding the invention.
FIG. 2 is a flowchart for explaining processing at the time of vector registration
in the similarity vector detecting apparatus.
FIG. 3 is a flowchart for explaining processing at the time of vector retrieval in
the similarity vector detecting apparatus.
FIG. 4 is a view for intuitively explaining processing in the first embodiment.
FIG. 5 is a view showing an example in which there exists deviation in distribution
of vector within feature space.
FIG. 6 is a view for explaining outline of the configuration of a similarity vector
detecting apparatus in the second embodiment describing background art and being useful
for understanding the invention.
FIG. 7 is a flowchart for explaining processing at the time of vector registration
in the similarity vector detecting apparatus.
FIG. 8 is a flowchart for explaining processing at the time of vector retrieval in
the similarity vector detecting apparatus.
FIG. 9 is a view for explaining outline of the configuration of a similarity vector
detecting apparatus in the third embodiment.
FIG. 10 is a flowchart for explaining processing at the time of vector registration
in the similarity vector detecting apparatus.
FIG. 11 is a flowchart for explaining processing at the time of vector retrieval in
the similarity vector detecting apparatus
FIG. 12 is a flowchart for explaining an example of processing for extracting acoustic
feature vector from acoustic signal.
FIG. 13 is a view for explaining an example of processing for extracting acoustic
feature vector from acoustic signal.
FIG. 14 is a view for explaining transform encoding in acoustic signal.
FIG. 15 is a flowchart for explaining an example of processing for extracting acoustic
feature vector from encoded acoustic signal.
FIG. 16 is a view for explaining an example of processing for extracting acoustic
feature vector from encoded acoustic signal.
FIG. 17 is a flowchart for explaining an example of processing for extracting image
feature vector from video signal.
FIG. 18 is a view for explaining an example of processing for extracting image feature
vector from video signal.
FIG. 19 is a flowchart for explaining another example of processing for extracting
image feature vector from video signal.
FIG. 20 is a view for explaining a further example of processing for extracting image
feature vector from video signal.
FIG. 21 is a flowchart for explaining a further example of processing for extracting
image feature vector from encoded video signal.
FIG. 22 is a view for explaining a further example of processing for extracting image
feature vector from encoded video signal.
Best Mode for Carrying Out the Invention
[0025] Explanation will be given below in detail with reference to the attached drawings
in connection with practical embodiments to which the present invention is applied.
In this embodiment, the present invention is applied to a similarity vector detection
method and an apparatus therefor which detect, at a high speed, vectors similar to
input vector from plural registered vectors.
[0026] Specifically, in the similarity vector detection method and the apparatus therefor
ofs this embodiment, in calculating distance between two vectors, there is employed
an approach to calculate distance when corresponding distance is below a predetermined
threshold value, and to only detect, without calculating actual distance, that corresponding
distance is larger than threshold value when it is above the predetermined value to
thereby allow operation of similarity vector detection to be conducted at a high speed.
It is to be noted that, in the similarity vector detecting apparatus in this embodiment,
in the case where distance is above threshold value, -1 is assumed to be outputted
for convenience.
[0027] Hereinafter, two vectors f and g for calculating distance are represented by the
following formulas.

[0028] Here, in the formula (1), f[1], f[2], ··· represent respective components of vector
f. In the formula (2), g[1], g[2], ··· represent respective components of vector g.
In addition, t represents transposition and N represents dimension of vector.
(1) First embodiment
[0029] Outline of the configuration of the similarity vector detecting apparatus in the
first embodiment is shown in FIG. 1. As shown in FIG. 1, the similarity vector detecting
apparatus 1 serves to input vector f and vector g to output square distance between
the vectors (or -1), and is composed of a recording unit 10, a hierarchical distance
calculating unit 11, and a threshold value judgment unit 12.
[0030] The processing at the time of registration in this similarity vector detecting apparatus
1 will be explained by using the flowchart of FIG. 2. First, at step S1, the recording
unit 10 (FIG. 1) inputs in advance registered vector g. In general, vector g is plural
numbers and may become vast number in many cases. Further, at the subsequent step
S2, the recording unit 10 records inputted vector g.
[0031] As stated above, in the first embodiment, since it is unnecessary to conduct special
operation at the time of registration, the apparatus is simple and is suitable for
processing on the real time basis. In this example, the recording unit 10 is, e.g.,
magnetic disc, optical disc or semiconductor memory, etc.
[0032] Subsequently, the processing at the time of retrieval in the similarity vector detecting
apparatus 1 will be explained by using the flowchart of FIG. 3. First, at step S10,
the threshold value judgment unit 12 (FIG. 1) sets threshold value S of distance.
At the subsequent step S11, the hierarchical distance calculating unit 11 inputs vector
f, and acquires one vector g recorded at the recording unit 10.
[0033] Subsequently, at step S12, the hierarchical distance calculating unit 11 sets component
number i serving as internal variable to 1, and sets integrated value sum of distance
to 0. At step S13, integrating operation as indicated by the following formula (3)
is performed between the i-th component f[i] of vector f and the i-th component g
[i] of vector g.

[0034] At step S14, the threshold value judgment unit 12 discriminates whether or not integrated
value sum is smaller than threshold value S. In the case where integrated value sum
is smaller than threshold value S (Yes), processing proceeds to step S 16. In the
case where integrated value sum is threshold value S or larger (No), the threshold
value judgment unit 12 outputs -1 at step S15 to complete processing. Here, as described
above, -1 which is outputted is convenient numerical value indicating that distance
between inputted vector f and acquired vector g is above threshold value S, and this
vector g is nullified. As stated above, the threshold value judgement unit 12 provides
threshold value S and serves to truncate integrating operation at the hierarchical
distance calculating unit 11 in the case where integrated value sum is above threshold
value S at the middle hierarchy of integrating operation to thereby realize high speed
processing.
[0035] As step S16, it is discriminated whether or not component number i is the number
of dimensions N of vector f or vector g or smaller. In the case where the component
number i is N or smaller (Yes), i is incremented at step S17 to return to step S13.
On the other hand, in the case where the component number i is larger than N (No),
the threshold value judgment unit 12 outputs integrated value sum at step S18 because
integrating operation has been completed until the last component of vector f or vector
g to complete processing. It is to be noted that integrated value sum at this time
is square of distance between vectors.
[0036] While the processing with respect to one registered vector g has been indicated above
in the flowchart of FIG. 3, similar processing is performed with respect to registered
all vectors g in practice to output, as vector similar to vector f, all vectors g
in which integrated value sum of distances with respect to vector f is below the threshold
value S.
[0037] When the processing in the first embodiment which has been explained above is intuitively
explained, this processing corresponds to the processing to calculate precise distance
only with respect to registered vectors in which distance from input vector indicated
by × in the figure is within the range of super sphere having radius √S in connection
with a large number of registered vectors indicated by black circle in FIG. 4, and
to nullify registered vectors without the range at the time point when integrated
value of distances of every respective axes is above radius.
[0038] It is to be noted that while square distance between vectors has been used in the
above-described explanation, similar technique may be used with respect to arbitrary
distance scale without being limited to square distance. It should be noted that in
the case where square distance is used, there is no possibility that erroneous nullification
is caused to take place because integrated value sum monotonously increases with respect
to integrated value of distances between respective components. Moreover, since sum
total of distances between respective components is in correspondence with distance
between vectors, entirely the same distances as simple full search method are outputted
in regard to vectors f and g in which distance is threshold value √S or smaller so
that there is no possibility that error may take place.
[0039] Further, in the case of this technique, since it is unnecessary to prepare reference
table, etc. which may break the time series relationship, updating and/or deletion
of data can be conducted in accordance with time series order, so processing and/or
management are easy. In addition, it is also easily possible to conduct retrieval
in accordance with time series order, or to designate time series range to be retrieved.
(2) Second embodiment
[0040] In the above-described first embodiment, threshold value S of distance is set, thereby
making it possible to conduct retrieval equivalent to full search at a high speed.
However, in the case of this technique, since from which vector component execution
of retrieval begins is dependent upon arrangement order of vectors, difference takes
place in retrieval speed by this arrangement order. For example, in such cases that
deviation exists in distribution of vectors within feature space as shown in FIG.
5, retrieval speed greatly changes in dependency upon which of f[1] axis or f[2] axis
is first integrated. In this example, employment of a method of first evaluating f[2]
axis results in less extra integration to thereby realize high speed operation.
[0041] In view of the above, in the second embodiment which will be explained below, as
indicated by the following formulas (4) and (5), multiplication of normal orthogonal
transform matrix U is conducted with respect to input vector f and registered vector
g to perform orthogonal transform operation to conduct retrieval in order of significance
by using the orthogonally transformed vectors f' and g' to thereby allow retrieval
to be conducted at higher speed.

[0042] It is to be noted that square distance d
2 between two vetcors g and f is not changed by normal orthogonal transform matrix
U as indicated by the following formula (6).

[0043] Outline of the configuration of the similarity vector detecting apparatus in the
second embodiment is shown in FIG. 6. As shown in FIG. 6, the similarity vector detecting
apparatus 2 serves to input vectors f and g to output distance between the vectors
(or -1), and is composed of vector transform units 20, 21, a recording unit 22, a
hierarchical distance calculating unit 23, and a threshold value judgment unit 24.
Here, the vector transform units 20, 21 serve to respectively implement similar transform
operations to vectors g and f. In addition, the recording unit 22 is, e.g., magnetic
disc, optical disc or semiconductor memory, etc.
[0044] The processing at the time of registration in this similarity vector detecting apparatus
2 will be explained by using the flowchart of FIG. 7. First, at step S20, the vector
transform unit 20 (FIG. 6) inputs registered vector g in advance. At the subsequent
step S21, vector g is transformed as indicated by the above-described formula (5)
to generate vector g'. Further, at step S22, the recording unit 10 records transformed
vector g'.
[0045] Next, the processing at the time of retrieval in the similarity vector detecting
apparatus 2 will be explained by using the flowchart of FIG. 8. First, at step S30,
the threshold value judgment unit 24 (FIG. 6) sets threshold value S of distance.
At the subsequent step S31, the vector transform unit 21 inputs vector f and the hierarchical
distance calculating unit 23 acquires one vector g' recorded at the recording unit
22.
[0046] Subsequently, at step S32, the vector transform unit 21 transforms vector f as indicated
by the above-described formula (4) to generate vector f'.
[0047] At step S33, the hierarchical distance calculating unit 23 sets component number
i serving as internal variable to 1, and sets integrated value sum of distance to
0. At step S34, integrating operation as indicated by the following formula (7) is
performed between the i-th component f'[i] of vector f' and the i-th component g'[i]
of vector g'.

[0048] At step S35, the threshold value judgment unit 24 discriminates whether or not integrated
value sum is smaller than threshold value S. In the case where integrated value sum
is smaller than threshold value S (Yes), processing proceeds to step S37. In the case
where integrated value sum is threshold value S or larger (No), the threshold value
judgment unit 24 outputs -1 at step S36 to complete processing.
[0049] At step S37, it is discriminated whether or not the component number i is the number
of dimensions N or smaller of vector f' and vector g'. In the case where the component
number i is N or smaller (Yes), i is incremented at step S38 to return to step S34.
On the other hand, in the case where the component number i is larger than N (No),
the threshold value judgment unit 24 outputs integrated value sum at step S39 because
integrating operation is completed up to the last component of vectors f' and g' to
complete processing. It is to be noted that the integrated value sum at this time
is square of distance between vectors.
[0050] While the processing with respect to one registered vector g' has been indicated
above in the flowchart of FIG. 8, there is employed in practice an approach to perform
similar processing with respect to registered all vectors g' to output, as vector
similar to vector f', all vectors g' in which integrated value sum of distance with
respect to vector f' is below the threshold value S.
[0051] Here, while various matrixes may be used as the above-described normal orthogonal
transform matrix U, explanation will be given below by taking four examples in practical
sense.
(2-1) Practical example of orthogonal transform
(2-1-1)
[0052] Sequential matrix is mentioned as the most simple orthogonal transform. In this sequential
matrix, order of vector component is caused to simply undergo sequencing. For example,
sequential matrix P of the eighth order is expressed in a form as indicated by the
following formula (8).

[0053] In the case where distribution of respective components of vectors is different as
in the case of the above-described FIG. 5, it is obvious that the larger dispersion
of component is, the larger distribution with respect to distance becomes. Accordingly,
in determining order of sequencing, it is optimum to prepare in advance sufficient
number (I) of sample vectors g
i to set sequential matrix arranged in order of magnitude of dispersion vector V calculated
by the following formula (9).

[0054] It is to be noted that the orthogonal transform using this sequential matrix is effective
in such cases that ways of spreading of respective vector components are different,
and is high in speed since it is sufficient to perform sequencing so that multiplication/division
and/or conditional branch are not necessary.
(2-1-2)
[0055] In feature quantity where correlation relationship between adjacent components is
large, such as image feature quantity or acoustic feature quantity, etc., energy in
the case where feature vector is considered as discrete signal deviates to lower frequency
component.
[0057] Here, since high speed transform method can be used for Discrete Cosine Transform
or Discrete Fourier Transform, and since it is unnecessary to hold all transform matrixes,
memory use quantity and/or operation speed in the case where operation is realized
by computer are far advantageous as compared to the case where all calculations of
matrix is performed.
(2-1-3)
[0058] The Walsh-Hadamard Transform is orthogonal transform where respective elements of
transform matrix are constituted only by ±1, and is suitable for high speed transform
because multiplication is not required at the time of transform. Here, sequency is
used as concept close to frequency and components are arranged in order from low sequency
so that high speed of distance calculation can be realized with respect to vectors
where correlation relationship between adjacent components is large similarly to the
above-described Discrete Cosine Transform or Discrete Fourier Transform.
[0059] The Walsh-Hadmard Transform matrix is constituted in accordance with codes of Fourier
Transform matrix, or is constituted by recursive expansion operation of matrix. As
an example, the Walsh-Hadamard Transform matrix W of the eighth order arranged in
order of sequency is indicated by the following formula (14).

(2-1-4)
[0060] In the case where sufficient number of sample vectors are collected in advance, and
where a certain amount of cost can be required for transform operation, it is effective
that optimum Karhunen-Loeve Transform (hereinafter referred to as KL transform) is
used as orthogonal transform.
[0061] The KL transform matrix T is eigen matix in which dispersion matrix V of sample vectors
is decomposed into eigen values, and is defined as indicated by the following formula
(15) in the case where eigen value is assumed as λ
1, ··· λ
N.

[0062] Here, the KL transform is orthogonal transform matrix which completely removes correlation
relationship between respective components, and dispersion of transformed vector components
results in eigen value λ
i. Accordingly, the KL transform matrix T is constituted so that eigen values λ
i are arranged in order of magnitude to thereby integrate all components to remove
overlapping information thereafter to have ability to perform integration of distances
from the axis where dispersion is the largest.
[0063] It is to be noted that, in the technique using this KL transform, since it is necessary
to hold KL transform matrix T over the entire dimension in principle at the time of
operation, and since it is necessary to perform matrix operation of all order with
respect to all vectors, operation cost is high. However, since this operation is performed
at the time of registration, it cannot be said that time required for retrieval processing
for which high speed is required is particularly increased.
[0064] In addition, although slight degradation of accuracy is involved, there is employed
an approach to extract only vector components having large eigen value to hold them
without holding vector components having small eigen value to thereby compress vector
itself, thus also making it possible to reduce memory area and/or data read-in time
of the recording unit 22 (FIG. 6).
(3) Third embodiment
[0065] While the retrieval operation is caused to be conducted at a high speed by realization
of high speed of distance calculation in the above-described first and second embodiments,
data read-in time from the recording unit, e.g., hard disc, etc. also results in cause
of large overhead in performing retrieval.
[0066] Here, the KL transform in the above-described second embodiment corresponds to analysis
method called main component analysis in the multivariate analysis field, and is an
operation for extracting main component constituting vector. In view of the above,
in the third embodiment which will be explained below, the main component of transformed
vector g' obtained in the second embodiment is recorded as index vector g
1, and the remaining component is recorded as detail vector g
2. At the time of retrieval, distance calculation is first performed with reference
to index vector g
1 to acquire detail vector g
2 only in the case where that result is smaller than threshold value S to further perform
distance calculation, thereby making it possible to shorten data read-in time.
[0067] Outline of the configuration of the similarity vector detecting apparatus in the
third embodiment is shown in FIG. 9. As shown in FIG. 9, the similarity vector detecting
apparatus 3 serves to input vector f and vector g to output square distance between
vectors (or -1), and is composed of vector transform units 30, 31, an index recording
unit 32, a detail recording unit 33, a hierarchical distance calculating unit 34,
and a threshold value judgment unit 35. Here, the vector converting units 30, 31 serve
to respectively implement transform operation similar to the above-described second
embodiment to the vectors g and f. In addition, the index recording unit 32 and the
detail recording unit 33 are, e.g., magnetic disc, optical disc or semiconductor memory,
etc.
[0068] The processing at the time of registration in this similarity vector detecting apparatus
3 will be explained by using the flowchart of FIG. 10. First, at step S40, the vector
transform unit 30 (FIG. 9) inputs registered vector g in advance. At the subsequent
step S41, vector g is transformed as indicated by the above-described formula (5)
to generate vector g'. Further, the vector transform unit 30 divides it into index
vector g
1 having a predetermined number M (1≤ M <N) of components and detail vector g
2 having the remaining component in order from component having small component number,
i.e., component having large dispersion or eigen value in the above-described transform
operations or low frequency component. Further, at step S42, the index recording unit
32 records index vector g
1. At step S43, the detail recording unit 33 records detail vector g
2.
[0069] Next, the processing at the time of retrieval in the similarity vector detecting
apparatus 3 will be explained by using the flowchart of FIG. 11. First, at step S50,
the threshold value judgment unit 35 (FIG. 9) sets threshold value S of distance.
At the subsequent step S51, the vector transform unit 31 inputs vector f, and the
hierarchical distance calculating unit 34 acquires one index vector g
1 recorded at the index recording unit 32.
[0070] Subsequently, at step S52, the vector transform unit 31 transforms vector f as indicated
by the above-described formula (4) to generate vector f'. Further, the vector transform
unit 31 divides it into index vector f
1 having a predetermined number M (1≤ M<N) of components and detail vector f
2 having the remaining component in order from component having small component number.
[0071] At step S53, the hierarchical distance calculating unit 34 sets component number
i serving as internal variable to 1 and sets integrated value sum of distance to 0.
At step S54, integrating operation as indicated by the following formula (16) is performed
between the i-th component f' [i] of vector f' and the i-th component g'[i] of vector
g'.

[0072] At step S55, the threshold value judgment unit 35 discriminates whether or not integrated
value sum is smaller than threshold value S. In the case where integrated value sum
is smaller than threshold value S (Yes), processing proceeds to step S57. In the case
where integrated value sum is threshold value S or larger (No), the threshold value
judgment unit 35 outputs -1 at step S56 to complete processing. Here, as described
above, -1 which is outputted is convenient numerical value indicating that distance
is above the threshold value so that it is nullified.
[0073] At step S57, it is discriminated whether or not component number i is the number
of dimensions M of index vector f
1 and index vector g
1 or smaller. In the case where the component number i is M or smaller (Yes), i is
incremented at step S58 to return to the step S54. On the other hand, in the case
where component number i is larger than M (No), the hierarchical distance calculating
unit 34 acquires one detail vector g
2 recorded at the detail recording unit 33.
[0074] At step S60, the hierarchical distance calculating unit 34 performs integrating operation
as indicated by the above-described formula (16) between the i-th component f'[i]
of vector f' and the i-th component g'[i] of vector g'.
[0075] At step S61, the threshold value judgment unit 35 discriminates whether or not integrated
value sum is smaller than threshold value S. In the case where the integrated value
sum is smaller than threshold value S (Yes), processing proceeds to step S63. In the
case where integrated value sum is threshold value S or larger (No), the threshold
value judgment unit 35 outputs -1 at step S62 to complete processing.
[0076] At step S63, it is discriminated whether or not the component number i is the number
of dimensions N of vector f' or vector g' or smaller. In the case where the component
number i is N or smaller (Yes), i is incremented at step S64 to return to the step
S60. On the other hand, in the case where the component number i is larger than N
(No), the threshold value judgment unit 35 outputs integrated value sum at step S65
since integration is completed until the last component of vector g' to complete processing.
At this time, the integrated value sum results in square of distance between vectors.
[0077] While the processing with respect to one registered vector g' is indicated above
in the flowchart of FIG. 11, similar processing is performed with respect to all registered
vectors g' in practice to output, as vector similar to vector f', all vectors g' in
which integrated value sum of distances with respect to vector f' is below the threshold
value S.
[0078] In the above-described third embodiment, as compared to the first and second embodiments,
memory capacity and/or accuracy are not changed, and operating speed changes little.
However, in the case where most comparisons are nullified at the stage of index vector
g
1 so that it is unnecessary to acquire detail vector g
2, overhead by data access is cancelled.
[0079] While it is assumed in the above-described explanation that vector is divided into
two stages of index vector and detail vector, it is a matter of course that there
can be made expansion to multi-stage, such as, for example, index vector is further
similarly divided into index vector of high order and detailed index vector so that
three-stage configuration is provided.
(4) Extraction of feature vector
[0080] Explanation will be given below in connection with a technique of extracting feature
vector from acoustic signal or video signal. In a manner described later, acoustic
feature vector and/or image feature vector are extracted to use them as the above-described
vectors f and g, thereby making it possible to retrieve, at a high speed, similar
acoustic or video signal from registered acoustic signal or video signal by using
the techniques of the above-described first to third embodiments in the case where
acoustic signal or video signal is inputted.
(4-1) Extraction of acoustic feature vector
(4-1-1)
[0081] Explanation will be given by using the flowchart of FIG. 12 and FIG. 13 in connection
with the example of the case where power spectrum coefficients are used as feature
quantity relating to acoustic signal. First, at step S70, as shown in FIG. 13, acoustic
signals with respect to each time period T are acquired from acoustic signal within
object time period.
[0082] Subsequently, at step S71, spectrum operation, e.g., high speed Fourier transform,
is implemented to the acquired acoustic signal to determine power spectrum coefficients
Sq (q = 0, 1, ···, Q-1) with respect to each short time period. Here, q is index representing
discrete frequency and Q is the maximum discrete frequency.
[0083] Subsequently, at step S72, it is discriminated whether or not calculation within
object time period is completed. In the case where such calculation is completed (Yes),
processing proceeds to step S73. In the case where such calculation is not completed
(No), processing returns to the step S70.
[0084] At step S73, average spectrum S'q of the determined power spectrum coefficients Sq
is calculated. At step S74, this average spectrum S'
q is changed into vector to generate acoustic feature vector a. This acoustic feature
vector a is represented by, e.g., the following formula (17).

[0085] It is to be noted that while explanation has been given in the above-described example
on the premise that acoustic signal within object time period is divided into each
time period T, spectrum operation may be implemented without dividing into each time
period T in the case where the object time period is short.
[0086] In addition, while the example using power spectrum coefficient has been explained
in the above-described example, the present invention is not limited to such implementation
but cepstrum coefficient having equivalent information, etc., may also be used. Further,
in place of Fourier transform, similar effect can also be obtained by linear predictive
coefficient using AR (Auto-Regressive) model.
(4-1-2)
[0087] Since the acoustic signal is vast, there are many instances where such signal is
recorded or is caused to undergo transmission after being compression-encoded. While
it is possible to extract acoustic feature vector a by using the above-described technique
after encoded acoustic signal is decoded into signal in the base band, extracting
processing can be conducted efficiently and at a high speed if acoustic feature vector
a can be extracted only by partial decoding.
[0088] Here, in the transform encoding which is encoding method generally used, acoustic
signal serving as original sound is divided into frames with respect to each time
period T, as shown in FIG. 14. Further, orthogonal transform such as Modified Discrete
Cosine Transform (MDCT), etc. is implemented to acoustic signal with respect to each
frame, and the coefficients thereof are quantized and encoded. In this instance, scale
factors serving as normalization coefficient of magnitude are extracted with respect
to each frequency band, and are separately encoded. In view of the above, by decoding
only the scale factors, they can be used as acoustic feature vector a.
[0089] Explanation will be given by using the flowchart of FIG. 15 and FIG. 16 in connection
with the example of the case where scale factors are used as feature quantity relating
to acoustic signal. First, at step S80, encoded acoustic signal within the time period
T in the object time period is acquired. At step S81, scale factors with respect to
each frame are partially decoded.
[0090] Subsequently, at step S82, it is discriminated whether or not decoding within the
object time period is completed. In the case where such decoding is completed (Yes),
processing proceeds to step S83. In the case where such decoding is not completed
(No), processing returns to the step S80.
[0091] At step S83, maximum scale factors are detected with respect to each band from scale
factors within the object time period. At step S84, those scale factors are changed
into vectors to generate acoustic feature vector a.
[0092] In this way, it is possible to extract, at a high speed, acoustic feature vector
a equivalent to the above without completely decoding encoded acoustic signal.
(4-2) Extraction of image feature vector
(4-2-1)
[0093] Explanation will be given by using the flowchart of FIG. 17 and FIG. 18 in connection
with the example of the case where luminance information and color information are
used as feature quantity relating to video signal. First, at step S90, as shown in
FIG. 18, image frame is acquired from video signal within the object time period T.
[0094] Subsequently, at step S91, time average image 100 is prepared on the basis of acquired
all image frames.
[0095] Subsequently, at step S92, the prepared time average image 100 is divided into X
× Y small blocks in breadth and width directions to prepare block average image 110
in which pixel values within respective blocks are averaged.
[0096] Further, at step S93, these small blocks are arranged in order of R, G, B, e.g.,
from the left upper direction toward the right lower direction to generate one-dimensional
image feature vector v. This image feature vector v is represented by, e.g., the following
formula (18).

[0097] It is to be noted that explanation has been given in the above-described example
in connection with the example where pixel values of the block average image 110 in
which the time average image 100 is divided are rearranged to generate one-dimensional
image feature vector v, however, the present invention is not limited to such implementation,
but there may be employed an approach to rearrange pixel values of the time average
image 100 without preparing the block average image 110 to generate one-dimensional
image feature vector v.
[0098] In addition, since time change of video signal is not so rapid in the ordinary state,
it is also possible to obtain the same effects/advantages by employing an approach
to select, as representative image, one frame within the object time period without
preparing the time average image 100 to substitute it.
(4-2-2)
[0099] There are many instances where there exist a certain relation in images where distribution
of color with respect to all images are similar, e.g., studio image, etc. photographed
from the same angle of news image even in the case where corresponding video signal
is not entirely the same video signal. Thus, there is a demand for performing retrieval
in the state where these images are considered to be the same. In such case, it is
effective to employ a method of rejecting spatial dependency of image to prepare histogram
of color distribution to make comparison.
[0100] In view of the above, explanation will be given by using the flowchart of FIG. 19
and FIG. 20 in connection with the example of the case where histogram of color distribution
is used as feature quantity in this way. First, at step S100, as shown in FIG. 20,
image frame is acquired from video signal within object time period T.
[0101] Subsequently, at step S101, histogram with respect to signal values of respective
colors, e.g., R, G, B is prepared from signal values of respective image frames.
[0102] Further, at step S102, these colors are arranged in order of, e.g., R, G, B to generate
one-dimensional image feature vector v. This image feature vector v is represented
by the following formula (19).

[0103] It is to be noted that while explanation has been given in the above-described example
on the premise that histogram with respect to signal values of R, G, B is prepared,
it is possible to obtain similar effects/advantages even if histogram with respect
to signal values of luminance (Y) and color difference (Cb, Cr) is prepared.
(4-2-3)
[0104] Since video signal is vast, there are many cases where such signal is recorded or
is caused to undergo transmission after being compression-encoded. While it is possible
to extract image feature vector v by using the above-described technique after employing
an approach to decode encoded video signal into signal of base band, extraction processing
can be performed efficiently and at a high speed if image feature vector v can be
extracted only by partial decoding.
[0105] Explanation will be given by using the flowchart of FIG. 21 and FIG. 22 in connection
with the example of the case where image feature vector v is extracted from video
signal compression-encoded by MPEG1 (Moving Picture Experts Group 1) or MPEG2. First,
at step S110, encoded video signal of encoded group (Group of pictures: GOP) proximate
to object time period T to be changed into vector is acquired to acquire intra-frame
encoded picture (I picture) 120 within that GOP.
[0106] Here, frame image is encoded with macro block MB (16 × 16 pixels, or 8 × 8 pixels)
being as unit, and Discrete Cosine Transform (DCT) is used. These DC-transformed DC
coefficients correspond to average value of pixel values of image within macro block.
[0107] In view of the above, at step S111, these DC coefficients are acquired. At the subsequent
step S 112, these coefficients are arranged in order of, e.g., Y, Cb, Cr to generate
one-dimensional image feature vector v. This image feature vector v is represented
by, e.g., the following formula (20).

[0108] In this way, it is possible to extract image feature vector v at a high speed without
completely decoding encoded video signal.
[0109] It is to be noted that while explanation has been given in the above-described example
that video signal which has been compression-encoded by the MPEG1 or the MPEG2 is
assumed to be used, the present invention may also be applied to other compression-encoding
system.
(5) Others
[0110] As explained above, in accordance with this embodiment, hierarchical distance integrating
operation is performed in detecting analogous (similar) vector on the basis of distance
between vectors to truncate distance integrating operation at the time when integrated
value of distances is above threshold value with respect to distance set in advance,
thereby making it possible to detect similar vector at a high speed. Particularly,
in such cases that vector similar to input vector is detected from a large quantity
of registered vectors, since most registered vectors are non-similar so that integrated
value of distances is above threshold value, distance calculation can be truncated
at the early stage. Thus, detection time can be shortened to a large extent.
[0111] In addition, by implementing sequential transform, Discrete Cosine Transform, Discrete
Fourier Transform, Walsh-Hadamard Transform or KL Transform in advance to vector to
perform integrating operation in order from vector component having high significance,
i.e., component having large dispersion or eigen value in the above-described transform
operations or in order from low frequency component, it is possible to detect similar
vector efficiently and at a high speed, taking the distribution of vector components
into consideration.
[0112] Accordingly, also in performing retrieval of acoustic signal or video signal, acoustic
feature vector and/or image feature vector is extracted in advance to register the
vector thus extracted, whereby in the case where arbitrary acoustic signal or video
signal is inputted, similar acoustic or video signals can be retrieved at a high speed
while maintaining structural simplicity and/or retrieval accuracy similar to full
search.
[0113] While the invention has been described in accordance with certain embodiments thereof
illustrated in the accompanying drawings and described in the above description in
detail, it should be understood by those ordinarily skilled in the art that the invention
is not limited to the embodiments, but various modifications, alternative embodiments
or equivalents can be implemented without departing from the scope and spirit of the
present invention as set forth and defined by the appended claims.
[0114] For example, while the present invention has been explained in the above-described
embodiments as the configuration of hardware, the present invention is not limited
to such implementation, but arbitrary processing may be also realized by allowing
CPU (Central Processing Unit) to execute computer program. In this case, computer
program may be provided in the state where it is recorded on recording medium, or
may be provided by allowing it to undergo transmission through other transmission
medium such as Internet.
Industrial Applicability
[0115] In accordance with the above-described present invention, there is employed such
approach to perform distance calculation between two vectors in a hierarchical manner,
whereby in the case where that integrated value of distances calculated up to a certain
hierarchy is above a predetermined threshold value, it is only detected, without calculating
actual distance, that the integrated value of distances is threshold value or larger,
thereby permitting operation to be conducted at a high speed. Particularly, in such
cases that vector similar to input vector is detected from a large quantity of registered
vectors, since most registered vectors are non-similar and thus integrated value of
distances is above threshold value, distance calculation can be truncated at the early
stage. Therefore, detection time can be shortened to a large extent.
1. A similarity calculation method of determining similarity between two feature vectors,
a registered vector (g) and an input vector (f), being representative of an acoustic
signal or a video signal, each of the two feature vectors having N corresponding components,
N being an integer greater than zero, the method including the following steps:
a transform step (S41, S52) in which a predetermined transform operation (S41, S52)
is implemented to the two feature vectors (f, g. ),
a division step (S41, S52) in which the two transformed feature vectors (f', g') are divided component-wise into a plurality of partial vectors (f1, f2, g1, g2),
a recording step (S42, S43) in which the plurality of partial vectors (g1, g2) constituting the transformed registered feature vector (g') are recorded,
a hierarchical distance calculation step ( S53, S54, S57, S58, S60, 563. S64) in which
the distance between the two feature vectors (f', g') transformed at the transform step (S41, S52) is calculated in a predetermined order
based on the predetermined transform operation (S41, S52), wherein the distance calculation
is performed between respective components constituting partial vectors (f1, f2, g1, g2) in a component-wise hierarchical manner in order from the partial vector (f1, g1) of the uppermost component order,
a threshold value comparison step (S55, S61) in which an integrated value of distances
calculated incrementally for hierarchically higher-order components (i) of the two
transformed feature vectors (f', g') is compared with a threshold value (S) set in
advance,
a control step (S55, S56, S57, S58, S61, S62, S63, S64) in which distance calculation
is controlled in accordance with a result of the threshold value comparison at the
threshold value comparison step (S55, S61), and
an output step (S65) in which, as the similarity, the integrated value of the calculated
distances up to the last components (i) of the two transformed feature vectors (f',
g') is outputted,
wherein, at the control step (S55, S56, S57, S58, S61, S62, S63, S64), control is
conducted such that the distance calculation is truncated in the case where the integrated
value of distances calculated up to a certain component order is greater or equal
to the threshold value and such that the distance calculation between next higher-order
components is performed in the case where the integrated value of distances calculated
up to a certain component order is below the threshold value,
and wherein distance calculation is performed such that, in a first step, only the
partial vector (g1) of the uppermost component order of the plurality of partial vectors (g1, g2) recorded in the recording step is retrieved and the distance calculation is performed
between respective components constituting the partial vectors (f1, g1) of the uppermost component order in a component-wise hierarchical manner, and wherein
only in the case where the integrated value of calculated distances between all components
constituting the partial vectors (f1, g1) of the uppermost component order is below the threshold value, in a second step
the partial vector (g2) of the next lower component order of the plurality of partial vectors (g1, g2) of the transformed registered feature vector (g') recorded in the recording step
(S42, S43) is retrieved and distance calculation between respective components constituting
partial vectors (f2, g2) of the next lower component order is performed.
2. The similarity calculation method as set forth in claim 1, wherein the predetermined
transform operation (S41, S52) is a transform operation which performs sequencing
of order of respective components constituting the two feature vectors (f, g) in accordance
with magnitude of dispersion of the respective components, and the distance calculation
between the two feature vectors (f', g') transformed at the transform step (S41, S52) is performed in order from components
of large dispersion at the hierarchical distance calculation step (S53, S54, S57,
S58. S60, S63, S64).
3. The similarity calculation method as set forth in claim 1, wherein the predetermined
transform operation (S41, S52) is a Discrete Cosine Transform operation or Discrete
Fourier Transform operation, and the distance calculation between the two feature
vectors (f', g') transformed at the transform step (S41. S52) is performed in order from low frequency
component at the hierarchical distance calculation step (S53, S54, S57, S58, S60,
S63, S64).
4. The similarity calculation method as set forth in claim 1, wherein, the predetermined
transform operation (S41, S52) is Walsh-Hadamard Transform operation, and the distance
calculation between the two transformed feature vectors (f', g') is performed in order from low frequency component at the hierarchical distance calculation
step (S53, S54, S57, S58, S60, S63, S64).
5. The similarity calculation method as set forth in claim 1, wherein the predetermined
transform operation (S41, S52) is a Karhunen-Loeve transform operation, and the distance
calculation between the two feature vectors (f', g') transformed at the transform step is performed in order from component of large eigenvalue
at the hierarchical distance calculation step (S53, S54, S57, S58, S60, S63, S64).
6. The similarity calculation method as set forth in claim 1, wherein the feature vector
(a) is obtained by extracting power spectrum coefficients (Sq) within a predetermined time period of an acoustic signal, the power spectrum coefficients
(Sq) being the components of the feature vector (a).
7. The similarity calculation method as set forth in claim 1. wherein the feature vector
(a) is obtained by extracting linear predictive coefficients within a predetermined
time period of an acoustic signal.
8. The similarity calculation method as set forth in claim 1, wherein the feature vector
(a) is obtained by extracting parameters indicating intensities of frequency components
within respective frames of an encoded acoustic signal, the parameters being components
of the feature vector (a).
9. The similarity calculation method as set forth in claim 1, wherein the feature vector
(v) is obtained by acquiring image frames from signal value of representative image
within respective predetermined time periods of a video signal, preparing an average
image (100) of the acquired image frames within the respective predetermined time
periods, and preparing a block average image (110) by dividing the average image (100)
into X × Y small blocks in breadth and width directions and averaging the values within
respective small blocks and by arranging the small blocks in order of R. G, B, the
values of the block average image (110) arranged in order of R. G. B being the components
of the feature vector (v).
10. The similarity calculation method as set forth in claim 1, wherein the feature vector
(v) is obtained by preparing histogram with respect to signal values of luminance
and/or color of image frame within a predetermined time period of a video signal,
the signal values of luminance and/or color being the components of the feature vector
(v).
11. A similarity calculating apparatus adapted for determining similarity between two
feature vectors, a registered vector (g) and an input vector (f), being representative
of an acoustic signal or a video signal, comprising:
transform means (30. 31) which is adapted to implement a predetermined transform operation
to the two feature vectors (f, g),
dividing means (30, 31) which is adapted to take out, in a predetermined order based
on the predetermined transform operation, respective components constituting the two
feature vectors (f', g') transformed by the transform means (30, 31) to divide them into a plurality of partial
vectors (f1, g1, f2, g2),
recording means (32, 33) which are adapted to record the plurality of partial vectors
(g1, g2) constituting the transformed registered feature vector (g'),
hierarchical distance calculating (34) means which is adapted to perform a distance
calculation between the two feature vectors (f', g') transformed by the transform means (30, 31) in a predetermined order based on the
predetermined transform operation, wherein the distance calculating means (34) is
adapted to perform, in a component-wise hierarchical manner, the distance calculation
between respective components constituting partial vectors (f1, g1, f2, g2) in order from the partial vector (f1, g1) of the uppermost component order, and
threshold value comparing means (35) which is adapted to compare an integrated value
of distances calculated incrementally for hierarchically higher-order components of
the two transformed vectors (f', g',) by the distance calculating means (34) with
a threshold value (S) set in advance,
a control means which is adapted to control the distance calculation in accordance
with a result by the threshold value comparing means (34), and
output means which is adapted to output, as the similarity, the integrated value of
distances calculated up to the last components of the two transformed feature vectors
(f', g'),
wherein the control means is operative so that in the case where integrated value
of distances calculated up to a certain component order is above the threshold value
as the result of comparison by the threshold comparing means (35), a control is performed
so as to truncate the distance calculation, and in the case where the integrated value
of distances calculated up to a certain component order is below the threshold value
the distance calculation is performed between the next higher-order components,
and wherein the hierarchical distance calculating means (34) is operative so that,
in a first step, only the partial vector (g1) of the uppermost component order of the plurality of partial vectors (g1, g2) recorded in the recording means is retrieved and the distance calculation is performed
between respective components constituting the partial vectors (f1, g1) of the uppermost component order in a component-wise hierarchical manner, and wherein
only in the case where the integrated value of calculated distances calculated between
all components constituting the partial vectors (f1, g1) of the uppermost component order is below the threshold value (S), in a second step
the partial vector (g2) of the next lower component order of the plurality of partial vectors (g1, g2) of the transformed registered feature vector (g') recorded in the recording means
(33) is retrieved and the distance calculation between respective components constituting
partial vectors (f2, g2) of one lower component order is performed.
12. A program for allowing a computer to execute similarity calculation processing for
determining similarity between two feature vectors (f, g,), a registered vector (g)
and an input vector (f), being representative of an acoustic signal or a video signal,
the program comprising:
a transform step (S41, S52) in which a predetermined transform operation is implemented
to the two feature vectors (f, g),
a division step (S41, S52) in which the two transformed feature vectors (f', g') are divided component-wise into a plurality of partial vectors (f1, g1, f2, g2),
a recording step (S42, S43) in which the plurality of partial vectors (g1, g2) constituting the transformed registered feature vector (g') are recorded,
a hierarchical distance calculation step (S53, S54, S57, S58, S60, S63, S64) in which
the distance between the two feature vectors (f', g') transformed at the transform step is calculated in a predetermined order based on
the predetermined transform operation (S41. S52), wherein the distance calculation
is performed between respective components constituting partial vectors (f1, g1, f2, g2) in a component-wise hierarchical manner in order from the partial vector (f1, g1) of the uppermost component order,
a threshold value comparison step (S55, S61) in which an integrated value of distances
calculated incrementally for hierarchically higher-order components (i) of the two
transformed feature vectors is compared with a threshold value (S) set in advance,
a control step (S55, S56, S57, 558. S61, S62, S63, S64) in which distance calculation
is controlled in accordance with a result of the threshold value comparison at the
threshold value comparison step (S55, S61), and
an output step (S65) in which, as the similarity, the integrated value of the calculated
distances up to the last components (i) of the two transformed feature vectors (f',
g') is outputted,
wherein, at the control step (S55, S56, S57, S58, S61, S62, S63, S64), control is
conducted such that the distance calculation is truncated in the case where the integrated
value of distances calculated up to a certain component order is greater or equal
to the threshold value (S) and the distance calculation between next higher-order
components is performed in the case that the integrated value of distances calculated
up to a certain component order is below the threshold value,
and wherein distance calculation is performed such that, in a first step, only the
partial vector (g1) of the uppermost component order of the plurality of partial vectors (g1, g2) recorded in the recording step is retrieved and the distance calculation is performed
between respective components constituting the partial vectors (f1, g1) of the uppermost component order in a component-wise hierarchical manner, and wherein
only in the case where the integrated value of calculated distances between all components
constituting the partial vectors (f1, g1) of the uppermost component order is below the threshold value, in a second step
the partial vector (g2) of the next lower component order of the plurality of partial vectors (g1, g2) of the transformed registered feature vector (g') recorded in the recording step
(S42, S43) is retrieved and the distance calculation between respective components
constituting partial vectors (f2, g2) of the next lower component order is performed.
13. A computer readable medium adapted so that a program for allowing a computer to execute
similarity calculation processing which determines similarity between two feature
vectors (f, g), a registered vector (g) and an input vector (f), being representative
of an acoustic signal or a video signal is recorded, the program including:
a transform step (S41, S52) in which a predetermined transform operation is implemented
to the two feature vectors (f, g)
a division step (S41) in which the two transformed feature vectors (f', g') are divided component-wise into a plurality of partial vectors (f1, g1, f2, g2),
a recording step (S42, S43) in which the plurality of partial vectors (g1, g2) constituting the transformed registered feature vector (g') are recorded,
a hierarchical distance calculation step ( S53, S54, S57, S58, S60, S63, S64) in which
the distance calculation between the two feature vectors (f', g') transformed at the transform step is calculated in a predetermined order based on
the predetermined transform operation (S41, S52), wherein the distance calculation
is performed between respective components constituting partial vectors (f1, g1, f2, g2) in a component-wise hierarchical manner in order from the partial vector (f1, g1) of the uppermost component order,
a threshold value comparison step (S55, S61) in which an integrated value of distances
calculated incrementally for hierarchically higher-order components (i) of the two
transformed feature vectors (f', g') is compared with a threshold value (S) set in
advance,
a control step (S55, S56, S57, S58, S61, S62, S63, S64) in which distance calculation
is controlled in accordance with a result of the threshold value comparison at the
threshold value comparison step (S55, S61), and
an output step (S65), in which, as the similarity, the integrated value of the calculated
distances up to the last components (i) of the two transformed feature vectors (f',
g') is outputted,
wherein, at the control step (S55, S56, S57, S58, S61, S62, S63, S64), control is
conducted such that the distance calculation is truncated in the case where the integrated
value of distances calculated up to a certain component order is greater or equal
to the threshold value (S) and the distance calculation between next higher-order
components is performed in the case that the integrated value of distances calculated
up to a certain component order is below the threshold value,
and wherein distance calculation is performed such that, in a first step, only the
partial vector (g1) of the uppermost component order of the plurality of partial vectors (g1, g2) recorded in the recording step is retrieved and the distance calculation is performed
between respective components constituting the partial vectors (f1, g1) of the uppermost component order in a component-wise hierarchical manner, and wherein
only in the case where the integrated value of calculated distances between all components
constituting the partial vectors (f1, g1) of the uppermost component order is below the threshold value (S), in a second step
the partial vector (g2) of the next lower component order of the plurality of partial vectors (g1, g2) of the transformed registered feature vector (g') recorded in the recording step
(S42, S43) is retrieved and the distance calculation between respective components
constituting partial vectors (f2, g2) of the next lower component order is performed.
1. Ähnlichkeitsberechnungsverfahren zum Bestimmen einer Ähnlichkeit zwischen zwei Merkmalsvektoren,
einem registrierten Vektor (g) und einem Eingabevektor (f), die repräsentativ sind
für ein akustisches Signal oder ein Videosignal, wobei jeder der zwei Merkmalsvektoren
N entsprechende Komponenten aufweist, wobei N eine ganze Zahl größer als Null ist,
wobei das Verfahren folgende Schritte aufweist:
einen Transformationsschritt (S41, S52), bei welchem ein vorbestimmter Transformationsvorgang
(S41, S52) auf die zwei Merkmalsvektoren (f, g) angewandt wird,
einen Unterteilungsschritt (S41, S52), bei welchem die zwei transformierten Merkmalsvektoren
(f', g') komponentenweise in eine Mehrzahl von Teilvektoren (f1, f2, g1, g2) unterteilt werden,
einen Aufzeichnungsschritt (S42, S43), bei welchem die Mehrzahl von Teilvektoren (g1, g2), die den transformierten registrierten Merkmalsvektor (g') bilden, aufgezeichnet
werden,
einen hierarchischen Distanzberechnungsschritt (S53, S54, S57, S58, S60, S63, S64),
bei welchem die Distanz zwischen den zwei Merkmalsvektoren (f', g'), die beim Transformationsschritt
(S41, S52) transformiert wurden, in einer vorbestimmten Ordnung berechnet wird, und
zwar auf der Grundlage des vorbestimmten Transformationsvorgangs (S41, S52), wobei
die Distanzberechnung zwischen den jeweiligen Komponenten, welche die Teilvektoren
(f1, f2, g1, g2) bilden, in einer komponentenweisen hierarchischen Art in der Ordnung vom Teilvektor
(f1, g1) der obersten Komponentenordnung durchgeführt wird,
einen Schwellwertvergleichsschritt (S55, S61), bei welchem ein integrierter Wert von
Distanzen, die inkrementweise für Komponenten (i) hierarchisch höherer Ordnung der
zwei transformierten Merkmalsvektoren (f', g') berechnet wurden, mit einem vorab eingestellten
Schwellwert (S) verglichen wird,
einen Steuerschritt (S55, S56, S57, S58, S61, S62, S63, S64), bei welchem die Distanzberechnung
gemäß einem Ergebnis des Schwellwertvergleichs beim Schwellwertvergleichsschritt (S55,
S61) gesteuert wird, und
einen Ausgabeschritt (S65), bei welchem der integrierte Wert der berechneten Distanzen
bis zu den letzten Komponenten (1) der zwei transformierten Merkmalsvektoren (f',
g') als Ähnlichkeit ausgegeben wird,
wobei beim Steuerschritt (S55, S56, S57, S58, S61, S62, S63, S64) die Steuerung derart
durchgeführt wird, dass die Distanzberechnung in dem Fall beendet wird, bei welchem
der integrierte Wert von Distanzen, die bis zu einer bestimmten Komponentenordnung
berechnet wurden, größer ist als der oder gleich ist zu dem Schwellwert, sowie derart,
dass die Distanzberechnung zwischen Komponenten nächst höherer Ordnung in dem Fall
durchgeführt wird, bei welchem der integrierte Wert von Distanzen, die bis zu einer
bestimmten Komponentenordnung berechnet wurden, niedriger ist als der Schwellwert,
und
wobei die Distanzberechnung derart ausgeführt wird, dass in einem ersten Schritt nur
der Teilvektor (g1) der obersten Komponentenordnung der Mehrzahl von Teilvektoren (g1, g2), die beim Aufzeichnungsschritt aufgezeichnet wurden, wieder gewonnen wird, und die
Distanzberechnung zwischen jeweiligen Komponenten, die die Teilvektoren (f1, g1) der obersten Komponentenordnung bilden, in einer komponentenweisen hierarchischen
Art und Weise ausgeführt wird, und
wobei nur in dem Fall, bei welchem der integrierte Wert der berechneten Distanzen
zwischen allen Komponenten, die die Teilvektoren (f1, g1) der obersten Komponentenordnung bilden, unterhalb des Schwellwerts liegt, in einem
zweiten Schritt der Teilvektor (g2) der nächst niedrigeren Komponentenordnung der Mehrzahl von Teilvektoren (g1, g2) des transformierten registrierten Merkmalsvektors (g'), die im Aufzeichnungsschritt
(S42, S43) aufgezeichnet wurden, wieder gewonnen und die Distanzberechnung zwischen
jeweiligen Komponenten, die Teilvektoren (f2, g2) der nächst niedrigeren Komponentenordnung bilden, ausgeführt wird.
2. Ähnlichkeitsberechnungsverfahren nach Anspruch 1,
wobei der vorbestimmte Transformationsvorgang (S41, S52) ein Transformationsvorgang
ist, welcher ein Ordnen einer Ordnung jeweiliger Komponenten, welche die zwei Merkmalsvektoren
(f, g) bilden, gemäß einer Stärke der Dispersion der jeweiligen Komponenten durchführt,
und
wobei die Distanzberechnung zwischen den zwei Merkmalsvektoren (f', g'), die im Transformationsschritt
(S41, S52) transformiert wurden, im hierarchischen Distanzberechnungsschritt (S53,
S54, S57, S58, S60, S63, S64) in der Ordnung von Komponenten einer großen Dispersion
ausgeführt wird.
3. Ähnlichkeitsberechnungsverfahren nach Anspruch 1,
wobei der vorbestimmte Transformationsvorgang (S41, S52) ein diskreter Cosinus-Transformationsvorgang
oder ein diskreter Fourier-Transformationsvorgang ist und
wobei die Distanzberechnung zwischen den zwei Merkmalsvektoren (f', g'), die im Transformationsschritt
(S41, S52) transformiert wurden, im hierarchischen Distanzberechnungsschritt (S53,
S54, S57, S58, S60, S63, S64) ausgeführt wird in der Ordnung von einer Niedrigfrequenzkomponente.
4. Ähnlichkeitsberechnungsverfahren nach Anspruch 1,
wobei der vorbestimmte Transformationsvorgang (S41, S52) ein Walsh-Hadamard-Transformationsvorgang
ist und
wobei die Distanzberechnung zwischen den zwei transformierten Merkmalsvektoren (f',
g') im hierarchischen Distanzberechnungsschritt (S53, S54, S57, S58, S60, S63, S64)
in einer Ordnung von einer Niedrigfrequenzkomponente ausgeführt wird.
5. Ähnlichkeitsberechnungsverfahren nach Anspruch 1,
wobei der vorbestimmte Transformationsvorgang (S41, S52) ein Karhunen-Loeve-Transformationsvorgang
ist und
wobei die Distanzberechnung zwischen den zwei Merkmalsvektoren (f', g'), die im Transformationsschritt
transformiert wurden, im hierarchischen Distanzberechnungsschritt (S53, S54, S57,
S58, S60, S63, S64) in einer Ordnung von einer Komponente mit großem Eigenwert ausgeführt
wird.
6. Ähnlichkeitsberechnungsverfahren nach Anspruch 1,
wobei der Merkmalsvektor (a) erhalten wird durch Extrahieren von Leistungsspektrumskoeffizienten
(Sq) innerhalb einer vorbestimmten Zeitspanne eines akustischen Signals,
wobei die Leistungsspektrumskoeffizienten (Sq) die Komponenten des Merkmalsvektors (a) sind.
7. Ähnlichkeitsberechnungsverfahren nach Anspruch 1,
wobei der Merkmalsvektor (a) erhalten wird durch Extrahieren linearer Vorhersagekoeffizienten
innerhalb einer vorbestimmten Zeitspanne eines akustischen Signals.
8. Ähnlichkeitsberechnungsverfahren nach Anspruch 1,
wobei der Merkmalsvektor (a) erhalten wird durch Extrahieren von Parametern, die Intensitäten
von Frequenzkomponenten innerhalb jeweiliger Frames eines codierten akustischen Signals
anzeigen,
wobei die Parameter Komponenten des Merkmalsvektors (a) sind.
9. Ähnlichkeitsberechnungsverfahren nach Anspruch 1,
wobei der Merkmalsvektor (v) erhalten wird durch Erfassen von Bildframes von einem
Signalwert eines repräsentativen Bildes innerhalb jeweiliger vorbestimmter Zeitspannen
eines Videosignals, durch Erzeugen eines gemittelten Bildes (100) der erfassten Bildframes
innerhalb der jeweiligen vorbestimmten Zeitspannen und durch Erzeugen eines blockgemittelten
Bildes (110) durch Unterteilen des gemittelten Bildes (100) in X x Y kleine Blöcke
in Breiten- und Weitenrichtungen und Mitteln der Werte innerhalb jeweiliger kleiner
Blöcke und Anordnen der kleinen Blöcke in einer Ordnung von R, G, B, wobei die Werte
des blockgemittelten Bildes (110), die in der Ordnung von R, G, B angeordnet sind,
die Komponenten des Merkmalsvektors (v) sind.
10. Ähnlichkeitsberechnungsverfahren nach Anspruch 1,
wobei der Merkmalsvektor (v) erhalten wird durch Erzeugen eines Histogramms in Bezug
auf Signalwerte von Luminanz und/oder von Farbe eines Bildframes innerhalb einer vorbestimmten
Zeitspanne eines Videosignals, wobei die Signalwerte von Luminanz und/oder Farbe die
Komponenten des Merkmalsvektors (v) sind.
11. Ähnlichkeitsberechnungsvorrichtung,
welche ausgebildet ist zum Bestimmen einer Ähnlichkeit zwischen zwei Merkmalsvektoren,
einem registrierten Vektor (g) und einem Eingabevektor (f), welche repräsentativ sind
für ein akustisches Signal oder für ein Videosignal, mit:
einer Transformationseinrichtung (30, 31), welche ausgebildet ist, einen vorbestimmten
Transformationsvorgang in Bezug auf die zwei Merkmalsvektoren (f, g) anzuwenden,
einer Unterteilungseinrichtung (30, 31), welche ausgebildet ist, in einer vorbestimmten
Ordnung auf der Grundlage des vorbestimmten Transformationsvorgangs jeweilige Komponenten,
die die zwei Merkmalsvektoren (f, g') bilden, die durch die Transformationseinrichtung
(30, 31) transformiert wurden, zu entnehmen, um diese in eine Mehrzahl von Teilvektoren
(f1, g1, f2, g2) zu unterteilen,
Aufzeichnungseinrichtungen (32, 33), welche ausgebildet sind, die Mehrzahl von Teilvektoren
(g1, g2), welche den transformierten registrierten Merkmalsvektor (g') bilden, aufzuzeichnen,
einer hierarchischen Distanzberechnungseinrichtung (34), welche ausgebildet ist, eine
Distanzberechnung zwischen den zwei Merkmalsvektoren (f', g'), die durch die Transformationseinrichtung
(30, 31) transformiert wurden, in einer vorbestimmten Ordnung auszuführen, auf der
Grundlage des vorbestimmten Transformationsvorgangs, wobei die Distanzberechnungseinrichtung
(34) dazu ausgebildet ist, die Distanzberechnung zwischen jeweiligen Komponenten,
welche Teilvektoren (f1, g1, f2, g2) bilden, in einer komponentenweisen hierarchischen Art und Weise in einer Ordnung
vom Teilvektor (f1, g1) der obersten Komponentenordnung auszuführen, und
einer Schwellwertvergleichseinrichtung (35), welche ausgebildet ist, einen integrierten
Wert von Distanzen, die inkrementweise für Komponenten hierarchisch höherer Ordnung
der zwei transformierten Vektoren (f', g') durch die Distanzberechnungseinrichtung
(34) berechnet wurden, mit einem vorab eingestellten Schwellwert (S) zu vergleichen,
einer Steuereinrichtung, welche ausgebildet ist, die Distanzberechnung gemäß einem
Ergebnis der Schwellwertvergleichseinrichtung (34) zu steuern, und
einer Ausgabeeinrichtung, welche ausgebildet ist, den integrierten Wert von Distanzen,
die bis zu den letzten Komponenten der zwei transformierten Merkmalsvektoren (f',
g') berechnet wurden, als Ähnlichkeit auszugeben,
wobei die Steuereinrichtung so arbeitet, dass in dem Fall, bei welchem der integrierte
Wert von Distanzen, die berechnet wurden bis zu einer bestimmten Komponentenordnung,
oberhalb des Schwellwerts liegt, und zwar als Ergebnis des Vergleichs durch die Schwellwertvergleichseinrichtung
(35), eine Steuerung ausgeführt wird, um die Distanzberechnung zu beenden, und dass
in dem Fall, bei welchem der integrierte Wert von Distanzen, die bis zu einer bestimmten
Komponentenordnung berechnet wurden, unterhalb des Schwellwerts liegt, die Distanzberechnung
ausgeführt wird zwischen den Komponenten nächst höherer Ordnung, und
wobei die hierarchische Distanzberechnungseinrichtung (34) so arbeitet, dass in einem
ersten Schritt nur der Teilvektor (g1) der obersten Komponentenordnung der Mehrzahl von Teilvektoren (g1, g2), die in der Aufzeichnungseinrichtung aufgezeichnet wurden, wieder gewonnen und die
Distanzberechnung zwischen jeweiligen Komponenten, die die Teilvektoren (f1, g1) der obersten Komponentenordnung bilden, in einer komponentenweisen hierarchischen
Art und Weise ausgeführt wird, und
wobei nur in dem Fall, bei welchem der integrierte Wert von berechneten Distanzen,
die zwischen sämtlichen Komponenten berechnet wurden, die die Teilvektoren (f1, g1) der obersten Komponentenordnung bilden, unterhalb des Schwellwerts (S) liegt, in
einem zweiten Schritt der Teilvektor (g2) der nächst niedrigeren Komponentenordnung der Mehrzahl von Teilvektoren (g1, g2) des transformierten registrierten Merkmalsvektors (g'), die in der Aufzeichnungseinrichtung
(33) aufgezeichnet wurden, wieder gewonnen und die Distanzberechnung zwischen den
jeweiligen Komponenten, die die Teilvektoren (f2, g2) einer niedrigeren Komponentenordnung bilden, ausgeführt wird.
12. Programm, um einem Computer zu ermöglichen, eine Ähnlichkeitsberechnungsverarbeitung
auszuführen zum Bestimmen einer Ähnlichkeit zwischen zwei Merkmalsvektoren (f, g),
einem registrierten Vektor (g) und einem Eingabevektor (f), welche repräsentativ sind
für ein akustisches Signal oder ein Videosignal,
wobei das Programm aufweist:
einen Transformationsschritt (S41, S52), bei welchem ein vorbestimmter Transformationsvorgang
auf die zwei Merkmalsvektoren (f, g) angewandt wird,
einen Unterteilungsschritt (S41, S52), bei welchem die zwei transformierten Merkmalsvektoren
(f', g') komponentenweise in eine Mehrzahl von Teilvektoren (f1, f2, g1, g2) unterteilt werden,
einen Aufzeichnungsschritt (S42, S43), bei welchem die Mehrzahl von Teilvektoren (g1, g2), die den transformierten registrierten Merkmalsvektor (g') bilden, aufgezeichnet
werden,
einen hierarchischen Distanzberechnungsschritt (S53, S54, S57, S58, S60, S63, S64),
bei welchem die Distanz zwischen den zwei Merkmalsvektoren (f', g'), die beim Transformationsschritt
(S41, S52) transformiert wurden, in einer vorbestimmten Ordnung berechnet wird, und
zwar auf der Grundlage des vorbestimmten Transformationsvorgangs (S41, S52), wobei
die Distanzberechnung zwischen den jeweiligen Komponenten, welche die Teilvektoren
(f1, f2, g1, g2) bilden, in einer komponentenweisen hierarchischen Art in der Ordnung vom Teilvektor
(f1, g1) der obersten Komponentenordnung durchgeführt wird,
einen Schwellwertvergleichsschritt (S55, S61), bei welchem ein integrierter Wert von
Distanzen, die inkrementweise für Komponenten (i) hierarchisch höherer Ordnung der
zwei transformierten Merkmalsvektoren berechnet wurden, mit einem vorab eingestellten
Schwellwert (S) verglichen wird,
einen Steuerschritt (S55, S56, S57, S58, S61, S62, S63, S64), bei welchem die Distanzberechnung
gemäß einem Ergebnis des Schwellwertvergleichs beim Schwellwertvergleichsschritt (S55,
S61) gesteuert wird, und
einen Ausgabeschritt (S65), bei welchem der integrierte Wert der berechneten Distanzen
bis zu den letzten Komponenten (i) der zwei transformierten Merkmalsvektoren (f',
g') als Ähnlichkeit ausgegeben wird,
wobei beim Steuerschritt (S55, S56, S57, S58, S61, S62, S63, S64) die Steuerung derart
durchgeführt wird, dass die Distanzberechnung in dem Fall beendet wird, bei welchem
der integrierte Wert von Distanzen, die bis zu einer bestimmten Komponentenordnung
berechnet wurden, größer ist als der oder gleich ist zu dem Schwellwert, sowie derart,
dass die Distanzberechnung zwischen Komponenten nächst höherer Ordnung in dem Fall
durchgeführt wird, bei welchem der integrierte Wert von Distanzen, die bis zu einer
bestimmten Komponentenordnung berechnet wurden, niedriger ist als der Schwellwert,
und
wobei die Distanzberechnung derart ausgeführt wird, dass in einem ersten Schritt nur
der Teilvektor (g1) der obersten Komponentenordnung der Mehrzahl von Teilvektoren (g1, g2), die beim Aufzeichnungsschritt aufgezeichnet wurden, wieder gewonnen wird, und wobei
die Distanzberechnung zwischen jeweiligen Komponenten, die die Teilvektoren (f1, g1) der obersten Komponentenordnung bilden, in einer komponentenweisen hierarchischen
Art und Weise ausgeführt wird, und
wobei nur in dem Fall, bei welchem der integrierte Wert der berechneten Distanzen
zwischen allen Komponenten, die die Teilvektoren (f1, g1) der obersten Komponentenordnung bilden, unterhalb des Schwellwerts liegt, in einem
zweiten Schritt der Teilvektor (g2) der nächst niedrigeren Komponentenordnung der Mehrzahl von Teilvektoren (g1, g2) des transformierten registrierten Merkmalsvektors (g'), die im Aufzeichnungsschritt
(S42, S43) aufgezeichnet wurden, wieder gewonnen und die Distanzberechnung zwischen
jeweiligen Komponenten, die Teilvektoren (f2, g2) der nächst niedrigeren Komponentenordnung bilden, ausgeführt wird.
13. Computerlesbares Medium, welches derart ausgebildet ist, dass ein Programm aufgezeichnet
wird, welches einem Computer ermöglicht, eine Ähnlichkeitsberechnungsverarbeitung
auszuführen, welche eine Ähnlichkeit bestimmt zwischen zwei Merkmalsvektoren (f, g),
einem registrierten Vektor (g) und einem Eingabevektor (f), welche repräsentativ sind
für ein akustisches Signal oder für ein Videosignal,
wobei das Programm aufweist:
einen Transformationsschritt (S41, S52), bei welchem ein vorbestimmter Transformationsvorgang
auf die zwei Merkmalsvektoren (f, g) angewandt wird,
einen Unterteilungsschritt (S41), bei welchem die zwei transformierten Merkmalsvektoren
(f, g') komponentenweise in eine Mehrzahl von Teilvektoren (f1, f2, g1, g2) unterteilt werden,
einen Aufzeichnungsschritt (S42, S43), bei welchem die Mehrzahl von Teilvektoren (g1, g2), die den transformierten registrierten Merkmalsvektor (g') bilden, aufgezeichnet
werden,
einen hierarchischen Distanzberechnungsschritt (S53, S54, S57, S58, S60, S63, S64),
bei welchem die Distanz zwischen den zwei Merkmalsvektoren (f, g'), die beim Transformationsschritt
transformiert wurden, in einer vorbestimmten Ordnung berechnet wird, und zwar auf
der Grundlage des vorbestimmten Transformationsvorgangs (S41, S52), wobei die Distanzberechnung
zwischen den jeweiligen Komponenten, welche die Teilvektoren (f1, f2, g1, g2) bilden, in einer komponentenweisen hierarchischen Art in der Ordnung vom Teilvektor
(f1, g1) der obersten Komponentenordnung durchgeführt wird,
einen Schwellwertvergleichsschritt (S55, S61), bei welchem ein integrierter Wert von
Distanzen, die inkrementweise für Komponenten (i) hierarchisch höherer Ordnung der
zwei transformierten Merkmalsvektoren (f', g') berechnet wurden, mit einem vorab eingestellten
Schwellwert (S) verglichen wird,
einen Steuerschritt (S55, S56, S57, S58, S61, S62, S63, S64), bei welchem die Distanzberechnung
gemäß einem Ergebnis des Schwellwertvergleichs beim Schwellwertvergleichsschritt (S55,
S61) gesteuert wird, und
einen Ausgabeschritt (S65), bei welchem der integrierte Wert der berechneten Distanzen
bis zu den letzten Komponenten (i) der zwei transformierten Merkmalsvektoren (f',
g') als Ähnlichkeit ausgegeben wird,
wobei beim Steuerschritt (S55, S56, S57, S58, S61, S62, S63, S64) die Steuerung derart
durchgeführt wird, dass die Distanzberechnung in dem Fall beendet wird, bei welchem
der integrierte Wert von Distanzen, die bis zu einer bestimmten Komponentenordnung
berechnet wurden, größer ist als der oder gleich ist zu dem Schwellwert (S), sowie
derart, dass die Distanzberechnung zwischen Komponenten nächst höherer Ordnung in
dem Fall durchgeführt wird, bei welchem der integrierte Wert von Distanzen, die bis
zu einer bestimmten Komponentenordnung berechnet wurden, niedriger ist als der Schwellwert,
und
wobei die Distanzberechnung derart ausgeführt wird, dass in einem ersten Schritt nur
der Teilvektor (g1) der obersten Komponentenordnung der Mehrzahl von Teilvektoren (g1, g2), die beim Aufzeichnungsschritt aufgezeichnet wurden, wieder gewonnen wird, und wobei
die Distanzberechnung zwischen jeweiligen Komponenten, die die Teilvektoren (f1, g1) der obersten Komponentenordnung bilden, in einer komponentenweisen hierarchischen
Art und Weise ausgeführt wird, und
wobei nur in dem Fall, bei welchem der integrierte Wert der berechneten Distanzen
zwischen allen Komponenten, die die Teilvektoren (f1, g1) der obersten Komponentenordnung bilden, unterhalb des Schwellwerts (S) liegt, in
einem zweiten Schritt der Teilvektor (g2) der nächst niedrigeren Komponentenordnung der Mehrzahl von Teilvektoren (g1, g2) des transformierten registrierten Merkmalsvektors (g'), die im Aufzeichnungsschritt
(S42, S43) aufgezeichnet wurden, wieder gewonnen und die Distanzberechnung zwischen
jeweiligen Komponenten, die Teilvektoren (f2, g2) der nächst niedrigeren Komponentenordnung bilden, ausgeführt wird.
1. Procédé de calcul de similitude pour déterminer une similitude entre deux vecteurs
caractéristiques, un vecteur préenregistré (g) et un vecteur d'entrée (f), qui sont
représentatifs d'un signal acoustique ou d'un signal vidéo, chacun des deux vecteurs
caractéristiques ayant N composantes correspondantes, N étant un entier supérieur
à zéro, le procédé comprenant les étapes suivantes :
une étape de transformation (S41, S52) dans laquelle une opération de transformation
prédéterminée (S41, S52) est mise en oeuvre sur les deux vecteurs caractéristiques
(f, g),
une étape de division (S41, S52) dans laquelle les deux vecteurs caractéristiques
transformés (f', g') sont divisés, composante par composante, en une pluralité de
vecteurs partiels (f1, f2, g1, g2),
une étape d'enregistrement (S42, S43) dans laquelle la pluralité de vecteurs partiels
(g1, g2) constituant le vecteur caractéristique préenregistré transformé (g') est enregistrée,
une étape de calcul de distance hiérarchique (S53, S54, S57, S58, S60, S63, S64) dans
laquelle la distance entre les deux vecteurs caractéristiques (f, g') transformés
à l'étape de transformation (S41, S52) est calculée, dans un ordre prédéterminé sur
la base de l'opération de transformation prédéterminée (S41, S52), le calcul de distance
étant effectué entre les composantes respectives constituant les vecteurs partiels
(f1, f2, g1, g2), dans un ordre hiérarchique, composante par composante, à partir du vecteur partiel
(f1, g1) de l'ordre de composante le plus élevé,
une étape de comparaison à une valeur seuil (S55, S61) dans laquelle une valeur intégrée
des distances calculées de façon incrémentielle pour les composantes (i), d'ordre
hiérarchique croissant, des deux vecteurs caractéristiques transformés (f', g') est
comparée à une valeur seuil (S) définie à l'avance,
une étape de contrôle (S55, S56, S57, S58, S61, S62, S63, S64) dans laquelle le calcul
de distance est contrôlé d'après un résultat de la comparaison à une valeur seuil
obtenu à l'étape de comparaison à une valeur seuil (S55, S61), et
une étape de sortie (S65) dans laquelle la valeur intégrée des distances calculées
jusqu'aux dernières composantes (i) des deux vecteurs caractéristiques transformés
(f', g') est produite en sortie en tant que similarité,
le contrôle à l'étape de contrôle (S55, S56, S57, S58, S61, S62, S63, S64) étant effectué
de façon que le calcul de distance soit tronqué lorsque la valeur intégrée des distances
calculées jusqu'à un certain ordre de composante est supérieure ou égale à la valeur
seuil, et de façon que le calcul de distance entre les composantes d'ordres plus élevés
suivants soit effectué lorsque la valeur intégrée des distances calculées jusqu'à
un certain ordre de composante est inférieure à la valeur seuil,
et le calcul de distance étant effectué de façon que, dans une première étape, seul
soit récupéré le vecteur partiel (g1) de l'ordre de composante le plus élevé de la pluralité de vecteurs partiels (g1, g2) enregistrée à l'étape d'enregistrement et que le calcul de distance soit effectué
entre les composantes respectives constituant les vecteurs partiels (f1, g1) de l'ordre de composante le plus élevé, dans un ordre hiérarchique, composante par
composante, et que, seulement lorsque la valeur intégrée des distances calculées entre
toutes les composantes constituant les vecteurs partiels (f1, g1) de l'ordre de composante le plus élevé est inférieure à la valeur seuil, dans une
deuxième étape, le vecteur partiel (g2) de l'ordre de composante inférieur suivant de la pluralité de vecteurs partiels
(g1, g2) du vecteur caractéristique préenregistré transformé (g') enregistrée à l'étape d'enregistrement
(S42, S43) soit récupéré et que le calcul de distance entre les composantes respectives
constituant les vecteurs partiels (f2, g2) de l'ordre de composante inférieur suivant soit effectué.
2. Procédé de calcul de similitude selon la revendication 1, dans lequel l'opération
de transformation prédéterminée (S41, S52) est une opération de transformation qui
exécute une mise en ordre des composantes respectives constituant les deux vecteurs
caractéristiques (f, g) selon la valeur de dispersion des composantes respectives,
et le calcul de distance entre les deux vecteurs caractéristiques (f', g') transformés
à l'étape de transformation (S41, S52) est effectué dans l'ordre à partir des composantes
ayant une grande dispersion à l'étape de calcul de distance hiérarchique (S53, S54,
S57, S58, S60, S63, S64).
3. Procédé de calcul de similitude selon la revendication 1, dans lequel l'opération
de transformation prédéterminée (S41, S52) est une Transformée en Cosinus Discrète
ou une Transformée de Fourier Discrète, et le calcul de distance entre les deux vecteurs
caractéristiques (f', g') transformés à l'étape de transformation (S41, S52) est effectué
dans l'ordre à partir d'une composante ayant une basse fréquence à l'étape de calcul
de distance hiérarchique (S53, S54, S57, S58, S60, S63, S64).
4. Procédé de calcul de similitude selon la revendication 1, dans lequel l'opération
de transformation prédéterminée (S41, S52) est une Transformée de Walsh-Hadamard,
et le calcul de distance entre les deux vecteurs caractéristiques transformés (f',
g') est effectué dans l'ordre à partir d'une composante ayant une basse fréquence
à l'étape de calcul de distance hiérarchique (S53, S54, S57, S58, S60, S63, S64).
5. Procédé de calcul de similitude selon la revendication 1, dans lequel l'opération
de transformation prédéterminée (S41, S52) est une Transformée de Karhunen-Loeve,
et le calcul de distance entre les deux vecteurs caractéristiques (f', g') transformés
à l'étape de transformation est effectué dans l'ordre à partir d'une composante ayant
une grande valeur propre à l'étape de calcul de distance hiérarchique (S53, S54, S57,
S58, S60, S63, S64).
6. Procédé de calcul de similitude selon la revendication 1, dans lequel le vecteur caractéristique
(a) est obtenu par extraction de coefficients de spectre de puissance (Sq) dans un
intervalle de temps prédéterminé d'un signal acoustique, les coefficients de spectre
de puissance (Sq) étant les composantes du vecteur caractéristique (a).
7. Procédé de calcul de similitude selon la revendication 1, dans lequel le vecteur caractéristique
(a) est obtenu par extraction de coefficients prédictifs linéaires dans un intervalle
de temps prédéterminé d'un signal acoustique.
8. Procédé de calcul de similitude selon la revendication 1, dans lequel le vecteur caractéristique
(a) est obtenu par extraction de paramètres indicatifs des intensités de composantes
de fréquence dans des trames respectives d'un signal acoustique codé, les paramètres
étant les composantes du vecteur caractéristique (a).
9. Procédé de calcul de similitude selon la revendication 1, dans lequel le vecteur caractéristique
(v) est obtenu par acquisition de trames d'image à partir d'une valeur de signal d'une
image représentative dans des intervalles de temps prédéterminés respectifs d'un signal
vidéo, préparation d'une image moyenne (100) des trames d'image acquises dans les
intervalles de temps prédéterminés respectifs, et préparation d'une image moyenne
à blocs (110) par division de l'image moyenne (100) en X x Y petits blocs dans les
sens de la hauteur et de la largeur et calcul de la moyenne des valeurs contenues
dans les petits blocs respectifs, puis classement des petits blocs dans un ordre R,
G, B, les valeurs de l'image moyenne à blocs (110) classées dans l'ordre R, G, B étant
les composantes du vecteur caractéristique (v).
10. Procédé de calcul de similitude selon la revendication 1, dans lequel le vecteur caractéristique
(v) est obtenu par préparation d'un histogramme en relation avec des valeurs de signaux
de luminance et/ou de couleur d'une trame d'image dans un intervalle de temps prédéterminé
d'un signal vidéo, les valeurs de signaux de luminance et/ou de couleur étant les
composantes du vecteur caractéristique (v).
11. Appareil de calcul de similitude adapté pour déterminer une similitude entre deux
vecteurs caractéristiques, un vecteur préenregistré (g) et un vecteur d'entrée (f),
qui sont représentatifs d'un signal acoustique ou d'un signal vidéo, l'appareil comprenant
:
des moyens de transformation (30, 31) qui sont adaptés pour mettre en oeuvre une opération
de transformation prédéterminée sur les deux vecteurs caractéristiques (f, g),
des moyens de division (30, 31) qui sont adaptés pour prélever, dans un ordre prédéterminé
sur la base de l'opération de transformation prédéterminée, des composantes respectives
constituant les deux vecteurs caractéristiques (f', g') transformés par les moyens
de transformation (30, 31) pour les diviser en une pluralité de vecteurs partiels
(f1, g1, f2, g2),
des moyens d'enregistrement (32, 33) qui sont adaptés pour enregistrer la pluralité
de vecteurs partiels (g1, g2) constituant le vecteur caractéristique préenregistré transformé (g'),
des moyens de calcul de distance hiérarchique (34) qui sont adaptés pour effectuer
un calcul de distance entre les deux vecteurs caractéristiques (f, g') transformés
par les moyens de transformation (30, 31), dans un ordre prédéterminé sur la base
de l'opération de transformation prédéterminée, les moyens de calcul de distance (34)
étant adaptés pour effectuer le calcul de distance entre les composantes respectives
constituant les vecteurs partiels (f1, g1, f2, g2), dans un ordre hiérarchique, composante par composante, à partir du vecteur partiel
(f1, g1) de l'ordre de composante le plus élevé, et
des moyens de comparaison à une valeur seuil (35) qui sont adaptés pour comparer une
valeur intégrée des distances calculées de façon incrémentielle pour les composantes,
d'ordre hiérarchique croissant, des deux vecteurs transformés (f', g') par les moyens
de calcul de distance (34) à une valeur seuil (S) définie à l'avance,
des moyens de contrôle qui sont adaptés pour contrôler le calcul de distance d'après
un résultat obtenu par les moyens de comparaison à une valeur seuil (34), et
des moyens de sortie qui sont adaptés pour produire en sortie la valeur intégrée des
distances calculées jusqu'aux dernières composantes des deux vecteurs caractéristiques
transformés (f', g') en tant que similarité,
les moyens de contrôle fonctionnant de telle sorte qu'un contrôle soit effectué de
façon à tronquer le calcul de distance lorsque la valeur intégrée des distances calculées
jusqu'à un certain ordre de composante est supérieure à la valeur seuil en résultat
de la comparaison effectuée par les moyens de comparaison à une valeur seuil (35),
et que le calcul de distance entre les composantes d'ordres plus élevés suivants soit
effectué lorsque la valeur intégrée des distances calculées jusqu'à un certain ordre
de composante est inférieure à la valeur seuil,
et les moyens de calcul de distance hiérarchique (34) fonctionnant de telle sorte
que, dans une première étape, seul soit récupéré le vecteur partiel (g1) de l'ordre de composante le plus élevé de la pluralité de vecteurs partiels (g1, g2) enregistrée dans les moyens d'enregistrement et que le calcul de distance soit effectué
entre les composantes respectives constituant les vecteurs partiels (f1, g1) de l'ordre de composante le plus élevé, dans un ordre hiérarchique, composante par
composante, et que, seulement lorsque la valeur intégrée des distances calculées entre
toutes les composantes constituant les vecteurs partiels (f1, g1) de l'ordre de composante le plus élevé est inférieure à la valeur seuil (S), dans
une deuxième étape, le vecteur partiel (g2) de l'ordre de composante inférieur suivant de la pluralité de vecteurs partiels
(g1, g2) du vecteur caractéristique préenregistré transformé (g') enregistrée dans les moyens
d'enregistrement (33) soit récupéré et que le calcul de distance entre les composantes
respectives constituant les vecteurs partiels (f2, g2) d'un ordre de composante inférieur soit effectué.
12. Programme pour permettre à un ordinateur d'exécuter un traitement de calcul de similitude
pour déterminer une similitude entre deux vecteurs caractéristiques (f, g), un vecteur
préenregistré (g) et un vecteur d'entrée (f), qui sont représentatifs d'un signal
acoustique ou d'un signal vidéo, le programme comprenant :
une étape de transformation (S41, S52) dans laquelle une opération de transformation
prédéterminée est mise en oeuvre sur les deux vecteurs caractéristiques (f, g),
une étape de division (S41, S52) dans laquelle les deux vecteurs caractéristiques
transformés (f', g') sont divisés, composante par composante, en une pluralité de
vecteurs partiels (f1, g1, f2, g2),
une étape d'enregistrement (S42, S43) dans laquelle la pluralité de vecteurs partiels
(g1, g2) constituant le vecteur caractéristique préenregistré transformé (g') est enregistrée,
une étape de calcul de distance hiérarchique (S53, S54, S57, S58, S60, S63, S64) dans
laquelle la distance entre les deux vecteurs caractéristiques (f', g') transformés
à l'étape de transformation est calculée, dans un ordre prédéterminé sur la base de
l'opération de transformation prédéterminée (S41, S52), le calcul de distance étant
effectué entre les composantes respectives constituant les vecteurs partiels (f1, g1, f2, g2), dans un ordre hiérarchique, composante par composante, à partir du vecteur partiel
(f1, g1) de l'ordre de composante le plus élevé,
une étape de comparaison à une valeur seuil (S55, S61) dans laquelle une valeur intégrée
des distances calculées de façon incrémentielle pour les composantes (i), d'ordre
hiérarchique croissant, des deux vecteurs caractéristiques transformés est comparée
à une valeur seuil (S) définie à l'avance,
une étape de contrôle (S55, S56, S57, S58, S61, S62, S63, S64) dans laquelle le calcul
de distance est contrôlé d'après un résultat de la comparaison à une valeur seuil
obtenu à l'étape de comparaison à une valeur seuil (S55, S61), et
une étape de sortie (S65) dans laquelle la valeur intégrée des distances calculées
jusqu'aux dernières composantes (i) des deux vecteurs caractéristiques transformés
(f', g') est produite en sortie en tant que similarité,
le contrôle à l'étape de contrôle (S55, S56, S57, S58, S61, S62, S63, S64) étant effectué
de façon que le calcul de distance soit tronqué lorsque la valeur intégrée des distances
calculées jusqu'à un certain ordre de composante est supérieure ou égale à la valeur
seuil (S), et de façon que le calcul de distance entre les composantes d'ordres plus
élevés suivants soit effectué lorsque la valeur intégrée des distances calculées jusqu'à
un certain ordre de composante est inférieure à la valeur seuil,
et le calcul de distance étant effectué de façon que, dans une première étape, seul
soit récupéré le vecteur partiel (g1) de l'ordre de composante le plus élevé de la pluralité de vecteurs partiels (g1, g2) enregistrée à l'étape d'enregistrement et que le calcul de distance soit effectué
entre les composantes respectives constituant les vecteurs partiels (f1, g1) de l'ordre de composante le plus élevé, dans un ordre hiérarchique, composante par
composante, et que, seulement lorsque la valeur intégrée des distances calculées entre
toutes les composantes constituant les vecteurs partiels (f1, g1) de l'ordre de composante le plus élevé est inférieure à la valeur seuil, dans une
deuxième étape, le vecteur partiel (g2) de l'ordre de composante inférieur suivant de la pluralité de vecteurs partiels
(g1, g2) du vecteur caractéristique préenregistré transformé (g') enregistrée à l'étape d'enregistrement
(S42, S43) soit récupéré et que le calcul de distance entre les composantes respectives
constituant les vecteurs partiels (f2, g2) de l'ordre de composante inférieur suivant soit effectué.
13. Support lisible par un ordinateur adapté pour qu'y soit enregistré un programme pour
permettre à un ordinateur d'exécuter un traitement de calcul de similitude qui détermine
une similitude entre deux vecteurs caractéristiques (f, g), un vecteur préenregistré
(g) et un vecteur d'entrée (f), qui sont représentatifs d'un signal acoustique ou
d'un signal vidéo, le programme comprenant :
une étape de transformation (S41, S52) dans laquelle une opération de transformation
prédéterminée est mise en oeuvre sur les deux vecteurs caractéristiques (f, g),
une étape de division (S41) dans laquelle les deux vecteurs caractéristiques transformés
(f', g') sont divisés, composante par composante, en une pluralité de vecteurs partiels
(f1, g1, f2, g2),
une étape d'enregistrement (S42, S43) dans laquelle la pluralité de vecteurs partiels
(g1, g2) constituant le vecteur caractéristique préenregistré transformé (g') est enregistrée,
une étape de calcul de distance hiérarchique (S53, S54, S57, S58, S60, S63, S64) dans
laquelle le calcul de distance entre les deux vecteurs caractéristiques (f', g') transformés
à l'étape de transformation est effectué, dans un ordre prédéterminé sur la base de
l'opération de transformation prédéterminée (S41, S52), le calcul de distance étant
effectué entre les composantes respectives constituant les vecteurs partiels (f1, g1, f2, g2), dans un ordre hiérarchique, composante par composante, à partir du vecteur partiel
(f1, g1) de l'ordre de composante le plus élevé,
une étape de comparaison à une valeur seuil (S55, S61) dans laquelle une valeur intégrée
des distances calculées de façon incrémentielle pour les composantes (i), d'ordre
hiérarchique croissant, des deux vecteurs caractéristiques transformés (f', g') est
comparée à une valeur seuil (S) définie à l'avance,
une étape de contrôle (S55, S56, S57, S58, S61, S62, S63, S64) dans laquelle le calcul
de distance est contrôlé d'après un résultat de la comparaison à une valeur seuil
obtenu à l'étape de comparaison à une valeur seuil (S55, S61), et
une étape de sortie (S65) dans laquelle la valeur intégrée des distances calculées
jusqu'aux dernières composantes (i) des deux vecteurs caractéristiques transformés
(f', g') est produite en sortie en tant que similarité,
le contrôle à l'étape de contrôle (S55, S56, S57, S58, S61, S62, S63, S64) étant effectué
de façon que le calcul de distance soit tronqué lorsque la valeur intégrée des distances
calculées jusqu'à un certain ordre de composante est supérieure ou égale à la valeur
seuil (S), et de façon que le calcul de distance entre les composantes d'ordres plus
élevés suivants soit effectué lorsque la valeur intégrée des distances calculées jusqu'à
un certain ordre de composante est inférieure à la valeur seuil,
et le calcul de distance étant effectué de façon que, dans une première étape, seul
soit récupéré le vecteur partiel (g1) de l'ordre de composante le plus élevé de la pluralité de vecteurs partiels (g1, g2) enregistrée à l'étape d'enregistrement et que le calcul de distance soit effectué
entre les composantes respectives constituant les vecteurs partiels (f1, g1) de l'ordre de composante le plus élevé, dans un ordre hiérarchique, composante par
composante, et que, seulement lorsque la valeur intégrée des distances calculées entre
toutes les composantes constituant les vecteurs partiels (f1, g1) de l'ordre de composante le plus élevé est inférieure à la valeur seuil (S), dans
une deuxième étape, le vecteur partiel (g2) de l'ordre de composante inférieur suivant de la pluralité de vecteurs partiels
(g1, g2) du vecteur caractéristique préenregistré transformé (g') enregistrée à l'étape d'enregistrement
(S42, S43) soit récupéré et que le calcul de distance entre les composantes respectives
constituant les vecteurs partiels (f2, g2) de l'ordre de composante inférieur suivant soit effectué.