[0001] The present invention relates to a speech speed conversion method, a speech speed
conversion apparatus, and an electronic apparatus for modulating the speed of a voice,
and, particularly, to a technique which is effective on application to a control technique
for using such an apparatus in conversation and so on.
[0002] As a device for aiding hearing sense for persons hard of hearing, there have been,
conventionally, mainly used analog type hearing aids using analog circuits for processing
the amplitude and frequency characteristic of a voice. On the contrary, research and
development for making application of digital signal processing to compensation for
hearing-impairment have been made eagerly in recent years. The trend of the research
and development has been described in detail, for example, in "Application of Digital
Technique to Compensation for Hearing-Impairment", Journal of Acoustical Society of
Japan (Vol. 47, No. 10, pp.760-765, 1991), "Speech-perception aids for hearing-impaired
people: Current status and needed research", J. Acoust. Soc. Am. (90(2), Pt. 1, Aug.
1991), and so on.
[0003] To compensate for hearing loss, the amplification of the amplitude of a speech signal
and the compression of a dynamic range are generally performed with every frequency
in accordance with the hearing characteristic of a user. In the conventional analog
hearing aid, such a process is realized by an analog circuit. On the other hand, in
the digital hearing aid developed in recent years, this process is realized by a software
such as a digital filter, or the like, so that adaptation to the hearing characteristic
of the user can be made more in detail.
[0004] In the aforementioned trend, an attempt to change only the speed of a voice by digital
signal processing without any change of the pitch of the voice to thereby perform
hearing aids of higher degree covering the whole range of a hearing system inclusive
of the decline of language processing speed has been made in recent years. Such a
speech speed conversion technique has been described in detail, for example, in "Development
of Portable DSP System for performing Speech Processing for the Aged", Technical Research
Report of Institute of Electronics, Information and Communication Engineers of Japan
(Vol. 92, No. 207 SP92-54), "High-Quality Real-Time Speech Speed Conversion System",
ditto (SP92-55), and so on.
[0005] In the aforementioned conventional techniques, a broadcasted voice over the television/radio
or the like or a voice recorded in a tape recorder or the like was used as the voice
to be subjected to speech speed conversion. That is, the subject of speech speed conversion
was only a voice one-sidedly given to a listener.
[0006] Taking into account the fact that the conventional hearing aids can be used without
distinction of the input voice in kind, it is however preferable that the speech speed
conversion apparatus also can use other voices than the aforementioned voices as the
input voice. Particularly, if the voice of a talker in conversation can be heard slowly,
the apparatus can be used not only in the case of hearing perception aids for aged
or hearing-impaired people but also in the case of hearing aids in conversation of
a foreign language unfamiliar to hearing-unhandicapped people, and so on.
[0007] Speech speed conversion methods and apparatus with the features included in the first
part of claim 1 and claim 2, respectively, are known from DE-A-4 227 826.
SUMMARY OF THE INVENTION
[0008] It is an object of the present invention to provide a technique capable of widening
the range of application of speech speed conversion.
[0009] This object is met by the invention characterised in claims 1 and 2 whereby, if a
time lag is caused by speed conversion or repeat operation, catching-up of the speech
is performed while stored speech is reproduced. This permits reducing the operating
time and improves the handling property of a speech speed conversion apparatus.
[0010] Preferred embodiments of the invention are set forth in the dependent claims.
[0011] According to claim 3, data are stored on a frame by frame basis, so that writing/reading
efficiency can be improved.
[0012] According to claim 4, the decision about waveform expansion and reduction processes,
silent-part elimination process, etc. in the speech speed conversion process is performed
based on comparison between power of a frame and a threshold, and the threshold is
changed in accordance with the loudness of the input speech. Accordingly, the speech
speed conversion process can be carried out in accordance with the environmental condition
in use.
[0013] According to claim 5, in the speech speed conversion apparatus, there are provided
a speech speed selection switch for selecting the speed of the speech, and means for
changing the speed of the speech to the speech speed selected by the speech speed
selection switch. Accordingly, the speed of the speech to be heard can be selected
by the listener's own will.
[0014] According to claim 6, means (AV control) for controlling an audio/video apparatus
is provided in the speech speed conversion apparatus. Accordingly, a series of operation
in which a signal for pausing the reproducing operation of the external apparatus
is issued to temporarily stop the inputting of the speech to the speech speed conversion
apparatus when the memory capacity is insufficient and in which the outputting of
the pause signal is stopped to re-start the inputting of the speech from the external
apparatus when there is some free area in the memory, is repeated irrespective of
the expansion/reduction rate in the speech speed conversion. As a result, use of speech
speed conversion can be continued for a long time.
[0015] According to claims 7 to 9, in the speech speed conversion apparatus, there are provided
a repeat switch and means for repeating a reproduced speech in a period in which the
repeat switch is turned on. Accordingly, the speech speed conversion of the repeat
speech can be carried out.
[0016] According to claims 10 and 11, the catching-up means is provided such that widening
of the range of application of the speech speed conversion apparatus, reduction in
operating time, improvement in handling property, and so on, can be attained.
[0017] According claims 12 and 13, at least one of the speech speed conversion switch, speech
speed selection switch, repeat switch and reset switch is provided in a peripheral
portion on a side surface of the speech speed conversion apparatus so as to perform
handling easily. Accordingly, widening of the range of application of the speech speed
conversion apparatus, reduction in operating time, improvement in handling property,
and so on, can be attained.
[0018] According to claims 14 to 19, the speech speed conversion means is provided as a
software executed by a digital signal processor having an input terminal for receiving
an interruption request signal from the outside, so that controlling of the speech
speed conversion process or switching of the speech speed conversion rate on the basis
of the speech speed conversion switch is given to the digital signal processor via
the interruption request signal input terminal.
[0019] According to claim 20, the microphone does not pick up click noise of each switch,
so that loud noise at the time of the manipulation of the switch can be prevented.
[0020] According to claim 21, the switches have respective surface formed different in tactility
so as to be identified without seeing, so that handling property can be improved.
[0021] According to claim 22, there is provided means for preventing the rustle of clothes
in contact with the microphone, so that entrance of noise can be reduced.
[0022] According to claim 23, a display means is provided at a predetermined position of
the speech speed conversion apparatus so that the quantity of a time lag from the
real time can be indicated visually. Accordingly, reduction in operating time, improvement
in handling property, and so on, can be attained.
[0023] According to claim 24, a ring buffer is used as the memory means, and there is provided
means for managing a lag time by a counter indicating a time lag on the ring buffer.
Accordingly, the repeat process, the catching-up process, and so on, can be carried
out easily.
[0024] According to claim 25, a standby mode is provided besides the through mode, so that
reduction in consumed electric power can be attained.
[0025] According to claim 26, there is provided an electric source switch operated in three
stages consisting of an ON stage, an OFF stage and an ON-OFF intermediate stage so
that an analog through mode is provided. Accordingly, reduction in electric power
can be attained.
[0026] According to claim 27, the speech speed conversion means is provided between a handset
of a telephone and a body of the telephone. Accordingly, a speech to be subjected
to speech speed conversion can be selected by the listener without any disturbance
of the listener's own speech.
[0027] Further, in the telephone, the voice can be heard at a slow speech speed without
any change of the characteristic of the talker's voice.
[0028] According to claim 28, the speech speed conversion means is provided in a telephone
line switching system. Accordingly, the voice to be subjected to speech speed conversion
can be selected by the listener without any disturbance of the listener's own speech.
[0029] Still further advantages of the present invention will become apparent to those of
ordinary skill in the art upon reading and understanding the following detailed description
of the preferred and alternate embodiments.
BRIEF DESCRIPTION OF THE DRAWINGS
[0030]
Fig. 1 is a block diagram showing the schematic structure of internal circuits according
to present invention;
Fig. 2 is a graph for explaining a speech speed conversion process executed within
a DSP according to present invention;
Fig. 3 is a graph for explaining the concept of a threshold process according to present
invention;
Fig. 4 is a view showing the form of use of the speech speed conversion apparatus
according to present invention;
Fig. 5 is a flow chart showing the control procedure of the speech speed conversion
apparatus according to present invention;
Fig. 6 is a front plan viewed from the front of the speech speed conversion apparatus
according to present invention;
Fig. 7 is a back plan viewed from the back of the speech speed conversion apparatus
according to present invention;
Fig. 8 is a top plan viewed from the top of the speech speed conversion apparatus
according to present invention;
Fig. 9 is a left plan viewed from the left of the speech speed conversion apparatus
according to present invention;
Fig. 10 is a right plan viewed from the right of the speech speed conversion apparatus
according to present invention;
Fig. 11 is a block diagram showing the functional structure of the speech speed conversion
apparatus according to present invention;
Figs. 12A and 12B are typical graphs for explaining a compression process in a speech
compression portion according to present invention;
Fig. 13 is a flow chart showing the procedure of a main process according to present
invention;
Fig. 14 is a flow chart to be continued from the flow chart of Fig. 13;
Fig. 15 is a state transition view typically showing transition between respective
modes according to present invention;
Fig. 16 is a flow chart showing the procedure of a reading pointer return routine
according to present invention;
Fig. 17 is a flow chart showing the procedure of a one-frame waveform expansion/reduction
process according to present invention;
Fig. 18 is a flow chart to be continued from the flow chart of Fig. 17;
Fig. 19 is a flow chart showing the procedure of a parameter setting process according
to present invention;
Fig. 20 is a view for explaining the data compression process according to present
invention;
Fig. 21 is a view for explaining the data compression process according to present
invention;
Fig. 22 is a view for explaining the data compression process according to present
invention;
Fig. 23 is a flow chart showing the procedure of the total operation of the speech
speed conversion apparatus provided with a continuous speech speed conversion means
according to present invention;
Fig. 24 is a flow chart to be continued from the flow chart of Fig. 23;
Fig. 25 is a flow chart showing the procedure of the total operation of the speech
speed conversion apparatus provided with a continuous speech speed conversion means
according to present invention;
Fig. 26 is a flow chart to be continued from the flow chart of Fig. 25;
Fig. 27 is a typical view for explaining an accelerator type switch used in the continuous
speech speed conversion means according to present invention;
Fig. 28 is a block diagram showing the functional structure of the speech speed conversion
apparatus provided with an AV control means according to present invention;
Fig. 29 is a view for explaining the operation of the AV control means according to
present invention;
Fig. 30 is a flow chart showing the procedure of a main process in the speech speed
conversion apparatus provided with the AV control means according to present invention;
Fig. 31 is a flow chart to be continued from the flow chart of Fig. 30;
Figs. 32A to 32C are views for explaining the arrangement of the microphone in the
speech speed conversion apparatus according to present invention;
Fig. 33 is a view showing the structure of a modified example according to present
invention;
Fig. 34 is a view for explaining a lag time display means in the speech speed conversion
apparatus according to present invention;
Fig. 35 is a diagram for explaining an electric source device in the speech speed
conversion apparatus according to present invention;
Fig. 36 is a diagram for explaining an embodiment in which the speech speed conversion
means according to the present invention is applied to a telephone; and
Fig. 37 is a diagram for explaining an embodiment in which the speech speed conversion
means according to the present invention is applied to a premises broadcasting system.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0031] Embodiments of the present invention will be described below in detail with reference
to the drawings.
[0032] In the all drawings for explaining the embodiments, parts having the same function
are marked by the same reference character to omit the repeated description thereof.
[0033] Fig. 1 is a block diagram showing the schematic structure of internal circuits according
to the present invention. The reference numeral 1 designates a DSP (Digital Signal
Processor); 11, a software for performing a speech speed conversion process; 12, a
serial port; 13, a terminal for external interruption flag; 14, a flag register; 2,
a memory (output buffer); 3, a selector switch; 4, a PTL (Push-To-Listen) switch;
5, an A/D converter; 6, a D/A converter; 7, a low-pass filter; 8, a low-pass filter;
9, an analog amplifier; 10, an analog amplifier; 321, a microphone; and 325, a binaural
headphone (earphone).
[0034] In a speech speed conversion apparatus according to this embodiment, as shown in
Fig. 1, a voice is inputted to the microphone 321 and outputted as a voice signal
(an electric signal). This voice signal is inputted via the amplifier 10 and the low-pass
filter 7 to the A/D converter 5, in which the voice signal is converted from an analog
value into a digital value at intervals of a time set in advance.
[0035] The voice signal converted into a digital value as described above is inputted to
the DSP 1. Then, the speech speed conversion process of the voice signal is realized
by the software 11 on the DSP 1. The PTL switch 4 is connected to the external interruption
flag terminal 13 contained in the DSP 1, so that the state of the PTL switch 4 is
expressed as a numerical value of the flag register 14 which is provided in the inside
of the DSP 1 so as to correspond to this terminal 13. In the software 11 on the DSP
1, a judgment in accordance with the numerical value of the flag register 14 is made
as to whether the speech speed conversion process is to be performed or not to be
performed.
[0036] The digital voice data subjected to the speech speed conversion process is stored
in the output buffer memory 2. The D/A converter 6 converts the data of the output
buffer memory 2 from a digital value into an analog value at intervals of a time set
in advance. The analog signal obtained by this conversion is inputted via the low-pass
filter 8 to the analog amplifier 9 and outputted as a voice from the binaural headphone
325 in listener's favorite amplitude of speech signal.
[0037] In this embodiment, two kinds of switches are prepared for the PTL switch 4. One
thereof is a switch in which current conduction is made as long as a pushbutton is
pushed. The other is a switch in which the current conduction state is maintained
though the hold of the pushbutton is released. The former is used in the case of conversation
whereas the latter is used in the case of continuous speech speed conversion of a
one-sidedly given voice such as a radio voice which is conventional utilization, and
the like. Further in this embodiment, the selector switch 3 as well as the PTL switch
4 is connected to the external interruption flag terminal 13 contained in the DSP
1. The numerical value of the flag register 14 is changed by the changeover of the
selector switch 3, so that the software 11 changes the expansion rate of the speech
speed conversion process in accordance with this numerical value.
[0038] Fig. 2 is a view for explaining the speech speed conversion process which is performed
in the DSP 1 in this embodiment. The speech speed conversion process in this embodiment
is a method of detecting the pitch (basic period) of a voice signal and expanding
the length of a waveform with the detected pitch as a unit, in which a voice data
set of the order of tens of milliseconds (hereinafter referred to as a frame) is made
a unit for one process. Accordingly, at least two frame length input buffers are prepared
in the inside of the DSP 1 so that while data from the A/D converter is inputted to
one buffer, data stored in the other buffer is processed (pipe-line process). After
processed, data is stored in the output buffer 2 having a sufficiently large capacity.
The procedure of processing data in each frame is as follows.
[0039] First of all,
(a) a pitch extracting process (not shown) is applied to the head portion of the frame
to thereby detect the pitch of this portion.
(b) Then, the thus detected data of the length of one pitch is transferred to the
output buffer 2.
(c) Then, data of the length of two pitches is multiplied by a window function which
changes from 0 to 1 and by a window function which changes from 1 to 0.
The on-data positions from which the multiplications by the window functions are started
are however shifted by one pitch. Then, data as respective results of the multiplications
by the window functions are added to each other to generate a reproduced wave pattern
having the time length of two pitches and put the pattern in the rear of the one-pitch
data which was transferred in advance.
(d) Then, a pitch detecting process (not shown) is carried out again in the condition
in which a position two pitches away from the on-data position previously subjected
to the pitch extracting process is at the head, so that pitch detection in this position
is performed. Because the voice pitch generally always varies, a pitch different from
the previously detected pitch is obtained in the second detection.
(e) Data of the length of n pitches is transferred to the output buffer with the pitch
length obtained by this final pitch detection as a unit.
[0040] The aforementioned procedure of from (a) to (e) is repeated over the whole frame.
[0041] Because the pitch length depends on the input voice, the number of repeats in one
frame is not constant. Further, different expansion rates are realized by changing
the value of
n in the aforementioned step (e). For example, data of the length of four pitches is
generated from data of the length of three pitches in the input buffers in the condition
of n=1, so that the expansion rate becomes 4/3=1.33 times. Similarly, the expansion
rate becomes 1.50 times in the condition of n=0 and 1.25 times in the condition of
n=2.
[0042] Further, in this embodiment, the aforementioned speech speed conversion process in
Fig. 2 is not always applied to all frames but the aforementioned process in Fig.
2 is applied only in the case where the calculated average power of each frame exceeds
a threshold Th which was set in advance. Data in a frame having power not exceeding
the threshold Th is therefore transferred to the output buffer in its original condition.
Fig. 3 shows a concept of this threshold process.
[0043] In Fig. 3, the portion in which the power of each frame exceeds the threshold Th
is expressed as a duration of expansion. Because the leading and trailing portions
of the voice signal are not processed but outputted in their original condition by
this threshold process, there is an advantage in that voice characteristic contained
in the leading and trailing of the voice, for example, consonantal characteristic,
is not destroyed.
[0044] Further, in this embodiment, a second threshold To is provided in the threshold process
for the average power of each frame as shown in Fig. 3. In the case where a frame
having power lower than this second threshold To is continued for a time not smaller
than one second, the frame having power lower than the threshold To continuously for
a time not smaller than one second is therefore processed so as not to be outputted.
Accordingly, reduction in the quantity of data stored in the output buffer is attained.
[0045] In Fig. 3, this not-outputted portion is expressed as a duration of elimination.
At the output buffer 2, data are one by one outputted to the D/A converter 6 at regular
time intervals in parallel with the writing of the speech-speed-conversion-processed
data at once each frame. Addresses in the output buffer 2 are set in the form of a
ring so that the last address is continued to the first address.
[0046] Accordingly, in this ring-like address space, an operation is carried out so that
an address pointer Po which points data to be fed to the D/A converter runs after
an address pointer Pi which points the destination of the writing of the speech-speed-conversion-processed
data. In this embodiment, Pi will overtake Po sooner or later because the speed of
Pi is higher than the speed of Po. At this point of time, information which has been
stored in the output buffer 2 is not outputted but rewritten.
[0047] Accordingly, the time from the start of the speech speed conversion operation to
this state becomes a time length of the input voice which can be tackled by the speech
speed conversion process of this embodiment. The reduction in the quantity of data
based on the aforementioned threshold To has an effect that this time length which
can be tackled is made long.
[0048] Further, the signal processing method in the speech speed conversion process explained
above with reference to Figs. 2 and 3 has been reported in "Evaluation of Speech Speed
Conversion Method by Hearing-Impaired People" Technical Research Report of Institute
of Electronics, Information and Communication Engineers of Japan, SP92-150 (1993-03)
or "Discussion of Speech Speed Conversion Method using Portable DSP System" Proceedings
of Acoustical Society of Japan (March 1993), 1-7-6.
[0049] Fig. 4 is a view showing the form of use of the speech speed conversion apparatus
according to this embodiment. Although Fig. 4 shows the case where the PTL switch
4 is disposed on the upper surface of the apparatus, it is a matter of course that
the position of the arrangement thereof may be replaced by another position. On the
other hand, the selector switch 3 for changing the expansion rate of speech speed
conversion is prepared in a side of the PTL switch 4. Because the selector switch
3 as well as the PTL switch 4 is provided so that the state of the selector switch
3 can be observed through the external interruption flag terminal of the DSP 1 from
the software on the DSP 1, the value of
n in the aforementioned speech speed conversion process is changed in accordance with
the state of the selector switch 3 when the PTL switch 4 is pushed. The expansion
rate can be changed for every speech by operating the PTL switch 4 and the selector
switch 3 alternately.
[0050] Fig. 5 shows the aforementioned control procedure expressed in a flow chart. In the
speech speed conversion process, for example, a frame of the time length of the order
of tens of milliseconds is used as a unit of processing. The A/D conversion and D/A
conversion are processes which are carried out at regular intervals of a smaller time
pitch than that, for example, at regular intervals of a time pitch of the order of,
for example, tens of microseconds. As shown in Fig. 5, the A/D conversion, the D/A
conversion and their attendant process are realized as an interruption process. While
the speech speed conversion process and a process of waiting for interruption are
carried out, the interruption process is carried out in accordance with an interruption
signal from the serial port to which the A/D converter and the D/A converter are connected.
[0051] As is obvious from the above description, in accordance with this embodiment, the
speech speed conversion apparatus can be used not only for the voice one-sidedly given
to a listener like a radio broadcasting voice but also in the situation of conversation,
so that the listener can select the voice subjected to the speech speed conversion
without any disturbance of listener's own speech.
[0052] Further, the speech speed conversion apparatus of this embodiment can be used to
compensate for the deterioration of voice hearing ability as observed in the aged
or the like. It is further needless to say that the apparatus can be used even in
the situation in which a listener who has no difficulty in hearing hears an unfamiliar
foreign language.
[0053] Referring to Figs. 6 through 10, the external appearance structure of the speech
speed conversion apparatus according to the present invention is shown in the following.
Fig. 6 is a front plan view from the front; Fig. 7 is a back plan view from the back;
Fig. 8 is a top plan view from the top; Fig. 9 is a left plan view from the left;
and Fig. 10 is a right plan view from the right.
[0054] In Figs. 6 to 10, the reference numeral 101 designates a body of the speech speed
conversion apparatus; 102, a back cover; 103, a finger stop hollow; 104, a slow switch
(slow pushbutton); 105, a repeat switch (repeat pushbutton); 106, a reset switch (reset
pushbutton); 321, a microphone; 108, a voice volume; 109, an electric source switch;
110, an earphone terminal; 111, an external input terminal; 112, an AV control terminal;
and 113, a speech speed changeover switch (speech speed setting switch).
[0055] As shown in Figs. 6 to 10, in the speech speed conversion apparatus of this embodiment,
the slow switch 104, the repeat switch 105 and the reset switch 106 are provided in
positions where the body 101 of the speech speed conversion apparatus is easy to operate
with one hand, for example, in the front upper side portion, and the speech speed
changeover switch 113 is provided in the right plan view.
[0056] The pushbutton of the aforementioned slow switch 104 is formed so as to be larger
than the other pushbuttons because the frequency of pushing of it is higher. Further,
because the continuous slow pushing of the pushbutton is tiring, the pushbutton is
provided so that it can be fixed. For example, there are used (1) a slide lock type
in which the pushbutton is locked when it is pushed and slid laterally, (2) a double
click type in which the pushbutton is locked when it is clicked twice, (3) a type
in which the hold of the pushbutton is released when the reset pushbutton is pushed,
and so on.
[0057] The aforementioned speech speed changeover switch (speech speed setting switch) 113
is disposed close to a range allowing the operation thereof with the same finger so
that this switch and the slow switch 104 can be operated alternately.
[0058] Besides the position in the aforementioned embodiment, a ring switch, a slide switch,
and so on, may be used to make the operation easier.
[0059] The aforementioned voice volume 108 is also disposed in a range allowing the operation
with the same finger so as to be easy to adjust in order to always make hearing in
appropriate voice volume possible.
[0060] Further, a switch which has a feeling of soft touch so that the microphone 321 does
not pick up click noise of the switch is preferably used as the aforementioned switches
high in the frequency of use, such as the slow switch 104, the repeat switch 105,
the reset switch 106, the speech speed changeover switch 113, and so on. For example,
a switch using electrically conductive rubber or the like is used.
[0061] Further, the external appearances of the aforementioned respective switches are preferably
formed into surface states which are different in the tactile feeling in order to
identify the switches in kind without seeing.
[0062] When the aforementioned finger stop hollow 103 is opened, a structure is made so
that some switches such as a speech speed selection switch at repeat time, and so
on, are seen.
[0063] The internal circuit structure of the speech speed conversion apparatus in this embodiment
is formed so as to be identical to the aforementioned circuit structure shown in Fig.
1.
[0064] As the PTL switch 4 in the previous embodiment, there are used the slow switch 104,
the repeat switch 105, the reset switch 106, and so on, as described above. Further,
as the selector switch 3 in the previous embodiment, there is used the speech speed
changeover switch (speech speed setting switch) 113. Further, the speech speed changeover
switch (speech speed setting switch) 113 is connected to the external interruption
flag terminal 13 contained in the DSP 1. The numeral value of the flag register 14
is changed by the changeover of the speech speed changeover switch 113, so that the
software 11 changes the expansion rate of the speech speed conversion process in accordance
with this numerical value.
[0065] Fig. 11 is a block diagram showing the functional structure of the speech speed conversion
apparatus in this embodiment, in which the reference numeral 21 designates speech
input devices; 22, input buffers; 23, a central processing unit (CPU); 24, a ring
buffer memory (which corresponds to the memory 2 in Fig. 1); 25, a function chooser;
26, output buffers; and 27, speech output devices.
[0066] Making the constituent parts of this embodiment correspond to those of the previous
embodiment, the speech input devices 21 are constituted by the microphone 321, analog
amplifier 10, low-pass filter 7 and A/D converter 5 of Fig. 1.
[0067] The aforementioned input buffers 22 serve to hold a speech converted into a digital
signal by the aforementioned speech input devices 21 and have a size enough to hold
data of the length of one frame which is a unit for signal processing after that.
These input buffers 22 can be realized by the allocation of a part of addresses of
the ring buffer memory 24 (which corresponds to the memory 2 in Fig. 1).
[0068] The aforementioned central processing unit (CPU) 23 which corresponds to the portion
of software executed on the DSP 1 shown in Fig. 1, has an encoder 23A, a silent-part
elimination process 23B, a decoder 23C, a wave-form manipulation process (speech speed
conversion process) 23D, and a controller 23E.
[0069] The aforementioned function chooser 25 which corresponds to the portion constituted
by the switches 3 and 4 and the external interruption flag terminal 13 shown in Fig.
1, is constituted by the slow switch 104, the repeat switch 105, the reset switch
106, the speech speed changeover switch 113, and so on, as described above.
[0070] The aforementioned output buffers 26 which serve to hold resulting data processed
by the aforementioned wave-form manipulation process 23D are two in practice and each
of them has a size enough to store data of the length of one frame expanded by wave-form
manipulation. In the previous embodiment, two input buffers are provided so that a
pipe-line process is realized by using them alternately, whereas in this embodiment
a pipe-line process is realized by using two output buffers alternately in the same
manner as in the previous embodiment.
[0071] That is, while the wave-form manipulation process of one frame is carried out so
that a result of the process is held in one output buffer, a result of the wave-form
manipulation process obtained in the previous cycle is outputted from the other output
buffer via the speech output devices 27. These output buffers 26 can be realized by
the allocation of a part of addresses of the ring buffer memory 24 (which corresponds
to the memory 2 in Fig. 1).
[0072] The inputting of data to the input buffers 22 and the outputting of data from the
output buffers 26 are carried out at intervals of the sampling rate of the A/D converter
5 and of the D/A converter 6 in the same manner as in the previous embodiment. The
process executed by the DSP 1 is therefore constituted by a wave-form manipulation
process for each frame and an interruption process executed at sampling intervals.
[0073] That is, the interruption process is executed any number of times while the wave-form
manipulation process is applied to data of the length of one frame, so that the two
processes are executed apparently and simultaneously.
[0074] As the aforementioned ring buffer memory 24, there is used a well-known type memory
in which writing/reading is performed for each frame. The details thereof will be
described below.
(Writing Operation)
[0075] In Fig. 11, speech data inputted through the speech input devices 21 are held in
the input buffers 22. The input buffers 22 have a capacity enough to hold a number
of data corresponding to one frame so that the code length of 16 bits per one data
is allocated thereto, and the input buffers 22 are realized by the allocation of a
part of addresses on the memory 2 shown in Fig. 1.
[0076] The controller 23E shown in Fig. 11 monitors the state of these input buffers 22
and transfers speech data of the length of one frame to the encoder 23A whenever the
input buffers 22 are filled with the data of the length of one frame.
[0077] In the encoder 23A, the input speech data of the length of one frame is subjected
to an information compression process, so that the data as a result of the compression
is held in the ring buffer memory 24. Several methods are considered as this compression
process. One example thereof is a difference data holding method shown in Figs. 12A
and 12B. Figs. 12A and 12B are typical graphs for explaining the compression process
in the encoder 23A in this embodiment.
[0078] In this compression process, "difference from the previous data" is calculated successively
from the leading data of each frame. In Fig. 12A, these difference data are expressed
as Δ1, Δ2,... The output data of the compression process are data obtained by arranging
the aforementioned difference data Δ1, Δ2,... into the code length of 8 bits per one
data after dividing the leading data of the frame into upper 8 bits and lower 8 bits.
One data of the input data has a digital code length of 16 bits. In the case of an
input signal such as a voice signal which changes sufficiently slowly compared with
the sampling interval, the difference from the previous sampling value is however
not so large that the difference can be expressed sufficiently in the code length
of 8 bits which is a half as shown in Fig. 12B. The capacity of data after the compression
process is therefore about a half the capacity of data before the compression process
but there is no missing from the contents thereof as long as the difference in the
middle of the process does not become too large to be expressed in the code length
of 8 bits.
[0079] In the storage into the ring buffer memory 24, data thus compressed into a half capacity
for each frame are arranged on the ring buffer memory 24 so that the time sequence
thereof is maintained.
[0080] In addition to this, a frame header is added to the leading of the compressed data
of each frame in order to indicate a break between frames. In the compression process
portion, the calculation of the sum of the absolute values of all data in the frame
as well as the aforementioned compression process in Fig. 12 is carried out and, at
the same time, the work of recording a result thereof as the power value of this frame
in the aforementioned frame header portion is carried out.
[0081] The determination of a frame to be subjected to the waveform expansion/reduction
process is performed on the basis of comparison between the power of the frame and
the threshold Th. Further, the silent-part elimination process is carried out on the
basis of comparison between the power of the frame and the threshold To.
[0082] It is preferable that these thresholds are not used as fixed values but changed in
accordance with the loudness of the input voice. For example, between the case of
use in a quiet room and the case of use in a situation of large background noise,
speech speed conversion, of course, cannot be performed well unless these thresholds
are adjusted well.
[0083] In a specific realization method, the maximum/ minimum values of frame power in the
past period of several seconds are stored so that the aforementioned thresholds are
determined on the basis of these values. For example, in the case where these thresholds
are to be changed at intervals of five seconds in the condition in which the time
length of one frame is 50 milliseconds (msec), the process of changing the threshold
Th can be carried out once whenever 100 frames are processed.
[0084] As described above, the power of each frame is always calculated with respect to
all inputs whenever information compression is performed for each frame by the encoder
in Fig. 11, so that information thereof is recorded in the frame header and held in
the ring buffer 24.
[0085] In this calculation of frame power, the maximum frame power Pmax and the minimum
frame power Pmin are compared with each other so that they are updated if necessary.
If the maximum frame power Pmax and the minimum frame power Pmin are provided so as
to be reset at intervals of five seconds (100 frames), the maximum frame power and
the minimum frame power in the past period of five seconds can always remain.
[0086] In the calculation of the thresholds, for example, Th and To are set to 10 % and
5 % the difference between the maximum frame power Pmax and the minimum frame power
Pmin, respectively. These are given by the following expressions (1) and (2).
[0087] Although the method of holding raw data in the ring buffer memory 24 in this embodiment
has been described above, the details of silent-part elimination will be described
below.
[0088] As explained in the previous embodiment with reference to Fig. 3, the function of
silent-part elimination serves to eliminate a silent part (a duration in which power
is lower than the voice-part/silent-part threshold To) continued for a time not smaller
than one second.
[0089] The silent-part elimination process is carried out by the silent-part elimination
process 23B shown in Fig. 11. This silent-part elimination process is a process independent
of a later-described process executed for each frame (hereinafter referred to as a
main process) so that the process is carried out after the main process for one frame
is terminated. In Fig. 14, the process is carried out between the judgment "time lag=0?"
(S143) and the judgment "Power ON?" (S144) (though not shown).
[0090] In the silent-part elimination process 23B, data accumulated in the input buffers
22 are added up at intervals of a predetermined unit (for example, 1/4 frame) to calculate
power, so that the silent-part elimination operation is started when the power "crosses
the voice-part/silent-part threshold upwards". This is because the point of time of
termination of the silent part is the point of time of the change of power from a
small value to a large value and, at any point of time except this point of time,
a judgment cannot be made as to whether the silent part continued up to that is longer
than one second or not.
[0091] When the silent-part elimination process is started, first the frame header of the
ring buffer memory 24 is retrieved retroactively to the past. Compressed data on the
ring buffer memory 24 are compressed for each frame and, as described above, the power
value of the frame is recorded in the frame header. If a frame having power lower
than To is continued for a time not smaller than one second, silent-part elimination
is enabled and the input pointer to the ring buffer memory 24 is returned to the point
of time in which the silent part has been continued for one second. The input of the
next compressed data is recorded so as to be overwritten from the returned point of
time. Accordingly, the silent part continued for a time not smaller than one second
just before the current point of time is always eliminated.
(Reading Operation)
[0092] The later-described main process in the apparatus of this embodiment is carried out
for each frame. The wave-form manipulation process 23D shown in Fig. 11 therefore
holds currently processed frame data, so that reading from the ring buffer memory
24 is performed collectively for each frame. That is, because addressing to the ring
buffer memory 24 can be made easily by a process of increasing the address one by
one simply in the case where data are collectively picked out, this case is better
in efficiency than the case where data are one by one picked out.
[0093] Because the data stored in the ring buffer memory 24 are compressed data as described
above, it is necessary that this compression is decoded into the original data before
the wave-form manipulation process. The decoder 23C shown in Fig. 11 is provided for
this purpose. First, leading two 8-bit data are arranged in the upper/lower of 16
bits with one-frame compressed data as an input to generate a leading data. Then,
the value of the third data of the compressed data is added to the leading data to
restore the second data. Then, the value of the next data of the compressed data is
added to the second data to restore the third data. Thereafter, the work of adding
the compressed data to the previously restored data successively is repeated thus
to restore all data of the frame.
[0094] The basic operation of the speech speed conversion apparatus in this embodiment will
be described below in brief.
[0095] As shown in Fig. 11, the speech converted into a digital signal by the speech input
devices 21 is first inputted to the input buffers 22. The speech signal read from
the input buffers 22 is fed to the encoder 23A contained in the CPU 23 of DSP1 (Fig.
1), subjected to the data compression process and stored in the ring buffer memory
24. The aforementioned speech signal is also fed to the silent-part elimination process
23B so that the silent-part elimination process is applied to the data stored in the
ring buffer memory 24 if necessary.
[0096] The data of the speech signal stored in the ring buffer memory 24 are frame-by-frame
fed to the decoder 23C, so that the compressed speech data are decoded by the decoder
23C and inputted to the wave-form manipulation process (speech speed conversion process)
23D. In the wave-form manipulation process (speech speed conversion process) 23D,
there is carried out speech speed conversion or the like on the basis of the condition
set by the function chooser 25. The digital speech data subjected to the speech speed
conversion process or the like are held in the output buffers 26. The data of the
output buffers 26 are read out so that the speech subjected to the speech speed conversion
process or the like is outputted from the speech output devices 27.
[0097] That is, the data of the output buffers 26 are read out so that the data are converted
from a digital value into an analog value at intervals of a set time by the D/A converter
6 as shown in Fig. 1. The analog signal thus obtained by this conversion is inputted
to the analog amplifier 9 via the low-pass filter 8 and outputted as a voice from
the binaural headphone 325 in listener's favorite amplitude of speech signal.
[0098] Referring to Figs. 11, 13 and 14, the process executed for each frame (hereinafter
referred to as a main process) in this embodiment will be described below.
[0099] Figs. 13 and 14 are flow charts showing the procedure of the main process in this
embodiment.
[0100] As shown in Fig. 13, in the main process in this embodiment, the "fade-in" step is
carried out (S131) with Powering ON. That is, just after the powering-on of the electric
source, data stored in the output buffers 26 are indefinite. Just after the powering-on
of the electric source, data having no relation to the speech may be therefore outputted.
In the case where the data are outputted intact from the speech output devices 27,
the data may form noise of a very large level. To prevent this, in this embodiment,
the values of data in the output buffers are adjusted by the execution of the fade-in
step so that the output of the speech output devices is increased gradually for a
predetermined time after the powering-on of the electric source irrespective of the
data in the output buffers. Specifically, whenever one data is transferred from the
output buffers to the D/A converter, the value of this data is multiplied by a coefficient,
so that this function is realized by changing the value of the coefficient with the
passage of time. This operation is executed by the controller 23E shown in Fig. 11.
[0101] Thereafter, the "through mode" process is started. In the through mode process, first,
the "reading pointer coincidence" step is carried out (S132). This reading pointer
coincidence process is a process in which when data from the speech input devices
21 is inputted, the same data is inputted to the output buffers 26 just after the
inputting of the data to the input buffers 22. This operation is realized by making
the value of the input pointer pointing an input address on memory coincident with
the value of the output pointer pointing an output data address on memory just after
the inputting of data to the input buffers 22. In Fig. 11, this operation is carried
out by the controller 23E.
[0102] After the reading pointer coincidence, in the through mode, the pushed states (ON
states) of the slow switch 104 and the repeat switch 105 are checked (S133 and S144).
In the case where both switches are in non-pushed states (OFF states), the situation
of the routine goes back to the previous reading pointer coincidence step (S132) so
that the through mode is continued. Accordingly, in the interruption process which
occurs while the through mode is continued, input data is always outputted intact,
so that the same speech as the input speech is outputted from the speech output devices
27.
[0103] In Fig. 11, the aforementioned respective switches such as the slow switch 104, the
repeat switch 105 and the reset switch 106 are contained in the function chooser 25
and the states thereof are checked by the controller 23E.
[0104] When the repeat switch 105 is pushed (turned ON) in the aforementioned through mode,
a repeat flag (not shown) prepared separately is set from 0 to 1 and the "reading
pointer return" routine is carried out (S135). Fig. 16 shows a flow chart of the internal
procedure of this reading pointer return routine. The explanation of Fig. 16 will
be described later.
[0105] When the slow switch 104 is pushed in the aforementioned through mode, the situation
of the routine skips to the routine for "setting parameter in expansion process" (S136)
as shown in Fig. 13. Fig. 17 shows a flow chart of the internal procedure of this
routine. The explanation of Fig. 17 will be described later.
[0106] After the setting of parameter in the expansion process is performed, the one-frame
waveform expansion/reduction process is carried out (S137). Figs. 18 and 19 show flow
charts of the internal procedure of this one-frame waveform expansion process. The
explanation of Figs. 18 and 19 will be described later.
[0107] After the aforementioned one-frame process is completed, the situation of the routine
goes to the step for checking the states of the respective switches as to whether
each switch is pushed or not. Hereupon, because the one-frame process is terminated
within the time length of one frame, the process is completed in the order of tens
of milliseconds (msec). On the other hand, switching devices which are such that the
pushed states are maintained for a time not shorter than the duration of pushing,
no matter how short, in the case where the respective switches (pushbuttons) are pushed
by a user, are used in this apparatus. Accordingly, the situation of the routine can
be shifted to a desired operation with such a time lag that a feeling of slow response
is not given to the user, as long as the pushed states of the switches are checked
whenever the one-frame process is carried out.
[0108] First, whether the reset switch 106 is pushed down or not is checked (S138). If the
reset switch 106 is pushed down (in the case of Yes in S138), the current mode forcedly
goes to the through mode at this point of time.
[0109] If the reset switch 106 is not pushed down (in the case of No in S138), whether the
slow switch 104 is pushed down or not is checked (S139) as shown in Fig. 14. If the
slow switch 104 is pushed down (in the case of ON: in the case of Yes in S139), the
situation of the routine goes back to the routine in which parameter is set in the
expansion process so that the wave-form expansion process is applied to the next frame
continuously. In the case where the slow switch 104 is pushed down continuously, the
situation of the routine circulates in this loop continuously. Further, even in the
case where the slow switch 104 is pushed down continuously during the repeat reproduction
and catching-up reproduction, the situation of the routine circulates in this loop
continuously.
[0110] In the case where the slow switch 104 is opened (in the case of OFF: in the case
of No in S139), the situation of the routine goes to the next judgment as to the repeat
pushed-down state (S140). The case where the pushing-down of the repeat switch 105
is detected at this point of time is either the case where "the repeat switch is pushed
at the time of repeat reproduction" or the case where "the repeat switch is pushed
at the time of catching-up reproduction". In either case, the situation of the routine
branches into the reading pointer return routine so that the repeat reproduction is
started from the silent part near a position returned back to the past by about five
seconds from the current position of the output pointer of the ring buffer memory
24.
[0111] In the case where the slow switch 104 is opened and the repeat switch 105 is not
pushed down, the situation of the routine goes to the following repeat end judgment
(S141). The repeat operation is continued until the output pointer goes back to the
output pointer position where the through mode was changed to the repeat operation
by the pushing-down of the repeat switch 105. That is, in the case where this judgment
shows that the repeat mode is used currently and that the position of the output pointer
does not yet go back to the output pointer position where the repeat was started,
a processing loop is formed so that the situation of the routine goes back to the
aforementioned one-frame waveform expansion/reduction process. The subsequent process
is a process for catching-up reproduction.
[0112] After the repeat reproduction is terminated or after the slow reproduction is terminated,
the situation of the routine goes to the catching-up reproduction. The catching-up
reproduction means an operation in which a time lag from the real time as caused by
the repeat or slow reproduction is made up for by fast reproduction realized by the
repetition of the one-frame waveform reduction process. In the process in this portion,
the setting of parameter is performed for the waveform reduction process for the catching-up
reproduction (S142).
[0113] The quantity of the lag from the real time increases when the repeat button is pushed
down or when the waveform expansion process is carried out. On the contrary, it decreases
when the waveform reduction process is carried out.
[0114] The process for increasing/decreasing this quantity of the lag is however not shown
in Figs. 18 and 19 which show the procedure (flow chart) of the one-frame waveform
expansion/reduction process which will be described later.
[0115] A judgment is made as to whether the quantity of the lag from the real time is present
or absent (S143). In the case where the quantity of the time lag is present yet, a
processing loop is formed so that the catching-up reproduction is continued. That
is, the operation in which the catching-up reproduction is continued until the quantity
of the time lag becomes zero, is realized by this judgment.
[0116] On the other hand, in the main process described above, the time lag from the real
time as caused by the speech speed conversion or repeat operation is managed as "lag
quantity" by using a counter.
[0117] Although the time lag from the real time can be managed also as difference between
the position on the ring buffer 24 where the current sampled data is inputted and
the position on the ring buffer 24 where the position of data outputted is inputted,
that is, as difference between addresses pointed by two pointers, the management method
using the lag quantity counter as described above is employed in the present invention.
This is because the quantity of the lag may be unable to be expressed correctly in
the address difference between the input and output pointers on the ring buffer 24.
[0118] For example, assuming that the memory address space allocated to the ring buffer
24 is from address 0 to address 1000, then the ring buffer 24 is realized by the handling
of the memory address space in a manner of "next to address 1000, jump to address
0" in the program. Therefore, in the case where the input and output pointers lie
across this break between addresses, the quantity of data therebetween cannot be expressed
easily by taking the difference between address values simply. In order to known the
quantity of data between these pointers by address calculation, address value calculation
including complex classification that takes into account the histories of the two
pointers up to their current positions is required.
[0119] In the speech speed conversion apparatus according to the present invention, whenever
reading/ writing of data is performed with respect to the ring buffer 24, the value
of the lag quantity counter is changed to thereby manage the quantity of the time
lag to prevent the increase of the quantity of processing based on the complex address
calculation.
[0120] The aforementioned main process is provided in the form of an infinite loop in which
the aforementioned process is repeated until the electric source switch is turned
off (S144).
[0121] In the case where the electric source switch is turned off, the process is not suddenly
stopped but continued for a predetermined time before it is stopped (mute) (S145).
During this time, here is carried out such a process that the loudness of the output
voice is reduced gradually.
[0122] Specifically, in an interruption process in the same manner as in the fade-in operation
which is the first step, whenever one data from the output buffers 26 is transferred
to the D/A converter 6 shown in Fig. 1, the value of this data is multiplied by a
coefficient so that this function is realized by changing the value of this coefficient
with the passage of time. This operation is carried out by the controller 23E shown
in Fig. 11.
[0123] Fig. 15 is a state transition view typically showing transition between respective
modes in this embodiment as described above. The way of mode switching on the basis
of the switching operation will be understood well from Fig. 15. Further, the standby
mode in Fig. 15 will be described later in detail.
[0124] The details of processing operations in the respective routines in this embodiment
described above will be described below in detail.
[0125] Fig. 16 is a flow chart showing the procedure of the reading pointer return routine.
[0126] The reading pointer return routine in this embodiment is a specific method for changing
the value of the output pointer pointing the position of data to be read from the
ring buffer 24, which method is necessary for realizing a repeat function.
[0127] As shown in Fig. 16, first, the position of the output pointer at the current point
of time is set to Pout (S161). Then, the quantity of the lag from the real time at
the current point of time is set to D (S162).
[0128] A judgment is made as to whether the quantity of the lag is already large at the
current point of time so that the quantity of the lag will exceed the size of the
ring buffer memory 24 if the quantity of the lag is further increased by 5 seconds
(B5) (S163). In the case where a decision is made as a result of the judgment so that
the quantity of the lag exceeds the size of the ring buffer memory 24 (the case of
Yes in S163), this routine is terminated without any change of Pout and D (S169 and
S170).
[0129] In the case where the quantity of the lag can be increased by five seconds (B5) (the
case of No in S163), the pointer is returned back by five seconds (-B5) and the quantity
of the lag is increased by five seconds (+B5) (S164).
[0130] Then, a process of searching back for the silent part is started so that the start
of the repeat is made a pause of the speech. First, data is accessed backward from
the position pointed by Pout on the ring buffer memory 24 to thereby calculate one-frame
power (S165).
[0131] At this time, if the output pointer is returned back (-F) by one frame (F), the quantity
of the lag is also further increased by one frame (+F). Here, a judgment is made as
to whether or not the total quantity of the lag exceeds the size of the ring buffer
memory if the quantity of the lag is further increased by one frame (S166), and in
the case where a decision is made as a result of the judgment so that the total quantity
of the lag exceeds the size of the ring buffer memory (the case of Yes in S166), this
search for the silent part is stopped and Pout and D at this time are set as the output
pointer vale and the lag quantity respectively (S169 and S170) whereafter this routine
is terminated.
[0132] In the case where the total quantity of the lag does not exceed the size of the ring
buffer memory (the case of No in S166) though the output pointer is returned back
by one frame, Pout is returned back by the length of one frame and the quantity D
of the lag is increased by one frame (S167) whereafter the calculated one-frame power
W and the voice-part/silent-part threshold are compared with each other (S168). In
the case where the one-frame power W is smaller than this threshold, a decision is
made that a pause of the speech is present near this frame (the case of No in S168)
and Pout and D at this time are set as the output pointer value and the lag quantity
respectively (S169 and S170) whereafter this routine is terminated.
[0133] In the case where the one-frame power W is larger than this threshold (the case of
Yes in S168), the pointer is further returned back by one frame and the search for
the silent part is continued to detect the silent part in the same manner as described
above but the search is continued until the quantity of the lag exceeds the size of
the ring buffer memory. Thus, the output pointer return process at the time of the
pushing of the repeat switch 105 is completed.
[0134] Figs. 17 and 18 are flow charts showing the procedure of the one-frame waveform expansion/reduction
process in this embodiment.
[0135] In the one-frame waveform expansion/reduction process in this embodiment, as shown
in Figs. 17 and 18, first of all, power of current one-frame data is calculated (S171).
Then, this power value P is compared with the threshold Th (S172). A frame having
higher power than the threshold Th is subjected to the following process. Data in
a frame having lower power than the threshold Th may be outputted intact to be transferred
to the ring buffer 24 (S173) or may be subjected to a consonant emphasis process and
then transferred to the output buffers 26. Whether consonant emphasis is to be performed
or not to be performed is determined by the state of the mode switch which is one
of hidden switches.
[0136] As a specific method for realizing the consonant emphasis process, there is, for
example, considered a method in which a frame having lower power than the threshold
Th just prior to a frame having higher power than the threshold Th is regarded as
a consonant and the values of data in the frame are increased.
[0137] In the case of a frame having higher power than the threshold Th in the aforementioned
power judgment (the case of Yes in S172), first the number of data in one frame is
stored in a variable Z indicating the quantity of not-yet-processed data (S174) and
then the pitch extraction process is carried out from the leading of the frame (S175).
Several methods are considered as the pitch extraction process. For example, the pitch
length at the leading of the frame is extracted on the basis of a well-known algorithm
using autocorrelation.
[0138] Then, the quantity of data corresponding to twice the thus extracted pitch length
is compared with the quantity of not-yet-processed data (S176), and in the case where
the quantity Z of not-yet-processed data is smaller the quantity of data twice as
much as the extracted pitch, this process is stopped.
[0139] In the case where the quantity Z of not-yet-processed data is equal to or more than
the quantity of data twice as much as the extracted pitch (the case of Yes in S176),
a pre-transfer process is carried out (S178). The pre-transfer process means a process
in which a part of input data is transferred intact to the output buffer 26 before
a reproduced wave pattern insertion process which will be described later. The pre-transfer
process corresponds to the portion of (b) in Fig. 2. The number of data to be transferred
by the pre-transfer process is set with the pitch as a unit but the number thereof
varies in accordance with the wave-form expansion/reduction rate. The number Npf is
set (S177) by a parameter setting routine which will be described later with reference
to Fig. 19. After the pre-transfer process is carried out (S178), the quantity Z of
not-yet-processed data is reduced by the number of transferred data (S179).
[0140] Then, the position of application of a Δ window function for generating a reproduced
wave pattern is determined (S180) in accordance with another parameter Ptri set in
the parameter setting routine shown in Fig. 19. What differs between expansion and
reduction is only the position on current wave to which the window function is applied
in the case where a reproduced wave pattern is generated by using the Δ window function.
[0141] That is, in the case of waveform expansion, as shown in Fig. 2, the Δ window function
is applied so that waveform of the length of two pitches is generated from waveform
of the length of one pitch (S181). Contrariwise in the case of waveform reduction,
as shown in Figs. 20 to 22, the Δ window function is applied so that waveform of the
length of two pitches is generated from waveform of the length of three or four pitches.
The quantity of the lag from the real time is changed by the insertion of the reproduced
wave pattern (though not shown).
[0142] After the reproduced wave pattern insertion process, the quantity Z of not-yet-processed
data is reduced by the number of the thus processed data (S182).
[0143] Then, the pitch extraction process is carried out again (S183). This is a process
which is adapted to the fact that the human voice pitch always varies and in which
the error between the actual pitch length and the pitch length for processing is reduced
by extracting the pitch again to thereby consequently prevent the increase of distortion
in waveform after expansion/reduction.
[0144] Then, as shown in Fig. 18, the quantity of data twice as much as the newly extracted
pitch is compared with the number of not-yet-processed data (S184). If the quantity
of data of the length of two pitches does not remain (the case of No in S184), this
process is stopped immediately.
[0145] If the quantity of data larger than the length of two pitches remains (the case of
Yes in S184), a post-transfer process is carried out. The post-transfer process means
a process similar to the pre-transfer process and corresponds to the portion of (e)
in Fig. 2 in the previous embodiment. The number of data to be transferred by the
post-transfer process is set with the pitch as a unit but the number thereof varies
in accordance with the waveform expansion/reduction rate. The number Npf is set (S185)
by the parameter setting routine which will be described later with reference to Fig.
19. After the pre-transfer process (S186), the quantity Z of not-yet-processed data
is reduced by the number of transferred data (S187).
[0146] The aforementioned procedure is continuously repeated until this procedure is stopped
on the basis of comparison between the quantity of data of the length of two pitches
and the quantity of not-yet-processed data which comparison is performed twice in
the middle of this procedure.
[0147] Fig. 19 is a flow chart showing the procedure of the parameter setting routine for
setting parameter for the expansion process in this embodiment.
[0148] In practice, the parameter setting routine shown in Fig. 19 is used twice in the
main process shown in Figs. 13 and 14. Once thereof is used just before the aforementioned
one-frame waveform expansion/reduction routine and the other once is used in a "process
for setting parameter for the reduction process" after the repeat end judgment.
[0149] Hereupon, the waveform reduction process is a process for realizing the "catching-up
process (fast hearing process)" which is continued after slow hearing or after repeating.
When the generation of a reproduced wave pattern with use of the Δ window function
as carried out in the waveform expansion process is carried out while the position
subjected to the window function is shifted in a direction reverse to the case of
expansion, waveform reduction is obtained.
[0150] In Fig. 19, first a discrimination is made between expansion and reduction (S191).
This discriminates one of the aforementioned twice from the other.
[0151] In the case of parameter setting for the expansion process, after this discrimination,
the position of the speech speed selection switch is checked (S192), the expansion
rate
e is set in accordance with the position of the switch (S193), the positions of parameters
Npf and Npr used in the waveform expansion process are set in accordance with the
expansion rate e, and parameter Ptri indicating the position of the start of weighted
summation with respect to the Δ window as carried out in the waveform expansion process
is set, whereafter this routine is terminated.
[0152] On the other hand, in the case of parameter setting for the reduction process, the
right flow in Fig. 19 is carried out. First, the position of the catching-up mode
switch (which is one of hidden switches) is checked (S196) and which of "jump", "fast
hearing" and "one-fold" the catching mode (Mcat) is set to is checked (S197 and S198).
[0153] When set to "jump", the catching mode (Mcat) practically serves not to "catch up"
but to jump actually just at the moment that the hold of the slow switch (slow pushbutton)
is released (S199). Specifically, a branching process for forcedly returning back
to the through mode is carried out in this portion.
[0154] When the catching-up mode (Mcat) switch is set to "one-fold", the reduction rate
s is set to one-fold (S200) and the situation of the routine goes to step 202.
[0155] When the catching-up mode (Mcat) switch is not set to "one-fold", the reduction rate
s is set through the center flow in Fig. 19 at the time of the catching-up mode (S201),
the values of parameters Npf and Npr used in the waveform reduction process are set
in accordance with the reduction rate
s (S202) and, further, parameter Ptri indicating the position from which weighted summation
with respect to the Δ window as carried out in the waveform reduction process is started
is set (S203), whereafter this routine is terminated.
[0156] Figs. 23 and 24 are flow charts showing the procedure of the total operation of a
speech speed conversion apparatus provided with a continuous speech speed conversion
means according to the present invention.
[0157] As shown in this embodiment of Figs. 23 and 24, continuous speech speed conversion
in the speech speed conversion apparatus provided with the continuous speech speed
conversion means is substantially an operation in which the pushing of the slow switch
(slow pushbutton) 104 is continued so that slow reproduction is continued. The time
lag is however accumulated rapidly when waveform expansion at a constant waveform
expansion rate is continued, so that the quantity of the lag from the real time finally
exceeds the capacity of the ring buffer 24 to make it impossible to continue slow
hearing any more.
[0158] The continuous speech speed conversion means is therefore provided to mix a waveform
expansion period and a waveform reduction period reverse thereto at the time of slow
reproduction so that the lag from the real time is not increased rapidly.
[0159] Although several methods are considered as means for changeover to the continuous
speech speed conversion mode, it is rather easy to understand that a clear distinction
is made between the case where the pushing of the slow switch is continued simply
for a considerably long time and the case where entry into the continuous speech speed
conversion mode is intended. Accordingly, this changeover is realized , for example,
by using switching parts by which the slow switch is locked when double-clicked (pushed
twice at a short time interval) or when slid laterally while pushed.
[0160] The respective steps in the flow charts shown in Figs. 23 and 24 in this embodiment
are quite the same as in the procedure of the main process described above with reference
to Figs. 13 and 14.
[0161] In the continuous speech speed conversion means in this embodiment, whether the continuous
speech speed conversion process is intended or not is checked in step S231 in Figs.
23 and 24 (S231). If the continuous speech speed conversion process is intended (the
case of Yes in S231), the one-frame waveform expansion/ reduction process is carried
out (S232). Then, a judgment is made as to whether the reset switch 106 is pushed
(turned on) or not (S233). In the case where the reset switch 106 is not pushed (turned
off), counting up by one frame is performed (S234) and a judgment is made as to whether
the expansion period is intended or not (S235). If the expansion period is intended
(the case of Yes in S235), the situation of the routine goes back to the step S232.
If the expansion period is not intended (the case of No in S235), parameter is set
for the reduction process (S236). Then, whether the lag quantity is zero or not is
checked (S237). In the case where the lag quantity is zero (the case of Yes in S237),
the situation of the routine goes back to the step S232. In the case where the lag
quantity is not zero (the case of No in S237), parameter is set for the expansion
process (S238) and the frame counter is reset (S239) whereafter the situation of the
routine goes back to the step S232 so that the continuous speech speed conversion
operation is repeated. In the case where the continuous speech speed conversion process
is not intended in the aforementioned step S231 (the case of No in S231), the mode
is shifted to the aforementioned main process routine (through mode).
[0162] That is, the continuous speech speed conversion means in this embodiment is a method
in which slow reproduction and catching-up reproduction are repeated alternately at
intervals of a preliminarily set time. According to this method, catching up to the
real time at intervals of a predetermined time is always made possible. The management
of the changeover between waveform expansion and waveform reduction is performed on
the basis of the count of the number of frames. For example, when the expansion process
for a number of frames corresponding to about five seconds is completed, the reduction
process is then carried out repeatedly, and when the lag quantity reaches zero, the
frame count is returned to zero and the expansion process is repeated again.
[0163] Further, escape out of the continuous speech speed conversion mode is achieved by
the pushing-down of the reset switch 106 to return the mode to the through mode.
[0164] Referring next to Figs. 25 and 26, there are flow charts showing the procedure of
the total operation of a speech speed conversion apparatus provided with a continuous
speech speed conversion means different from that in the embodiment shown in Figs.
23 and 24.
[0165] The continuous speech speed conversion in the speech speed conversion apparatus provided
with the continuous speech speed conversion means in this embodiment is an operation
for applying waveform expansion to a frame of high power and applying waveform reduction
to a frame of low power.
[0166] In the continuous speech speed conversion means in this embodiment, whether the continuous
speech speed conversion process is intended or not is checked in step S251 in Figs.
25 and 26. If the continuous speech speed conversion process is intended (the case
of Yes in S251), a judgment is made as to whether the reset switch 106 is pushed (turned
on) or not (S252). In the case where the reset switch 106 is not pushed (turned off),
one-frame power is calculated (S253). Then, whether the calculated one-frame power
is higher than the threshold Th or not is checked (S254). In the case where the calculated
one-frame power is lower than the threshold Th (the case of No in S251), parameter
is set for the reduction process (S256) and the situation of the routine goes to step
S257. In the case where the calculated one-frame power is higher than the threshold
Th (the case of Yes in S254), parameter is set for the expansion process (S255) and
the one-frame waveform expansion/reduction process is carried out (S257) whereafter
the situation of the routine goes back to the step S252 so that the continuous speech
speed conversion operation is repeated. In the case where the continuous speech speed
conversion process is not intended in the aforementioned step S251 (the case of No
in S251), the mode goes to the aforementioned main process routine (through mode).
[0167] That is, when entry into the continuous speech speed conversion mode is made, one-frame
power is calculated so that either expansion or reduction is applied to each frame
on the basis of comparison between the one-frame power and the threshold Th. Escape
out of the continuous speech speed conversion mode is achieved by the pushing-down
of the reset switch 106.
[0168] In this embodiment, the speech is made slow or fast in accordance with the power
thereof.
[0169] In ordinary conversation, there is generally a tendency that an important portion
which must be told to a listener is louder-voiced but a portion not so important is
smaller-voiced. Accordingly, the speech speed control in this embodiment is characterized
in that an output voice nearer to the natural voice is obtained.
[0170] The probability of appearance of the high-power portion and the probability of appearance
of the low-power portion are however not always equal to each other, so that catching
up to the real time at intervals of a predetermined time as in the case of the previous
embodiment of Figs. 23 and 24 is not always ensured.
[0171] Further, as a method of instruction from the user to attain entry into the continuous
speech speed conversion mode, there are considered a method in which the slow switch
(slow pushbutton) 104 is pushed and then slid laterally to thereby be locked, a method
in which the slow switch (slow pushbutton) 104 is double-clicked (pushed down twice
in succession at a short time interval), and so on. If these methods are used, the
respective intentions of "executing slow reproduction" and of "continuing" the operation
by the pushing of the slow switch (slow pushbutton) 104 can be expressed in difference
in the way of pushing of the same pushbutton so that there can be provided an operating
system which is more intuitive and easier to understand compared with the case where
a continuous speech speed conversion pushbutton is provided separately.
[0172] The embodiments up to now have been described above upon the assumption that the
waveform "expansion rate" in the case of "slow" reproduction based on the waveform
expansion process is determined by the setting of the "speech speed setting switch"
provided on the apparatus and that a "default value" determined (in the program) in
advance is used as the waveform "reduction rate" in the case of "fast" reproduction
based on the waveform reduction process.
[0173] The function of "making freely coming and going on the time axis of the speech possible"
provided by this apparatus, however, can be used by the user more intuitively when
an "accelerator type switch" shown in Fig. 27 is provided so that the waveform expansion/
reduction rate is changed by this switch.
[0174] When the accelerator type switch is set to the center, the through mode in the aforementioned
embodiments is executed. When the slide switch is pulled to the front, the waveform
expansion process is employed so that "slow reproduction" is executed with a lag from
the real time. When the slide switch is then pushed to the back, the waveform reduction
process is employed reversely so that fast reproduction is executed (until the lag
from the real time reaches zero).
[0175] During this, the controller changes the waveform expansion/reduction rate in accordance
with the distance from the center of the slide switch. As is obvious from the explanation
of the aforementioned embodiment of Figs. 20 through 22, the expansion/reduction rate
however can be set to no value but several values which can be expressed in integer
rates. In practice, therefore, the expansion/reduction rate may be preferably set
so that several stages of values can be selected in accordance with the distance from
the center of the slide switch.
[0176] Further, when a lever is provided so that force acts to return the lever to the center
automatically when the user releases the finger's hold of this accelerator type switch,
it becomes easy for the user to keep the slide switch in another intermediate position
than the center, so that an operating method easier to handle can be realized. Hereupon,
the production of the force to return this lever to the center can be realized by
two springs which are provided in the inside of a switching device to give thereto
a mechanical means such as means of pulling the lever by uniform force from the opposite
sides, and so on.
[0177] Referring next to Fig. 28, there is a block diagram showing the functional structure
of a speech speed conversion apparatus provided with an AV control means. Referring
to Fig. 29, there is a view for explaining the operation of the AV control means in
this embodiment of Fig. 28. Referring to Figs. 30 and 31, there are flow charts showing
the operating procedure of the main process in the speech speed conversion apparatus
provided with the AV control means in this embodiment.
[0178] As shown in Fig. 28, the speech speed conversion apparatus provided with the AV control
means in this embodiment is provided as a functional structure in which an AV controller
28 is added to the functional structure of the speech speed conversion apparatus in
the aforementioned embodiment shown in Fig. 11 and connected to the controller 23E.
[0179] The aforementioned controller 23E judges whether a condition for outputting an AV
control signal is satisfied or not and operates the AV controller 28 to start/stop
the outputting of the AV control signal.
[0180] As shown in Fig. 29, the AV control means is a software in which the AV control signal
is outputted when the quantity of the lag from the real time as caused by slow or
repeat reproduction exceeds a predetermined value (30 seconds in Fig. 29) and in which
the outputting of the same signal is stopped when the lag quantity then reaches zero
via catching-up reproduction.
[0181] The AV control signal is picked out of this apparatus and used for temporarily stopping
the reproducing operation of a recording/reproducing apparatus such as a tape recorder,
a video tape recorder, or the like. By this means, it is made possible to continue
the slow hearing of an input voice which is continued for such a long time that exceeds
the capacity of the ring buffer 24 in this apparatus.
[0182] In Figs. 30 and 31, the portion surrounded by the broken line is a step showing the
operating procedure of the AV control means added to the flow charts in Figs. 12 and
13. In this step, a judgment is made as to whether the condition for outputting the
AV control signal is satisfied or not (S301). The judgment with respect to the outputting
of the AV control signal is realized by a judgment as to whether the quantity of the
lag from the real time in a loop in which the one-frame waveform expansion/reduction
process is repeated for slow or repeat reproduction is over 30 seconds or not (S301)
and by starting the outputting of the AV control signal when the lag from the real
time is over 30 seconds (S302).
[0183] On the other hand, the process of stopping the AV control signal is carried out just
after the judgment "lag quantity=0?" which is a judgment for escape out of the loop
of the catching-up reproduction process (S303).
[0184] Referring next to Figs. 32A, 32B and 32C, there are views for explaining the arrangement
of a microphone in a speech speed conversion apparatus according to the present invention.
The reference numeral 101 designates a body of the speech speed conversion apparatus;
321, a microphone; 322, a prop capable of expansion and contraction for supporting
the microphone 321; 323, a flexible prop for supporting the microphone 321; and 324,
an electric cord for electrically connecting the microphone 321 to the speech speed
conversion apparatus body 101 by wire.
[0185] Fig. 33 is a view showing a modified example of this embodiment of Fig. 32, in which
the reference numeral 101 designates a body of the speech speed conversion apparatus;
104 a slow switch; 105, a repeat switch; 106, a reset switch; 321, a microphone; 324,
an electric cord for electrically connecting the microphone 321 to the speech speed
conversion apparatus body 101; 325, an earphone; and 300, a connection member.
[0186] In the arrangement of the microphone in the speech speed conversion apparatus of
this embodiment, as shown in Fig. 32A, the microphone 321 is supported by the prop
322 capable of expansion and contraction. Because the aforementioned supporting of
the microphone 321 makes the microphone 321 far away from the speech speed conversion
apparatus body 101, the rustle of clothes can be prevented from being produced when
the apparatus body is put into a breast pocket in use.
[0187] Alternatively, as shown in Fig. 32B, the microphone 321 is supported by the flexible
prop 323. Being supported by such a manner, the microphone 321 is separated from the
speech speed conversion apparatus body 101, and can be bent in a desired direction.
Accordingly, the rustle of clothes can be prevented from being produced when the apparatus
body is put into a breast pocket in use.
[0188] Further, as shown in Fig. 32B, the microphone 321 and the speech speed conversion
apparatus body 101 are electrically connected to each other by wire (or wireless).
The S/N ratio can be improved because the microphone 321 and the speech speed conversion
apparatus body 101 are electrically connected to each other by wire (or wireless)
as described above so that the microphone 321 is disposed near the listener independently
of the speech speed conversion apparatus body 101.
[0189] Further, as shown in Fig. 33, the speech speed conversion apparatus body 101 and
the microphone 321 are electrically connected to each other by the electric cord through
the earphone 325 and the connection member 300. Further, operation switches such as
the slow switch 104, the repeat switch 105, the reset switch 106, and so on, are provided
on the aforementioned connection member 300. In this manner, not only the rustle of
clothes can be prevented from being produced when the apparatus body is put into a
breast pocket in use but also both the S/N ratio and the handling property can be
improved.
[0190] Referring next to Fig. 34, there is a view for explaining a lag time display means
in a speech speed conversion apparatus according to a further embodiment of the present
invention. The reference numeral 341 designates a display portion; and 342, a display
screen.
[0191] As shown in Fig. 34, the lag time display means in this embodiment displays how much
the speech of a speaker is delayed from the real speech speed at the time of the aforementioned
slow/repeat reproduction. For example, assuming that one human image represents the
time lag of 10 seconds in Fig. 34, then the time lag from the current time is expressed
in the number of displayed human images. In this manner, the quantity of time lag
from the current time is recognized visually. Accordingly, both speaker and listener
can adjust the speech speed conversion easily, so that this apparatus can be used
so as to be easy to handle.
[0192] The visual display of the time lag is realized, for example, by the provision of
a liquid crystal display in the front center of the speech speed conversion apparatus
body shown in Fig. 6 and by the display of the display screen as shown in Fig. 34
on the liquid crystal display. Further, this display portion is controlled by a "liquid
crystal display driver" (not shown) connected to the controller 23E in Fig. 11.
[0193] Because the quantity of the time lag to be displayed is continuously managed by the
lag quantity counter in the main process shown in Figs. 13 and 14, the numerical value
of this lag quantity counter can be converted at the conversion rate of one to 10
seconds so that a corresponding number of human images can be indicated on the aforementioned
display. This displaying operation is carried out by the controller 23E in Fig. 11
through the aforementioned display driver and the timing of rewriting the display
is sufficient as long as the rewriting is performed whenever the processing of one
frame is completed. For example, this displaying process is carried out between the
steps S137 and S138 in Fig. 14.
[0194] Referring to Fig. 35, there is a view for explaining an electric source device in
a speech speed conversion apparatus according to the present invention. The reference
numeral 1000 designates an apparatus portion concerning the speech speed conversion
apparatus; 1, a DSP; 5, an A/D converter; 6, a D/A converter; 9, an analog amplifier;
10, an analog amplifier; 1001, an electric source; 1002, an electric power supply
line; and 1003, a changeover switch.
[0195] In the speech speed conversion apparatus in this embodiment, as shown in the state
transition view of Fig. 15, a standby mode is provided besides the through mode so
that entry into the standby mode is made automatically when the through mode is continued
for a predetermined time. That is, when either slow switch or repeat switch is pushed
(turned on), clock frequency is heightened so that each process is carried out.
[0196] Further, in the through mode, the DSP 1 operates with fast clock but power is wasteful
because the speech speed conversion process or the like is not executed. In the standby
mode, therefore, the operating clock for the DSP 1 is lowered so that only I/O of
data is performed to thereby reduce consumed electric power. Further, only storage
into the memory is performed. In this manner, a voice memory function is realized.
[0197] Further, as shown in Fig. 35, at the time of an analog through mode, the changeover
switch 1003 is connected to a contact side to cut off the electric power supply line
1002 and also connected to a contact side to connect the analog amplifiers 10 and
9 directly so that electric power is not supplied to the DSP 1, the A/D converter
5, the D/A converter 6 and peripheral digital circuits. At this time, the storage
into the memory is not performed. That is, I/O analog systems are connected directly
so as to be operated simply as an analog amplifier. The aforementioned changeover
switch is provided as a switch of three stages, namely, an ON stage, an OFF stage
and an ON-OFF intermediate stage as shown in Fig. 35, so that the analog through mode
is provided.
[0198] As is obvious from the above explanation, in accordance with this embodiment, a switch
of three stage consisting of an ON stage, an OFF stage and an ON-OFF intermediate
stage is formed so that the analog through mode is provided. Accordingly, not only
reduction in electric power can be attained but also the range of use of the electric
source can be widened.
[0199] Referring to Fig. 36, an embodiment in which the speech speed conversion means according
to the present invention is applied to a telephone will be described. The reference
numeral 2000 designates the speech speed conversion means according to the present
invention; 3000, a body of the telephone; 3001, a transceiver; and 3002, a telephone
line.
[0200] As shown in Fig. 36, the telephone in this embodiment is formed by inserting the
speech speed conversion means 2000 according to the present invention between the
handset 3001 and the telephone body 3000. The speech speed conversion means 2000 is,
for example, shaped like a mount on which the telephone body 3000 is put.
[0201] Further, in the case of a cordless handset or cordless child transceiver 3001, the
speech speed conversion means 2000 is inserted between the transceiver 3001 and the
telephone body 3000 by wireless connection.
[0202] Further, the speech speed conversion means according to the present invention may
be used as a speech speed conversion means in a switching system so that it can be
operated at the user's request.
[0203] In the aforementioned structure, a voice can be heard over the telephone slowly.
Further, because the voice is fed back as a through voice to the speaker side as well
as the voice can be heard slowly to the listener so that the speaker can speak ordinarily
at the time of telephone conversation with the aged or the like, there is no fear
of hard speaking.
[0204] Further, it is unnecessary that any A/D means is provided in the inside of the speech
speed conversion means as long as the speech speed conversion means is provided as
a digital circuit.
[0205] Referring to Fig. 37, description will be made as to an embodiment in which the speech
speed conversion means according to the present invention is applied to a premises
broadcasting system. The reference numeral 2000 designates the speech speed conversion
means; 321, a microphone; 325,an earphone; 4003, an amplifier; and 4004, a speaker.
[0206] In the telephone in this embodiment, as shown in Fig. 37, the speech speed conversion
means 200 according to the present invention is inserted between the microphone 321,
the earphone 325 and the amplifier 4003 for the speaker 4004.
[0207] In the aforementioned structure, the listener can hear a voice at a suitable speech
speed even in the case where the speaking person does not control the speech speed
conversion operation. For example, even in the case where the speaking person talks
volubly at a high speech speed (impetuous speed) selfishly, the listener can hear
at a suitable speech speed.
[0208] Further, the listener can hear at a suitable speech speed from a speaker even in
the case where a speaking person speaks slowly.
[0209] As is obvious from the above explanation, the present invention can be applied to
technical fields requiring speech speed conversion, such as for example hearing aids,
learning of languages, abroad traveling, music, and so on, besides telephones, telephone
line switching systems and premises broadcasting.
[0210] For example, in goods for learning of languages and for abroad traveling, the present
invention can be applied to the following cases.
(1) A voice recorded is heard continuously and slowly.
(2) The expansion rate is changed in accordance with the improvement in level.
(3) A portion which was hard to understand when heard at an ordinary speed is heard
repeatedly and slowly.
(4) After heard slowly, a voice is heard again at its original speed.
(5) After slow repeat, pronounce is imitated.
(6) An imitated voice is heard in comparison with its original voice.
(7) A plurality of persons hear one source simultaneously at their favorite speech
speeds.
[0211] Further, in combination with a digital audio apparatus such as a tape recorder, a
CD recorder, an MD recorder, and so on, it is unnecessary that any A/D converter is
provided in the speech speed conversion apparatus as long as the audio apparatus has
a digital output.
[0212] Further, in the music purpose, the present invention can be applied as long as changes
are made as follows.
[0213] The judgment based on the power of an expanded frame is not carried out (because
tempo is disordered).
[0214] The pitch extraction range is widened compared with the case of a voice.
[0215] The waveform expansion process is carried out on the basis of the pitch of a fixed
length. In the case of a voice thereof, the pitch is detected so that processing is
made on the basis of the detected pitch.
[0216] A foot switch is provided so that a converting operation can be carried out by the
foot switch. This makes it possible to control a music instrument while playing.
[0217] Although the present invention has been described specifically on the basis of the
embodiments thereof, it is a matter of course that the present invention is not limited
to the aforementioned embodiments and that various changes may be made within the
scope of the claims.
[0218] In brief, effects obtained by typical embodiments of the present invention disclosed
in this application are as follows.
(1) Because the speech speed conversion apparatus can be used not only for a voice
one-sidedly given to the listener such as a radio voice but also in the situation
of conversation, a voice to be subjected to speech speed conversion can be selected
by the listener without any disturbance of listener's own speech.
Further, in hearing aids, foreign language learning machines, telephones, and so on,
talker's voice can be heard at a slow speech speed without any change of the characteristic
of the voice.
(2) Effective use of the memory, a raw voice repeat function, a voice memory function,
a repeat voice speech speed conversion function, a fast-hearing reproduction function,
and so on, can be provided.
(3) Because means for changing the speech speed to a value selected by the speech
speed selection switch is provided, the speech speed of a voice to be heard can be
selected by the listener's own will.
(4) Because means for repeating a reproduced voice in a period in which the repeat
switch is turned on is provided, the speech speed of the repeat voice can be converted.
(5) Because a catching-up means for catching up to a position of stored information
to be heard is provided in the speech speed conversion apparatus, widening of the
range of application of the speech speed conversion apparatus, reduction in operating
time, improvement in handling property, and so on, can be attained.
(6) Because at least one of the speech speed conversion switch, the speech speed selection
switch, the repeat switch and the reset switch is provided in a peripheral portion
which is on a side surface of the speech speed conversion apparatus and easy to handle,
widening of the range of application of the speech speed conversion apparatus, reduction
in operating time, improvement in handling property, and so on, can be attained.
(7) Efficiency in the speech speed conversion process can be improved.
(8) Because not only the determination of the waveform expansion/reduction process
and of the silent-part elimination process in the speech speed conversion process
is made on the basis of comparison between the power of a frame and the threshold
but also the threshold is changed in accordance with the loudness of the input voice,
the speech speed conversion process can be carried out in accordance with the environmental
condition in use.
(9) Because the microphone does not catch click noise of switches, the reproduced
voice can be heard accurately.
(10) Because the switches have respective surface forms which are different in tactility
so as to be identified without seeing, handling property can be improved.
(11) Because means for preventing the rustle of clothes against the microphone is
provided, the entrance of noise can be reduced.
(12) Because display means capable of visually indicating the quantity of a time lag
from the current time is provided in a predetermined position of the speech speed
conversion apparatus, reduction in operating time, improvement in handling property,
and so on, can be attained.
(13) Because a ring buffer is used as a memory means so that means for managing the
lag time by a counter indicating the time lag on the ring buffer is provided, complex
calculation of pointer addresses in the repeat process, the catching-up process, and
so on, can be performed easily.
(14) Because a standby mode and an analog through mode are provided besides the through
mode, reduction in consumed electric power can be attained.
(15) Because the electric source switch is provided in the form of three stages consisting
of an stage, an stage and an OFF intermediate stage so that the analog through mode
is provided, not only reduction in electric power can be attained but also the range
of use of the electric source can be widened.
(16) Because the aforementioned speech speed conversion means is provided between
the handset of a telephone and the apparatus body, a voice to be subjected to speech
speed conversion can be selected by the listener without any disturbance of the listener's
own speech.
(17) A talker's voice over the telephone can be heard at a slow speech speed without
any change of the characteristic of the voice.
(18) Because the speech speed conversion means is provided in a telephone line switching
system, a voice to be subjected to speech speed conversion can be selected by the
listener without any disturbance of the listener's own speech.
1. A speech speed conversion method for receiving and storing an input speech and changing
a speed of said input speech without any change of the pitch of said input speech,
wherein speed conversion for said input speech is carried out in a period which is
designated by a listener when speech speed conversion is needed, and no speech speed
conversion is carried out in the period other than said designated period,
characterised in that the quantity of lag from real time is adjusted in a period in which the stored
speech is reproduced in case the lag is caused by a speech speed conversion or repeat
operation.
2. A speech speed conversion apparatus comprising:
means (321) for receiving input speech;
memory means (2) for storing information representative of said input speech;
speed conversion means (1) for changing the speed of said input speech;
means (325) for supplying an output of said speed conversion means (1) as a speech
output to listener's ears;
a speech speed conversion switch (113); and
means adapted to output speech while changing the speed of said input speech when
said speed conversion switch (113) is turned on, and to output speech without changing
the speed when said switch is turned off;
characterised by catching-up means for adjusting the quantity of lag from real time in a period
in which said stored information is reproduced in case the lag is caused by a speech
speed conversion or repeat operation.
3. The apparatus of claim 2, wherein said memory means (2) includes means for storing
data on a frame by frame basis.
4. The apparatus of claim 3, further comprising means for deciding about waveform expansion
and reduction processes in said speed conversion process on the basis of a comparison
between the power of a frame and a threshold provided as a variable.
5. The apparatus of claim 2, further comprising a speed selection switch (113) for selecting
the speed of said speech, and means for changing the speed of said speech to that
selected by said speed selection switch.
6. The apparatus of claim 5, further comprising means for controlling an audio/video
apparatus.
7. The apparatus of claim 2, further comprising a repeat switch (105) and means for repeating
a reproduced speech when said repeat switch is turned on.
8. The apparatus of claim 7, wherein said repeat means includes means for turning back
the speech by several seconds whenever said repeat switch (105) is pushed once, means
for sometimes generating intermittent sounds while the speech is turned back, means
for stopping the turning-back of the speech when the speech reaches the end of a ring
buffer (24), and/or means for selecting the speed at the repeat time.
9. The apparatus of claim 8, wherein said means for selecting the speed at the repeat
time has at least two of the following modes: default speed value repeat, slow repeat,
fast repeat, and gradually accelerated repeat.
10. The apparatus of claim 2, wherein said catching-up means includes means for starting
catching-up when a slow reproduction mode is terminated, means for starting catching-up
when reproduction is turned back to the point of time of the start of a repeat after
the repeat, means for selecting the speech speed at the catching-up time, means for
automatically shifting the current mode to a through mode for directly outputting
the input speech when catching-up is completed, and/or means for generating a report
signal sound when catching-up is completed.
11. The apparatus of claim 10, wherein said means for selecting the speed at the catching-up
time has means for making a non-stop skip to real time, means for catching up the
real time with fast hearing, and/or means for making a parallel movement with a time
lag.
12. The apparatus of claim 2, further comprising a speed selection switch, a repeat switch
(105), and/or a reset switch (106), the or each said switch being provided in a peripheral
portion on a side surface of said speed conversion apparatus so as to facilitate handling.
13. The apparatus of claim 12, wherein said reset switch (106) includes means for stopping
the repeating or catching-up operation and making a skip to real time when said switch
is turned on at the repeat or catching-up time, and then shifting the current mode
to a through mode.
14. The apparatus of claim 2, wherein said speed conversion means is provided as a software
(11) executed by a digital signal processor (1) having an input terminal for receiving
an interruption request signal from the outside, so that controlling of the speed
conversion process or switching of the rate of speed conversion on the basis of said
speed conversion switch (113) is given to said digital signal processor (1) via said
interruption request signal input terminal.
15. The apparatus of claim 2, further comprising means for hearing said output speech
through a binaural headphone (325).
16. The apparatus of claim 2, further comprising:
a microphone (321) for converting an acoustic signal into an electric signal;
an analog amplifier (10) for amplifying an output of said microphone (321);
a low-pass filter (7) for removing high-frequency components from the output of said
analog amplifier (10);
an A/D converter (5) for converting the analog output signal of said low-pass filter
(7) into a digital signal;
a digital signal processor (1) for executing the speed changing process;
means (113) for changing a processing parameter;
a D/A converter (6) for converting the digital speech data into an analog value;
a second low-pass filter (8) for removing high-frequency components from the output
of said D/A converter (6);
a second analog amplifier for amplifying the output of said second low-pass filter
(8); and
a headphone (325) for converting the output of said second analog amplifier into an
acoustic signal and supplying the acoustic signal to both ears.
17. The apparatus of claim 16, wherein said speed conversion means carries out a series
of procedures over a whole frame repeatedly through a pipe-line process by frame using
a plurality of input frame buffers, said series of procedures including:
applying a pitch extraction process to a leading portion of the frame to detect the
pitch of the leading portion;
transferring data of the length of one pitch thus detected to output buffers;
multiplying data of the length of two pitches by a window function which changes from
0 to 1, and by a window function which changes from 1 to 0;
adding up respective data obtained by the multiplications by the window functions
to generate a reproduced wave pattern having a time length of two pitches;
inserting the reproduced wave pattern in the rear of the preliminarily transferred
data of the length of one pitch;
carrying out a pitch detection process again while spearheaded by a position at a
distance of two pitches from the position preliminarily subjected to the pitch extraction
process to thereby perform pitch detection at said position; and
transferring data of the length of n pitches (n being an integer) based on the pitch
length obtained by the final pitch detection to the output buffers.
18. The apparatus of claim 17, wherein said speed conversion process is executed only
if the average power of data in an input frame is higher than a preliminarily set
threshold, the data contained in said frame being directly transferred to the output
buffers if said average power is lower than said threshold.
19. The apparatus of claim 18, wherein a second threshold is provided in the threshold
process for the average power of data in the input frame so that when a frame having
a average power lower than said second threshold is continued for a time longer than
a preliminarily set time threshold, data in the frame having an average power lower
than the second threshold and continuing for a time longer than said time threshold
being forbidden to be transferred to the output buffers.
20. The apparatus of any one of claims 2 to 15, wherein the or each said switch has a
soft touch feeling so that the microphone (321) does not pick up click noise of the
switch.
21. The apparatus of claim 20, wherein the or each said switch has respective surface
forms different in tactility so as to be identified without seeing.
22. The apparatus of any one of claims 2 to 21, further comprising rustling prevention
means for changing the distance between a microphone (321) and the apparatus body
(101) so that said microphone (321) does not touch clothes directly when said apparatus
body (101) is put into a breast pocket in use.
23. The apparatus of claim 2, further comprising display means provided at a predetermined
position of said speed conversion apparatus for visually indicating the quantity of
time lag from real time.
24. The apparatus of any one of claims 2 to 23, wherein a ring buffer (24) is used as
said memory means (2), and said apparatus further comprises means for managing lag
time by a counter indicating the time lag on said ring buffer (24).
25. The apparatus of claim 16, wherein a standby mode for lowering the clock cycle of
the processor (1) and carrying out the same process as in the through mode is provided
besides the through mode.
26. The apparatus of claim 2, further comprising an electric source switch (109) operated
at three stages consisting of an ON stage, an OFF stage and an ON-OFF intermediate
stage, and an electric source supply means operated in an analog through mode in which
analog input-output systems are short-circuited so as to be directly connected to
each other to stop electric source supply to a digital processing system between said
analog input-output systems when said switch is adjusted to said intermediate stage.
27. A telephone including the speech speed conversion apparatus of claim 2 between the
handset and the body of the telephone.
28. A telephone line switching system including the speech speed conversion apparatus
of claim 2.
1. Verfahren zur Änderung der Geschwindigkeit von Sprache, wobei Spracheingabe empfangen
und gespeichert sowie in ihrer Geschwindigkeit ohne Änderung ihrer Signalperiode geändert
wird, und wobei eine Geschwindigkeitsänderung der Spracheingabe in einer von einem
Hörer bestimmten Periode, in der eine Änderung der Sprachgeschwindigkeit benötigt
wird, durchgeführt wird, in der übrigen Zeit dagegen nicht,
dadurch gekennzeichnet, daß das Maß der Verzögerung gegenüber Echtzeit in einer Periode eingestellt wird,
in der die gespeicherte Sprache wiedergegeben wird, falls die Verzögerung durch eine
Änderung der Sprachgeschwindigkeit oder einen Wiederholvorgang bewirkt ist.
2. Vorrichtung zur Änderung der Geschwindigkeit von Sprache, mit
einer Einrichtung (321) zum Empfang von Spracheingabe;
einer Speichereinrichtung (2) zum Speichern von die Spracheingabe wiedergebenden Informationen;
einer Geschwindigkeitsänderungs-Einrichtung (1) zum Ändern der Geschwindigkeit der
Spracheingabe;
einer Einrichtung (325) zum Zuführen eines Ausgangs der Geschwindigkeitsänderungs-Einrichtung
(1) als Sprachausgabe an die Ohren eines Hörers;
einem Sprachgeschwindigkeits-Änderungsschalter (113); und
einer Einrichtung, die so ausgelegt ist, daß sie Sprache im eingeschalteten Zustand
des Geschwindigkeits-Änderungsschalters (113) unter Änderung der Geschwindigkeit der
Spracheingabe und im ausgeschalteten Zustand des Schalters ohne Änderung der Geschwindigkeit
ausgibt;
gekennzeichnet durch eine Aufholeinrichtung zum Justieren des Maßes an Verzögerung gegenüber Echtzeit
in einer Periode, in der die gespeicherte Information wiedergegeben wird, falls die
Verzögerung durch eine Änderung der Sprachgeschwindigkeit oder einen Wiederholvorgang
bewirkt ist.
3. Vorrichtung nach Anspruch 2, wobei die Speichereinrichtung (2) eine Einrichtung zum
rahmenweisen Speichern von Daten aufweist.
4. Vorrichtung nach Anspruch 3 mit ferner einer Einrichtung, die in dem Geschwindigkeitsänderungsvorgang
hinsichtlich Impulsdiagramm-Erweiterungs- und -Verringerungs-Vorgängen aufgrund eines
Vergleiches zwischen der Energie eines Rahmens und einem als Variable vorgesehenen
Schwellenwert entscheidet.
5. Vorrichtung nach Anspruch 2 mit ferner einem Geschwindigkeits-Wählschalter (113) zum
Wählen der Geschwindigkeit der Sprache und einer Einrichtung zum Ändern der Geschwindigkeit
der Sprache auf die von dem Geschwindigkeits-Wählschalter ausgewählte Geschwindigkeit.
6. Vorrichtung nach Anspruch 5 mit ferner einer Einrichtung zum Steuern eines Audio/
Video-Gerätes.
7. Vorrichtung nach Anspruch 2 mit ferner einem Wiederholschalter (105) und einer Einrichtung,
die im eingeschalteten Zustand des Wiederholschalters reproduzierte Sprache wiederholt.
8. Vorrichtung nach Anspruch 7, wobei die Wiederholeinrichtung eine Einrichtung aufweist,
die die Sprache um mehrere Sekunden zurückschaltet, wenn der Wiederholschalter (105)
einmal gedrückt wird, eine Einrichtung zum mehrfachen Erzeugen von intermittierenden
Tönen, während die Sprache zurückgeschaltet wird, eine Einrichtung zum Anhalten der
Sprachrückschaltung, wenn die Sprache das Ende eines Ringpuffers (24) erreicht, und/
oder eine Einrichtung zum Auswählen der Geschwindigkeit während der Wiederholzeit.
9. Vorrichtung nach Anspruch 8, wobei die Einrichtung zum Wählen der Geschwindigkeit
während der Wiederholzeit mindestens zwei der folgenden Moden aufweist: Wiederholung
mit Vorgabegeschwindigkeit, langsame Wiederholung, schnelle Wiederholung, allmählich
schneller werdende Wiederholung.
10. Vorrichtung nach Anspruch 2, wobei die Aufholeinrichtung eine Einrichtung enthält,
die einen Aufholvorgang startet, wenn ein langsamer Wiedergabemodus beendet ist, eine
Einrichtung, die einen Wiederholvorgang startet, wenn die Wiedergabe nach der Wiederholung
auf den Zeitpunkt des Beginns einer Wiederholung zurückgeschaltet wird, eine Einrichtung
zum Wählen der Sprachgeschwindigkeit während der Aufholzeit, eine Einrichtung zum
automatischen Umschalten des laufenden Modus auf einen Durchgangsmodus zum direkten
Ausgeben der Spracheingabe, wenn der Aufholvorgang beendet ist, und/oder eine Einrichtung
zum Erzeugen eines Bestätigungs-Signalstons, wenn der Aufholvorgang beendet ist.
11. Vorrichtung nach Anspruch 10, wobei die Einrichtung zum Wählen der Geschwindigkeit
während der Aufholzeit eine Einrichtung aufweist, die einen Sprung auf Echtzeit ohne
Stop durchführt, eine Einrichtung, die Echtzeit mit schnellem Hören aufholt, und/oder
eine Einrichtung, die eine Parallelbewegung mit zeitlicher Verzögerung durchführt.
12. Vorrichtung nach Anspruch 2 mit ferner einem Geschwindigkeits-Wählschalter, einem
Wiederholschalter (105) und/oder einem Rückstellschalter (106), wobei der oder die
Schalter zur erleichterten Handhabung in einem Randbereich an einer Seitenfläche der
Geschwindigkeitsänderungs-Einrichtung vorgesehen sind.
13. Vorrichtung nach Anspruch 12, wobei der Rückstellschalter (106) eine Einrichtung aufweist,
die den Wiederhol- bzw. Aufholvorgang anhält und einen Sprung zu Echtzeit durchführt,
wenn der Schalter während der Wiederhol- bzw. Aufholzeit eingeschaltet wird, und anschließend
den laufenden Modus in einen Durchgangsmodus umschaltet.
14. Vorrichtung nach Anspruch 2, wobei die Geschwindigkeitsänderungs-Einrichtung als Software
(11) vorgesehen ist, die von einem Digitalsignal-Prozessor (1) ausgeführt wird, wobei
dieser eine Eingangsklemme zum Empfang eines Unterbrechungs-Anforderungssignals von
Außen aufweist, so daß die Steuerung des Geschwindigkeitsänderungsvorgangs oder die
Umschaltung des Verhältnisses der Geschwindigkeitsänderung aufgrund des Geschwindigkeits-Änderungsschalters
(113) dem Digitalsignal-Prozessor (1) über den Eingangsanschluß für das Unterbrechungs-Anforderungssignal
zugeführt wird.
15. Vorrichtung nach Anspruch 2 mit ferner einer Einrichtung zum Hören der Sprachausgabe
über Zweiohr-Kopfhörer (325).
16. Vorrichtung nach Anspruch 2 mit ferner
einem Mikrophon (321) zum Umwandeln eines akustischen in ein elektrisches Signal;
einem Analogverstärker (10) zum Verstärken eines Ausgangssignals des Mikrophons (321);
einem Tiefpaßfilter (7) zum Entfernen von Hochfrequenzkomponenten aus dem Ausgangssignal
des Analogverstärkers (10);
einem Analog/Digital-Umsetzer (5) zum Umsetzen des analogen Ausgangssignals des Tiefpaßfilters
(7) in ein Digitalsignal;
einem Digitalsignal-Prozessor (1) zur Durchführung des Geschwindigkeitsänderungsvorgangs;
einer Einrichtung (113) zum Ändern eines Verarbeitungsparameters;
einem Digital/Analog-Umsetzer (6) zum Umsetzen der digitalen Sprachdaten in Analogwerte;
einem zweiten Tiefpaßfilter (8) zum Entfernen von Hochfrequenzkomponenten aus dem
Ausgangssignal des Digital/Analog-Umsetzers (6);
einem zweiten Analogverstärker zum Verstärken des Ausgangssignals des zweiten Tiefpaßfilters
(8); und
einem Kopfhörer (325) zum Umwandeln des Ausgangssignals des zweiten Analogverstärkers
in ein akustisches Signal und zum Zuführen des akustischen Signals an beide Ohren.
17. Vorrichtung nach Anspruch 16, wobei die Geschwindigkeitsänderungs-Einrichtung über
einen vollen Rahmen wiederholt eine Folge von Prozeduren durch rahmenweise Pipeline-Verarbeitung
unter Verwendung mehrerer Eingangsrahmenpuffer durchführt, und wobei zu der Folge
von Prozeduren gehört:
Ausführen eines Signalperioden-Extraktionsvorgangs auf den Anfangsteil des Rahmens,
um die Signalperiode des Anfangsteils zu erfassen;
Übertragen von Daten der Länge einer so erfaßten Signalperiode an Ausgangspuffer;
Multiplizieren von Daten der Länge zweier Signalperioden mittels einer von 0 nach
1 und einer von 1 nach 0 sich ändernden Fensterfunktion;
Aufaddieren von durch die Multiplikationen mittels der Fensterfunktionen erhaltenen
jeweiligen Daten zur Erzeugung eines wiedergegebenen Wellenmusters mit der zeitlichen
Länge zweier Signalperioden;
Einfügen des wiedergegebenen Wellenmusters am Ende der vorläufig übertragenen Daten
der Länge einer Signalperiode;
erneutes Ausführen eines Signalperioden-Erfassungsvorgangs an einer Stelle, die sich
im Abstand von zwei Signalperioden von derjenigen Stelle befindet, an der vorher der
Signalperioden-Extraktionsvorgang durchgeführt wurde, so daß die Erfassung der Signalperiode
an dieser Stelle erfolgt, und
Übertragen von Daten der Länge von n Signalperioden (wobei n eine ganze Zahl ist)
aufgrund der in der letzten Signalperiodenerfassung erzielten Signalperiodenlänge
an die Ausgangspuffer.
18. Vorrichtung nach Anspruch 17, wobei der Geschwindigkeitsänderungsvorgang nur dann
ausgeführt wird, wenn die mittlere Energie von Daten in einem Eingangsrahmen höher
ist als ein vorher eingestellter Schwellenwert, und wobei die in dem Rahmen enthaltenen
Daten an die Ausgangspuffer direkt übertragen werden, wenn die mittlere Energie unter
dem Schwellenwert liegt.
19. Vorrichtung nach Anspruch 18, wobei in dem Schwellenwertvorgang für die mittlere Energie
der Daten in dem Eingangsrahmen ein zweiter Schwellenwert vorgesehen wird, wenn sich
ein Rahmen mit einer unter dem zweiten Schwellenwert liegenden mittleren Energie über
eine Zeitspanne fortsetzt, die länger ist als ein vorgegebener zeitlicher Schwellenwert,
und wobei die Daten in einem Rahmen, dessen mittlere Energie unter dem zweiten Schwellenwert
liegt und der sich über eine über dem zeitlichen Schwellenwert liegende Zeitspanne
fortsetzt, nicht an die Ausgangspuffer übertragen werden dürfen.
20. Vorrichtung nach einem der Ansprüche 2 bis 15, wobei der bzw. jeder Schalter eine
Soft-touch-Charakteristik aufweist, so daß das Mikrophon (321) kein Klickgeräusch
des Schalters aufnimmt.
21. Vorrichtung nach Anspruch 20, wobei der bzw. jeder Schalter in der Griffigkeit unterschiedliche
Oberflächenformen aufweist, um eine Identifikation ohne Hinschauen zu ermöglichen.
22. Vorrichtung nach einem der Ansprüche 2 bis 21 mit ferner einer Raschel-Unterdrückungseinrichtung,
die den Abstand zwischen einem Mikrophon (321) und dem Gerätegehäuse (101) derart
ändert, daß das Mikrophon (321) die Kleidung nicht direkt berührt, wenn das Gerätegehäuse
(121) beim Gebrauch in eine Brusttasche gesteckt wird.
23. Vorrichtung nach Anspruch 2 mit ferner einer an einer vorgegebenen Stelle der Geschwindigkeitsänderungs-Einrichtung
vorgesehenen Anzeigeeinrichtung zur sichtbaren Darstellung des Maßes der Zeitverzögerung
gegenüber Echtzeit.
24. Vorrichtung nach einem der Ansprüche 2 bis 23, wobei als Speichereinrichtung (2) ein
Ringpuffer (24) dient und die Vorrichtung ferner eine Einrichtung zur Verwaltung der
Zeitverzögerung mittels eines Zählers aufweist, der die Zeitverzögerung an dem Ringpuffer
(24) angibt.
25. Vorrichtung nach Anspruch 16, wobei zusätzlich zu dem Durchgangsmodus ein Standby-Modus
vorgesehen ist, in dem der Taktzyklus des Prozessors (1) abgesenkt und der gleiche
Prozeß wie im Durchgangsmodus ausgeführt wird.
26. Vorrichtung nach Anspruch 2 mit ferner einem der elektrischen Versorgung dienenden
Schalter (109), der in drei Stufen arbeitet, nämlich einer EIN-Stufe, einer AUS-Stufe
und einer EIN-AUS-Zwischenstufe, sowie einer der Zuführung elektrischer Energie dienenden
Einrichtung, die in einem analogen Durchgangsmodus arbeitet, in dem analoge Eingangs/Ausgangs-Systeme
kurzgeschlossen und dadurch direkt miteinander verbunden sind, um die Zufuhr von elektrischer
Energie an ein digitales Verarbeitungssystem zwischen den analogen Eingangs/Ausgangs-Systemen
zu unterbrechen, wenn der Schalter in der Zwischenstellung steht.
27. Telefon, das zwischen dem Hörer und dem Telefonapparat die Sprachgeschwindigkeit-Änderungsvorrichtung
nach Anspruch 2 aufweist.
28. Telefon-Leitungsschaltsystem, das die Sprachgeschwindigkeits-Änderungsvorrichtung
nach Anspruch 2 aufweist.
1. Procédé de conversion de vitesse de la parole pour recevoir et mémoriser une parole
d'entrée et changer une vitesse de ladite parole d'entrée sans aucun changement du
pas de ladite parole d'entrée, dans lequel la conversion de vitesse pour ladite parole
d'entrée est effectuée dans une période qui est désignée par un auditeur lorsque la
conversion de vitesse de parole est nécessaire, et aucune conversion de vitesse de
parole n'est effectuée dans une période autre que ladite période désignée,
caractérisé en ce que la quantité de retard par rapport au temps réel est ajustée
dans une période dans laquelle la parole mémorisée est reproduite dans le cas où le
retard est provoqué par une conversion de vitesse de parole ou une opération de répétition.
2. Dispositif de conversion de vitesse de la parole comportant :
des moyens (321) pour recevoir une parole d'entrée,
des moyens de mémorisation (2) pour mémoriser des informations représentatives de
ladite parole d'entrée,
des moyens de conversion de vitesse (1) pour changer la vitesse de ladite parole d'entrée,
des moyens (325) pour envoyer dans les oreilles d'un auditeur une sortie desdits moyens
de conversion de vitesse (1) en tant que sortie de parole,
un commutateur de conversion de vitesse de parole (113), et
des moyens adaptés pour émettre une parole tout en changeant la vitesse de ladite
parole d'entrée lorsque ledit commutateur de conversion de vitesse (113) est passant,
et pour émettre une parole sans changer la vitesse lorsque ledit commutateur est bloqué,
caractérisé par des moyens de rattrapage pour ajuster la quantité de retard par
rapport au temps réel dans une période dans laquelle lesdites informations mémorisées
sont reproduites dans le cas où le retard est provoqué par une conversion de vitesse
de parole ou une opération de répétition.
3. Dispositif selon la revendication 2, dans lequel lesdits moyens de mémorisation (2)
comportent des moyens pour mémoriser des données sur une base de trame par trame.
4. Dispositif selon la revendication 3, comportant de plus des moyens pour décider des
processus d'extension et de réduction de forme d'onde dans ledit processus de conversion
de vitesse sur la base d'une comparaison entre la puissance d'une trame et un seuil
fourni en tant que variable.
5. Dispositif selon la revendication 2, comportant de plus un commutateur de sélection
de vitesse (113) pour sélectionner la vitesse de ladite parole, et des moyens pour
changer la vitesse de ladite parole en celle sélectionnée par ledit commutateur de
sélection de vitesse.
6. Dispositif selon la revendication 5, comportant de plus des moyens pour commander
un dispositif audio/vidéo.
7. Dispositif selon la revendication 2, comportant de plus un commutateur de répétition
(105) et des moyens pour répéter une parole reproduite lorsque ledit commutateur de
répétition est passant.
8. Dispositif selon la revendication 7, dans lequel lesdits moyens de répétition comportent
des moyens pour faire revenir en arrière la parole de plusieurs secondes à chaque
fois que ledit commutateur de répétition (105) est actionné, des moyens pour produire
parfois des sons intermittents pendant que la parole est renvoyée en arrière, des
moyens pour stopper le retour en arrière de la parole lorsque la parole atteint la
fin d'un tampon en anneau (24), et/ou des moyens pour sélectionner la vitesse au moment
de la répétition.
9. Dispositif selon la revendication 8, dans lequel lesdits moyens pour sélectionner
la vitesse au moment de la répétition ont au moins deux des modes suivants : une répétition
à valeur de vitesse par défaut, une répétition lente, une répétition rapide et une
répétition graduellement accélérée.
10. Dispositif selon la revendication 2, dans lequel lesdits moyens de rattrapage comportent
des moyens pour commencer le rattrapage lorsqu'un mode de reproduction lente est terminé,
des moyens pour commencer le rattrapage lorsque la reproduction est renvoyée à l'instant
du début d'une répétition après la répétition, des moyens pour sélectionner la vitesse
de parole au moment du rattrapage, des moyens pour permuter automatiquement le mode
courant en un mode continu pour émettre directement la parole d'entrée lorsque le
rattrapage est terminé, et/ou des moyens pour produire un son d'avertissement lorsque
le rattrapage est terminé.
11. Dispositif selon la revendication 10, dans lequel lesdits moyens pour sélectionner
la vitesse au moment du rattrapage comportent des moyens pour effectuer un saut non
stop jusqu'au temps réel, des moyens pour rattraper le temps réel avec une écoute
rapide, et/ou des moyens pour effectuer un mouvement parallèle avec un retard de temps.
12. Dispositif selon la revendication 2, comportant de plus un commutateur de sélection
de vitesse, un commutateur de répétition (105), et/ou un commutateur de réinitialisation
(106), ledit commutateur ou chaque commutateur étant agencé dans une partie périphérique
sur une surface latérale dudit dispositif de conversion de vitesse de manière à faciliter
la manipulation.
13. Dispositif selon la revendication 12, dans lequel ledit commutateur de réinitialisation
(106) comporte des moyens pour stopper l'opération de répétition ou de rattrapage
et faire un saut jusqu'au temps réel lorsque ledit commutateur est passant au moment
de la répétition ou du rattrapage, et permuter ensuite le mode courant en mode continu.
14. Dispositif selon la revendication 2, dans lequel lesdits moyens de conversion de vitesse
sont munis d'un logiciel (11) exécuté par un processeur de signaux numériques (1)
ayant une borne d'entrée pour recevoir un signal de demande d'interruption provenant
de l'extérieur, de sorte qu'une commande du processus de conversion de vitesse ou
une commutation de la fréquence de conversion de vitesse sur la base dudit commutateur
de conversion de vitesse (113) est envoyée dans ledit processeur de signaux numériques
(1) via ladite borne d'entrée de signal de demande d'interruption.
15. Dispositif selon la revendication 2, comportant de plus des moyens pour entendre ladite
parole de sortie à travers un écouteur binaural (325).
16. Dispositif selon la revendication 2, comportant de plus :
un microphone (321) pour convertir un signal sonore en un signal électrique,
un amplificateur analogique (10) pour amplifier une sortie dudit microphone (321),
un filtre passe-bas (7) pour éliminer des composantes à haute fréquence de la sortie
dudit amplificateur analogique (10),
un convertisseur analogique/numérique A/N (5) pour convertir le signal de sortie analogique
dudit filtre passe-bas (7) en un signal numérique,
un processeur de signaux numériques (1) pour exécuter le processus de changement de
vitesse,
des moyens (113) pour changer un paramètre de traitement,
un convertisseur N/A (6) pour convertir les données numériques de parole en une valeur
analogique,
un second filtre passe-bas (8) pour éliminer des composantes à haute fréquence de
la sortie dudit convertisseur N/A (6),
un second amplificateur analogique pour amplifier la sortie dudit second filtre passe-bas
(8), et
un écouteur (325) pour convertir la sortie dudit second amplificateur analogique en
un signal sonore et délivrer le signal sonore dans les deux oreilles.
17. Dispositif selon la revendication 16, dans lequel lesdits moyens de conversion de
vitesse effectuent une série de processus sur une trame entière de manière répétée
à travers un processus en pipeline par trame en utilisant une pluralité de tampons
de trame d'entrée, ladite série de processus comportant les étapes consistant à :
appliquer un processus d'extraction de pas dans une partie avant de la trame pour
détecter le pas de la partie avant,
transférer des données ayant une longueur d'un pas ainsi détectées dans des tampons
de sortie,
multiplier des données ayant une longueur de deux pas par une fonction de fenêtre
qui change de 0 à 1, et par une fonction de fenêtre qui change de 1 à 0,
ajouter des données respectives obtenues par les multiplications par les fonctions
de fenêtre pour produire un motif d'onde reproduit ayant une durée de deux pas,
insérer le motif d'onde reproduit dans la partie arrière des données transférées au
préalable ayant une longueur d'un pas,
effectuer un processus de détection de pas à nouveau tout en avançant d'une position
à une distance de deux pas de la position soumise au préalable au processus d'extraction
de pas de manière à effectuer une détection de pas à ladite position, et
transférer dans les tampons de sortie des données ayant une longueur de n pas (n étant
un entier) sur la base de la longueur de pas obtenue par la détection de pas final.
18. Dispositif selon la revendication 17, dans lequel ledit processus de conversion de
vitesse est exécuté uniquement si la puissance moyenne de données dans une trame d'entrée
est supérieure à un seuil établi au préalable, les données contenues dans ladite trame
étant transférées directement dans les tampons de sortie si ladite puissance moyenne
est inférieure audit seuil.
19. Dispositif selon la revendication 18, dans lequel un second seuil est fourni dans
le processus de seuil pour la puissance moyenne des données dans la trame d'entrée
de sorte que, lorsqu'une trame ayant une puissance moyenne inférieure audit second
seuil se prolonge pendant un temps plus long qu'un seuil de temps établi au préalable,
des données dans la trame ayant une puissance moyenne inférieure au second seuil et
se prolongeant pendant un temps plus long que ledit seuil de temps sont interdites
de transfert dans les tampons de sortie.
20. Dispositif selon l'une quelconque des revendications 2 à 15, dans lequel ledit commutateur
ou chacun desdits commutateurs a une sensation de toucher doux de sorte que le microphone
(321) ne prend pas le bruit de déclic du commutateur.
21. Dispositif selon la revendication 20, dans lequel ledit commutateur ou chacun desdits
commutateurs a des formes de surface respectives différentes en ce qui concerne le
toucher de manière à les identifier sans les voir.
22. Dispositif selon l'une quelconque des revendications 2 à 21, comportant de plus des
moyens de prévention de bruissement pour changer la distance entre un microphone (321)
et le corps de dispositif (101) de sorte que ledit microphone (321) ne touche pas
des vêtements directement lorsque ledit corps de dispositif (101) est mis en utilisation
dans une poche poitrine.
23. Dispositif selon la revendication 2, comportant de plus des moyens d'affichage agencés
à une position prédéterminée dudit dispositif de conversion de vitesse pour indiquer
visuellement la quantité de retard de temps par rapport au temps réel.
24. Dispositif selon l'une quelconque des revendications 2 à 23, dans lequel un tampon
en anneau (24) est utilisé en tant que lesdits moyens de mémorisation (2), et ledit
dispositif comporte de plus des moyens pour gérer le temps de retard par un compteur
indiquant le retard de temps sur ledit tampon en anneau (24).
25. Dispositif selon la revendication 16, dans lequel un mode d'attente pour diminuer
le cycle d'horloge du processeur (1) et effectuer le même processus que dans le mode
continu est fourni en plus du mode continu.
26. Dispositif selon la revendication 2, comportant de plus un commutateur de source électrique
(109) actionné selon trois niveaux constitués d'un niveau MARCHE, d'un niveau ARRET
et d'un niveau intermédiaire MARCHE-ARRET, et des moyens d'alimentation de source
électrique actionnés dans un mode continu analogique dans lequel des systèmes d'entrée-sortie
analogiques sont court-circuités de manière à être directement connectés les uns aux
autres pour stopper l'alimentation de source électrique dans un système de traitement
numérique entre lesdits systèmes d'entrée-sortie analogique lorsque ledit commutateur
est ajusté audit niveau intermédiaire.
27. Téléphone comportant le dispositif de conversion de vitesse de parole de la revendication
2, entre le combiné et le corps du téléphone.
28. Système de commutation de ligne téléphonique comportant le dispositif de conversion
de vitesse de la parole de la revendication 2.