[0001] The present disclosure relates to an action control device, an action control method,
and a program.
[0002] In the related art, various types of robots have been developed but, in recent years,
advancements have been made in the development of not only industrial robots, but
also of consumer robots such as pet robots. For example, Unexamined
Japanese Patent Application Publication No. 2001-157985 describes a robot device, provided with a pressure sensor, that determines, by a
pattern of a detected pressure detection signal, whether a person that contacts the
robot device is a user that is registered in advance.
[0003] The robot device described in Patent Literature 1 determines, on the basis of the
pattern of the detected pressure detection signal from the pressure sensor, whether
the person who pets the robot device is the user. However, the "user" in Unexamined
Japanese Patent Application Publication No. 2001-157985 is a person for which the pattern of the pressure detection signal is registered
in advance, and is not limited to a person based on the intimacy with the robot device.
Consequently, there is a problem in that it is difficult to control the robot device
so as to perform actions based on the relationship between the user and the robot
device.
[0004] The present disclosure is made with the view of the above situation, and an objective
of the present disclosure is to make it possible to cause a device that executes actions
to perform an action that takes the relationship between the device and a subject
applying an external stimulus to the device into consideration.
[0005] One aspect of an action control device according to the present disclosure is
an action control device that controls an action of a control target device, the action
control device comprising:
a controller that
acquires an external stimulus, and
in a case where the controller executes an action corresponding to the external stimulus,
controls so as to execute, based on an intimacy between a subject applying the external
stimulus and the control target device, different action content.
[0006] According to the present disclosure, it is possible to cause a device that executes
actions to perform an action that takes the relationship between the device and a
subject applying an external stimulus to the device into consideration.
[0007] A more complete understanding of this application can be obtained when the following
detailed description is considered in conjunction with the following drawings, in
which:
FIG. 1 is a drawing illustrating the appearance of a robot according to Embodiment
1;
FIG. 2 is a cross-sectional view of the robot according to Embodiment 1, viewed from
a side surface;
FIG. 3 is a drawing for explaining a housing of the robot according to Embodiment
1;
FIG. 4 is a block diagram illustrating the functional configuration of the robot according
to Embodiment 1;
FIG. 5 is a drawing for explaining an example of an action mode setting table according
to Embodiment 1;
FIG. 6 is a drawing for explaining an example of an emotion map according to Embodiment
1;
FIG. 7 is a drawing for explaining an example of a growth table according to Embodiment
1;
FIG. 8 is a drawing for explaining an example of an action content table according
to Embodiment 1;
FIG. 9 is a flowchart of action control processing according to Embodiment 1;
FIG. 10 is a flowchart of microphone input processing according to Embodiment 1;
FIG. 11 is a drawing illustrating an example of a sound buffer according to Embodiment
1;
FIG. 12 is a flowchart of similarity with voice history determination processing according
to Embodiment 1;
FIG. 13 is a flowchart of action mode setting processing according to Embodiment 1;
FIG. 14 is a flowchart of normal action mode processing according to Embodiment 1;
FIG. 15 is a flowchart of familiar action mode processing according to Embodiment
1;
FIG. 16 is a flowchart of touch response familiar action processing according to Embodiment
1;
FIG. 17 is a flowchart of sound response familiar action processing according to Embodiment
1; and
FIG. 18 is a flowchart of loud sound response familiar action processing according
to Embodiment 1.
[0008] Hereinafter, embodiments of the present disclosure are described while referencing
the drawings. Note that, in the drawings, identical or corresponding components are
denoted with the same reference numerals.
Embodiment 1
[0009] An embodiment in which an action control device according to Embodiment 1 is applied
to a robot 200 illustrated in FIG. 1 is described while referencing the drawings.
As illustrated in FIG. 1, the robot 200 according to the embodiment is a pet robot
that resembles a small animal. The robot 200 is covered with an exterior 201 provided
with bushy fur 203 and decorative parts 202 resembling eyes. A housing 207 of the
robot 200 is accommodated in the exterior 201. As illustrated in FIG. 2, the housing
207 of the robot 200 includes a head 204, a coupler 205, and a torso 206. The head
204 and the torso 206 are coupled by the coupler 205.
[0010] Regarding the torso 206, as illustrated in FIG. 2, a twist motor 221 is provided
at a front end of the torso 206, and the head 204 is coupled to the front end of the
torso 206 via the coupler 205. The coupler 205 is provided with a vertical motor 222.
Note that, in FIG. 2, the twist motor 221 is provided on the torso 206, but may be
provided on the coupler 205 or on the head 204.
[0011] The coupler 205 couples the torso 206 and the head 204 so as to enable rotation (by
the twist motor 221) around a first rotational axis that passes through the coupler
205 and extends in a front-back direction of the torso 206. The twist motor 221 rotates
the head 204, with respect to the torso 206, clockwise (right rotation) within a forward
rotation angle range around the first rotational axis (forward rotation), counter-clockwise
(left rotation) within a reverse rotation angle range around the first rotational
axis (reverse rotation), and the like. Note that, in this description, the term "clockwise"
refers to clockwise when viewing the direction of the head 204 from the torso 206.
A maximum value of the angle of twist rotation to the right (right rotation) or the
left (left rotation) can be set as desired, and the angle of the head 204 in a state,
as illustrated in FIG. 3, in which the head 204 is not twisted to the right or the
left is referred to as a "twist reference angle."
[0012] The coupler 205 couples the torso 206 and the head 204 so as to enable rotation (by
the vertical motor 222) around a second rotational axis that passes through the coupler
205 and extends in a width direction of the torso 206. The vertical motor 222 rotates
the head 204 upward (forward rotation) within a forward rotation angle range around
the second rotational axis, downward (reverse rotation) within a reverse rotation
angle range around the second rotational axis, and the like. A maximum value of the
angle of rotation upward or downward can be set as desired, and the angle of the head
204 in a state, as illustrated in FIG. 3, in which the head 204 is not rotated upward
or downward is referred to as a "vertical reference angle." Note that, in FIG. 2,
an example is illustrated in which the first rotational axis and the second rotational
axis are orthogonal to each other, but a configuration is possible in which the first
and second rotational axes are not orthogonal to each other.
[0013] The robot 200 includes a touch sensor 211 that can detect petting or striking of
the robot 200 by a user. More specifically, as illustrated in FIG. 2, the robot 200
includes a touch sensor 211H on the head 204. The touch sensor 211H can detect petting
or striking of the head 204 by the user. Additionally, as illustrated in FIGS. 2 and
3, the robot 200 includes a touch sensor 211LF and a touch sensor 211LR respectively
on the front and rear of a left-side surface of the torso 206, and a touch sensor
211RF and a touch sensor 211RR respectively on the front and rear of a right-side
surface of the torso 206. These touch sensors 211LF, 211LR, 211RF, 211RR can detect
petting or striking of the torso 206 by the user.
[0014] The robot 200 includes an acceleration sensor 212 on the torso 206. The acceleration
sensor 212 can detect an attitude (orientation) of the robot 200, and can detect being
picked up, the orientation being changed, being thrown, and the like by the user.
The robot 200 includes a gyrosensor 213 on the torso 206. The gyrosensor 213 can detect
vibrating, rolling, rotating, and the like of the robot 200.
[0015] The robot 200 includes a microphone 214 on the torso 206. The microphone 214 can
detect external sounds. Furthermore, the robot 200 includes a speaker 231 on the torso
206. The speaker 231 can be used to emit animal sounds, sing songs, and the like.
[0016] Note that, in the present embodiment, the acceleration sensor 212, the gyrosensor
213, the microphone 214, and the speaker 231 are provided on the torso 206, but a
configuration is possible in which all or a portion of these components are provided
on the head 204. Note that a configuration is possible in which, in addition to the
acceleration sensor 212, the gyrosensor 213, the microphone 214, and the speaker 231
provided on the torso 206, all or a portion of these components are also provided
on the head 204. The touch sensor 211 is provided on each of the head 204 and the
torso 206, but a configuration is possible in which the touch sensor 211 is provided
on only one of the head 204 and the torso 206. Moreover, a configuration is possible
in which a plurality of any of these components is provided.
[0017] Next, the functional configuration of the robot 200 is described. As illustrated
in FIG. 4, the robot 200 includes an action control device 100, a sensor 210, a driver
220, a sound outputter 230, and an operation inputter 240. Moreover, the action control
device 100 includes a controller 110, a storage 120, and a communicator 130. In FIG.
4, the action control device 100, and the sensor 210, the driver 220, the sound outputter
230, and the operation inputter 240 are connected to each other via a bus line BL,
but this is merely an example. A configuration is possible in which the action control
device 100, and the sensor 210, the driver 220, the sound outputter 230, and the operation
inputter 240 are connected by a wired interface such as a universal serial bus (USB)
cable or the like, or by a wireless interface such as Bluetooth (registered trademark)
or the like. Additionally, a configuration is possible in which the controller 110,
and the storage 120 and the communicator 130 are connected via the bus line BL.
[0018] The action control device 100 controls, by the controller 110 and the storage 120,
actions of the robot 200. Note that the robot 200 is a device that is controlled by
the action control device 100 and, as such, is also called a "control target device."
[0019] In one example, the controller 110 is configured from a central processing unit (CPU)
or the like, and executes various processings described later using programs stored
in the storage 120. Note that the controller 110 is compatible with multithreading
functionality, in which a plurality of processings are executed in parallel. As such,
the controller 110 can execute the various processings described below in parallel.
Additionally, the controller 110 is provided with a clock function and a timer function,
and can measure the date and time, and the like.
[0020] The storage 120 is configured from read-only memory (ROM), flash memory, random access
memory (RAM), or the like. Programs to be executed by the CPU of the controller 110,
and data needed in advance to execute these programs are stored in the ROM. The flash
memory is writable non-volatile memory, and stores data that is desired to be retained
even after the power is turned OFF. Data that is created or modified during the execution
of the programs is stored in the RAM. In one example, the storage 120 stores a voice
history, emotion data 121, emotion change data 122, a growth table 123, an action
mode setting table 126, a sound buffer 127, and the like, all described hereinafter.
[0021] The communicator 130 includes a communication module compatible with a wireless local
area network (LAN), Bluetooth (registered trademark), or the like, and carries out
data communication with a smartphone or similar external device.
[0022] The sensor 210 includes the touch sensor 211, the acceleration sensor 212, the gyrosensor
213, and the microphone 214 described above. The controller 110 acquires, as external
stimulus data, detection values detected by the various sensors of the sensor 210.
The external stimulus data expresses an external stimulus acting on the robot 200.
Note that a configuration is possible in which the sensor 210 includes sensors other
than the touch sensor 211, the acceleration sensor 212, the gyrosensor 213, and the
microphone 214. The types of external stimuli acquirable by the controller 110 can
be increased by increasing the types of sensors of the sensor 210.
[0023] The touch sensor 211 detects contacting by some sort of object. The touch sensor
211 is configured from a pressure sensor or a capacitance sensor, for example. The
controller 110 acquires a contact strength and/or a contact time on the basis of the
detection values from the touch sensor 211 and, on the basis of these values, can
detect an external stimulus such as that the robot 200 is being pet or being struck
by the user, and the like (for example, see Unexamined
Japanese Patent Application Publication No. 2019-217122). Note that a configuration is possible in which the controller 110 detects these
external stimuli by a sensor other than the touch sensor 211 (for example, see
Japanese Patent No. 6575637).
[0024] The acceleration sensor 212 detects acceleration in three axial directions, namely
the front-back direction (X-axis direction), the width (left-right) direction (Y-axis
direction), and the vertical direction (Z direction) of the torso 206 of the robot
200. The acceleration sensor 212 detects gravitational acceleration when the robot
200 is stopped and, as such, the controller 110 can detect a current attitude of the
robot 200 on the basis of the gravitational acceleration detected by the acceleration
sensor 212. Additionally, when, for example, the user picks up or throws the robot
200, the acceleration sensor 212 detects, in addition to the gravitational acceleration,
acceleration caused by the movement of the robot 200. Accordingly, the controller
110 can detect the movement of the robot 200 by removing the gravitational acceleration
component from the detection value detected by the acceleration sensor 212.
[0025] The gyrosensor 213 detects angular velocity of the three axes of the robot 200. The
controller 110 can determine a rotation state of the robot 200 on the basis of the
angular velocities of the three axes. Additionally, the controller 110 can determine
a vibration state of the robot 200 on the basis of the maximum values of the angular
velocities of the three axes.
[0026] The microphone 214 detects ambient sound of the robot 200. The controller 110 can,
for example, detect, on the basis of a component of the sound detected by the microphone
214, that the user is speaking to the robot 200, that the user is clapping their hands,
and the like.
[0027] Specifically, the controller 110 samples, at a prescribed sampling frequency (16,384
Hz in the present embodiment) and number of quantization bits (16 bits in the present
embodiment), sound data acquired from the microphone 214, and stores the sampled sound
data in the sound buffer 127 of the storage 120. In the present embodiment, the sound
buffer 127 includes 16 consecutive buffers (storage regions) that each contain 512
samples of sampling data. Specifically, as illustrated in FIG. 11, voice similarity
is determined with the 16 consecutive buffers (storage regions) 1270 to 1285 as one
unit. In the present embodiment, the 16 consecutive buffers are expressed as array
variables. For example, buffer 1270 is expressed as buf[0] and buffer 1285 is expressed
as buf[15]. 512 samples×16 buffers/16384 Hz = 0.5 seconds of sound data is stored
by the 16 buffers.
[0028] Note that processing in which the controller 110 stores the sound data acquired from
the microphone 214 in the sound buffer 127 is executed in parallel with other processings
as a sound buffer storage thread. Additionally, in the present embodiment, in voice
characteristic parameter calculation processing, described later, the controller 110
performs, for the 16 buffers 1270 to 1285, processing for calculating three pieces
of Cepstrum information from the 512 samples of sampling data in one buffer. The controller
110 treats the 48 (= 3×16) pieces of data obtained thereby as a 48-dimension voice
characteristic parameter.
[0029] A history storage number (for example, 256) of this voice characteristic parameter
is stored in the storage 120 on a first-in first-out (FIFO) basis. In the present
embodiment, the FIFO storing the voice characteristic parameter is called (VFIFO),
and the number of voice characteristic parameters stored in the VFIFO is stored in
a variable called "VFIFO_SIZE." A history of the voice characteristic parameter is
stored in the VFIFO and, as such, the VFIFO is also called "voice history."
[0030] Returning to FIG. 4, the driver 220 includes the twist motor 221 and the vertical
motor 222, and is driven by the controller 110. The controller 110 controls the driver
220 and, as a result, the robot 200 can express actions such as, for example, lifting
the head 204 up (rotating upward around the second rotational axis), twisting the
head 204 sideways (twisting/rotating to the right or to the left around the first
rotational axis), and the like. Action control data for performing these actions are
stored in the storage 120, and the actions of the robot 200 are controlled on the
basis of the detected external stimulus, a growth value described later, and the like.
[0031] The sound outputter 230 includes the speaker 231, and sound is output from the speaker
231 as a result of sound data being input into the sound outputter 230 by the controller
110. For example, the robot 200 emits a pseudo-animal sound as a result of the controller
110 inputting animal sound data of the robot 200 into the sound outputter 230. This
animal sound data is also stored in the storage 120, and an animal sound is selected
on the basis of the detected external stimulus, a growth value described later, and
the like.
[0032] In one example, the operation inputter 240 is configured from an operation button,
a volume knob, or the like. The operation inputter 240 is an interface for receiving
operations performed by the user (owner or borrower) such as, for example, turning
the power ON/OFF, adjusting the volume of the output sound, and the like. Note that
a configuration is possible in which, in order to further enhance a sense of lifelikeness,
the robot 200 includes only a power switch as the operation inputter 240 on the inside
of the exterior 201, and does not include other operation buttons, the volume knob,
and the like. In such a case as well, operations such as adjusting the volume of the
robot 200 can be performed using an external smartphone or the like connected via
the communicator 130.
[0033] The functional configuration of the robot 200 is described above. Next, action modes
of the robot 200 set by the controller 110 of the action control device 100 are described.
In the present embodiment, the robot 200 has, as the action modes, two action modes,
namely a normal action mode and a familiar action mode. Typically, the robot 200 operates
in the normal action mode but, when a person that has high intimacy with the robot
200 (a person intimate with the robot 200, for example, the owner, a person who always
cares for the robot 200, or the like) speaks to the robot 200, the robot 200 transitions
from the normal action mode to the familiar action mode and operates in the familiar
action mode for a certain amount of time. Note that the familiar action mode is an
action mode that is transitioned to when the person intimate with the robot 200 is
near and, as such, is also called an "intimate action mode."
[0034] The normal action mode is an action mode in which an action prepared in advance is
performed on the basis of an externally-received stimulus (sound, touch, or the like),
an emotion at that time, or the like, regardless of the intimacy between the robot
200 and the user near the robot 200. For example, in the normal action mode, the robot
200 performs a surprised action when the robot 200 hears a loud sound, and performs
a happy action when petted.
[0035] The familiar action mode is an action mode that is transitioned to from the normal
action mode when a determination is made, on the basis of a likelihood (certainty)
between the robot 200 and the user near the robot, that the user near the robot 200
is a person with high intimacy to the robot 200. The familiar action mode is set for
only a certain amount of time. In the familiar action mode, the robot 200 performs
an action of playing (playing around) with the user near the robot 200 in accordance
with the intimacy.
[0036] Specifically, for an action mode setting based on voice, a recognition level is determined
in accordance with the action mode setting table 126 illustrated in FIG. 5 and on
the basis of the similarity between the voice characteristic parameter of the acquired
voice and the voice history, and one action mode, namely the normal action mode or
the familiar action mode (three minutes, four minutes, five minutes), is set. Specifically,
when the similarity between the voice characteristic parameter of the acquired voice
and the voice history is low (lower than a predetermined threshold), the controller
110 determines, in accordance with the action mode setting table 126, that the intimacy
between the robot 200 and the person speaking to the robot 200 is low (that is, that
the person is not a "person that always cares for the robot 200"). and sets the action
mode to the normal action mode.
[0037] When the similarity between the voice characteristic parameter of the acquired voice
and the voice history is high (higher than the predetermined threshold), the controller
110 determines, in accordance with the action mode setting table 126, that the intimacy
between the robot 200 and the person speaking to the robot 200 is high (that is, that
the person is "a person that always cares for the robot 200"). Moreover, the controller
110 recognizes, on the basis of the likelihood (the certainty of "definitely" or "probably"
or "maybe") corresponding to the level of the similarity, that person as a "person
that always cares for the robot 200", and sets the action mode to the familiar action
mode for a familiar amount of time corresponding to the likelihood. For example, when
the similarity is very high, a first familiar amount of time (for example, five minutes)
is set as the familiar amount of time, when the similarity is high, a second familiar
amount of time (for example, four minutes) is set as the familiar amount of time,
and when the similarity is medium, a third familiar amount of time (for example, three
minutes) is set as the familiar amount of time.
[0038] In the present embodiment, the setting of the action mode is performed on the basis
of voice similarity, but the setting of the action mode is not limited to being performed
on the basis of voice similarity. For example, a configuration is possible in which
the action mode is set to the familiar action mode when the manner of petting is similar
to the past history. Additionally, a configuration is possible in which both the voice
and the manner of petting are used to define each of a familiar action mode for when
the similarity of both are high, a familiar action mode for when only the similarity
of the voice history is high, and a familiar action mode for when the similarity of
a touch history is high (for example, see
Japanese Patent Application No. 2021-158663 for a method for determining whether the method of petting is similar to the past
history).
[0039] Additionally, a configuration is possible in which, instead of absolutely setting
the robot 200 to the familiar action mode when the similarity to the history is high,
the controller 110 sets the robot 200 to the familiar action mode on the basis of
a certain probability (for example, a probability corresponding to an amount of growth
(growth value, described later) of the robot 200). Moreover, a configuration is possible
in which, when the robot 200 is not set to the familiar action mode regardless of
the similarity to the history being high, a familiar action (action when it is recognized
that the person near the robot 200 is the owner or a person that always cares for
the robot 200) described in, for example,
Japanese Patent Application No. 2021-158663, is set on the basis of the certain probability.
[0040] Next, of the data stored in the storage 120, the emotion data 121, the emotion change
data 122, the growth table 123, the action content table 124, and the growth days
count data 125, which are pieces of data required to determine general actions determined
on the basis of the growth value and the like, are described in order. The herein
described general actions are performed in the normal action mode of the present embodiment.
[0041] The emotion data 121 is data for imparting pseudo-emotions to the robot 200, and
is data (X, Y) that represents coordinates on an emotion map 300. As illustrated in
FIG. 6, the emotion map 300 is expressed by a two-dimensional coordinate system with
a degree of relaxation (degree of worry) axis as an X axis 311, and a degree of excitement
(degree of disinterest) axis as a Y axis 312. An origin 310 (0, 0) on the emotion
map 300 represents an emotion when normal. Moreover, as the value of the X coordinate
(X value) is positive and the absolute value thereof increases, emotions for which
the degree of relaxation is high are expressed and, as the value of the Y coordinate
(Y value) is positive and the absolute value thereof increases, emotions for which
the degree of excitement is high are expressed. Additionally, as the X value is negative
and the absolute value thereof increases, emotions for which the degree of worry is
high are expressed and, as the Y value is negative and the absolute value thereof
increases, emotions for which the degree of disinterest is high are expressed. Note
that, in FIG. 6, the emotion map 300 is expressed as a two-dimensional coordinate
system, but the number of dimensions of the emotion map 300 may be set as desired.
[0042] In the present embodiment, regarding the size of the emotion map 300 as the initial
value, as illustrated by frame 301 of FIG. 6, a maximum value of both the X value
and the Y value is 100 and a minimum value is -100. Moreover, during a first period,
each time the pseudo growth days count of the robot 200 increases one day, the maximum
value and the minimum value of the emotion map 300 both increase by two. Here, the
first period is a period in which the robot 200 grows in a pseudo manner, and is,
for example, a period of 50 days from a pseudo birth of the robot 200. Note that the
pseudo birth of the robot 200 is the time of the first start up by the user of the
robot 200 after shipping from the factory. When the growth days count is 25 days,
as illustrated by frame 302 of FIG. 6, the maximum value of the X value and the Y
value is 150 and the minimum value is -150. Moreover, when the first period (in this
example, 50 days) elapses, the pseudo growth of the robot 200 ends and, as illustrated
in frame 303 of FIG. 6, the maximum value of the X value and the Y value is 200, the
minimum value is -200, and the size of the emotion map 300 is fixed.
[0043] The emotion change data 122 is data that sets an amount of change that each of an
X value and a Y value of the emotion data 121 is increased or decreased. In the present
embodiment, as emotion change data 122 corresponding to the X of the emotion data
121, DXP that increases the X value and DXM that decreases the X value are provided
and, as emotion change data 122 corresponding to the Y value of the emotion data 121,
DYP that increases the Y value and DYM that decreases the Y value are provided. Specifically,
the emotion change data 122 includes the following four variables, and is data expressing
degrees to which the pseudo emotions of the robot 200 are changed.
DXP: Tendency to relax (tendency to change in the positive value direction of the
X value on the emotion map)
DXM: Tendency to worry (tendency to change in the negative value direction of the
X value on the emotion map)
DYP: Tendency to be excited (tendency to change in the positive value direction of
the Y value on the emotion map)
DYM: Tendency to be disinterested (tendency to change in the negative value direction
of the Y value on the emotion map)
[0044] In the present embodiment, an example is described in which the initial value of
each of these variables is set to 10, and the value increases to a maximum of 20 by
processing for learning emotion change data 122 in action control processing, described
later. Due to this learning processing, the emotion change data 122, that is, the
degree of change of emotion changes and, as such, the robot 200 assumes various personalities
in accordance with the manner in which the user interacts with the robot 200. That
is, the personality of each individual robot 200 is formed differently on the basis
of the manner in which the user interacts with the robot 200.
[0045] In the present embodiment, each piece of personality data (personality value) is
derived by subtracting 10 from each piece of emotion change data 122. Specifically,
a value obtained by subtracting 10 from DXP that expresses a tendency to be relaxed
is set as a personality value (chipper), a value obtained by subtracting 10 from DXM
that expresses a tendency to be worried is set as a personality value (shy), a value
obtained by subtracting 10 from DYP that expresses a tendency to be excited is set
as a personality value (active), and a value obtained by subtracting 10 from DYM that
expresses a tendency to be disinterested is set as a personality value (spoiled).
[0046] The initial value of each personality value is 0 and, as the robot 200 grows, each
personality value changes, with an upper limit of 10, due to external stimuli and
the like (manner in which the user interacts with the robot 200) detected by the sensor
210. In a case in which, as in the present embodiment, four personality values change
from 0 to 10, it is possible to express 14,641 types of personalities (11 to the 4
th power).
[0047] In the present embodiment, the greatest value among these four personality values
is used as growth level data (the growth value) that expresses a pseudo growth level
of the robot 200. Moreover, the controller 110 controls so that variation is introduced
into the action content of the robot 200 in accordance with the pseudo growth of the
robot 200 (as the growth value increases). As such, the data used by the controller
110 is the growth table 123.
[0048] As illustrated in FIG. 7, types of actions to be performed by the robot 200 in response
to an action trigger such as the external stimulus detected by the sensor 210 or the
like, and a probability of each action being selected in accordance with the growth
value (hereinafter referred to as "action selection probability") are stored in the
growth table 123. Note that the action trigger is information about the external stimulus
or the like that triggers the performance of some sort of action by the robot 200.
[0049] For example, a case is assumed in which, as a current personality value of the robot
200, the personality value (chipper) is 3, the personality value (active) is 8, the
personality value (shy) is 5, and the personality value (spoiled) is 4, and a loud
sound is detected by the microphone 214. In this case, the growth value is 8, which
is the maximum value of the four personality values, and the action trigger is "heard
a loud sound." Moreover, in the growth table 123 illustrated in FIG. 7, when referencing
the entry for when the action trigger is "heard a loud sound" and the growth value
is 8, it is clear that the action selection probability of "basic action 2-0" is 20%,
the action selection probability of "basic action 2-1" is 20%, the action selection
probability of "basic action 2-2" is 40%, and the action selection probability of
"personality action 2-0" is 20%.
[0050] That is, in this case, the "basic action 2-0" is selected at a probability of 20%,
the "basic action 2-1" is selected at a probability of 20%, the "basic action 2-2"
is selected at a probability of 40%, and the "personality action 2-0" is selected
at a probability of 20%. Moreover, when the "personality action 2-0" is selected,
selection according to the four personality values of one of four types of personality
actions such as those illustrated in FIG. 8 is further performed. Then, the robot
200 executes the selected action.
[0051] Note that, in the growth table 123 (FIG. 7) of the present embodiment, one personality
action is selected for each action trigger but, as with the basic actions, a configuration
is possible in which the types of selected personality actions are increased in accordance
with an increase in the personality values. Additionally, in the present embodiment,
only the growth table 123 (FIG. 7) for defining actions when in the normal action
mode is defined, but a configuration is possible in which a growth table for defining
actions when in the familiar action mode is separately defined. Moreover, a configuration
is possible in which the content of FIG. 5 is also incorporated into the content of
the growth table 123 (FIG. 7) and a growth table is set that defines, in the action
type, not only actions for when in the normal action mode but also actions for when
in the familiar action mode.
[0052] Provided that the growth table 123 can, for each action trigger, define a function
(growth function) that returns, with the growth value as an argument, the action selection
probability of each action type, any form may be used for the growth table 123, and
the growth table 123 need not necessarily be in the form of tabular data such as illustrated
in FIG. 7.
[0053] As illustrated in FIG. 8, the action content table 124 is a table in which specific
action content of the various action types defined in the growth table 123 is stored
and, for the personality actions, action content is defined for every type of personality.
Note that the action content table 124 is not essential data. For example, the action
content table 124 is unnecessary in a case in which the growth table 123 is constructed
such that specific action content is directly recorded in the action type field of
the growth table 123.
[0054] The growth days count data 125 has an initial value of 1, and 1 is added for each
passing day. The growth days count data 125 represents a pseudo growth days count
(number of days from a pseudo birth) of the robot 200. In the present embodiment,
a period of the growth days count expressed by the growth days count data 125 is called
a "second period."
[0055] Next, the action control processing executed by the controller 110 of the action
control device 100 is described while referencing the flowchart illustrated in FIG.
9. The action control processing is processing in which the controller 110 controls
the actions (motion, animal sound, or the like) of the robot 200 on the basis of detection
values from the sensor 210 or the like. When the user turns ON the power of the robot
200, execution of a thread of this action control processing is started in parallel
with other required processings. As a result of the action control processing, the
driver 220 and the sound outputter 230 are controlled, the motion of the robot 200
is expressed, sounds such as animal sounds and the like are output, and the like.
[0056] Firstly, the controller 110 initialization-processes the various types of data such
as the emotion data 121, the emotion change data 122, the growth days count data 125,
and the like (step S101). The various variables used in the present embodiment (BigSound_Flag,
TalkSound_Flag, Talkdefinitely_Flag, Talkprobably _Flag, Talkmaybe_Flag, Touch_Flag,
and the like) are also initialized to OFF or 0 in step S101. Additionally, the controller
110 sets the action mode to the normal action mode in step S101.
[0057] Next, the controller 110 executes microphone input processing for acquiring the external
stimulus (voice) of the subject (the user) from the microphone 214 (step S 102). Next,
the controller 110 executes action mode setting processing for setting the action
mode (step S103). Details of the action mode setting processing are described later
but, mainly, the action mode setting processing is processing for setting, on the
basis of the similarity between the external stimulus acquired in step S102 and the
past history, the action mode to the normal action mode or the familiar action mode
presented in the action mode setting table 126 illustrated in FIG. 5.
[0058] Next, the controller 110 executes touch input processing for acquiring the external
stimulus from the touch sensor 211 and/or the acceleration sensor 212 (step S104).
In the touch input processing, when touched or when there is a change in acceleration
or angular velocity, the controller 110 sets the Touch_Flag to ON, calculates a touch
characteristic parameter, and determines, on the basis of the similarity between the
calculated touch characteristic parameter and a touch history, which is the history
of past touch characteristic parameters, the intimacy with the subject (the user)
applying the external stimulus (see
Japanese Patent Application No. 2021-158663 for details about the touch input processing).
[0059] Note that, in the present embodiment, to facilitate comprehension, the microphone
input processing and the touch input processing are described as separate processings,
but a configuration is possible in which processing for acquiring the external stimulus
from the various types of sensors of the sensor 210 and determining the intimacy with
the subject (the user) applying the external stimulus is executed as a single processing
(external input processing). Additionally, in the present embodiment, the action mode
setting processing is executed in step S103, but a configuration is possible in which
the action mode is set in consideration of an external input other than voice by executing
the action mode setting processing after the touch input processing or after the external
input processing.
[0060] Next, the controller 110 determines whether the external stimulus is acquired by
the sensor 210 (step S105). For example, when a sound-based external stimulus is detected,
as a result of the microphone input processing described above, the BigSound_Flag
(flag that turns ON when a loud sound is detected) or the TalkSound_Flag (flag that
turns ON when the voice of a person is detected) is set to ON and, as such, the controller
110 can determine, on the basis of the values of these flag variables, whether the
external stimulus is acquired in step S105.
[0061] When a determination is made that the external stimulus is acquired (step S105; Yes),
the controller 110 acquires, in accordance with the external stimulus acquired in
the microphone input processing and the touch input processing, the emotion change
data 122 to be added to or subtracted from the emotion data 121 (step S106). When,
for example, petting of the head 204 is detected as the external stimulus, the robot
200 obtains a pseudo sense of relaxation and, as such, the controller 110 acquires
DXP as the emotion change data 122 to be added to the X value of the emotion data
121.
[0062] Next, the controller 110 sets the emotion data 121 in accordance with the emotion
change data 122 acquired in step S106 (step S107). When, for example, DXP is acquired
as the emotion change data 122 in step S106, the controller 110 adds the DXP of the
emotion change data 122 to the X value of the emotion data 121. However, in a case
in which a value (X value, Y value) of the emotion data 121 exceeds the maximum value
of the emotion map 300 when adding the emotion change data 122, that value of the
emotion data 121 is set to the maximum value of the emotion map 300. In addition,
in a case in which a value of the emotion data 121 is less than the minimum value
of the emotion map 300 when subtracting the emotion change data 122, that value of
the emotion data 121 is set to the minimum value of the emotion map 300.
[0063] In steps S106 and S107, any type of settings are possible for the type of emotion
change data 122 acquired and the emotion data 121 set for each individual external
stimulus. Examples are described below.
The head 204 is petted (relax): X = X+DXP
The head 204 is struck (worry): X = X-DXM
(these external stimuli can be detected by the touch sensor 211H of the head 204)
The torso 206 is petted (excite): Y = Y+DYP
The torso 206 is struck (disinterest): Y = Y-DYM
(these external stimuli can be detected by the touch sensor 211 of the torso 206)
Held with head upward (happy): X = X+DXP and Y = Y+DYP
Suspended with head downward (sad): X = X-DXM and Y = Y-DYM
(these external stimuli can be detected by the touch sensor 211 and the acceleration
sensor 212)
Spoken to in kind voice (peaceful): X = X+DXP and Y = Y-DYM
Yelled at in loud voice (upset): X = X-DXM and Y = Y+DYP
(these external stimuli can be detected by the microphone 214)
[0064] Next, the controller 110 determines whether the current action mode is the normal
action mode or the familiar action mode (step S108). When a determination is made
that the current action mode is the normal action mode (step S108; Normal action mode),
the controller 110 executes normal action mode processing, described later (step 5112),
and executes step S115.
[0065] When a determination is made that the current action mode is the familiar action
mode (step S108; Familiar action mode), the controller 110 executes familiar action
mode processing described later (step S109). Then, the controller 110 determines whether
a familiar action amount of time (predetermined amount of time set from the start
of the familiar action mode) set in the action mode setting processing of step S103
has elapsed (step S110). When the familiar action amount of time has not elapsed (step
S110; No), the controller 110 executes step S115.
[0066] When a determination is made that the familiar action amount of time has elapsed
(step S110; Yes), the controller 110 sets the action mode to the normal action mode
(step S111), and executes step S115.
[0067] Meanwhile, when a determination is made in step S105 that the external stimulus is
not acquired (step S105; No), the controller 110 determines whether to perform a spontaneous
action such as a breathing action that creates the impression that the robot 200 is
breathing, or the like, by periodically driving the twist motor 221 and the vertical
motor 222 at a certain rhythm (step S113). Any method may be used as the method for
determining whether to perform the spontaneous action and, in the present embodiment,
it is assumed that the determination of step S113 is "Yes" and the breathing action
is performed every breathing cycle (for example, two seconds).
[0068] When a determination is made to perform the spontaneous action (step S113; Yes),
the controller 110 executes the spontaneous action (for example, the breathing action)
(step S114), and executes step S115.
[0069] When a determination is made to not perform the spontaneous action (step S113; No),
the controller 110 uses a built-in clock function to determine whether a date has
changed (step S115). When a determination is made that the date has not changed (step
S115; No), the controller 110 executes step S102.
[0070] Meanwhile, when a determination is made that the date has changed (step S115; Yes),
the controller 110 determines whether it is in a first period (step S1 16). When the
first period is, for example, a period 50 days from the pseudo birth (for example,
the first startup by the user after purchase) of the robot 200, the controller 110
determines that it is in the first period when the growth days count data 125 is 50
or less. When a determination is made that it is not in the first period (step S116;
No), the controller 110 executes step S118.
[0071] When a determination is made that it is in the first period (step S116; Yes), the
controller 110 executes learning processing of the emotion change data 122, and expands
the emotion map (step S117). The learning processing of the emotion change data 122
is, specifically, processing for updating the emotion change data 122 by adding 1
to the DXP of the emotion change data 122 when the X value of the emotion data 121
is set to the maximum value of the emotion map 300 even once in step S107 of that
day, adding 1 to the DYP of the emotion change data 122 when the Y value of the emotion
data 121 is set to the maximum value of the emotion map 300 even once in step S107
of that day, adding 1 to the DXM of the emotion change data 122 when the X value of
the emotion data 121 is set to the minimum value of the emotion map 300 even once
in step S107 of that day, and adding 1 to the DYM of the emotion change data 122 when
the Y value of the emotion data 121 is set to the minimum value of the emotion map
300 even once in step S 107 of that day.
[0072] However, when the various values of the emotion change data 122 become exceedingly
large, the amount of change of one time of the emotion data 121 becomes exceedingly
large and, as such, the maximum value of the various values of the emotion change
data 122 is set to 20, for example, and the various values are limited to that maximum
value or less. Here, 1 is added to each piece of the emotion change data 122, but
the value to be added is not limited to 1. For example, a configuration is possible
in which a number of times at which the various values of the emotion data 121 are
set to the maximum value or the minimum value of the emotion map 300 is counted and,
when that number of times is great, the numerical value to be added to the emotion
change data 122 is increased.
[0073] Expanding the emotion map 300 in step S117 of FIG. 9 is, specifically, processing
in which the controller 110 expands both the maximum value and the minimum value of
emotion map 300 by 2. However, the numerical value "2" to be expanded is merely an
example, and the emotion map 300 may be expanded by 3 or greater, or be expanded by
1. Additionally, the numerical values that the emotion map 300 is expanded by for
the maximum value and the minimum value need not be the same.
[0074] Then, the controller 110 adds 1 to the growth days count data 125, initializes both
the X value and the Y value of the emotion data 121 to 0 (step S118), and executes
step S102.
[0075] Next, the microphone input processing executed in step S102 of the action control
processing is described while referencing FIGS. 10 and 11.
[0076] Firstly, the controller 110 substitutes, for a variable ML, a maximum level of the
sampling data of the voice that is acquired by the microphone input processing and
stored in the sound buffer 127 (step S201). Next, the controller 110 determines whether
the value of the variable ML is greater than a BigSoundTh (step S202). Note that the
BigSoundTh is a value (loud sound threshold), and the robot 200 performs a surprised
action in response to sounds louder than the BigSoundTh. When a determination is made
that the variable ML is greater than the BigSoundTh (step S202; Yes), the controller
110 sets a variable BigSound _Flag, indicating that a loud sound has been input, to
ON (step S203), ends the microphone input processing, and executes step S103 of the
action control processing.
[0077] Meanwhile, when a determination is made that the variable ML is not greater than
the BigSoundTh (step S202; No), the controller 110 determines whether the value of
the variable ML is greater than a TalkSoundTh. Note that the TalkSoundTh is a value
(talking voice threshold), and the robot 200 cannot hear, as a talking voice, sounds
that are quieter than or equal to the TalkSoundTh. When a determination is made that
the variable ML is not greater than the TalkSoundTh (step S204; No), the controller
110 ends the microphone input processing, and executes step S103 of the action control
processing.
[0078] Meanwhile, when a determination is made that the variable ML is greater than the
TalkSoundTh (step S204; Yes), the controller 110 determines whether a number of buffers
storing the sound data in the sound buffer 127 is less than a reference number (here,
the 16 buffers 1270 to 1285) (step S205). When a determination is made that the number
of buffers is less than the reference number (step S205; Yes), the controller 110
executes step S205, and continues the storing of the reference number of buffers.
[0079] Meanwhile, when a determination is made that the number of buffers storing the sound
data has reached the reference number (step S205; No), the controller 110 determines
whether the sound stored in the reference number of buffers is noise (step S206).
As an example of a method for determining whether the sound is noise, when the sound
stored in the buffers is a talking voice that is not noise, a sound of a level greater
than TalkSoundTh occurs for a certain amount of time (for example, 0.1 seconds or
longer). Meanwhile, when the sound stored in the buffers is noise, there is a high
possibility that the sound is a single, momentary sound. The controller 110 uses such
sound characteristics to determine whether the sound stored in each buffer is noise.
[0080] Firstly, for a predetermined number of buffers (in the present embodiment, three
sound buffers, namely, buffer 1270, buffer 1271, and buffer 1272) from the beginning
(the buffer 1270) among the reference number of buffers, the controller 110 investigates
the number of buffers in which, of the sampling data stored in each buffer, sampling
data having a maximum value greater than the TalkSoundTh is stored. When a determination
is made that there is even one buffer in which sampling data having a maximum value
less than or equal to the TalkSoundTh is stored, the sampling data of the reference
number of buffers stored this time is determined to be noise. Meanwhile, when a determination
is made that the maximum level of the sampling data stored in all of the buffers is
greater than the TalkSoundTh, the sampling data is determined to be not noise.
[0081] When a determination is made that the sound stored in the reference number of buffers
is noise (step S206; Yes), the controller 110 disregards the sampling data stored
in the current reference number of buffers (that is, determines that there are no
sound external stimuli that constitute an action trigger), ends the microphone input
processing, and executes step S103 of the action control processing.
[0082] Meanwhile, when a determination is made that the sound stored in the reference number
of buffers is not noise (step S206; No), the controller 110 determines that the sampling
data is a talking voice, substitutes ON for the variable TalkSound_Flag that indicates
that a talking voice is inputted (step S207), and performs voice characteristic parameter
calculation processing (step S208). The voice characteristic parameter calculation
processing is processing for calculating the voice characteristic parameter by calculating
a Cepstrum from the sampling data stored in the sound buffer 127 (for details, see
Japanese Patent Application No. 2021-158663).
[0083] Next, the controller 110 performs similarity with voice history determination processing
(step S209). The similarity with voice history determination processing is processing
for calculating a similarity by comparing the voice characteristic parameter calculated
by the voice characteristic parameter calculation processing and the voice history,
and outputting from return = 0 to return = 3 in accordance with the similarity (0
= not similar, 1 = medium similarity, 2 = high similarity, and 3 = very high similarity).
[0084] Then, the controller 110 determines an output result of the similarity with voice
history determination processing (step S210). When the determination result of the
similarity with voice history determination processing is return = 3 (step S210; Yes),
the controller 110 substitutes ON for a variable Talkdefinitely_Flag indicating that
the robot 200 recognizes that the voice definitely is a person that always cares for
the robot 200 (step S211), and executes step S212.
[0085] When a determination is made that the determination result of the similarity with
voice history determination processing is not return = 3 (step S210; No), the controller
110 determines whether the determination result of the similarity with voice history
determination processing is return = 2 (that is, high similarity) (step S213). When
a determination is made that the determination result is return = 2 (step S213; Yes),
the controller 110 substitutes ON for a variable Talkprobably_Flag indicating that
the robot 200 recognizes that the voice probably is a person that always cares for
the robot 200 (step S214), and executes step S212.
[0086] When a determination is made that the determination result of the similarity with
voice history determination processing is not return = 2 (step S213; No), the controller
110 determines whether the determination result of the similarity with voice history
determination processing is return = 1 (that is, medium similarity) (step S215). When
a determination is made that the determination result is return = 1 (step S215; Yes),
the controller 110 substitutes ON for a variable Talkmaybe_Flag indicating that the
robot 200 recognizes that the voice maybe is a person that always cares for the robot
200 (step S216), and executes step S212.
[0087] When a determination is made that the determination result of the similarity with
voice history determination processing is not return = 1 (step S215; No), the controller
110 substitutes ON for a variable Talkgeneralaction_Flag indicating that a general
action is to be performed (step S217), and executes step S212.
[0088] Next, in step S212, the controller 110 stores, in the voice history (VFIFO), the
voice characteristic parameter calculated in step S208 (step S212). Then, the controller
110 ends the microphone input processing, and executes step S103 of the action control
processing.
[0089] Next, the similarity with voice history determination processing executed in step
S218 of the microphone input processing is described while referencing FIG. 12.
[0090] Firstly, the controller 110 determines whether a stored number of the voice history
stored in the variable VFIFO_Size (buffer) is greater than a minimum voice reference
number (in the present embodiment, 32) (step S251). When a determination is made that
the stored number is less than or equal to the minimum voice reference number (step
S251; No), the controller 110 outputs "return = 0" (expressing not similar), ends
the similarity with voice history determination processing, and executes step S210
of the microphone input processing.
[0091] When a determination is made that the stored number is greater than the minimum voice
reference number (step S251; Yes), the controller 110 initializes a variable abssimCnt
for counting the number of voice histories for which the similarity is very high,
a variable simCnt for counting the number of voice histories for which the similarity
is high, a variable maysimCnt for counting the number of voice histories for which
the similarity is medium, and a variable i for stipulating the various elements (VFIFO[0]
to VFIFO[VFIFO_Size-1]), as array variables, of the voice history VFIFO to 0 (step
S252).
[0092] Next, the controller 110 calculates a distance (L2 norm) between the voice characteristic
parameter calculated in step S208 and the VFIFO[i], and substitutes this distance
for a variable d[i] (step S253). Next, the controller 110 determines whether the value
of the variable d[i] is less than a VAbsSimTh (voice extremely highly similar threshold)
(step S254). Note that a value less than a VSimTH (voice similar threshold), described
later, is set in advance as the VAbsSimTh (voice extremely highly similar threshold).
When a determination is made that the variable d[i] is less than the VAbsSimTh (step
S254; Yes), the controller 110 adds 1 to the variable abssimCnt (step S255), and executes
step S256. When a determination is made that the variable d[i] is greater than or
equal to the VAbsSimTh (step S254; No), the controller 110 executes step S256.
[0093] Next, the controller 110 determines whether the value of the variable d[i] is less
than the VSimTh (set in advance as the voice similar threshold) (step S256). When
a determination is made that the variable d[i] is less than the VSimTh (step S256;
Yes), the controller 110 adds 1 to the variable simCnt (step S257), and executes step
S258. When a determination is made that the variable d[i] is greater than or equal
to the VSimTh (step S256; No), the controller 110 executes step S258.
[0094] Next, in step S258, the controller 110 determines whether the value of the variable
d[i] is less than a VMaySimTh (voice medium similar threshold). Note that a value
greater than the VSimTH (voice similar threshold) is set in advance as the VMaySimTh
(voice medium similar threshold). When a determination is made that the variable d[i]
is less than the VMaySimTh (step S258; Yes), the controller 110 adds 1 to the variable
maysimCnt (step S259), and executes step S260. When a determination is made that the
variable d[i] is greater than or equal to the VMaySimTh (step S258; No), the controller
110 executes step S260.
[0095] In step S260, the controller 110 adds 1 to the variable i. Next, the controller 110
determines whether the value of the variable i is less than the variable VFIFO_Size
(step S261). When a determination is made that the variable i is less than the variable
VFIFO_Size (step S261; Yes), the controller 110 executes step S253.
[0096] When a determination is made that the variable i is greater than or equal to the
variable VFIFO_Size (step S261; No), the controller 110 determines whether a ratio
of the variable abssimCnt to the variable VFIFO _Size exceeds 20% (step S262). When
a determination is made that the ratio of the variable abssimCnt to the variable VFIFO
_Size exceeds 20% (step S262; Yes), the similarity between the voice characteristic
parameter calculated in step S208 and the voice history is very high and, as such,
the controller 110 outputs "return = 3", ends the similarity with voice history determination
processing, and executes step S210 of the microphone input processing.
[0097] Meanwhile, when a determination is made that the ratio of the variable abssimCnt
to the variable VFIFO_Size is less than or equal to 20% (step S262; No), the controller
110 determines whether a ratio of the variable simCnt to the variable VFIFO_Size exceeds
20% (step S263). When a determination is made that the ratio of the variable simCnt
to the variable VFIFO_Size exceeds 20% (step S263; Yes), the similarity between the
voice characteristic parameter calculated in step S208 and the voice history is high
and, as such, the controller 110 outputs "return = 2", ends the similarity with voice
history determination processing, and executes step S210 of the microphone input processing.
[0098] When a determination is made that the ratio of the variable simCnt to the variable
VFIFO_Size is less than or equal to 20% (step S263; No), the controller 110 determines
whether a ratio of the variable maysimCnt to the variable VFIFO_Size exceeds 30% (step
S264). When a determination is made that the ratio of the variable maysimCnt to the
variable VFIFO_Size exceeds 30% (step S264; Yes), the similarity between the voice
characteristic parameter calculated in step S208 and the voice history is medium and,
as such, the controller 110 outputs "return = 1", ends the similarity with voice history
determination processing, and executes step S210 of the microphone input processing.
[0099] Meanwhile, when a determination is made that the ratio of the variable maysimCnt
to the variable VFIFO_Size is less than or equal to 30% (step S264; No), the voice
characteristic parameter calculated in step S208 and the voice history are not similar
and, as such, the controller 110 outputs "return = 0", ends the similarity with voice
history determination processing, and executes step S210 of the microphone input processing.
Note that comparing against "20%" and "30%" in the determinations described above
are merely examples, and can be changed as needed together with the VabsSimTh, the
VSimTh, and the VMaySimTh.
[0100] Next, the action mode setting processing that is executed in step S103 of the action
control processing (FIG. 9) is described while referencing FIG. 13.
[0101] Firstly, the controller 110 determines whether the subject (the user or the like)
applying the external stimulus definitely is a person that always cares for the robot
200 (that is, whether the Talkdefinitely_Flag is ON) (step S131). When a determination
is made that the subject definitely is a person that always cares for the robot 200
(step S131; Yes), the controller 110 sets the action mode to the familiar action mode,
sets the familiar action amount of time to a first familiar action amount of time
(for example, five minutes) (step S132), ends the action mode setting processing,
and executes step S104 of the action control processing.
[0102] When a determination is made that the subject (the user or the like) applying the
external stimulus is not definitely a person that always cares for the robot 200 (that
is, when the Talkdefinitely_Flag is not ON) (step S131; No), the controller 110 determines
whether the subject (the user or the like) applying the external stimulus probably
is a person that always cares for the robot 200 (that is, whether the Talkprobably_Flag
is ON) (step S133). When a determination is made that the subject probably is a person
that always cares for the robot 200 (step S133; Yes), the controller 110 sets the
action mode to the familiar action mode, sets the familiar action amount of time to
a second familiar action amount of time (for example, four minutes) (step S134), ends
the action mode setting processing, and executes step S104 of the action control processing.
[0103] When a determination is made that the subject (the user or the like) applying the
external stimulus is not probably a person that always cares for the robot 200 (that
is, when the Talkprobably_Flag is not ON) (step S133; No), the controller 110 determines
whether the subject (the user or the like) applying the external stimulus "maybe is
a person that always cares for the robot 200" (that is, whether the Talkmaybe_Flag
is ON) (step S135). When a determination is made that the subject "maybe is a person
that always cares for the robot 200 (step S135; Yes), the controller 110 sets the
action mode to the familiar action mode, sets the familiar action amount of time to
a third familiar action amount of time (for example, three minutes) (step S136), ends
the action mode setting processing, and executes step S 104 of the action control
processing.
[0104] When a determination is made that the subject (the user or the like) applying the
external stimulus is not "maybe a person that always cares for the robot 200" (that
is, when the Talkmaybe_Flag is not ON) (step S135; No), the controller 110 sets the
action mode to the normal action mode (step S137), ends the action mode setting processing,
and executes step S104 of the action control processing.
[0105] As a result of the action mode setting processing described above, the controller
110 sets the action mode on the basis of the likelihood obtained from the level of
similarity between the subject (the user or the like) applying the external stimulus
to the robot 200 and the past voice history. In cases in which the action mode is
set to the familiar action mode, the action mode is returned to the normal action
mode when the predetermined familiar action amount of time elapses from the start
of the familiar action mode. When setting the familiar action mode again during the
period in which the familiar action mode is set, the familiar action amount of time
is re-set (updated) on the basis of a degree of confidence in the intimacy. Accordingly,
a user that always cares for the robot 200 can extend the amount of time that the
action mode is set to the familiar action mode by occasionally speaking to the robot
200 when in the familiar action mode.
[0106] Next, the normal action mode processing that is executed in step S112 of the action
control processing (FIG. 9) is described while referencing FIG. 14.
[0107] Firstly, the controller 110 determines whether there is an external stimulus such
as a touch or the like in the touch input processing (step S151). Specifically, it
is sufficient that the controller 110 determines whether the Touch_Flag is ON. When
there is a touch or the like (step S151; Yes), the controller 110 performs a touch
general action (step S152). The touch general action is a general action performed
when the user pets the body of the robot 200, holds the robot 200, or the like, and
specifically is an action set in the action type field of the growth table 123, with
the body is petted or held as the action trigger (in FIG. 7, the basic action 0-0
and the like). Next, the controller 110 substitutes OFF for the variable Touch_Flag
(step S153), ends the normal action mode processing, and executes step S115 of the
action control processing (FIG. 9).
[0108] Meanwhile, when there is not an external stimulus such as a touch or the like in
the touch input processing (step S151; No), the controller 110 determines whether
there is a sound as the external stimulus in the microphone input processing (step
S154). Specifically, it is sufficient that the controller 110 determines whether the
TalkSound_Flag is ON. If there is a sound (step S154; Yes), the controller 110 performs
a "Talk general action" (step S155). The "Talk general action" is a general action
performed when the user speaks to the robot 200, and specifically is an action set
in the action type field of the growth table 123, with the robot 200 is spoken to
as the action trigger (in FIG. 7, the basic action 1-0 and the like). Next, the controller
110 substitutes OFF for the variable TalkSound_Flag (step S156), ends the normal action
mode processing, and executes step S115 of the action control processing (FIG. 9).
[0109] Meanwhile, when there is not a sound as the external stimulus in the microphone input
processing (step S154; No), the controller 110 determines whether there is a loud
sound as the external stimulus in the microphone input processing (step S157). Specifically,
it is sufficient that the controller 110 determines whether the BigSound_Flag is ON.
When there is a loud sound (step S157; Yes), the controller 110 executes an action
of reacting to the loud sound (step S158). That is, the controller 110 executes an
action (the basic action 2-0 or the like) corresponding to "heard a loud sound" as
the action trigger of the growth table 123 illustrated in FIG. 7. Then, the controller
110 substitutes OFF for the variable BigSound_Flag (step S159), ends the normal action
mode processing, and executes step S115 of the action control processing (FIG. 9).
[0110] Meanwhile, when there is not a loud sound as the external stimulus (step S157; No),
the controller 110 executes an action corresponding to another external stimulus (when
an action trigger corresponding to the external stimulus acquired in the microphone
input processing and/or the touch input processing exists in the growth table 123,
an action corresponding to that action trigger) (step S160), ends the normal action
mode processing, and executes step S115 of the action control processing (FIG. 9).
[0111] Next, the familiar action mode processing that is executed in step S109 of the action
control processing (FIG. 9) is described while referencing FIG. 15.
[0112] Firstly, the controller 110 determines whether there is an external stimulus such
as a touch or the like in the touch input processing (step S171). Specifically, it
is sufficient that the controller 110 determines whether the Touch_Flag is ON. When
there is a touch or the like (step S171; Yes), the controller 110 executes touch response
familiar action processing (step S172). The touch response familiar action processing
is described later. Next, the controller 110 substitutes OFF for the variable Touch_Flag
(step S173), ends the familiar action mode processing, and executes step S110 of the
action control processing (FIG. 9).
[0113] Meanwhile, when there is not an external stimulus such as a touch or the like in
the touch input processing (step S171; No), the controller 110 determines whether
there is a sound as the external stimulus in the microphone input processing (step
S174). Specifically, it is sufficient that the controller 110 determines whether the
TalkSound_Flag is ON. When there is a sound (step S174; Yes), the controller 110 executes
sound response familiar action processing (step S175). The sound response familiar
action processing is described later. Next, the controller 110 substitutes OFF for
the variable TalkSound_Flag (step S176), ends the familiar action mode processing,
and executes step S110 of the action control processing (FIG. 9).
[0114] Meanwhile, when there is not a sound as the external stimulus in the microphone input
processing (step S174; No), the controller 110 determines whether there is a loud
sound as the external stimulus in the microphone input processing (step S177). Specifically,
it is sufficient that the controller 110 determines whether the BigSound_Flag is ON.
When there is a loud sound (step S177; Yes), the controller 110 executes loud sound
response familiar action processing (step S178). The loud sound response familiar
action processing is described later. Then, the controller 110 substitutes OFF for
the variable BigSound_Flag (step S179), ends the familiar action mode processing,
and executes step S 110 of the action control processing (FIG. 9).
[0115] Meanwhile, when there is not a loud sound as the external stimulus (step S177; No),
the controller 110 executes an action corresponding to another external stimulus (when
an action trigger corresponding to the external stimulus acquired in the microphone
input processing and/or the touch input processing exists in the growth table 123,
an action corresponding to that action trigger) (step S180), ends the familiar action
mode processing, and executes step S110 of the action control processing (FIG. 9).
[0116] Next, the touch response familiar action processing that is executed in step S 172
of the familiar action mode processing (FIG. 15) is described while referencing FIG.
16.
[0117] Firstly, the controller 110 determines, by the touch sensor 211H, whether the head
204 is being held down (step S301). This can be determined on the basis of whether
the pressure acquired by the touch sensor 211H is greater than or equal to a predetermined
threshold. When the head 204 is not being held down (step S301; No), the controller
110 ends the touch response familiar action processing, and executes step S173 of
the familiar action mode processing (FIG. 15).
[0118] When the head 204 is being held down (step S301; Yes), the controller 110 performs
an action of raising the torso 206 (step S302). Specifically, the controller 110 raises
the head 204 using the vertical motor 222. Since the user is holding the head 204
down, the torso 206 is raised by raising the head 204 using the vertical motor 222.
Note that when the force of the user holding the head 204 down is weak, it is thought
that the head 204 will rise without the torso 206 rising. As such, the predetermined
threshold when determining in step S301 is set, by step S302, to a value at which
the torso 206 will rise.
[0119] Next, the controller 110 determines whether the head 204 is still being held down
(step S303). When the head 204 is still being held down (step S303; Yes), the controller
110 executes step S302, and repeats the action of raising the torso 206.
[0120] When the head 204 is not being held down (step S303; No), the controller 110 returns
the robot 200 to the original state (typically, a cyclical breathing action) (step
S304), ends the touch response familiar action processing, and executes step S173
of the familiar action mode processing (FIG. 15).
[0121] As a result of this touch response familiar action processing, when the user holds
down the head 204 of the robot 200, the robot 200 raises the torso 206 in response
thereto and, as such, the user can be given the impression of playing with the robot
200.
[0122] Next, the sound response familiar action processing that is executed in step S175
of the familiar action mode processing (FIG. 15) is described while referencing FIG.
17.
[0123] Firstly, the controller 110 determines, by the touch sensor 211 of the torso 206,
whether the torso 206 is being touched (step S311). When the torso 206 is not being
touched (step S311; No), the controller 110 ends the sound response familiar action
processing, and executes step S176 of the familiar action mode processing (FIG. 15).
[0124] When the torso 206 is being touched (step S311; Yes), the controller 110 performs
a trembling action (step S312). Specifically, the controller 110 causes the robot
200 to tremble by moving the head 204 left and right at small increments (for details
of this processing, see Unexamined
Japanese Patent Application Publication No. 2022-142113).
[0125] Next, the controller 110 determines whether the torso 206 is still being touched
(step S313). When the torso 206 is still being touched (step S313; Yes), the controller
110 executes step S312 and repeats the trembling action.
[0126] When the torso 206 is not being touched (step S313; No), the controller 110 performs
an action of raising the head 204 and looking around (step S314). Specifically, the
controller 110 uses the vertical motor 222 to raise the head 204, and uses the twist
motor 221 to rotate the head 204 to the left and right.
[0127] Then, the controller 110 returns the robot 200 to the original state (typically,
a cyclical breathing action) (step S315), ends the sound response familiar action
processing, and executes step S176 of the familiar action mode processing (FIG. 15).
[0128] As a result of this sound response familiar action processing, when the user holds
down the torso 206 of the robot 200, the robot 200 trembles and, as such, the robot
200 can give the impression of being frightened due to the body being held down. Moreover,
when the hand of the user is removed from the robot 200, the robot 200 raises the
head 204 and looks around as if to say "the danger has passed." As a result, the user
can feel that the robot 200 is more adorable.
[0129] Next, the loud sound response familiar action processing that is executed in step
S178 of the familiar action mode processing (FIG. 15) is described while referencing
FIG. 18.
[0130] Firstly, the controller 110 generates a random number, namely an integer from 0 to
2 (step S321). Next, the controller 110 determines whether the generated random number
is 0 (step S322). When the generated random number is 0 (step S322; Yes), the controller
110 performs an action of tilting the robot 200 to the left (step S323). Specifically,
the controller 110 uses the vertical motor 222 to lower the head 204, and uses the
twist motor 221 to rotate the head 204 to the right. As a result, the body of the
robot 200 tilts diagonally to the left. Then, the controller 110 ends the loud sound
response familiar action processing, and executes step S179 of the familiar action
mode processing (FIG. 15).
[0131] When the generated random number is not 0 (step S322; No), the controller 110 determines
whether the generated random number is 1 (step S324). When the generated random number
is 1 (step S324; Yes), the controller 110 performs an action of tilting the robot
200 to the right (step S325). Specifically, the controller 110 uses the vertical motor
222 to lower the head 204, and uses the twist motor 221 to rotate the head 204 to
the left. As a result, the body of the robot 200 tilts diagonally to the right. Then,
the controller 110 ends the loud sound response familiar action processing, and executes
step S179 of the familiar action mode processing (FIG. 15).
[0132] When the generated random number is not 1 (step S324; No), the controller 110 determines
whether the generated random number is 2 (step S326). When the generated random number
is 2 (step S326; Yes), the controller 110 causes the robot 200 to perform a swing
action (step S325). Specifically, the controller 110 repeatedly performs the action
of tilting to the left and the action of tilting to the right to give the impression
that that the robot 200 is swinging. Then, the controller 110 ends the loud sound
response familiar action processing, and executes step S179 of the familiar action
mode processing (FIG. 15).
[0133] Note that a configuration is possible in which, in step S321 described above, the
controller 110 generates numbers in a regular order of, for example, 0, 1, 2, 0, 1,
2, and so on, instead of generating a random number.
[0134] As a result of this loud sound response familiar action processing, when the user
makes a loud sound, the body of the robot 200 moves so as to tilt and swing and, as
such, the user can be given the impression of playing with the robot 200 by making
sounds.
[0135] Note that the various actions described above of the familiar action mode are merely
examples, and the action content may be changed on the basis of the emotion data 121
at each point in time and/or the emotion change data 122.
[0136] Additionally, in the action mode setting processing (FIG. 13) described above, the
controller 110 sets the action mode to the familiar action mode only when a determination
is made that the intimacy with the user applying the external stimulus is high. However,
the present disclosure need not be limited to this setting method. For example, a
configuration is possible in which, when a determination is made that the robot 200
is spoken to, the controller 110 sets the action mode to the familiar action mode
in accordance with a number of days from the pseudo-birth of the robot 200 (for example,
the birthday of the robot 200), regardless of the similarity with the history. Additionally,
a configuration is possible in which the controller 110 occasionally (for example,
about one time per day) sets the action mode to the familiar action mode by a random
number or the like, regardless of the similarity with the history. The familiar action
amount of time in such cases can also be set as desired, and a configuration is possible
in which the controller 110 sets the familiar action amount of time to, for example,
the comparatively short third familiar action amount of time (for example, three minutes).
[0137] As a result of the action control processing described above, the controller 110
acquires the external stimulus acting on the robot 200 (the control target device);
sets, on the basis of the likelihood (owner, person that always cares for the robot
200, person that cares little for the robot 200, and the like) corresponding to the
level of intimacy between the robot 200 and the subject (the user or the like) applying
the external stimulus, the action mode in which the action content that the robot
200 performs to the subject is defined; and controls the action of the robot 200 on
the basis of the external stimulus and the action mode. As such, in accordance with
the similarity between a characteristic quantity of the manner of speaking and an
external stimulus characteristic quantity stored in the storage 120 as the history,
the robot 200 can determine that the person applying the external stimulus is "probably
a person that always cares for the robot 200" and perform an action in the familiar
action mode, and the controller 110 can cause an action to be performed that takes
the relationship between the robot 200 and the subject into consideration.
Modified Examples
[0138] The present disclosure is not limited to the embodiment described above, and various
modifications and uses are possible. For example, a configuration is possible in which,
as when in the normal action mode, the action content when in the familiar action
mode changes in accordance with the growth value and/or the personality.
[0139] The actions of the robot 200 are not limited to the actions by the driver 220 and
the outputting of sounds from the sound outputter 230. A configuration is possible
in which, in cases in which the robot 200 includes other controlled components (for
example, an LED, a display, or the like), as the action of the robot 200, the controller
110 controls a color and/or a brightness of an LED that is turned ON. It is sufficient
that the controlled components to controlled by the controller 110 include at least
one of the driver 220 and the sound outputter 230.
[0140] The configuration of the emotion map 300, and the setting methods of the emotion
data 121. the emotion change data 122, the personality data, the growth value, and
the like in the embodiment described above are merely examples. For example, a configuration
is possible in which a numerical value (when exceeding 10, always set to 10) obtained
by dividing the growth days count data 125 by a certain number is set as the growth
value.
[0141] In the embodiment described above, the action control device 100 for controlling
the robot 200 is built into the robot 200, but the action control device 100 for controlling
the robot 200 need not necessarily be built into the robot 200. For example, a configuration
is possible in which the action control device 100 is configured as a device separate
from the robot 200, and the robot 200 includes a controller 250 and a communicator
260 separate from the controller 110 and the communicator 130 of the action control
device 100. In such a case, the communicator 260 and the communicator 130 are configured
so as to send and receive data to and from each other, and the controller 110 acquires
the external stimulus detected by the sensor 210, controls the driver 220 and the
sound outputter 230, and the like via the communicator 130 and the communicator 260.
[0142] In the embodiments described above, a description is given in which the action programs
executed by the CPU of the controller 110 are stored in advance in the ROM or the
like of the storage 120. However, the present disclosure is not limited thereto, and
a configuration is possible in which the action programs for executing the various
processings described above are installed on an existing general-purpose computer
or the like, thereby causing that computer to function as a device corresponding to
the action control device 100 according to the embodiments described above.
[0143] Any method can be used to provide such programs. For example, the programs may be
stored and distributed on a non-transitory computer-readable recording medium (flexible
disc, Compact Disc (CD)-ROM, Digital Versatile Disc (DVD)-ROM, Magneto Optical (MO)
disc, memory card, USB memory, or the like), or may be provided by storing the programs
in a storage on a network such as the internet, and causing these programs to be downloaded.
[0144] Additionally, in cases in which the processings described above are realized by being
divided between an operating system (OS) and an application/program, or are realized
by cooperation between an OS and an application/program, it is possible to store only
the portion of the application/program on the non-transitory recording medium or in
the storage. Additionally, the programs can be piggybacked on carrier waves and distributed
via a network. For example, the programs may be posted to a bulletin board system
(BBS) on a network, and distributed via the network. Moreover, a configuration is
possible in which the processings described above are executed by starting these programs
and, under the control of the operating system (OS), executing the programs in the
same manner as other applications/programs.
[0145] Additionally, a configuration is possible in which the controller 110 is constituted
by a desired processor unit such as a single processor, a multiprocessor, a multi-core
processor, or the like, or by combining these desired processors with processing circuity
such as an application specific integrated circuit (ASIC), a field-programmable gate
array (FPGA), or the like.
[0146] The foregoing describes some example embodiments for explanatory purposes. Although
the foregoing discussion has presented specific embodiments, persons skilled in the
art will recognize that changes may be made in form and detail without departing from
the broader spirit and scope of the invention. Accordingly, the specification and
drawings are to be regarded in an illustrative rather than a restrictive sense. This
detailed description, therefore, is not to be taken in a limiting sense, and the scope
of the invention is defined only by the included claims, along with the full range
of equivalents to which such claims are entitled.