FIELD
[0001] A certain aspect of embodiments described herein relates to a learning method and
program of a value calculation model and a selection probability estimation method.
BACKGROUND
[0002] It is desired to control the transportation of people to reduce the emission amount
of CO
2 and alleviating traffic congestion. For example, when there is a plurality of transportation
options for a pair (OD) of an origin (O) and a destination (D), changing the fare
for each transportation option will cause people to change the transportation option
that they select. Therefore, by setting the fare for each transportation option appropriately,
it is possible to appropriately control the transportation of people.
[0003] Conventionally, when predicting at what rate each of options having attribute values
of different scales such as cost and time is selected, a numerical value (value) that
can express each option with a single scale is obtained by inputting the attribute
values into a predetermined formula (for example, a linear formula). Then, the degree
to which each option is selected (selection probability) is predicted from the relative
relationship between the obtained values of the options. Note that the art related
to the present disclosure is also disclosed in
Japanese Patent Application Laid-Open No. 2015-114988.
SUMMARY
[0004] To learn a value calculation model for calculating a value from attribute values
by machine learning or the like, data in which the attribute values of each of options
and the value of each of the options are associated with each other is necessary as
learning data. However, only the rate at which each option was selected (selection
probability) can be obtained as the observed value of each option. Even when the selection
probability of each option is obtained, the value of each option cannot be obtained
from the selection probability, and thus the selection probability alone is insufficient
as learning data.
[0005] In one aspect, an object of the present disclosure is to provide a learning method
and program of a value calculation model capable of learning a value calculation model
for calculating the values of options used when people act. Another object of the
present disclosure is to provide a selection probability estimation method capable
of accurately estimating the selection probabilities of options.
[0006] According to an aspect of the embodiments, there is provided a learning method of
a value calculation model for calculating a value of an option used when a person
acts from an attribute value of the option, implemented by a computer, the learning
method including: acquiring input data in which a selection probability indicating
a rate at which each option is selected from a plurality of options and attribute
values of the plurality of options when the selection probability is obtained are
associated with each other; and acquiring, for each combination of two options that
can be extracted from the plurality of options, a relationship between selection probabilities
of the two options included in each combination from the input data, and adjusting
the value calculation model so that a relationship between values calculated when
attribute values of the two options included in each combination are input to the
value calculation model and a relationship between the selection probabilities corresponding
to each combination are close to each other.
BRIEF DESCRIPTION OF DRAWINGS
[0007]
FIG. 1A to FIG. 1C are diagrams for describing an outline of a process executed by
an information processing apparatus in accordance with an embodiment.
FIG. 2 illustrates a hardware configuration of the information processing apparatus
in accordance with the embodiment.
FIG. 3 is a functional block diagram of the information processing apparatus of FIG.
2.
FIG. 4A illustrates an example of transportation data, and FIG. 4B illustrates an
example of selection probability data.
FIG. 5 illustrates an example of attribute value data.
FIG. 6 illustrates an example of learning data.
FIG. 7 is a diagram illustrating an input and an output of a value calculation model.
FIG. 8A illustrates an example of attribute value data (object OD), and FIG. 8B illustrates
an example of target selection probability data.
FIG. 9 is a diagram for describing an outline of learning by a model learning unit.
FIG. 10 illustrates an overview of a learning device of the model learning unit.
FIG. 11 is a flowchart illustrating an example of a value calculation model learning
process.
FIG. 12 is a flowchart illustrating a detailed process of step S16 in FIG. 11.
FIG. 13 is a flowchart illustrating a detailed process of step S20 in FIG. 11.
FIG. 14 is a flowchart illustrating an example of a billing amount determination process.
FIG. 15 is a diagram for describing an outline of learning by a model learning unit
in accordance with a variation.
FIG. 16 is a diagram illustrating an overview of a learning device of the model learning
unit in accordance with the variation.
DESCRIPTION OF EMBODIMENTS
[0008] Hereinafter, an embodiment will be described in detail with reference to FIG. 1 to
FIG. 14.
[0009] FIG. 1A to FIG. 1C are diagrams for describing an outline of a process executed by
an information processing apparatus 10 in accordance with the present embodiment.
For example, as illustrated in FIG. 1A, there is a pair (OD) of an origin (O) and
a destination (D), and there are an option (car), an option (train), and an option
(bus) as the options of transportation between the origin and the destination. Further,
as illustrated in FIG. 1B, cost and time are set as attribute values for each option.
In this case, it is assumed that 50% of people who move between OD select the option
(car), 30% of the people select the option (train), and 20% of the people select the
option (bus). In the example of FIG. 1B, since many people select the option (car),
the road becomes congested.
[0010] The information processing apparatus 10 of the present embodiment is an apparatus
that determines and outputs an appropriate toll (billing amount) when a user wants
to set road pricing (toll) for eliminating congestion on a road. For example, as illustrated
in FIG. 1C, when the user inputs an instruction to adjust the selection probabilities
of the option (car), the option (train), and the option (but) to be the same (33%),
the information processing apparatus 10 calculates and outputs how much the cost (billing
amount) required to use the road should be to adjust the selection probabilities to
be the same.
[0011] FIG. 2 illustrates a hardware configuration of the information processing apparatus
10. As illustrated in FIG. 2, the information processing apparatus 10 includes a central
processing unit (CPU) 90, a read only memory (ROM) 92, a random access memory (RAM)
94, a storage (a solid state drive (SSD) or a hard disk drive (HDD)) 96, a network
interface 97, a display unit 93, an input unit 95, a portable storage medium drive
99, and the like. These components of the information processing apparatus 10 are
connected to a bus (data transmission path) 98. In the information processing apparatus
10, the CPU 90 executes a program (including a learning program of a value calculation
model) stored in the ROM 92 or the storage 96, or a program read from a portable storage
medium 91 by the portable storage medium drive 99 to implement the function of each
unit illustrated in FIG. 3. FIG. 3 also illustrates various storage units stored in
the storage 96 and the like. The function of each unit in FIG. 3 may be implemented
by an integrated circuit such as an application specific integrated circuit (ASIC)
or a field programmable gate array (FPGA).
[0012] As illustrated in FIG. 3, when the CPU 90 executes the program, the information processing
apparatus 10 functions as a transportation data acquisition unit 20, a selection probability
calculation unit 22, an attribute value acquisition unit 24, a learning data generation
unit 26, a model learning unit 28, a target selection probability acquisition unit
30, an optimum billing amount calculation unit 32, and an output unit 34. Hereinafter,
each unit will be described in detail.
[0013] The transportation data acquisition unit 20 acquires transportation data illustrated
in FIG. 4A. Here, the transportation data in FIG. 4A records which option (car, train,
bus) was used (selected) by people who have moved through each of three types of ODs.
Although the "selected option" is recorded in association with the "personal ID" in
FIG. 4A, the "personal ID" may not be necessarily recorded. That is, the form of the
transportation data is not limited as long as the number of times of selection of
each option can be known.
[0014] The selection probability calculation unit 22 calculates the rate at which each option
(car, train, bus) was selected for each OD from the transportation data of FIG. 4A,
and generates selection probability data illustrated in FIG. 4B. From the selection
probability data in FIG. 4B, it can be seen that 50%, 33%, and 17% of the people who
moved through OD1 selected cars, trains, and buses, respectively. The selection probability
calculation unit 22 stores the generated selection probability data (FIG. 4B) in a
selection probability storage unit 40.
[0015] The attribute value acquisition unit 24 acquires attribute values (in the present
embodiment, cost and time) of each option for each OD. Here, the cost is a fare when
using a train or a bus, a toll for a road when using a car, or the like. Time is the
time required to move between OD. The attribute value acquisition unit 24 acquires
attribute value data illustrated in FIG. 5 input by the user, for example, and stores
the acquired data in an attribute value storage unit 42. In the case of the attribute
value data of FIG. 5, for example, the attribute values of a car for OD1 are cost
= 100 yen and time = 10 minutes, the attribute values of a train are cost = 200 yen
and time = 6 minutes, and the attribute values of a bus are cost = 500 yen and time
= 3 minutes. Since the selection probabilities of the selection probability data of
FIG. 4B and the attribute values of the attribute value data of FIG. 5 exist for each
option for each OD, it can be said that they are associated with each other on a one
to-one basis. In other words, it can be said that the selection probability data and
the attribute value data are input data in which a selection probability is associated
with the attribute values of each option when the selection probability is obtained.
[0016] The learning data generation unit 26 generates learning data using the selection
probability data (FIG. 4B) stored in the selection probability storage unit 40 and
the attribute value data (FIG. 5) stored in the attribute value storage unit 42. The
learning data is data illustrated in FIG. 6. The learning data generation unit 26
generates learning data corresponding to each combination of two options (car/bus,
car/train, bus/train) for OD1 (learning data ID = 001 to 003). The learning data generation
unit 26 also generates learning data corresponding to each combination of two options
(car/bus, car/train, bus/train) for OD2 and OD3 (learning data IDs = 004 to 006, 007
to 009). In each piece of learning data, the attribute values of two options included
in a combination and the ratio of the selection probabilities of the two options are
associated with each other. For example, in the case of the combination of the option
(car) and the option (bus) for OD1 (learning data ID = 001), the ratio of the selection
probability of the car (50%) to the selection probability of the bus (17%) is 50/17.
The learning data generation unit 26 stores the generated learning data (FIG. 6) in
a learning data storage unit 44. Although FIG. 6 also includes information of "data
source (remarks)", since this information is reference information, it may not be
included in the actual learning data.
[0017] The model learning unit 28 executes a process of learning the value calculation model
using the learning data stored in the learning data storage unit 44. FIG. 7 illustrates
inputs to the value calculation model and outputs of the value calculation model.
As illustrated in FIG. 7, the value calculation model is a model capable of calculating
and outputting a numerical value (value) expressed with a single scale by inputting
attribute values with different scales such as cost and time. In the present embodiment,
the value calculation model is a model using a neural network called Multi-Layer Perceptron
(MI,P). As the MLP, a three-layer perceptron having two input layer nodes, one output
layer node, and six intermediate layer nodes can be used. The two input layer nodes
correspond to the attribute values (cost and time) of the option, respectively, and
the one output layer node corresponds to the value of the option. Details of the model
learning unit 28 will be described later. The model learning unit 28 stores the parameters
of the value calculation model obtained by the learning process in a model parameter
storage unit 46.
[0018] The target selection probability acquisition unit 30 acquires a target value of the
selection probability of each option for a certain OD (object OD) input by the user.
The data of the target value input by the user is target selection probability data
illustrated in FIG. 8B.
[0019] The optimum billing amount calculation unit 32 acquires the attribute value data
(see FIG. 8A) of each option for the object OD from the attribute value storage unit
42, and calculates the cost (toll) of the option (car) that causes the selection probability
of each option to become the target value (see FIG. 8B). For example, in FIG. 7, it
is assumed that the numerical values of the values output as results of inputting
the attribute values of each option (car, train, bus) to the value calculation model
are V
1 = 25, V
2 = 15, and V
3 = 10. In this case, the optimum billing amount calculation unit 32 calculates the
selection probability P
1 of the option (can) by the relative evaluation equation presented by the following
expression (1).

[0020] In the example illustrated in FIG. 7, P
1 is calculated as 25/(25 + 15 + 10) = 0.5 = 50%. The selection probabilities P
2 and P
3 of the option (train) and the option (bus) are also calculated as P
2 = 30% and P
3 = 20% by the same calculation. The optimum billing amount calculation unit 32 calculates
the cost (optimum billing amount) of the option (car) such that the values of P
1, P
2, and P
3 match the target values. The optimum billing amount calculation unit 32 notifies
the output unit 34 of the calculated optimum billing amount.
[0021] The output unit 34 outputs the optimum billing amount notified from the optimum billing
amount calculation unit 32 to the display unit 93.
(Outline of Learning Performed by Model Learning Unit 28)
[0022] Here, an outline of learning of the model learning unit 28 will be described.
[0023] As illustrated in FIG. 7, the value calculation model used in the present embodiment
is a model in which the attribute values of each option are input and the respective
values of the options are output. Therefore, in order to learn the value calculation
model, data of the combination of attribute values and a value is required as learning
data. However, in the present embodiment, the numerical value of the value cannot
be obtained as the observed value, and only the rate at which each option was actually
selected (selection probability) can be obtained as the observed value. The numerical
value of the selection probability does not always match the numerical value of the
value (see FIG. 7), and since the calculation for obtaining the selection probability
from the value (relative evaluation in FIG. 7) is an irreversible operation, it is
impossible to obtain the value from the selection probability. Therefore, when the
attribute values and the selection probabilities are simply used as the learning data,
the parameters of the value calculation model cannot be machine-learned.
[0024] As a result of intensive studies, the inventor has focused on the fact that a relationship
(ratio) between values can be obtained from the selection probabilities, which are
observed values. For example, as illustrated in FIG. 9, the ratio of the selection
probability of the option (car) to the selection probability of the option (bus) is
50/20 (times), but the ratio of the value of the option (car) to the value of the
option (bus) is also 50/20 (times). Similarly, the ratio of the selection probability
of the option (train) to the selection probability of the option (bus) is 30/20, and
the ratio of the value of the option (train) to the value of the option (bus) is also
30/20. Based on the above-described findings, the inventor performed machine learning
on a value calculation model such that a relationship (ratio) between values output
from the value calculation model approaches a relationship (ratio) between selection
probabilities.
[0025] FIG. 10 illustrates an overview of a learning device of the model learning unit 28.
As illustrated in FIG. 10, in the model learning unit 28, the attribute values of
the two options included in the learning data (FIG. 6) are input to the value calculation
model. Then, the model learning unit 28 calculates the relationship (ratio V
1/V
2) between the values V
1 and V
2 output from the value calculation model. In addition, the model learning unit 28
obtains a relationship (ratio P
1/P
2) between the selection probabilities P
1 and P
2, which are observed values. Then, the model learning unit 28 obtains a difference
(residual (V
1/V
2) - (P
1/P
2)) between the relationship between the values (ratio V
1/V
2) and the relationship between the selection probabilities (ratio P
1/P
2). The model learning unit 28 obtains residuals using all pieces of learning data,
and updates the parameters of the value calculation model so that the sum of all the
residuals is equal to or less than a threshold value. In this manner, the model learning
unit 28 can learn the value calculation model.
(Process Executed by Information Processing Apparatus 10)
[0026] Next, a process executed by the information processing apparatus 10 will be described
in detail. The information processing apparatus 10 executes the "learning preparation
and learning process" illustrated in FIG. 11 (and FIG. 12 and FIG. 13) and the "billing
amount determination process" illustrated in FIG. 14 using the value calculation model.
(Learning Preparation and Learning Process)
[0027] FIG. 11 is a flowchart illustrating the learning preparation and learning process
of the value calculation model. The process illustrated in FIG. 11 is executed, for
example, at predetermined time intervals or every time a predetermined amount of transportation
data is stored.
[0028] When the process of FIG. 11 is started, first, in step S10, the transportation data
acquisition unit 20 reads the transportation data (see FIG. 4A) for a plurality of
ODs, and the attribute value acquisition unit 24 reads the attribute values (see FIG.
5) of options. The transportation data acquisition unit 20 transfers the read transportation
data to the selection probability calculation unit 22. The attribute value acquisition
unit 24 stores the read attribute value data in the attribute value storage unit 42.
[0029] Then, in step S12, the selection probability calculation unit 22 calculates the selection
probability of each option in each OD with reference to the transportation date (FIG.
4A). The selection probability calculation unit 22 stores the selection probability
data (FIG. 4B) in which the calculated selection probabilities of the respective options
for the respective ODs are collected in the selection probability storage unit 40.
[0030] Then, in step S14, the learning data generation unit 26 selects one unselected OD.
When there are three ODs (OD1 to OD3) as illustrated in FIG. 4Ato FIG. 5, the learning
data generation unit 26 selects one (for example, OD1) of the three ODs.
[0031] Then, in step S16, the learning data generation unit 26 executes a learning data
generation process. In this step S16, a process according to the flowchart of FIG.
12 is executed.
(Learning Data Generation Process (S16))
[0032] In the process of FIG. 12, first, in step S30, the learning data generation unit
26 selects one unselected combination of two options. For example, the learning data
generation unit 26 selects a combination of the option (car) and the option (bus).
[0033] Then, in step S32, the learning data generation unit 26 acquires respective attribute
values of the two options. The learning data generation unit 26 refers to the attribute
value storage unit 42 and acquires the attribute values (cost and time) of, for example,
the option (car) and the option (bus) for OD1 from the attribute value data illustrated
in FIG. 5.
[0034] Then, in step S34, the learning data generation unit 26 calculates the ratio of the
selection probabilities of the two options. The learning data generation unit 26 refers
to the selection probability storage unit 40, acquires, for example, the respective
selection probabilities (50% and 17%) of the option (car) and the option (bus) for
OD1 from the selection probability data in FIG. 4B, and calculates the ratio (50/17).
[0035] Then, in step S36, the learning data generation unit 26 records the acquired attribute
values and the calculated ratio of the selection probabilities as the learning data.
In the above example, the learning data generation unit 26 stores the data with the
learning data ID = "001" in FIG. 6 in the learning data storage unit 44.
[0036] Then, in step S38, the learning data generation unit 26 determines whether all combinations
of options have been selected. When the determination at step S38 is negative, the
process returns to step S30, and the processes in and after step S30 are repeatedly
executed. On the other hand, when the determination in step S38 is affirmative, the
process proceeds to step S18 of FIG. 11.
[0037] In step S18 of FIG. 11, the learning data generation unit 26 determines whether all
ODs have been selected. When the determination in step S18 is negative, the process
returns to step S14, and the processes in steps S14 and S16 are repeatedly executed.
On the other hand, when the determination in step S18 is affirmative, the process
proceeds to step S20. At the stage of proceeding to step S20, all pieces of the learning
data in FIG. 6 are ready.
[0038] In step S20, the model learning unit 28 executes a value calculation model learning
process. In this step S20, a process according to the flowchart of FIG. 13 is executed.
(Value Calculation Model Learning Process (Step S20))
[0039] When the process of FIG. 13 is started, first, in step S40, the model learning unit
28 sets the value calculation model to MLP and initializes the parameters.
[0040] Then, when the process proceeds to step S42, the model learning unit 28 selects one
unselected piece of the learning data. For example, the model learning unit 28 selects
the learning data that is in the top (learning data ID = 001) in FIG. 6.
[0041] Then, in step S44, the model learning unit 28 inputs the attribute values of the
selected piece of the learning data to the value calculation model, and calculates
the respective values (V
1 and V
2 in FIG. 10) of the options.
[0042] Then, in step S46, the model learning unit 28 calculates the ratio (V
1/V
2) of the values of the options.
[0043] Then, in step S48, the model learning unit 28 calculates a difference ((V
1/V
2) - (P
1/P
2)) between the ratio of the values of the options and the ratio of the selection probabilities
of the selected piece of the learning data and records the difference as a residual.
[0044] Then, in step S50, the model learning unit 28 determines whether all pieces of the
learning data have been selected. When the determination in step S50 is negative,
the process returns to step S42, and the processes in steps S42 to S50 are repeatedly
executed until the residuals of all the pieces of the learning data are calculated.
On the other hand, when the determination in step S50 is affirmative, the model learning
unit 28 proceeds to step S52.
[0045] In step S52, the model learning unit 28 determines whether the sum of the residuals
calculated in step S48 is equal to or less than the threshold value. When the determination
in step S52 is negative, it is necessary to adjust the value calculation model, and
thus the process proceeds to step S54.
[0046] In step S54, the model learning unit 28 updates the parameters of the value calculation
model. In addition, the model learning unit 28 sets all the pieces of the learning
data as unselected and deletes all the recorded residuals. Thereafter, the model learning
unit 28 repeatedly executes the processes of steps S42 to S54 using the updated value
calculation model. Then, when the sum of the residuals becomes equal to or less than
the threshold value, the determination in step S52 becomes affirmative, and the model
learning unit 28 proceeds to step S56.
[0047] In step S56, the model learning unit 28 stores the parameters of the value calculation
model in the model parameter storage unit 46. Thus, the process of FIG. 13 is completed,
and the entire process of FIG. 11 is also completed.
(Billing Amount Determination Process)
[0048] Next, the billing amount determination process will be described with reference to
the flowchart of FIG. 14. The billing amount determination process is a process for
determining a toll of a road using the value calculation model learned by the learning
process of FIG. 11. For example, it is assumed that the user selects "OD1" as the
OD to be considered (object OD). In addition, it is assumed that the user inputs the
target selection probability data illustrated in FIG. 8B as information about target
selection probabilities. In this case, the optimum billing amount calculation unit
32 calculates the cost of the option "car" so that the selection probability of each
option for OD1 matches the corresponding selection probability of FIG. 8B, and outputs
the calculated cost as the optimum billing amount.
[0049] When the process of FIG. 14 is started, first, in step S70, the optimum billing amount
calculation unit 32 reads the attribute values of each option for the OD (for example,
OD1) under consideration and the selection probability data to be achieved (the target
selection probability data). The optimum billing amount calculation unit 32 reads
the target selection probability data (FIG. 8B) input by the user through the target
selection probability acquisition unit 30.
[0050] Then, in step S72, the optimum billing amount calculation unit 32 selects one unselected
option. For example, the optimum billing amount calculation unit 32 selects the option
(car) from the option (car), the option (train), and the option (bus).
[0051] Then, in step S74, the optimum billing amount calculation unit 32 calculates the
value of the selected option using the value calculation model that has been learned
through the processes of FIG. 11 to FIG. 13, and stores the calculated value.
[0052] Then, in step S76, the optimum billing amount calculation unit 32 determines whether
all option have been selected. When the determination in step S76 is negative, the
process returns to step S72, and the processes in steps S72 to S76 are repeated until
the values of all the options are calculated. When the determination in step S76 is
affirmative, the optimum billing amount calculation unit 32 proceeds to step S78.
[0053] In step S78, the optimum billing amount calculation unit 32 calculates (estimates)
the selection probability of each option from the calculated value of each option.
Specifically, the optimum billing amount calculation unit 32 calculates (estimates)
the selection probability of each option using the above equation (1).
[0054] Then, in step S80, the optimum billing amount calculation unit 32 determines whether
the calculated selection probability matches the target selection probability. When
the difference between the calculated selection probability and the target selection
probability falls within a predetermined range, the optimum billing amount calculation
unit 32 may determine that the calculated selection probability and the target selection
probability match each other. When the determination in step S80 is negative, the
process proceeds to step S82, the optimum billing amount calculation unit 32 updates
the cost of the option (car), and the process returns to step S72. Thereafter, the
optimum billing amount calculation unit 32 repeats the processes in and after step
S72 until the determination in step S80 becomes affirmative, and when the determination
in step S80 becomes affirmative, the process proceeds to step S84.
[0055] In step S84, the output unit 34 outputs the cost of the option (car) when the determination
in step S80 is affirmative as the optimum billing amount. By checking the output optimum
billing amount, the user can confirm how much the toll for the car is appropriate
in order to match the selection probability of each option with the corresponding
target selection probability.
[0056] As described above in detail, the information processing apparatus 10 of the present
embodiment acquires the selection probability of each option that can be used when
a person moves between OD, and the data (FIG. 4B and FIG. 5) of the attribute values
of each option when the selection probability is obtained. In addition, the information
processing apparatus 10 calculates the relationship (ratio) between the selection
probabilities of the options for each combination of two options that can be extracted
from a plurality of options. Then, the information processing apparatus 10 adjusts
(learns) the value calculation model so that the relationship (ratio) between the
values calculated when the attribute values of the options are input to the value
calculation model and the relationship (ratio) between the selection probabilities
approach each other. As a result, in the present embodiment, even when the numerical
value of the value is not included in the learning data, the value calculation model
can be learned from the ratio of the selection probabilities, which are observed values.
In addition, by performing machine learning on the value calculation model, it is
possible to obtain a value calculation model capable of accurately calculating the
value of the option.
[0057] The value calculation model used in the present embodiment is a neural network (MLP
or the like) in which inputs are the attribute values of each option and an output
is the value of each option. This allows the user to automatically learn the value
calculation model without previously assuming a linear equation or the like as the
value calculation model.
[0058] In the present embodiment, the model learning unit 28 calculates the differences
(residuals) between the relationships (ratio) between the values of the options obtained
from the learning data (learning data IDs = 001 to 009) and the relationships (ratio)
between the selection probabilities. Then, the model learning unit 28 adjusts the
parameters of the value calculation model so that the sum of the differences is equal
to or less than the threshold value (S42 to S54 in FIG. 13). Thus, it is possible
to obtain a value calculation model capable of calculating the value of each option
with high accuracy.
[0059] Further, in the present embodiment, the optimum billing amount calculation unit 32
calculates the value of each option by inputting the attribute values of each option
to the value calculation model learned by the processes of FIG. 11 to FIG. 13 (S74
of FIG. 14). Then, the optimum billing amount calculation unit 32 calculates the selection
probability of each option based on the calculated values of each option (S78). Thus,
the selection probability of each option can be calculated with high accuracy.
[0060] Further, in the present embodiment, the optimum billing amount calculation unit 32
adjusts at least part of the attribute values of each option so that the estimated
selection probability of each option approaches the corresponding target selection
probability (S82). Accordingly, for example, by adjusting the cost of the option (car)
so that the selection probability of the option (car) is reduced, it is possible to
determine the optimum toll (road pricing) for eliminating the congestion of the road.
[0061] In the above-described embodiment, the optimum billing amount calculation unit 32
optimizes the toll of the road. However, this does not intend to suggest any limitation,
and the fare (cost) of the train or the bus may be adjusted so that the selection
probability of each option approaches the corresponding target selection probability.
The options of transportation may include other options of transportation (a motorcycle,
a ship, an airplane, or the like) in addition to or instead of at least one of a car,
a train, or a bus.
(Variation)
[0062] In the above-described embodiment, the case where the relative evaluation of FIG.
7 is performed based on the above equation (1) has been described, but this does not
intend to suggest any limitation. For example, as illustrated in FIG. 15, a logit
model frequently used in the behavior selection model can be used as the relative
evaluation. When the logit model is used as the relative evaluation, the selection
probability Pi of each option can be obtained from the following equation (2).

[0063] In this case, the relationship between P
1 and P
2 can be expressed by the following equation (3).

[0064] From the above equation (3), it can be seen that when the logit model is used as
the relative evaluation, the difference between the values can be used as the relationship
between the values. Additionally, it can be seen that the value calculation model
is to be learned so that the difference between the values approaches the relationship
between the selection probabilities (the difference between the numerical values of
the natural logarithms of the selection probabilities).
[0065] FIG. 16 illustrates an outline of a learning device of the model learning unit 28
in accordance with the present variation. As illustrated in FIG. 16, also in the present
variation, the model learning unit 28 inputs the attribute values of two options to
the value calculation model at the time of learning, similarly to the above-described
embodiment. Then, the model learning unit 28 calculates the relationship between the
values (difference (V
1 - V
2)) from the values V
1 and V
2 output from the value calculation model. In addition, the model learning unit 28
obtains the relationship (lnP
1 - lnP
2) between the selection probabilities P
1 and P
2 as observed values. Then, the model learning unit 28 obtains the difference ((lnP
1 - lnP
2) - (V
1 - V
2)) between the relationship between the values (V
1 - V
2) and the relationship between the selection probabilities (lnP
1- lnP
2). The model learning unit 28 obtains differences (residuals) using all pieces of
learning data, and updates the parameters of the value calculation model so that the
sum of the differences is equal to or less than the threshold value. In this manner,
learning of the value calculation model can be performed also in the present variation.
[0066] In the case of the present variation, since the difference between V
1 and V
2 is used as presented in the above equation (3), the calculation can be performed
even the numerical values of the values V
1 and V
2 are 0, for example. Accordingly, since the loss function of the machine learning
does not have a singular point, calculation in the machine learning can be stabilized.
[0067] In the above-described embodiment, the case where the value calculation model is
a model of a neural network such as an MLP has been described, but this does not intend
to suggest any limitation. As the value calculation model, a linear equation such
as V = w
1 × cost + w
2 × time (w
1 and w
2 are weight coefficients, and V is value) may be used.
[0068] In the above-described embodiment, transportation options (car, train, bus) have
been described as an example of the option used when people act, but this does not
intend to suggest any limitation. There are various options used when people act,
and for example, net shopping and an actual store used when people shop also correspond
to options used when people act. That is, in a situation where a person selects an
option from among a plurality of options when performing a certain action, when the
value of each option and the selection probability of each option are obtained, the
above-described embodiment can be appropriately modified and used. In the above-described
embodiment, the case in which the attribute values are cost and time has been described,
but the attribute values may be something other than cost or time.
[0069] In the above-described embodiment, the case in which the information processing apparatus
10 used by the user has the functions of FIG. 3 has been described, but this does
not intend to suggest any limitation. For example, a server apparatus connected to
the information processing apparatus 10 used by the user via a network or the like
may have the functions of FIG. 3.
[0070] The above-described processing functions are implemented by a computer. In this case,
a program in which processing details of the functions that a processing device is
to have are written is provided. The aforementioned processing functions are implemented
in the computer by the computer executing the program.
The program in which the processing details are written can be stored in a computer-readable
recording medium (however, excluding carrier waves).
[0071] When the program is distributed, it may be sold in the form of a portable storage
medium such as a digital versatile disc (DVD) or a compact disc read only memory (CD-ROM)
storing the program. The program may be stored in a storage device of a server computer,
and the program may be transferred from the server computer to another computer over
a network.
[0072] A computer executing the program stores the program stored in a portable storage
medium or transferred from a server computer in its own storage device. The computer
then reads the program from its own storage device, and executes processes according
to the program. The computer may directly read the program from a portable storage
medium, and execute processes according to the program. Alternatively, the computer
may successively execute a process, every time the program is transferred from a server
computer, according to the received program.
[0073] Although the exemplary embodiments of the present invention have been illustrated
in detail, the present invention is not limited to the above-mentioned embodiments,
and other embodiments, variations and modifications may be made without departing
from the scope of the present invention.