(57) The present disclosure provides an image based human-computer interaction method
and apparatus, a device, and a storage medium, which relates to the field of artificial
intelligence and, in particular, to the field of image processing. A specific implementation
solution is as follows: acquiring a to-be-analyzed image, and determining image layout
information and image content information of the to-be-analyzed image, where the to-be-analyzed
image includes a variety of modal data, the image layout information represents distribution
of image elements with preset granularity in the to-be-analyzed image, and the image
content information represents a content expressed by the modal data in the to-be-analyzed
image; and determining, in response to acquiring question information, response information
corresponding to the question information according to the image layout information
and the image content information, where the question information represents a question
proposed by a user for the to-be-analyzed image, and the response information represents
a reply answer corresponding to the question information. By extracting layout information
and content information from an image, the accuracy of answering a question and user
experience of human-computer interaction are improved.
|

|