(19)
(11)EP 3 859 732 A3

(12)EUROPEAN PATENT APPLICATION

(88)Date of publication A3:
08.12.2021 Bulletin 2021/49

(43)Date of publication A2:
04.08.2021 Bulletin 2021/31

(21)Application number: 21175049.2

(22)Date of filing:  20.05.2021
(51)International Patent Classification (IPC): 
G10L 15/16(2006.01)
G10L 15/32(2013.01)
G10L 15/06(2013.01)
G06N 3/08(2006.01)
G10L 15/20(2006.01)
G06N 3/04(2006.01)
(52)Cooperative Patent Classification (CPC):
G10L 15/20; G10L 15/16; G10L 15/063; G01S 3/8006; G06N 3/084; G06N 3/0454; G06N 3/0445; G10L 15/32
(84)Designated Contracting States:
AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR
Designated Extension States:
BA ME
Designated Validation States:
KH MA MD TN

(30)Priority: 20.07.2020 CN 202010697077

(71)Applicant: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO. LTD.
100085 Beijing (CN)

(72)Inventors:
  • LI, Xin
    Beijing, 100085 (CN)
  • HUANG, Bin
    Beijing, 100085 (CN)
  • ZHANG, Ce
    Beijing, 100085 (CN)
  • BAI, Jinfeng
    Beijing, 100085 (CN)
  • JIA, Lei
    Beijing, 100085 (CN)

(74)Representative: V.O. 
P.O. Box 87930
2508 DH Den Haag
2508 DH Den Haag (NL)


(56)References cited: : 
  
      


    (54)METHOD AND APPARATUS FOR RECOGNIZING VOICE


    (57) The present disclosure discloses a method and apparatus for recognizing a voice, relates to the fields of voice and deep learning technology. A specific embodiment includes: inputting a target voice into a pre-trained voice recognition model to obtain an initial text output by at least one recognition network in the voice recognition model, the recognition network including a plurality of preset types of processing layers, and at least one type of processing layer of the recognition network being obtained by training based on a voice sample in a preset direction interval; and determining a voice recognition result of the target voice, based on the initial text. At least part of the processing layers of each recognition network in the voice recognition model used in the present disclosure is obtained based on voice training in a preset direction interval, so that the accuracy of voice recognition in the preset direction interval may be improved.







    Search report









    Search report