Global Patent Index - EP 3753018 A4

EP 3753018 A4 20210512 - A METHOD AND DEVICE FOR PROCESSING WHISPERED SPEECH

Title (en)

A METHOD AND DEVICE FOR PROCESSING WHISPERED SPEECH

Title (de)

VERFAHREN UND VORRICHTUNG ZUR VERARBEITUNG GEFLÜSTERTER SPRACHE

Title (fr)

PROCÉDÉ ET DISPOSITIF DESTINÉ AU TRAITEMENT D'UN DISCOURS CHUCHOTÉ

Publication

EP 3753018 A4 20210512 (EN)

Application

EP 19786188 A 20190327

Priority

  • US 201815949358 A 20180410
  • CN 2019079776 W 20190327

Abstract (en)

[origin: US2019311711A1] The present disclosure provides embodiments of method, system and devices to convert whispered speech to normal speech intelligently through deep learning, therefore whispered speech can be more robust to interference and more intelligible to listener. A first audio signal is received including first whispered speech; a first plurality of computations are performed on the first audio signal to extract a first features; the first features are provided as input to a trained deep neural network (DNN) model to obtain output of the DNN model comprising a second features; and an inverse of the first plurality of computations are performed on the second features to produce a second audio signal corresponding to a non-whispered version of the first whispered speech.

IPC 8 full level

G10L 21/02 (2013.01); G10L 25/30 (2013.01); G10L 21/0364 (2013.01); G10L 25/18 (2013.01)

CPC (source: EP US)

G10L 15/02 (2013.01 - US); G10L 15/063 (2013.01 - US); G10L 15/16 (2013.01 - US); G10L 15/22 (2013.01 - US); G10L 21/003 (2013.01 - EP); G10L 25/30 (2013.01 - EP); G10L 21/0364 (2013.01 - EP); G10L 25/18 (2013.01 - EP)

Citation (search report)

  • [I] SRINIVAS DESAI ET AL: "Spectral Mapping Using Artificial Neural Networks for Voice Conversion", IEEE TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, IEEE, US, vol. 18, no. 5, 1 July 2010 (2010-07-01), pages 954 - 964, XP011329177, ISSN: 1558-7916, DOI: 10.1109/TASL.2010.2047683
  • [I] LING-HUI CHEN ET AL: "Voice conversion using deep neural networks with layer-wise generative training", IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, IEEE, USA, vol. 22, no. 12, 1 December 2014 (2014-12-01), pages 1859 - 1872, XP058065951, ISSN: 2329-9290, DOI: 10.1109/TASLP.2014.2353991
  • [I] ZHI TAO ET AL: "Reconstruction of Normal Speech from Whispered Speech Based on RBF Neural Network", INTELLIGENT INFORMATION TECHNOLOGY AND SECURITY INFORMATICS (IITSI), 2010 THIRD INTERNATIONAL SYMPOSIUM ON, IEEE, PISCATAWAY, NJ, USA, 2 April 2010 (2010-04-02), pages 374 - 377, XP031661781, ISBN: 978-1-4244-6730-3
  • [I] KANEKO TAKUHIRO ET AL: "Sequence-to-Sequence Voice Conversion with Similarity Metric Learned Using Generative Adversarial Networks", INTERSPEECH 2017, 1 January 2017 (2017-01-01), ISCA, pages 1283 - 1287, XP055791467, Retrieved from the Internet <URL:https://www.isca-speech.org/archive/Interspeech_2017/pdfs/0970.PDF> [retrieved on 20210329], DOI: 10.21437/Interspeech.2017-970

Citation (examination)

  • JANKE M ET AL: "Fundamental frequency generation for whisper-to-audible speech conversion", 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), IEEE, 4 May 2014 (2014-05-04), pages 2579 - 2583, XP032616925, DOI: 10.1109/ICASSP.2014.6854066
  • SRINIVAS DESAI ET AL: "Voice conversion using Artificial Neural Networks", ACOUSTICS, SPEECH AND SIGNAL PROCESSING, 2009. ICASSP 2009. IEEE INTERNATIONAL CONFERENCE ON, IEEE, PISCATAWAY, NJ, USA, 19 April 2009 (2009-04-19), pages 3893 - 3896, XP031460124, ISBN: 978-1-4244-2353-8
  • See also references of WO 2019196648A1

Designated contracting state (EPC)

AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

Designated extension state (EPC)

BA ME

DOCDB simple family (publication)

US 10832660 B2 20201110; US 2019311711 A1 20191010; CN 111902862 A 20201106; EP 3753018 A1 20201223; EP 3753018 A4 20210512; WO 2019196648 A1 20191017

DOCDB simple family (application)

US 201815949358 A 20180410; CN 2019079776 W 20190327; CN 201980021280 A 20190327; EP 19786188 A 20190327