EP 4281965 A1 20231129 - QUALITY ESTIMATION FOR AUTOMATIC SPEECH RECOGNITION
Title (en)
QUALITY ESTIMATION FOR AUTOMATIC SPEECH RECOGNITION
Title (de)
QUALITÄTSSCHÄTZUNG FÜR AUTOMATISCHE SPRACHERKENNUNG
Title (fr)
ESTIMATION DE LA QUALITÉ POUR LA RECONNAISSANCE AUTOMATIQUE DE LA PAROLE
Publication
Application
Priority
CN 2021073073 W 20210121
Abstract (en)
[origin: WO2022155842A1] A method, a system, and a computer-readable storage medium are provided for implementing quality estimation for automatic speech recognition, and more specifically training an ASR model, and training a QE model to perform word error rate prediction upon the trained ASR model. The ASR model may be a transformer learning model having an architecture including an encoder including multi-head attention layers, and a memory encoder including a masking multi-head attention layer. The QE model may include a binary classification model and a regression model, where the binary classification model is based on a discrete statistical distribution, and the regression model is based on a continuous statistical distribution. Training the ASR model may produce output having variable word error rates, and the QE model may be trained based on empirical word error rates of the ASR model. The QE model may predict performance of the ASR model without labor-intensive labeling to generate ground truth.
IPC 8 full level
G10L 15/01 (2013.01); G10L 15/06 (2013.01)
CPC (source: EP US)
G10L 15/01 (2013.01 - EP US); G10L 15/063 (2013.01 - EP US); G10L 15/16 (2013.01 - EP)
Designated contracting state (EPC)
AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR
Designated extension state (EPC)
BA ME
Designated validation state (EPC)
KH MA MD TN
DOCDB simple family (publication)
WO 2022155842 A1 20220728; EP 4281965 A1 20231129; EP 4281965 A4 20240821; US 2023360636 A1 20231109
DOCDB simple family (application)
CN 2021073073 W 20210121; EP 21920245 A 20210121; US 202318224514 A 20230720