Application of Extractive Text Summarization Algorithms to Speech-to-Text Media

Domínguez M., Victor; Fidalgo F., Eduardo; Rubel Biswas; Enrique Alegre; Laura Fernández-Robles

Application of Extractive Text Summarization Algorithms to Speech-to-Text Media

Domínguez M., Victor ¹
Fidalgo F., Eduardo ¹²
Rubel Biswas ¹²
Enrique Alegre ¹²
Laura Fernández-Robles ¹²

1 Universidad de León

Universidad de León

León, España

ROR https://ror.org/02tzt0b78
2 INCIBE (Spanish National Institute of Cybersecurity, León)

Livre:

Hybrid Artificial Intelligent Systems. 14th International Conference, HAIS 2019: León, Spain, September 4–6, 2019. Proceedings

Hilde Pérez García (coord.)
Lidia Sánchez González (coord.)
Manuel Castejón Limas (coord.)
Héctor Quintián Pardo (coord.)
Emilio Corchado Rodríguez (coord.)

Éditorial: Springer Suiza

ISBN: 978-3-030-29859-3, 978-3-030-29858-6

Année de publication: 2019

Pages: 540-550

Congreso: Hybrid Artificial Intelligent Systems (14. 2019. León)

Type: Communication dans un congrès

DIALNET GOOGLE SCHOLAR

Résumé

This paper presents how speech-to-text summarization can be performed using extractive text summarization algorithms. Our objective is to make a recommendation about which of the six text summary algorithms evaluated in the study is the most suitable for the task of audio summarization. First, we have selected six text summarization algorithms: Luhn, TextRank, LexRank, LSA, SumBasic, and KLSum. Then, we have evaluated them on two datasets, DUC2001 and OWIDSum, with six ROUGE metrics. After that, we have selected five speech documents from ISCI Corpus dataset, and we have transcribed using the Automatic Speech Recognition (ASR) from Google Cloud Speech API. Finally, we applied the studied extractive summarization algorithms to these five text samples to obtain a text summary from the original audio file. Experimental results showed that Luhn and TextRank obtained the best performance for the task of extractive speech-to-text summarization on the samples evaluated.

La source de données: Dialnet

Application of Extractive Text Summarization Algorithms to Speech-to-Text Media

Universidad de León

Résumé