ОГЛЯД АКТУАЛЬНИХ ШЛЯХІВ ПРИСКОРЕННЯ РОБОТИ З НАВЧАЛЬНИМИ ДАНИМИ У СИСТЕМАХ АВТОМАТИЧНОГО РОЗПІЗНАВАННЯ МОВЛЕННЯ

Ю.А. Шульга; С.С. Забара

doi:10.36994/2788-5518-2022-02-04-24

Автор(и)

Ю.А. Шульга
С.С. Забара

DOI:

https://doi.org/10.36994/2788-5518-2022-02-04-24

Ключові слова:

розпізнавання мовлення, розподілені системи, децентралізовані системи, глибинне навчання

Анотація

В умовах збільшення кількості голосових зразків для обробки основним завданням є пришвидшення роботи із навчальними даними та збереження швидкодії та відмовостійкості систем розпізнавання вцілому, що стає доступним при використанні розподілених та децентралізованих систем. Використання багатохмарних підходів дозволяє отримувати переваги від використання кожної з них та отримуючи масштабованість, більшу кількість обчислень за одиницю часу та зменшення часу очікування на результат.

Посилання

G Hinton, L Deng, D Yu, G Dahl, A Mohamed, N Jaitly, A Senior, V Vanhoucke, P Nguyen, T. N Sainath, and B Kingsbury, «Deep neural networks for acoustic modeling in speech recognition,» IEEE Signal Processing Maganize, pp. 82–97, November 2012.

G Saon, G Kurata, T Sercu, K Audhkhasi, S Thomas, D Dimitriadis, X Cui, B Ramabhadran, M Picheny, L.-L Lim, B Roomi, and P Hall, «English conversational telephone speech recognition by humans and machines,» in Interspeech, 2017.

W Xiong, J Droppo, X Huang, F Seide, M. L Seltzer, A Stolcke, D Yu, and G Zweig, «Toward human parity in conversational speech recognition,» IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 25, no. 12, pp. 2410–2423, Dec 2017.

P Goyal, P Dollar, R. B Girshick, P Noordhuis, L Wesolowski, ´ A Kyrola, A Tulloch, Y Jia, and K He, «Accurate, large minibatch SGD: training imagenet in 1 hour,» CoRR, vol. abs/1706.02677, 2017.

M Cho, U Finkler, S Kumar, D. S Kung, V Saxena, and D Sreedhar, «Powerai DDL,» CoRR, vol. abs/1708.02188, 2017.

Y You, I Gitman, and B Ginsburg, «Scaling SGD batch size to 32k for imagenet training,» CoRR, vol. abs/1708.03888, 2017.

X Jia, S Song, W He, Y Wang, H Rong, F Zhou, L Xie, Z Guo, Y Yang, L Yu, T Chen, G Hu, S Shi, and X Chu, «Highly scalable deep learning training system with mixed-precision: Training imagenet in four minutes,» CoRR, vol. abs/1807.11205, 2018.

W Wen, C Xu, F Yan, C Wu, Y Wang, Y Chen, and H Li, «Terngrad: Ternary gradients to reduce communication in distributed deep learning,» in NIPS’2017, pp. 1509–1519. Curran Associates, Inc., 2017.

F Seide, H Fu, J Droppo, G Li, and D Yu, «1-bit stochastic gradient descent and application to data-parallel distributed training of speech dnns,» in Interspeech 2014, September 2014.

D Amodei(et.al.), «Deep speech 2: End-to-end speech recognition in english and mandarin,» in ICML’16. 2016, pp. 173–182, PMLR.

X Lian, W Zhang, C Zhang, and J Liu, «Asynchronous decentralized parallel stochastic gradient descent,» in ICML, 2018.

W Zhang, S Gupta, X Lian, and J Liu, «Staleness-aware async-sgd for distributed deep learning,» in Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9–15 July 2016, 2016, pp. 2350–2356.

P Patarasuk and X Yuan, «Bandwidth optimal all-reduce algorithms for clusters of workstations,» J. Parallel Distrib. Comput., vol. 69, pp. 117–124, 2009.

Baidu, Effectively Scaling Deep Learning Frameworks, Available at https://github.com/baidu-research/ baidu-allreduce.

Nvidia, NCCL: Optimized primitives for collective multiGPU communication, Available at https://github.com/ NVIDIA/nccl.

ОГЛЯД АКТУАЛЬНИХ ШЛЯХІВ ПРИСКОРЕННЯ РОБОТИ З НАВЧАЛЬНИМИ ДАНИМИ У СИСТЕМАХ АВТОМАТИЧНОГО РОЗПІЗНАВАННЯ МОВЛЕННЯ

Автор(и)

DOI:

Ключові слова:

Анотація

Посилання

##submission.downloads##

Опубліковано

Як цитувати

Номер

Розділ