【信息技術】【2015.03】基於深度神經網絡的語音識別

本文爲美國斯坦福大學（作者：Andrew Lee Maas）的博士論文，共191頁。

隨着計算機設備滲透到日常生活的方方面面，口語是一種越來越普遍的界面選擇方式。自動理解口語是一個巨大的挑戰，因爲它既需要將語音信號轉換成單詞，又需要從單詞本身提取意義。口語理解任務可以大致分爲不同的部分，它們執行（1）音頻信號的低級處理，（2）語音轉錄和（3）自然語言理解。我們描述了改善與口語理解相關的每個子任務的各個組成部分的方法。我們主要依靠基於機器學習的方法來代替手工設計的方法，並且一致地發現，在從數據中學習時，只要對問題的假設最少，就能提高性能。我們特別關注神經網絡方法來解決這個問題。神經網絡在最近重新引起了人們的興趣，因爲當有更多的數據可用時，神經網絡有能力擴展學習越來越複雜的函數。神經網絡最近在計算機視覺領域推動了巨大的進步，許多任務很容易轉化爲分類和迴歸問題。然而，在口語理解中，很難定義容易被形式化爲神經網絡要解決的問題的任務。我們的工作與這些複雜的系統相結合表明，與計算機視覺一樣，神經網絡可以顯著改善口語理解系統。

Spoken language is an increasinglypervasive interface choice as computing devices permeate many aspects of dailylife. Automatically understanding spoken language poses significant challengesbecause it requires both converting a speech signal into words and extractingmeaning from the words themselves. Spoken language understanding tasks canroughly be broken into distinct components which perform (1) low-levelprocessing of the audio signal, (2) speech transcription, and (3) naturallanguage understanding. We describe approaches to improving individualcomponents for each sub-task associated with spoken language understanding. Ourmethods primarily rely on machine-learning-based approaches to replacehand-engineered approaches and consistently find that learning from data withminimal assumptions about a problem results in improved performance. Inparticular, we focus on neural network approaches to problems. Neural networkshave seen a recent resurgence of interest thanks to their ability to scale tolearn increasingly complex functions when more data becomes available. Neuralnetworks have recently driven tremendous progress in the field of computervision, where many tasks easily translate into classification and regressionproblems. In spoken language understanding, however, it is more difficult todefine tasks which are easily formalized into problems for a neural network tosolve. Our work integrates with these complex systems and shows that, like incomputer vision, neural networks can significantly improve spoken languageunderstanding systems.

引言
項目背景
TREPAN算法
TREPAN經驗評估
TREPAN解析評估
MOFN-SWS算法：提取M-of-N準則的本地方法
基於Boosting的感知學習算法
其他相關工作
結論
附錄A 基於TREPAN提取表示樹

更多精彩文章請關注公衆號：

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【信息技術】【2015.03】基於深度神經網絡的語音識別

【新書推薦】【2019.12】航空工業中的計算機建模

【計算機科學】【2019.10】【含源碼】基於中軸變換的點雲可見性分析

【無人機】【2018.05】基於無人機的無線通信與聯網：基礎、部署和優化

【信息技術】【2017】5G技術的研發

【電力電子】【2014】三相電壓型逆變器在獨立和併網模式下的動態建模與分析

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結