https://cmsworkshops.com/ICASSP2020/TechnicalProgram.asp

文章目錄

T-1: Machine Learning and Wireless Communications

by
Yonina Eldar, Vince Poor and Nir Shlezinger
尤尼娜·艾爾達、文斯·普爾和尼爾·什萊辛格
Weizman Institute of Science, Princeton U, Weizman Inst. Sci.
威斯曼科學研究所，普林斯頓大學，威斯曼科學研究所。
mobile communications and machine learning are two of the most exciting and rapidly developing technological fields of our time. In the past few years, these two fields have begun to merge in two fundamental ways. First, while mobile communications has developed largely as a model-driven field, the complexities of many emerging communication scenarios raise the need to introduce data-driven methods into the design and analysis of mobile networks. Second, many machine learning problems are by their nature distributed due to either physical limitations or privacy concerns. This distributed nature can be exploited by using mobile networks as part of the learning mechanisms, i.e., as platforms for machine learning.
移動通信和機器學習是當今最令人興奮和發展最快的兩個技術領域。在過去的幾年裏，這兩個領域已經開始以兩種基本的方式融合。首先，雖然移動通信在很大程度上已經發展成爲一個模型驅動的領域，但許多新興通信場景的複雜性使得需要在移動網絡的設計和分析中引入數據驅動方法。其次，許多機器學習問題由於物理限制或隱私問題的性質而分佈。利用移動網絡作爲學習機制的一部分，即作爲機器學習的平臺，可以利用這種分佈式性質。
In this tutorial we will illuminate these two perspectives, presenting a representative set of relevant problems which have been addressed in the recent literature, and discussing the multitude of exciting research directions which arise from the combination of machine learning and wireless communications. We will begin with the application of machine learning methods for optimizing wireless networks: Here, we will first survey some of the challenges in communication networks which can be treated using machine learning tools. Then, we will focus on one of the fundamental problems in digital communications – receiver design. We will review different designs of data driven receivers, and discuss how they can be related to conventional and emerging approaches for combining machine learning and model-based algorithms. We will conclude this part of the tutorial with a set of communication-related problems which can be tackled in a data-driven manner.
在本教程中，我們將闡述這兩個觀點，介紹一組在最近的文獻中已經解決的具有代表性的相關問題，並討論由機器學習和無線通信相結合而產生的衆多令人興奮的研究方向。我們將從應用機器學習方法優化無線網絡開始：在這裏，我們將首先調查通信網絡中的一些挑戰，這些挑戰可以使用機器學習工具來處理。然後，我們將集中討論數字通信中的一個基本問題——接收機設計。我們將回顧數據驅動接收器的不同設計，並討論它們如何與用於結合機器學習和基於模型的算法的傳統和新興方法相關。我們將以一組與通信相關的問題結束本教程的這一部分，這些問題可以用數據驅動的方式解決。
The second part of the tutorial will be dedicated to wireless networks as a platform for machine learning: We will discuss communication issues arising in distributed learning problems such as federated learning and collaborative learning. We will explain how established communications and coding methods can contribute to the development of these emerging distributed learning technologies, illustrating these ideas through examples from recent research in the field. We will conclude with a set of open machine learning related problems, which we believe can be tackled using established communications and signal processing techniques.
本教程的第二部分將專門介紹作爲機器學習平臺的無線網絡：我們將討論分佈式學習問題中出現的通信問題，如聯合學習和協作學習。我們將解釋已建立的通信和編碼方法如何有助於這些新興的分佈式學習技術的發展，並通過該領域最新研究的實例說明這些想法。最後，我們將提出一系列與開放式機器學習相關的問題，我們相信這些問題可以通過現有的通信和信號處理技術來解決。

T-2: Distributed and Efficient Deep Learning

by
Wojciech Samek and Felix Sattler
薩梅克和薩特勒
Fraunhofer Heinrich Hertz Institute
弗勞恩霍夫海因裏希赫茲研究所
eep neural networks have recently demonstrated their incredible ability to solve complex tasks. Today’s models are trained on Millions of examples using powerful GPU cards and are able to reliably annotate images, translate text, understand spoken language or play strategic games such as chess or go. Furthermore, deep learning will also be integral part of many future technologies, e.g., autonomous driving, Internet of Things (IoT) or 5G networks. Especially with the advent of IoT, the number of intelligent devices has rapidly grown in the last couple of years. Many of these devices are equipped with sensors that allow them to collect and process data at unprecedented scales. This opens unique opportunities for deep learning methods.
eep神經網絡最近已經證明了它們解決複雜任務的不可思議的能力。今天的模型使用強大的GPU卡在數百萬個例子上進行訓練，能夠可靠地註釋圖像、翻譯文本、理解口語或玩象棋或圍棋等戰略遊戲。此外，深度學習也將是許多未來技術的組成部分，例如，自主駕駛、物聯網（IoT）或5G網絡。特別是隨着物聯網的出現，智能設備的數量在過去幾年裏迅速增長。這些設備中的許多都配備了傳感器，使它們能夠以前所未有的規模收集和處理數據。這爲深入學習方法提供了獨特的機會。
However, these new applications come with several additional constraints and requirements, which limit the out-of-the-box use of current models.
然而，這些新的應用程序附帶了一些額外的約束和要求，這些約束和要求限制了當前模型的開箱即用。
Embedded devices, IoT gadgets and smartphones have limited memory & storage capacities and restricted energy resources. Deep neural networks such as VGG-16 require over 500 MB for storing the parameters and up to 15 giga-operations for performing a single forward pass. Such models in their current (uncompressed) form cannot be used on-device.
嵌入式設備、物聯網設備和智能手機的內存和存儲容量有限，能源資源有限。VGG-16這樣的深層神經網絡需要500兆以上的內存來存儲參數，執行一次前向傳遞需要高達15千兆的操作。這些當前（未壓縮）形式的模型不能在設備上使用。
Training data is often distributed over devices and cannot simply be collected at a central server due to privacy issues or limited resources (bandwidth). Since a local training of the model with only few data points is often not promising, new collaborative training schemes are needed to bring the power of deep learning to these distributed applications.
訓練數據通常分佈在設備上，由於隱私問題或資源（帶寬）有限，不能簡單地在中央服務器上收集。由於只有很少數據點的模型的局部訓練往往是不可能的，因此需要新的協作訓練方案來爲這些分佈式應用帶來深度學習的能力。
This tutorial will discuss recently proposed techniques to tackle these two problems.
本教程將討論最近提出的解決這兩個問題的技術。

T-3: Graph Filters with Applications to Distributed Optimization and Neural Networks

by
Geert Leus, Elvin Isufi and Mario Coutino
格爾特·勒烏斯、埃爾文·伊斯菲和馬里奧·庫蒂諾
TU Delft
荷蘭代爾夫特理工大學
although processing and analyzing audio, images and video is still of great importance in current society, more and more data is originating from networks with an irregular structure, e.g., social networks, brain networks, sensor networks, and communications networks to name a few. To handle such signals, graph signal processing has recently been coined as a proper tool set. In graph signal processing the irregular structure of the network is captured by means of a graph, and the data is viewed as a signal on top of this graph, i.e., a graph signal. Graph signal processing extends concepts and tools from classical signal processing to the field of graph signals, e.g., the Fourier transform, filtering, sampling, stationarity, etc. Since nowadays many researchers and engineers work in the field of network data processing, this tutorial is attractive, timely and critical. Further, most existing tutorials in this field focus on the basics of graph signal processing. Hence, it is urgent to go one step beyond and discuss the latest advances in graph signal processing as well as connections to the exciting fields of distributed optimization and neural networks, both of which draw inspiration from fundamental signal processing techniques.
儘管音頻、圖像和視頻的處理和分析在當今社會仍然具有重要的意義，但越來越多的數據來自於結構不規則的網絡，例如社交網絡、大腦網絡、傳感器網絡和通信網絡等等。爲了處理這些信號，圖形信號處理最近被創造成一個合適的工具集。在圖形信號處理中，網絡的不規則結構是通過一個圖形來捕獲的，數據被看作是這個圖形上的一個信號，即一個圖形信號。圖形信號處理將概念和工具從經典的信號處理擴展到圖形信號領域，如傅立葉變換、濾波、採樣、平穩性等。由於目前許多研究人員和工程師在網絡數據處理領域工作，本教程具有吸引力、及時性和批判性。此外，這一領域的大多數現有教程都側重於圖形信號處理的基礎知識。因此，迫切需要超越和討論圖形信號處理的最新進展，以及與分佈式優化和神經網絡這兩個激動人心的領域的聯繫，這兩個領域都從基本的信號處理技術中得到啓發。
More specifically, in this tutorial, we will emphasize the concept of graph filtering, one of the cornerstones of the field of graph signal processing. Graph filters are direct analogues of time-domain filters but intended for signals defined on graphs. They find applications in image denoising, network data interpolation, signal and link prediction, learning of graph signals and building recommender systems. More recently, connections to distributed optimization as well as neural networks have been established. These last two applications rely heavily on core signal processing techniques such as iterative inversion algorithms and linear time-invariant filters. Graph filters extend these concepts to graphs, leading to key developments in distributed optimization and neural networks.
更具體地說，在本教程中，我們將強調圖形濾波的概念，圖形信號處理領域的基石之一。圖濾波器是時域濾波器的直接類似物，但用於圖上定義的信號。它們在圖像去噪、網絡數據插值、信號和鏈路預測、圖形信號學習和建立推薦系統等方面有着廣泛的應用。最近，分佈式優化和神經網絡已經建立了聯繫。最後兩個應用嚴重依賴於核心信號處理技術，如迭代反演算法和線性時不變濾波器。圖過濾器將這些概念擴展到圖，導致了分佈式優化和神經網絡的關鍵發展。

T-4: Signal and Image Processing for Art Investigation

by
Ingrid Daubechies, Pier Luigi Dragotti, Nathan Daly, Catherine Higgitt and Miguel Rodrigues
英格麗·多貝基斯，皮爾·路易吉·德拉戈蒂，內森·戴利，凱瑟琳·希吉特和米格爾·羅德里格斯
Duke University, National Gallery, University College London
杜克大學，國家美術館，倫敦大學學院
the cultural heritage sector is experiencing a digital revolution driven by the growing adoption of non-invasive, non-destructive spectroscopic imaging approaches generating multi-dimensional data from entire artworks. Such approaches include ‘macro X-ray fluorescence’ (MA-XRF) scanning or hyper-spectral imaging (HSI) in the visible and infrared ranges and are highly complementary to more traditional broad band digital imaging techniques such as X-ray radiography (XRR) or infrared reflectography (IRR).
文化遺產部門正在經歷一場數字革命，其驅動力是越來越多地採用非侵入性、非破壞性的光譜成像方法，從整個藝術品中生成多維數據。這些方法包括可見光和紅外範圍內的“宏觀X射線熒光”（MA-XRF）掃描或高光譜成像（HSI），與更傳統的寬帶數字成像技術（如X射線照相術（XRR）或紅外反射成像（IRR）高度互補。
This data –spanning both the spatial and spectral domains– holds information about both materials at the surface of an artwork but also about sub-surface layers or features of interest otherwise invisible to the naked eye. The ability to interrogate the wealth of data yielded by these techniques can potentially provide insights into an artist’s materials, techniques and creative process; reveal the changing condition of an artwork over time and its restoration history; help inform strategies for the conservation and preservation of artworks; and, importantly, offer means by which to present artwork to the public in new ways.
這些數據橫跨空間域和光譜域，包含了藝術品表面兩種材料的信息，也包含了亞表面層或肉眼看不見的感興趣特徵的信息。對這些技術所產生的大量數據進行查詢的能力，可以潛在地洞察藝術家的材料、技術和創作過程；揭示藝術作品隨時間變化的狀況及其修復歷史；有助於爲保護和保存藝術作品的策略提供信息；而且，重要的是，以新的方式向公衆展示藝術品的方式。
However, to do this successfully also calls for new sophisticated signal and image processing tools capable of addressing various challenges associated with the analysis, interrogation, and processing of such massive multi-dimensional datasets. These challenges derive from the fact that paintings are very complex objects where
然而，要成功地做到這一點，還需要新的複雜的信號和圖像處理工具，能夠應對與分析、查詢和處理這種大規模多維數據集相關的各種挑戰。這些挑戰源於繪畫是非常複雜的物體
Materials are often present in intimate mixtures applied over multi-layered systems so signals deriving from spectroscopic imaging techniques are highly nonlinearly mixed.
材料通常以多層系統上的緊密混合物存在，因此光譜成像技術產生的信號是高度非線性混合的。
Materials also age/degrade over time so signals collected from spectroscopic imaging techniques cannot be often compared to signals present in reference libraries.
材料也會隨着時間老化/降解，因此從光譜成像技術收集的信號不能經常與參考庫中的信號進行比較。
A ‘ground-truth’ is often unavailable or limited because each painting is unique, with original materials often unknown, and materials’ aging process also unknown.
因爲每幅作品都是獨一無二的，原材料往往是未知的，材料的老化過程也未知，所以一個“基本真相”往往是不可能或有限的。
In addition, different spectroscopic imaging techniques reveal different details about artwork, so there is also a need to develop new signal and image processing tools combining different datasets in order to understand artwork.
此外，不同的光譜成像技術揭示了藝術品的不同細節，因此還需要開發結合不同數據集的新的信號和圖像處理工具來理解藝術品。
This tutorial –which is offered by experts in applied mathematics, signal & image processing, machine learning, and heritage science– (a) reviews the state-of-the-art in signal and image processing for art investigation (b) reviews signal and image processing challenges arising in the examination of datasets acquired on artwork and © overviews emerging directions in signal processing for art investigation.
本教程-由應用數學、信號與圖像處理、機器學習、計算機輔助教學、計算機輔助教學和計算機輔助教學等領域的專家提供，和遺產科學–（a）回顧藝術調查的信號和圖像處理的最新技術（b）回顧檢查藝術品上獲得的數據集時出現的信號和圖像處理挑戰，以及（c）概述藝術調查信號處理的新方向。

T-5: Quantum Signal Processing & Communications – a Glimpse beyond Moore’s Law

by
Lajos Hanzo, Angela Sera Cacciapuoti and Marcello Caleffi
拉喬斯·漢佐、安吉拉·塞拉·卡恰普奧蒂和馬塞洛·卡列菲
University of Southampton; University of Naples Federico II
南安普敦大學；那不勒斯大學費德里科二世
moore’s law has indeed prevailed since he outlined his empirical rule-of-thumb in 1965, but based on this trend the scale of integration is set to depart from classical physics, entering nano-scale integration, where the postulates of quantum physics have to be obeyed. The quest for quantum-domain communication and processing solutions was inspired by Feynman’s revolutionary idea in 1985: particles such as photons or electrons might be relied upon for encoding, processing and delivering information. Hence in the light of these trends it is extremely timely to build an interdisciplinary momentum in the area of quantum signal processing and communications, where there is an abundance of open problems for a broad community to solve collaboratively. In this workshop-style interactive presentation we will address the following issues:
摩爾定律自1965年概述其經驗經驗經驗法則以來，確實盛行，但基於這一趨勢，積分的尺度設置偏離經典物理，進入納米級積分，在那裏必須遵守量子物理的假設。對量子域通信和處理解決方案的探索靈感來自費曼1985年的革命性想法：光子或電子等粒子可能被用來編碼、處理和傳遞信息。因此，鑑於這些趨勢，在量子信號處理和通信領域建立一種跨學科的勢頭是非常及時的，在這一領域，有大量的開放問題需要廣大社區共同解決。在本次研討會式互動演示中，我們將討論以下問題：
We commence by highlighting the nature of the quantum channel, followed by techniques of mitigating the effects of quantum decoherence using quantum codes.
我們首先強調量子信道的性質，然後是利用量子碼來減輕量子退相干效應的技術。
Then we bridge the subject areas of large-scale search problems in wireless communications and exploit the benefits of quantum search algorithms in multi-user detection, in joint-channel estimation and data detection, localization and in routing problems of networking, for example.
然後，我們將無線通信中大規模搜索問題的主題領域聯繫起來，並利用量子搜索算法在多用戶檢測、聯合信道估計和數據檢測、定位和網絡路由問題等方面的優勢。

T-6: Robust Data Science: Modern Tools for Detection, Clustering and Cluster Enumeration

by
Michael Fauß, Michael Muma and Abdelhak M. Zoubir
邁克爾·福爾、邁克爾·穆馬和阿卜杜勒哈克·M·祖比爾
Princeton University, TU Darmstadt
普林斯頓大學
with rapid developments in signal processing and data analytics, driven by technological advances towards a more intelligent networked world, there is an ever-increasing need for reliable and robust information extraction and processing. Robust statistical methods account for the fact that the postulated models for the data are fulfilled only approximately and not exactly. In contrast to classical parametric procedures, robust methods are not significantly affected by small changes in the data, such as outliers or minor model departures. In practice, many engineering applications involve measurements that are not Gaussian and that may contain corrupted measurements and outliers, which cause the data distributions to be heavy-tailed. This leads to a breakdown in performance of traditional signal processing techniques that are based on Gaussian models.
隨着信號處理和數據分析技術的迅速發展，在技術進步的推動下，向更智能的網絡世界發展，對可靠和可靠的信息提取和處理的需求日益增加。穩健的統計方法解釋了這樣一個事實：數據的假設模型只得到了近似的實現，而不是精確的實現。與經典的參數化方法相比，穩健方法不受數據的微小變化（如離羣值或微小模型偏離）的顯著影響。在實際應用中，許多工程應用涉及非高斯的測量，這些測量可能包含損壞的測量和異常值，從而導致數據分佈是重尾的。這導致基於高斯模型的傳統信號處理技術的性能崩潰。
The focus of this tutorial is on recent advances in the related areas of robust detection and robust cluster analysis for unsupervised learning. This tutorial is organized into two parts. In the first part, we discuss robust detection for a given number of hypotheses. In the second part, we move to robust cluster analysis with a focus on recent advances in robust cluster enumeration.
本教程的重點是無監督學習的魯棒檢測和魯棒聚類分析相關領域的最新進展。本教程分爲兩部分。在第一部分中，我們討論了給定假設數下的魯棒檢測。在第二部分中，我們將討論穩健聚類分析，重點是穩健聚類枚舉的最新進展。

Amazon Workshop: Creating AI driven voice experiences with Alexa Conversations

Alexa Conversations is a new deep learning-based approach for creating more natural voice experiences on Alexa. As an AI driven dialog manager, Alexa Conversations provides implicit support for context carry-over, slot over and under-filling, and user driven corrections with less training data and back-end code than traditional skill development techniques. In this session you will learn about the science behind Alexa Conversations in a 30-minute technical deep dive, followed by a 60-minute hands-on workshop. In the workshop you will participate in a guided walk-through to create an Alexa skill, learn how to train the conversational AI to handle increasingly complex user interactions, and enhance Alexa’s spoken and visual responses to add personality. At the end of the session you will be able to speak to your skill on your Alexa enabled devices, and have the knowledge to build on what you have learned for new and innovative use cases.
Alexa對話是一種新的基於深度學習的方法，可以在Alexa上創建更自然的語音體驗。作爲一個人工智能驅動的對話管理器，Alexa Conversations提供了對上下文轉換、時隙轉換和不足填充以及用戶驅動的更正的隱式支持，與傳統的技能開發技術相比，它的訓練數據和後端代碼更少。在本課程中，你將在30分鐘的技術深度潛水中學習Alexa對話背後的科學知識，然後是60分鐘的實踐研討會。在研討會中，你將參與一個引導性的走查，以創造Alexa技能，學習如何訓練對話型人工智能來處理日益複雜的用戶交互，並增強Alexa的口語和視覺反應以增加個性。在課程結束時，您將能夠在支持Alexa的設備上與您的技能對話，並擁有構建新的和創新的用例所學知識的知識。
Alexa users in the US can try an experience created with Alexa Conversations by saying “Alexa, open Night Out”.
美國的Alexa用戶可以通過說“Alexa，open Night Out”來嘗試用Alexa對話創建的體驗。
Presenter Bio: Maryam Fazel-Zarandi is currently a Senior Applied Scientist at Amazon working on conversational AI for Alexa. Before joining Amazon in 2017, she was a Senior Research Scientist at Nuance Communications. Maryam received her Ph.D. in Computer Science from University of Toronto in 2013. Her research interests are in the areas of machine learning and knowledge representation and reasoning with a focus on natural language understanding and dialogue management.
主持人簡介：Maryam Fazel Zarandi目前是亞馬遜的高級應用科學家，爲Alexa開發對話人工智能。2017年加入亞馬遜之前，她是Nuance Communications的資深研究科學家。2013年，Maryam在多倫多大學獲得計算機科學博士學位。她的研究興趣是機器學習、知識表示和推理，重點是自然語言理解和對話管理。
Presenter Bio: Adam Hasham is a Senior Product Manager for Alexa Skills Kit (ASK) at Amazon. Before that, Adam founded a full-stack on-demand delivery technology platform called Hurrier in 2013, and sold it within 2 years to a Delivery Hero subsidiary (foodora) for its logistics technology and market footprint – generating a 2x+ return for investors (Delivery Hero subsequently IPO’d in 2017 on FSE). He holds a degree in Computer Engineering from University of Waterloo (2004), MBA from Queen’s University (2007) and CFA (2010) and has a broad range of experience from working in technology as a microchip designer, working in finance in Singapore as a Corporate Banker, heading product at foodora and wearing many hats as an entrepreneur.
演示者簡介：亞當·哈沙姆是亞馬遜Alexa Skills Kit（ASK）的高級產品經理。在此之前，Adam於2013年創建了一個名爲Hurrier的全棧按需交付技術平臺，並在2年內將其出售給delivery Hero子公司（foodora），以獲取物流技術和市場足跡——爲投資者帶來2倍以上的回報（delivery Hero隨後於2017年在FSE上市）。他擁有滑鐵盧大學（2004年）計算機工程學位、皇后大學（2007年）工商管理碩士學位和CFA（2010年）學位，擁有廣泛的經驗，包括作爲微芯片設計師從事技術工作、作爲公司銀行家在新加坡從事金融工作、在foodora領導產品以及作爲企業家戴着許多帽子。
Presenter Bio: Josey Sandoval is a Senior Product Manager for Alexa AI at Amazon, based in Seattle. He has spent the last three years with Alexa focused on dialog management capabilities and enabling more natural conversational experiences. Before joining Amazon in 2016, Josey lead product and engineering teams for a B2B human resources technology startup, and holds a Master’s of Science in Electrical Engineering from the University of Washington with a focus on signal processing and signal path.
演示者簡介：Josey Sandoval是位於西雅圖的Amazon的Alexa AI的高級產品經理。在過去的三年裏，他一直與Alexa一起致力於對話管理能力和更自然的對話體驗。在2016年加入亞馬遜之前，Josey領導一家B2B人力資源技術初創公司的產品和工程團隊，並擁有華盛頓大學電氣工程科學碩士學位，重點是信號處理和信號路徑。

Sony Workshop: AI x Audio: From Research to Production

AI x Audio: From Research to Production
AI x音頻：從研究到生產
AI has changed the way we process and create audio, especially music. This opens new possibilities and enables new products that could not be envisioned some years ago. In this industry session, we want to give an overview of Sony’s activities in this field.
人工智能改變了我們處理和創造音頻的方式，特別是音樂。這開闢了新的可能性，並使一些年前無法預見的新產品成爲可能。在本次行業會議上，我們想概述一下索尼在這一領域的活動。
We start this session with an introduction into music source separation. Sony has been active in AI-based source separation since 2013 and our systems have repeatedly won international evaluation campaigns. In the last years, we could successfully integrate this technology into a number of products, which we will introduce as well.
我們從介紹音樂源分離開始這節課。索尼自2013年以來一直積極開展基於人工智能的源代碼分離，我們的系統多次贏得國際評估活動。在過去的幾年裏，我們可以將這項技術成功地集成到許多產品中，我們也將引進這些產品。
Recently, INRIA released -in collaboration with Sony- open-unmix, an open-source implementation of our music source separation. open-unmix is available for NNabla as well as PyTorch.
最近，INRIA與索尼合作發佈了一個開源的音樂源代碼分離實現。NNabla和Pythorch都有開放的聯尼特派團。
Finally, in this first part, we will briefly introduce the NNabla open-source project. NNabla is Sony’s Deep Learning Library, which we are actively developing worldwide. We will give a brief overview of its main features and compare it to other popular DL frameworks. We will highlight its focus on network compression and speed, making it a good choice for audio and music product development and prototyping.
最後，在第一部分中，我們將簡要介紹NNabla開源項目。NNabla是索尼的深度學習圖書館，我們正在全球範圍內積極開發。我們將簡要概述它的主要特性，並將其與其他流行的DL框架進行比較。我們將重點關注網絡壓縮和速度，使之成爲音頻和音樂產品開發和原型製作的良好選擇。
In the second part of the session, we will present our activities on music creation where we envision technologies that could drive music for the years to come. Through deep learning-based approaches, we develop tools that enhance a composer’s creativity and augment his capabilities. In our talk, we briefly present our research activities, including details about the underlying machine learning models. For these tools to be relevant, we rely on close collaboration with artists from Sony Music Entertainment, which can sometimes be tricky. Indeed, we are often experiencing a gap that exists between scientific research and the music industry on many levels, such as timeliness or profitability. Hence, the presentation will also address our efforts to bridge that gap.
在會議的第二部分，我們將介紹我們在音樂創作方面的活動，我們設想在未來幾年裏可以推動音樂發展的技術。通過基於深度學習的方法，我們開發了增強作曲家創造力和增強其能力的工具。在我們的演講中，我們簡要介紹了我們的研究活動，包括關於底層機器學習模型的細節。爲了使這些工具具有相關性，我們依賴於與索尼音樂娛樂公司（Sony Music Entertainment）的藝術家們的密切合作，這有時是很棘手的。事實上，我們經常遇到科學研究和音樂產業在許多層面上的差距，例如及時性或盈利能力。因此，本報告還將討論我們爲彌合這一差距所作的努力。
Presenter bio: Mototsugu Abe is a Senior General Manager and Chief Distinguished Researcher at R&D Center of Sony Corporation. As a researcher, he specializes in audio signal processing, intelligent sensing and pattern recognition. As a manager, he supervises fundamental technology R&D in information technology field including video, image, audio, speech, natural language, communication, RF, robotics, sensing and machine learning technologies. He received a Ph.D in engineering from the University of Tokyo in 1999 and has been with Sony Corporation since then. From 2003 to 2004, he was a visiting scholar at Stanford University worked with Prof. Julius O. Smith III.
主持人簡介：安倍晉三是索尼公司研發中心高級總經理兼首席傑出研究員。作爲一名研究人員，他專門研究音頻信號處理、智能傳感和模式識別。作爲經理，他監督信息技術領域的基礎技術研發，包括視頻、圖像、音頻、語音、自然語言、通信、射頻、機器人、傳感和機器學習技術。他於1999年在東京大學獲得工程學博士學位，此後一直在索尼公司工作。從2003年到2004年，他是斯坦福大學的訪問學者，與朱利葉斯密斯三世教授合作。
Presenter bio: Marc Ferras received the B.S. degree in computer science, the M.S. degree in telecommunications, and the European Master in Language and Speech from the Universitat Politecnica de Catalunya (UPC), Spain, in 1999 and 2005, respectively. He received his PhD. degree from Université Paris-Sud XI, France, in 2009, researching the use of automatic speech recognition in speaker recognition tasks. Since, he has hold two post-doc positions, one at Tokyo Institute of Technology, Japan (2009-2011) and one at the Idiap Research Institute, Switzerland (2011-2016), both focused on automatic speech and speaker recognition. He is currently working at SONY’s Stuttgart Technology Center as a Senior Engineer working on speech recognition technology.
主持人簡介：Marc Ferras分別於1999年和2005年在西班牙加泰羅尼亞理工大學（UPC）獲得計算機科學學士學位、電信碩士學位和歐洲語言和演講碩士學位。他獲得了博士學位。2009年畢業於法國巴黎第十一大學，研究自動語音識別在說話人識別任務中的應用。此後，他先後擔任過兩個博士後職位，一個在日本東京理工學院（2009-2011）工作，一個在瑞士Idiap研究院（2011-2016）工作，均專注於自動語音和說話人識別。他目前在索尼斯圖加特技術中心工作，是一名從事語音識別技術的高級工程師。
Presenter bio: Stefan Lattner is a research associate at Sony CSL Paris, where he works on transformation and invariance learning with artificial neural networks. Using this paradigm, he targets rhythm generation (i.e., DrumNet) and is also involved in music information retrieval, audio generation, and recommendation. He obtained his doctorate in the area of music structure modeling from the Johannes Kepler University in Linz, Austria.
主持人簡介：Stefan Lattner是Sony CSL Paris的研究助理，他致力於用人工神經網絡進行變換和不變性學習。使用這種範式，他以節奏生成（即鼓聲）爲目標，還參與音樂信息檢索、音頻生成和推薦。他在奧地利林茨的約翰內斯開普勒大學獲得了音樂結構建模領域的博士學位。
Presenter bio: Cyran Aouameur is an assistant researcher at Sony CSL. Graduated from Ircam-organized ATIAM Master’s degree, he entered CSL two years ago. Passionate about urban music since he was a child, he has been focusing on developing AI-based solutions for artists to quickly design unique drum sounds and rhythms, which he considers being top-importance elements. He is now partly responsible for the communication with the artists, seeking to get the research and the music industry worlds to understand each other.
主持人簡介：Cyran Aouameur是索尼CSL的助理研究員。畢業於Ircam組織的ATIAM碩士學位，兩年前進入CSL。他從小就熱衷於城市音樂，一直致力於開發基於人工智能的解決方案，讓藝術家快速設計獨特的鼓聲和節奏，他認爲這些是最重要的元素。他現在部分負責與藝術家的交流，尋求讓研究界和音樂界相互瞭解。

Young Professional Development Workshop

Monday, May 4, 09:30 – 13:00
5月4日，星期一，09:30–13:00
Navigating social media as a scientist
作爲一名科學家駕馭社交媒體
Facebook, Twitter or LinkedIn, by now, are no longer a new thing. Also for scientists, social media platforms have become an integral networking tool to connect globally, exchange research ideas and advance careers. But, what’s a proper way for scientists to make use of these platforms?
Facebook、Twitter或LinkedIn現在已經不是什麼新鮮事了。同樣對科學家來說，社交媒體平臺已經成爲連接全球、交流研究想法和促進職業發展的不可或缺的網絡工具。但是，什麼是科學家利用這些平臺的正確方法呢？
In this workshop part, you will gain a better understanding on the current state of digital science communication. In detail, you will learn how scientists may integrate social media into their activities — in a helpful and productive way. The workshop advocates a reflected media usage that keeps a close eye on how and when it is recommended for you to “go digital”. This workshop provides…
在這一部分中，您將對數字科學傳播的現狀有一個更好的瞭解。具體來說，你將學習到科學家如何將社交媒體融入到他們的活動中——以一種有幫助和有成效的方式。研討會提倡一種反映媒體使用情況的方法，密切關注如何以及何時建議您“數字化”。這個研討會提供…
Professional assistance in clarifying your objectives for engaging with social media. Why should I consider social media usage? What are my goals?
提供專業幫助，幫助您明確參與社交媒體的目標。我爲什麼要考慮使用社交媒體？我的目標是什麼？
Help figuring out which of the many media platforms is the right one for you.
幫助找出哪些媒體平臺適合您。
Assistance on how social media may help you explore your career options (e.g. after a PhD or postdoc).
關於社交媒體如何幫助你探索職業選擇的幫助（例如博士或博士後）。
Help taking first steps towards brushing up your personal professional online profiles.
幫助採取第一步刷你的個人專業在線檔案。
Speaker: Peter Kronenberg from NaturalScience.Careers
演講者：來自自然科學的彼得·克倫伯格。職業生涯
https://naturalscience.careers/

Tutorial T-7: Signal Processing for MIMO Communications Beyond 5G

by
Emil Björnson and Jiayi Zhang
埃米爾·比約恩森和張嘉怡
Linköping University, Beijing Jiaotong University
林科平大學、北京交通大學
Signal processing is at the core of the 5G communication technology. The use of large arrays with 64 or more antennas is becoming mainstream and the commercial deployment started in 2019. This technology is known as Massive MIMO (multiple-input multiple-output) and was viewed as science fiction just ten years ago, but with the combination of advanced signal processing and innovative protocols, it is now a reality. Just as the seminal papers on Massive MIMO were published ten years ago, this is likely the time when the new technology components for 6G will be identified. In this tutorial, we will consider two such promising research directions, which might be utilized in the conventional cellular spectrum as well as in mmWave or sub-THz bands.
信號處理是5G通信技術的核心。使用64個或更多天線的大型陣列正在成爲主流，商業部署從2019年開始。這項技術被稱爲大規模MIMO（multiple input multiple output，多輸入多輸出），十年前還被視爲科幻小說，但隨着先進信號處理和創新協議的結合，現在已經成爲現實。正如10年前發表的關於大規模MIMO的開創性論文一樣，現在很可能是確定6G的新技術組件的時候。在本教程中，我們將考慮兩個這樣有前途的研究方向，它們可以用於傳統的蜂窩頻譜以及毫米波或亞太赫茲波段。
The first new direction is Cell-free Massive MIMO, which refers to a large-scale distributed antenna system that is made practical by innovative signal processing and radio resource allocation algorithms. Different from cellular communications, each user is served by all or a user-unique subset of the antennas. The system is designed to achieve high spectral and energy efficiency, but under the unusual constraint of being scalable from a computational and cost perspective to enable large network deployments. The main goal is to achieve uniformly good and reliable service via excessive macro-diversity, as compared to the micro-diversity achieved by conventional Massive MIMO with large arrays. We will cover the basic theory as well as the recent algorithmic and implementation developments.
第一個新的方向是無小區大規模MIMO，它是指通過創新的信號處理和無線資源分配算法使大規模分佈式天線系統實用化。與蜂窩通信不同，每個用戶由天線的全部或用戶唯一子集服務。該系統旨在實現高頻譜和能源效率，但在不尋常的限制下，可從計算和成本角度擴展，以實現大型網絡部署。其主要目標是通過過多的宏分集來實現一致的良好和可靠的服務，與傳統的大陣列大規模MIMO相比。我們將介紹基本理論以及最近的算法和實現發展。
The second new direction is intelligent reflecting surfaces, which are also known as software-controlled meta-surfaces and reconfigurable intelligent surfaces. These are semi-passive surfaces consisting of an array of meta-atoms with reconfigurable properties that can be controlled to reflect an incoming wave in a controllable way. While only the transmitter and receiver can be optimized in conventional wireless communication systems, the addition of intelligent reflecting surfaces enables optimization also of the channels (i.e., the creating of smart radio environments). We will derive the propagation model from physics and tackle difficult issues such as channel estimation and real-time operation.
第二個新方向是智能反射面，也稱爲軟件控制的元曲面和可重構的智能曲面。這些是半被動表面，由具有可重構特性的元原子陣列組成，這些元原子陣列可以控制以可控方式反射入射波。雖然在傳統無線通信系統中只能優化發射機和接收機，但增加智能反射面也可以優化信道（即創建智能無線電環境）。我們將從物理學中導出傳播模型，並解決諸如信道估計和實時操作等難題。

Tutorial T-8: Adversarial Robustness of Deep Learning Models: Attack, Defense and Verification

深度學習模型的對抗魯棒性：攻擊，防禦和驗證
by
Pin-Yu Chen 陳品玉
IBM Research, Yorktown Heights
IBM Research，約克敦高地

陳品瑜
IBM研究中心，約克敦高地

Despite the fact of achieving high standard accuracy in a variety of machine learning tasks, deep learning models built upon neural networks have recently been identified having the issue of lacking adversarial robustness. The decision making of well-trained deep learning models can be easily falsified and manipulated, resulting in ever-increasing concerns in safety-critical and security-sensitive applications requiring certified robustness and guaranteed reliability.

This tutorial will provide an overview of recent advances in the research of adversarial robustness, featuring both comprehensive research topics and technical depth. We will cover three fundamental pillars in adversarial robustness: attack, defense and verification. Attack refers to efficient generation of adversarial examples for robustness assessment under different attack assumptions (e.g., white-box or black-box attacks). Defense refers to adversary detection and robust training algorithms to enhance model robustness. Verification refers to attack-agnostic metrics and certification algorithms for proper evaluation of adversarial robustness and standardization. For each pillar, we will emphasize the tight connection between signal processing and the research in adversarial robustness, ranging from fundamental techniques such as first-order and zero-order optimization, minimax optimization, geometric analysis, model compression, data filtering and quantization, subspace analysis, active sampling, frequency component analysis to specific applications such as computer vision, automatic speech recognition, natural language processing and data regression.

This tutorial aims to serve as a short lecture for researchers and students to access the emergent filed of adversarial robustness from the viewpoint of signal processing.

儘管在各種機器學習任務中都達到了很高的標準精度，但是最近發現基於神經網絡的深度學習模型存在缺乏對抗性魯棒性的問題。訓練有素的深度學習模型的決策很容易被篡改和操縱，導致對安全性要求嚴格且對安全性要求很高的應用程序的擔憂日益增加，這些應用程序需要經過認證的魯棒性和保證的可靠性。

本教程將概述對抗性魯棒性研究的最新進展，包括全面的研究主題和技術深度。我們將涵蓋對抗性魯棒性的三個基本支柱：攻擊，防禦和驗證。攻擊是指在不同攻擊假設（例如白盒或黑盒攻擊）下有效生成對抗性示例的魯棒性評估。防禦是指對手檢測和魯棒訓練算法以增強模型的魯棒性。驗證是指與攻擊無關的指標和認證算法，用於正確評估對抗性的魯棒性和標準化。對於每個支柱，我們都將強調信號處理與對抗性魯棒性研究之間的緊密聯繫，

本教程旨在作爲短期講座，供研究人員和學生從信號處理的角度訪問新興的對抗魯棒性文件。

Tutorial T-9: Graph Neural Networks

by
Alejandro Ribeiro and Fernando Gama
University of Pennsylvania
由
亞歷杭德羅·裏貝羅和費爾南多·伽馬
賓夕法尼亞大學

Neural Networks have achieved resounding success in a variety of learning tasks. Although sometimes overlooked, success has not been uniform across all learning problems and it has not been achieved by generic architectures. Most remarkable accomplishments are on the processing of signals in time and images and have been attained by Convolutional Neural Networks (CNNs). This is because convolutions successfully exploit the regular structure of Euclidean space and enable learning in high dimensional spaces.

In this tutorial we will develop the novel concept of Graph Neural Networks (GNNs), which intend to extend the success of CNNs to the processing of high dimensional signals in non-Euclidean domains. They do so by leveraging possibly irregular signal structures described by graphs. The following topics will be covered:

Graph Convolutions and GNN Architectures. The key concept enabling the definition of GNNs is the graph convolutional filter introduced in the graph signal processing (GSP) literature. GNN architectures compose graph filters with pointwise nonlinearities. Illustrative examples on authorship attribution and recommendation systems will be covered.
Fundamental Properties of GNNs. Graph filters and GNNs are suitable architectures to process signals on graphs because of their permutation equivariance. GNNs tend to work better than graph filters because they are Lipschitz stable to deformations of the graph that describes their structure. This is a property that regular graph filters can’t have.
Distributed Control of Multiagent Systems. An exciting application domain for GNNs is the distributed control of large scale multiagent systems. Applications to the control of robot swarms and wireless communication networks will be covered.
Attendees to this tutorial will be prepared to tackle research on the practice and theory of GNNs. Coding examples will be provided throughout.

神經網絡在各種學習任務中都取得了巨大的成功。儘管有時會被忽略，但在所有學習問題中，成功並不一致，並且通用架構也未實現成功。卷積神經網絡（CNN）是在時間和圖像信號處理方面取得的最傑出成就。這是因爲卷積成功地利用了歐幾里得空間的規則結構，並允許在高維空間進行學習。

在本教程中，我們將開發圖形神經網絡（GNN）的新穎概念，其意圖是將CNN的成功擴展到非歐氏域中高維信號的處理。它們通過利用圖形描述的可能不規則的信號結構來做到這一點。將涵蓋以下主題：

圖卷積和GNN架構。定義GNN的關鍵概念是在圖信號處理（GSP）文獻中引入的圖卷積濾波器。GNN架構構成具有逐點非線性的圖形濾波器。本文將介紹作者身份歸屬和推薦系統的示例。
GNN的基本屬性。圖形濾波器和GNN由於其置換等方差，因此是處理圖形信號的合適體系結構。GNN傾向於比圖形過濾器更好地工作，因爲它們對於描述其結構的圖形的變形是Lipschitz穩定的。這是常規圖形過濾器無法擁有的屬性。
多主體系統的分佈式控制。GNN的一個激動人心的應用領域是大規模多主體系統的分佈式控制。將介紹控制機器人羣和無線通信網絡的應用程序。
本教程的參與者將準備進行有關GNN的實踐和理論的研究。全文將提供編碼示例。

Tutorial T-10: Biomedical Image Reconstruction—From Foundations to Deep Neural Networks

生物醫學圖像重建-從基礎到深度神經網絡
by
Michaël Unser and Pol del Aguila Pla
CIBM, EPFL

Biomedical imaging plays a key role in medicine and biology. Its range of applications and its impact in research and medical practice have increased steadily during the past 4 decades. Part of the astonishing improvements in image quality and resolution is due to the use of increasingly sophisticated signal-processing techniques. This, in itself, would justify the tutorial. Nonetheless, the field is now transitioning towards the deep-learning era, where disruptive improvements and lack of theoretical background go hand-in-hand. To harness the power of these new techniques without suffering from their pitfalls, a structured understanding of the field is fundamental.

We start the tutorial by presenting the building blocks of an image-reconstruction problem, from the underlying image that lives in function spaces to its observed discrete measurements. Most importantly, we detail the small collection of forward and sampling operators that allow one to model most biomedical imaging problems, including magnetic resonance imaging, bright-field microscopy, structured-illumination microscopy, x-ray computed tomography, and optical diffraction tomography. This leads up to our exposition of 1st-generation methods (e.g., filtered back-projection, Tikhonov regularization), the regimes in which they are most attractive, and how to implement them efficiently.

We then transition to 2nd-generation methods (non-quadratic regularization, sparsity, and compressive sensing) and show how advanced signal processing allows image reconstruction with smaller acquisition times, less invasive procedures, and lower radioactive and irradiation dosage. We expose the foundations of these methods (results in compressed-sensing recovery, representer theorems, infinite-divisible distributions) and the most useful algorithms in imaging (proximal operators, projected gradient descent, alternate-direction method of multipliers), again exemplifying their efficient implementation.

Finally, we present the state of the art in 3rd-generation methods (deep-learning reconstruction of images), categorizing them using the building-block terminology introduced throughout the tutorial. In this manner, we emphasize the links to 1st- and 2nd-generation methods in order to provide intuition and guidelines to devise and understand novel 3rd-generation methods. Furthermore, we state the benefits of each proposal and give cautionary examples of the dangers of overreliance on training data.

生物醫學成像在醫學和生物學中起着關鍵作用。在過去的40年中，其應用範圍及其在研究和醫學實踐中的影響穩步增長。圖像質量和分辨率的驚人提高的部分原因是由於使用了越來越複雜的信號處理技術。這本身就可以證明本教程的合理性。儘管如此，該領域現在正在過渡到深度學習時代，在那個時代，顛覆性的改進和缺乏理論背景齊頭並進。爲了利用這些新技術的力量而不遭受其陷阱，對這一領域的結構化理解至關重要。

我們通過介紹圖像重建問題的構建塊開始本教程，從生活在功能空間中的基礎圖像到觀察到的離散測量值。最重要的是，我們詳細介紹了少量的正向運算符和採樣運算符，使他們可以對大多數生物醫學成像問題進行建模，包括磁共振成像，明場顯微鏡，結構照明顯微鏡，X射線計算機斷層掃描和光學衍射斷層掃描。這導致我們對第一代方法（例如，濾波反投影，Tikhonov正則化）進行了闡述，介紹了它們最有吸引力的系統以及如何有效地實施它們。

然後，我們過渡到第二代方法（非二次正則化，稀疏性和壓縮感測），並展示先進的信號處理如何以更少的採集時間，更少的侵入性程序以及更低的放射性和輻射劑量實現圖像重建。我們介紹了這些方法的基礎（壓縮感知恢復的結果，表示定理，無限可分分佈）和成像中最有用的算法（近距離算子，投影梯度下降，乘法器的交替方向方法），再次證明了它們的高效性實施。

最後，我們以第三代方法（圖像的深度學習重建）展示了最新技術，並使用了整個教程中介紹的構建模塊術語對它們進行分類。以這種方式，我們強調與第一代和第二代方法的聯繫，以提供設計和理解新穎的第三代方法的直覺和指導。此外，我們陳述了每個提案的好處，並給出了過度依賴培訓數據的危險的警告示例。

Tutorial T-11: Game theoretic learning and applications to spectrum collaboration

博弈論學習及其在頻譜協作中的應用
by
Amir Leshem and Kobi Cohen
Bar Ilan University, Ben-Gurion University of the Negev

阿米爾Leshem和科恩Kobi
巴伊蘭大學，內蓋夫本-古裏安大學

Recent years have shown significant advances in many signal processing tasks based on machine learning techniques. Deep learning as well as reinforcement learning techniques have shown a tremendous value for classification, noise reduction and many other tasks. Recent advances in transferring the learning process to the edge of the network in order to protect the privacy of users’ data, as well as exploit the computational resources available at the mobile devices stimulated the development of techniques such as federated learning. In contrast, learning over networks of selfish agents is much less understood and holds the potential for the next leap in learning techniques. To allow distributed learning, both in the federated and distributed contexts, efficient communication techniques are required to save both energy and bandwidth. The tutorial will present recent results related to distributed learning under communication constraints. We will survey basic protocols which can be utilized to achieve efficient learning and then implement them to multiple examples of collaborative spectrum access as well as other resource sharing problems.

近年來，在基於機器學習技術的許多信號處理任務中顯示出了顯着的進步。深度學習以及強化學習技術已顯示出對分類，降噪和許多其他任務的巨大價值。爲了保護用戶數據的私密性以及利用移動設備上可用的計算資源，將學習過程轉移到網絡邊緣的最新進展刺激了諸如聯合學習之類的技術的發展。相比之下，通過自私行爲者網絡進行學習的瞭解卻很少，並且具有學習技術下一次飛躍的潛力。爲了允許在聯邦和分佈式環境中進行分佈式學習，需要有效的通信技術來節省能量和帶寬。本教程將介紹與交流約束下的分佈式學習相關的最新結果。我們將調查可用於實現有效學習的基本協議，然後將其應用於協作頻譜訪問以及其他資源共享問題的多個示例。

Tutorial T-12: Information Extraction in Joint Millimeter-Wave State Sensing and Communications: Fundamentals to Applications

毫米波聯合狀態傳感和通信中的信息提取：應用基礎
by
Kumar Vijay Mishra, Bhavani Shankar and Mari Kobayashi
US Army Research Laboratory; University of Luxembourg;
TU Munich
庫瑪·維傑·米斯拉（Kumar Vijay Mishra），巴瓦尼·尚卡爾（Bhavani Shankar）和馬裏·小林（Mari Kobayashi）
美國陸軍研究實驗室；盧森堡大學
慕尼黑工業大學

Extreme crowding of electromagnetic spectrum in recent years has led to the emergence of complex challenges in designing, sensing and communications systems. The advent of novel technologies –such as drone-based customer services, autonomous driving, radio-frequency identification, and weather monitoring– imply sensors like radars are now deployed in urban environments and operate in bands that were earlier reserved for communications services. Similarly, with rapid surge in mobile network operators, there is a growing concern that the amount of mobile data traffic poses a formidable challenge toward realizing future wireless networks. Both radar and communications systems need wide bandwidth to provide a designated quality-of-service thus resulting in competing interests in exploiting the spectrum. Hence, sharing spectral and hardware resources of communications and radar is imperative toward efficient spectrum utilization.

Specifically, in the automotive sector, state sensing and communication are two major tasks enabling future high-mobility applications such as Vehicular to Everything (V2X) where a node must continuously track its dynamically changing environment and react accordingly by exchanging information with others. This field has, therefore, witnessed concerted and intense efforts towards realizing these joint radar-communications (JRC) systems. Most of the modern automotive JRC systems are envisaged to operate at millimeter-wave (mm-Wave); this brings a new set of challenges and opportunities for the system engineers when compared with centimeter-wave (cm-Wave) JRC. This band is characterized by severe penetration losses, short coherence times, and availability of wide bandwidth. While wide bandwidth is useful in attaining high vehicular communications data rates and high-resolution automotive radar, the losses must be compensated by using large number of antennas at the transmitter and receiver. There is, therefore, a recent surge in research on joint multiple-input multiple-output (MIMO)-Radar-MIMO-Communications (MRMC) systems, where the antenna positions of radar and communications are shared with each other. Both systems may share information with each other to benefit from increased number of design degrees-of-freedom (DoFs).

This tutorial takes a focused view on mm-Wave JRC touching the entire spectrum of this field. After attending the tutorial, participants will be able to understand:

Current challenges and design criteria associated with mm-Wave JRC.
Information theoretic modeling and fundamental limits of joint sensing-communications.
Overview of communication and radar systems including waveform design and data/ target detection-estimation-tracking theoretic criteria, receiver processing algorithms for mm-Wave JRC.
Hardware design aspects of example JRC designs.
Emerging research challenges and solutions in MRMC

近年來，電磁頻譜的極端擁擠導致在設計，傳感和通信系統中出現了複雜的挑戰。基於無人機的客戶服務，自動駕駛，射頻識別和天氣監控等新技術的問世，意味着雷達等傳感器現在已部署在城市環境中，並在爲通信服務保留的頻段中運行。類似地，隨着移動網絡運營商的迅速增長，人們越來越擔心移動數據業務量將對實現未來的無線網絡提出巨大的挑戰。雷達和通信系統都需要寬帶寬來提供指定的服務質量，因此在利用頻譜方面引起了競爭。因此，

具體而言，在汽車領域，狀態檢測和通信是實現未來的高移動性應用（如車到萬物（V2X））的兩個主要任務，其中節點必須不斷跟蹤其動態變化的環境並通過與他人交換信息來做出相應的反應。因此，在這一領域見證了爲實現這些聯合雷達通信（JRC）系統而進行的一致而艱苦的努力。大多數現代汽車JRC系統被設想爲在毫米波（mm-Wave）下運行。與釐米波（JCW）相比，這給系統工程師帶來了一系列新的挑戰和機遇。該頻帶的特點是嚴重的穿透損耗，較短的相干時間和寬帶寬的可用性。雖然寬帶寬可用於獲得高的車載通信數據速率和高分辨率的汽車雷達，但必須通過在發射器和接收器處使用大量天線來補償損耗。因此，最近對聯合多輸入多輸出（MIMO）-雷達-MIMO-通信（MRMC）系統的研究激增，其中雷達和通信的天線位置彼此共享。兩個系統可能會彼此共享信息，以受益於更多的設計自由度（DoF）。雷達和通訊的天線位置彼此共享。兩個系統可能會彼此共享信息，以受益於更多的設計自由度（DoF）。雷達和通訊的天線位置彼此共享。兩個系統可能會彼此共享信息，以受益於更多的設計自由度（DoF）。

本教程重點介紹了毫米波JRC，涉及該領域的整個領域。參加本教程之後，參與者將能夠理解：

與毫米波JRC相關的當前挑戰和設計標準。
聯合感測通信的信息理論模型和基本限制。
通信和雷達系統概述，包括波形設計和數據/目標檢測-估計-跟蹤理論標準，毫米波JRC的接收器處理算法。
示例JRC設計的硬件設計方面。
MRMC中新興的研究挑戰和解決方案

Workshop Huawei Workshop: Shaping the vertical industry in 2025

華爲研討會，塑造2025年的垂直產業
Monday, May 4, 14:30 – 18:00
Shaping the vertical industry in 2025
It is an era of digitalization and transformation of industry factory with 5G technologies to relax people from hard working. Emerging new technologies include but not limited to URLLC, V2X, high-accuracy positioning, intelligent sensing, massive IOTs, and low power IOTs. It is predictive that 2025 is a suitable time to widely commercialize the vertical industry. However, the challenge still exists in reaching the KPIs such as low latency, high reliability, high accuracy positioning, and massive connections. This workshop is to discuss the promising issues on vertical industry.

Presenter Bio: Dr. Peiying Zhu (Chair) is a Huawei Fellow. She is currently leading 5G wireless system research in Huawei. The focus of her research is advanced wireless access technologies with more than 150 granted patents. She has been regularly giving talks and panel discussions on 5G vision and enabling technologies. She served as the guest editor for IEEE Signal processing magazine special issue on the 5G revolution and co-chaired for various 5G workshops. She is actively involved in IEEE 802 and 3GPP standards development. She is currently a WiFi Alliance Board member. Prior to joining Huawei in 2009, Peiying was a Nortel Fellow and Director of Advanced Wireless Access Technology in the Nortel Wireless Technology Lab. She led the team and pioneered research and prototyping on MIMO-OFDM and Multi-hop relay. Many of these technologies developed by the team have been adopted into LTE standards and 4G products.

Presenter Bio: Dr. Thomas Haustein (Panelist) received the Dr.-Ing. (Ph.D.) degree in mobile communications from the University of Technology Berlin, Germany, in 2006. In 1997, he was with the Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute (HHI), Berlin, where he was involved in wireless infrared systems and radio communications with multiple antennas and OFDM. He was also involved in real-time algorithms for baseband processing and advanced multiuser resource allocation. From 2006 to 2008, he was with Nokia Siemens Networks, where he conducted research on 4G. Since 2009, he has been the Head of the Wireless Communications Department, Fraunhofer HHI, and is currently involved in research on 5G and industrial wireless. He led several national and European funded projects on the topics of Cognitive Radio and Millimeter Wave Technology. He was and is active in several H2020 5GPPP projects. He has been serving as an Academic Advisor to NGMN since 2012 and contributes to 3GPP standardization since 2015.

Presenter Bio: Prof. Joerg Widmer (panelist) is Research Professor and Research Director of IMDEA Networks in Madrid, Spain. Before, he held positions at DOCOMO Euro-Labs in Munich, Germany and EPFL, Switzerland. His research focuses on wireless networks, ranging from extremely high frequency millimeter-wave communication and MAC layer design to mobile network architectures. Joerg Widmer authored more than 150 conference and journal papers and three IETF RFCs, and holds 13 patents. He was awarded an ERC consolidator grant, the Friedrich Wilhelm Bessel Research Award of the Alexander von Humboldt Foundation, a Mercator Fellowship of the German Research Foundation, a Spanish Ramon y Cajal grant, as well as eight best paper awards. He is an IEEE Fellow and Distinguished Member of the ACM.

Presenter Bio: Prof. Petar Popovski (panelist) is a Professor in Connectivity at Aalborg University, Denmark. He received his Dipl.-Ing./ Mag.-Ing. in communication engineering from Sts. Cyril and Methodius University in Skopje, R. of Macedonia, and his Ph.D. from Aalborg University. He is a Fellow of IEEE and featured in the list of Highly Cited Researchers 2018, compiled by Web of Science. He received an ERC Consolidator Grant (2015), the Danish Elite Researcher award (2016), IEEE Fred W. Ellersick prize (2016) and IEEE Stephen O. Rice prize (2018). He is currently an Area Editor for IEEE Transactions on Wireless Communications and a Steering Board member of IEEE SmartGridComm. He served as a General Chair for IEEE SmartGridComm 2018 and General Chair for IEEE Communication Theory Workshop 2019. He co-founded RESEIWE A/S, a company delivering ultra-reliable wireless solutions. His research interests are in the area of communication theory, with focus on wireless communication and networks.

Presenter Bio: Eichinger Josef (Panelist) joined Huawei Technologies in 2013 to strengthen the 5G Research team in Munich. He started his professional carrier as technical expert in the field of industry energy and electronic systems. After the study he joined Siemens AG 1994 and was working in development of high frequency radar systems, optical networks and as researcher on radio technologies as HSPA and LTE. He changed to Nokia Siemens Networks 2007 as LTE Product Manager and was head of LTE-Advanced FastTrack Programs from 2010 to end of 2012 to push forward LTE-A. Currently he is leading research on 5G enabled industrial communication in Huawei Munich Research Center. The focus are 5G for industry 4.0 and vehicle-to-vehicle communication. Complementary to the research and standardization work he is also responsible for the prove of the new concept by trials and live experiments e.g. Robot Control in the Cloud, Robot as a Service, Tele-operated Driving, etc. Since April 2018 he is also member of the 5G-ACIA steering board and leading the Huawei delegation.

Presenter Bio: Dr. Mehdi Bennis is an Associate Professor at the Centre for Wireless Communications, University of Oulu, Finland, an Academy of Finland Research Fellow and head of the intelligent connectivity and networks/systems group (ICON). His main research interests are in radio resource management, heterogeneous networks, game theory and machine learning in 5G networks and beyond. He has co-authored one book and published more than 200 research papers in international conferences, journals and book chapters. He has been the recipient of several prestigious awards including the 2015 Fred W. Ellersick Prize from the IEEE Communications Society, the 2016 Best Tutorial Prize from the IEEE Communications Society, the 2017 EURASIP Best paper Award for the Journal of Wireless Communications and Networks, the all-University of Oulu award for research, In 2019 Dr Bennis received the IEEE ComSoc Radio Communications Committee Early Achievement Award.
5月4日，星期一，14：30 – 18:00
塑造2025年的垂直產業
這是一個數字化和採用5G技術的工業工廠轉型時代，可以使人們擺脫辛苦的工作。新興的新技術包括但不限於URLLC，V2X，高精度定位，智能感應，大規模物聯網和低功耗物聯網。可以預見的是，2025年是使垂直行業廣泛商業化的合適時機。但是，在達到KPI方面仍然存在挑戰，例如低延遲，高可靠性，高精度定位和大規模連接。該研討會將討論垂直行業中有希望的問題。

主持人簡歷：朱培英博士（主席）是華爲研究員。她目前在華爲領導5G無線系統研究。她的研究重點是擁有150多項授權專利的高級無線訪問技術。她經常就5G願景和支持技術進行演講和小組討論。她曾擔任IEEE信號處理雜誌關於5G革命的特刊的客座編輯，並共同主持了各種5G研討會。她積極參與IEEE 802和3GPP標準的開發。她目前是WiFi聯盟董事會成員。在2009年加入華爲之前，裴英穎曾擔任北電無線研究員和北電無線技術實驗室高級無線接入技術總監。她領導團隊，並開創了MIMO-OFDM和多跳中繼的研究和原型設計。

主持人簡歷：Thomas Haustein博士（主持人）獲得了Ing博士。2006年獲得德國柏林工業大學移動通信博士學位（博士學位）。1997年，他在柏林海因裏希·赫茲研究所（HHI）的弗勞恩霍夫電信研究所工作，在那裏從事無線紅外技術的研究。系統和帶有多個天線和OFDM的無線電通信。他還參與了用於基帶處理和高級多用戶資源分配的實時算法。從2006年到2008年，他在諾基亞西門子通信公司任職，從事4G研究。自2009年以來，他一直擔任Fraunhofer HHI無線通信部負責人，目前從事5G和工業無線領域的研究。他領導了多個國家和歐洲資助的項目，涉及認知無線電和毫米波技術。他曾經並且活躍於多個H2020 5GPPP項目。自2012年以來，他一直擔任NGMN的學術顧問，並從2015年開始爲3GPP標準化做出貢獻。

主持人簡介： Joerg Widmer教授（專家組成員）是西班牙馬德里IMDEA Networks的研究教授兼研究總監。在此之前，他曾在德國慕尼黑的DOCOMO歐洲實驗室和瑞士的EPFL任職。他的研究專注於無線網絡，從極高頻率的毫米波通信和MAC層設計到移動網絡架構。約爾格·威德默（Joerg Widmer）撰寫了150餘篇會議和期刊論文以及3份IETF RFC，並擁有13項專利。他獲得了ERC合併者資助，亞歷山大·馮·洪堡基金會的弗里德里希·威廉·貝塞爾研究獎，德國研究基金會的墨卡託獎學金，西班牙的拉蒙·卡哈爾獎以及八個最佳論文獎。他是ACM的IEEE資深會員。

主持人簡歷：Petar Popovski教授（專家）是丹麥奧爾堡大學的連通性教授。他獲得了他的Dipl.-Ing./ Mag.-Ing。Sts的通信工程專業。馬其頓河斯科普里的西里爾和麥迪烏斯大學，及其博士學位。來自奧爾堡大學。他是IEEE的院士，併入選了Web of Science撰寫的《 2018年高被引學者》。他獲得了ERC合併者獎（2015），丹麥優秀研究人員獎（2016），IEEE弗雷德·W·埃勒西克獎（2016）和IEEE斯蒂芬·賴斯獎（2018）。他目前是IEEE無線通信事務的區域編輯，還是IEEE SmartGridComm的指導委員會成員。他曾擔任2018年IEEE SmartGridComm大會主席和2019年IEEE通訊理論研討會主席。他是RESEIWE A / S的共同創始人，一家提供超可靠無線解決方案的公司。他的研究興趣是通信理論領域，重點是無線通信和網絡。

主持人簡歷：艾辛格·約瑟夫（Eichinger Josef）（潘捷）於2013年加入華爲，以加強慕尼黑的5G研究團隊。他以工業能源和電子系統領域的技術專家的身份開始了自己的職業航母。研究結束後，他於1994年加入西門子股份公司，從事高頻雷達系統，光網絡的開發以及HSPA和LTE等無線電技術的研究。他改爲諾基亞西門子通信2007年擔任LTE產品經理，並於2010年至2012年底擔任LTE-Advanced FastTrack計劃負責人，以推動LTE-A的發展。目前，他在華爲慕尼黑研究中心領導着基於5G的工業通信研究。重點是用於工業4.0和車對車通信的5G。

主持人簡歷：Mehdi Bennis博士是芬蘭奧盧大學無線通信中心的副教授，芬蘭科學院研究員，也是智能連接和網絡/系統小組（ICON）的負責人。他的主要研究興趣是無線電資源管理，異構網絡，博弈論和5G網絡及以後的機器學習。他與他人合着了一本書，並在國際會議，期刊和書籍章節中發表了200多篇研究論文。他曾獲得多個著名的獎項，包括2015年IEEE通信協會的Fred W. Ellersick獎，IEEE通信協會的2016年最佳教程獎，2017年無線通信和網絡雜誌的EURASIP最佳論文獎，奧盧大學全研究獎

Workshop Mathworks Workshop: Developing Next Generation AI-based Signal Processing Systems using MATLAB & Simulink

數學研討會，信號處理與MATLAB中的深度學習相結合-從入門到開發實際應用
Monday, May 4, 14:30 – 18:00
Signal Processing Meets Deep Learning in MATLAB – From Getting Started to Developing Real-World Applications
The adoption of deep learning across a wide range of signal processing applications has been attracting an increasing level of attention over the last few years. Deep learning for real-world signal processing systems has accentuated the need for application-specific tools and expertise for creating, labelling, augmenting and processing the vast amounts of signal data required to train and evaluate the learning models.

Using MATLAB code and new features, we will start from the basics of designing and training a network. We will then move onto more advanced topics, including data annotation, advanced feature extraction, training acceleration on GPUs and GPU clouds, and real-time implementation of deep networks on embedded devices. While focusing on a practical speech-based example, we will also discuss applications to other types of signals, such as Communications, Radar, and Medical Devices.

Presenter Bio: Jihad Ibrahim is a principal software developer and product lead of Audio Toolbox at MathWorks. He joined MathWorks in 2006 and has contributed to the development of Signal Toolbox, DSP System Toolbox, Communications Toolbox, and Audio Toolbox. He received his PhD in Electrical Engineering from Virginia Tech.

Presenter Bio: Gabriele Bunkheila is a senior product manager at MathWorks, where he coordinates the strategy of MATLAB toolboxes for audio and DSP. After joining MathWorks in 2008, he worked as a signal processing application engineer for several years, supporting MATLAB and Simulink users across industries from algorithm design to real-time implementations. Before MathWorks, he held a number of research and development positions, and he was a lecturer of sound theory and technologies at the national film school of Rome. He has a master’s degree in physics and a Ph.D. in communications engineering.
5月4日，星期一，14：30 – 18:00
信號處理與MATLAB中的深度學習相結合-從入門到開發實際應用
在過去的幾年中，深度學習在各種信號處理應用中的採用已引起越來越多的關注。現實世界中信號處理系統的深度學習突顯了對特定於應用的工具和專業知識的需求，這些工具和專業知識用於創建，標記，擴充和處理訓練和評估學習模型所需的大量信號數據。

使用MATLAB代碼和新功能，我們將從設計和訓練網絡的基礎開始。然後，我們將進入更高級的主題，包括數據註釋，高級功能提取，GPU和GPU雲上的訓練加速以及嵌入式設備上的深度網絡的實時實現。在重點討論基於語音的實際示例時，我們還將討論對其他類型信號的應用，例如通信，雷達和醫療設備。

Presenter Bio：Jihad Ibrahim是MathWorks的首席軟件開發人員和Audio Toolbox的產品負責人。他於2006年加入MathWorks，爲Signal Toolbox，DSP System Toolbox，Communication Toolbox和Audio Toolbox的開發做出了貢獻。他獲得了弗吉尼亞理工大學電氣工程博士學位。

Presenter Bio：Gabriele Bunkheila是MathWorks的高級產品經理，負責協調用於音頻和DSP的MATLAB工具箱的策略。在2008年加入MathWorks之後，他擔任信號處理應用工程師已有數年，爲從算法設計到實時實現的各個行業的MATLAB和Simulink用戶提供支持。在加入MathWorks之前，他擔任過多個研發職位，並且在羅馬國家電影學院擔任聲音理論和技術講師。他擁有物理學碩士學位和博士學位。在通信工程中。

Plenary Deep Representation Learning (Yoshua Bengio)

Tuesday, 5 May
10:00 - 11:00
Deep Representation Learning
Abstract: A crucial ingredient of deep learning is that of learning representations, more specifically with the objective to discover higher-level representations which capture and disentangle explanatory factors. This is a very ambitious goal and current state-of-the-art techniques still fall short, often capturing mostly superficial features of the data, which leaves them vulnerable to adversarial attacks and insufficient out-of-distribution robustness.This talk will review these original objectives, supervised and unsupervised approaches, and outline research ideas towards better representation learning.

Yoshua Bengio
Yoshua Bengio
Yoshua Bengio is recognized as one of the world’s artificial intelligence leaders and a pioneer of deep learning. Professor since 1993 at the Université de Montréal, he received the A.M. Turing Award 2018 with Geoff Hinton and Yann LeCun, considered like the Nobel prize for computing. Holder of the Canada Research Chair in Statistical Learning Algorithms, he is also the founder and scientific director of Mila, the Quebec Institute of Artificial Intelligence, which is the world’s largest university-based research group in deep learning. In 2018, he collected the largest number of new citations in the world for a computer scientist. He earned the prestigious Killam Prize from the Canada Council for the Arts and the Marie-Victorin Quebec Prize. Concerned about the social impact of AI, he actively contributed to the Montreal Declaration for the Responsible Development of Artificial Intelligence.
深度表示學習
摘要：深度學習的關鍵要素是學習表示形式，更具體地說，其目的是發現能夠捕獲和解開解釋因素的高級表示形式。這是一個非常雄心勃勃的目標，當前的最新技術仍然不足，通常會捕獲大部分數據的表面特徵，使它們容易受到對抗性攻擊和分佈失穩性不足。最初的目標，有監督和無監督的方法，並概述了旨在更好地進行表徵學習的研究思路。

尤舒亞·本吉奧（Yoshua Bengio）

尤舒亞·本吉奧（Yoshua Bengio）被公認爲世界人工智能領導者之一和深度學習的先驅。自1993年擔任蒙特利爾大學教授以來，他與傑夫·欣頓（Geoff Hinton）和揚·勒昆（Yann LeCun）一起獲得了2018年圖靈獎，被視爲諾貝爾計算機獎。他是統計學習算法的加拿大研究主席，還是魁北克人工智能研究所Mila的創始人和科學總監，魁北克人工智能研究所是世界上最大的以大學爲基礎的深度學習研究小組。2018年，他爲計算機科學家收集了世界上數量最多的新引用。他獲得了加拿大藝術理事會頒發的著名的基拉姆獎和瑪麗·維克托·魁北克獎。關注AI的社會影響，

ISS 1.1: Accelerating IoT for Industries: How?

ISS 1.1：加速工業物聯網：如何？
Internet of Things, connectivity, and data analytics are the fundamental enablers for industrial digitalisation. Deploying and operating an IoT system at scale is no trivial task as its success will depend on a well-oiled ecosystem, favourable rules and regulations, a solid business model, and the presence of a pervasive network that can meet the desired quality of service. This talk reflects on how mobile network operators are working towards accelerating the adoption of IoT in vertical industries by providing an overview on key standardization and open source activities, new industry bodies, e.g., the 5G Automotive Association and the 5G Alliance for Connected Industries and Automation, coverage and performance trade-offs, and key issues related to spectrum usage.
物聯網、連接性和數據分析是實現工業數字化的根本動力。大規模部署和操作物聯網系統並非易事，因爲它的成功將取決於良好的生態系統、有利的規章制度、可靠的商業模式以及能夠滿足所需服務質量的普及網絡的存在。本次演講反映了移動網絡運營商如何通過概述主要標準化和開源活動、新的行業機構（如5G汽車協會和5G互聯產業和自動化聯盟）來加速在垂直行業中採用物聯網，覆蓋範圍和性能權衡，以及與頻譜使用相關的關鍵問題。
Speaker: Dr Ilaria Thibault, IoT Strategy Manager, Vodafone Business
發言人：Ilaria Thibault博士，沃達豐業務物聯網戰略經理

ISS 1.2: DataNet – Doing for Data What the Internet Did for Communications

ISS 1.2：數據網——對數據的處理就像互聯網對通信的處理
This talk presents a new approach to national infrastructure called DataNet- Building on successful examples such as X-Road, DataNet provides a unique approach to data as a national infrastructure. Through providing Identity as a Utility, Data Exchange and a Data Sovereign Wealth Fund, DataNet reduces required investment in a national infrastructure for data, provides a unique economic model for the delivery of such services and protects end-user and citizen privacy. Designed to be led by the national government and delivered by a combination of the private and public sector in an open architecture format, DataNet provides a flexible, modular and adaptable approach to data in the emerging digital era.
本次演講提出了一種新的國家基礎設施方法，稱爲數據網——在諸如X-Road等成功案例的基礎上，數據網提供了一種獨特的方法，將數據作爲國家基礎設施。通過提供作爲公用事業、數據交換和數據主權財富基金的身份，數據網減少了對國家數據基礎設施的必要投資，爲提供此類服務提供了獨特的經濟模式，並保護最終用戶和公民隱私。數據網由國家政府領導，由私營和公共部門以開放式架構的形式聯合提供，爲新興數字時代的數據提供了靈活、模塊化和適應性強的方法。
Speaker: Dr Cathy Mulligan, CTO, GovTech Labs
演講人：Cathy Mulligan博士，GovTech實驗室首席技術官

ISS 1.3: Enabling Industrial IoT with 5G and Beyond

ISS 1.3：實現5G及以上工業物聯網
5G is seen as a key enabler for achieving flexible implementation of industrial IoT (IIoT), where rigid wired connections can be replaced with low-latency and high-reliability wireless communications. Such flexible implementation paves the way for scalable operations, where the production can be more easily altered and scaled up, e.g., by adding new machines to the production cells of a production line. Nevertheless, the current industrial communications are supported by a diverse set of communication protocols addressing the associated application requirements. Moreover, new use cases, such as distributed machine controllers providing the needed flexibility, should be supported by the 5G system (5GS) design. In this talk, we will outline the key requirements for IIoT and the associated challenges for 5G Release 17 and beyond. We will further elaborate on the architectural framework for supporting distributed machine controllers.
5G被視爲實現工業物聯網（IIoT）靈活實施的關鍵促成因素，在工業物聯網中，剛性有線連接可以被低延遲和高可靠性的無線通信所取代。這種靈活的實現爲可擴展操作鋪平了道路，在這種操作中，生產可以更容易地改變和擴展，例如，通過向生產線的生產單元添加新機器。然而，目前的工業通信由一組滿足相關應用需求的不同通信協議支持。此外，5G系統（5GS）設計應支持新的用例，例如提供所需靈活性的分佈式機器控制器。在本次演講中，我們將概述IIoT的關鍵要求以及5G 17版及更高版本的相關挑戰。我們將進一步闡述支持分佈式機器控制器的體系結構框架。
Speaker: Dr. Malte Schellmann, Principle Researcher, Huawei Tech. GRC
演講人：華爲技術集團首席研究員馬爾特·謝爾曼博士

ISS 3.1: Intelligent ear-level devices for hearing enhancement and health and wellness monitoring

ISS 3.1：用於聽力增強和健康與健康監測的智能耳位裝置
With resurgence of artificial intelligence (AI) and machine learning, sensor miniaturization and increased wireless connectivity, ear-level hearing devices are going through a major revolution transforming themselves from traditional hearing aids into modern hearing enhancement and health and wellness monitoring devices. For the aging user population of hearing aids, sound quality and speech understanding in challenging listening environments remain unsatisfactory. To improve quality of life and reduce health care cost, it is highly desirable if the devices can provide effective health and wellness monitoring capability on a continuous basis in everyday life. Finally, as the device functionality becomes more complex and dexterity is a major challenge for our user population, easy and intuitive user interactions with the devices are becoming increasingly important. In this talk, we will present examples of such transformation in the areas of hearing enhancement, health and wellness monitoring and user experience. In the process, we will highlight how AI and machine learning, miniaturized sensors and wireless connectivity are enabling and accelerating the transformation. In addition, we will discuss practical challenges for the transformation in areas of power consumption, non-volatile and volatile storage, audio latency and wireless reliability. Finally, we will provide an outlook on future directions and opportunities for intelligent ear-level devices.
隨着人工智能（AI）和機器學習的興起，傳感器的小型化和無線連接的增強，耳廓聽力設備正經歷一場重大的革命，從傳統的助聽器向現代的聽力增強和健康監測設備轉變。隨着助聽器使用人口的老齡化，在充滿挑戰的聽力環境中，其音質和語音理解能力仍不盡如人意。爲了提高生活質量和降低醫療保健成本，如果這些設備能夠在日常生活中持續提供有效的健康和健康監測能力，是非常理想的。最後，隨着設備功能變得越來越複雜，靈巧性對我們的用戶羣來說是一個重大挑戰，用戶與設備的簡單直觀的交互變得越來越重要。在本次講座中，我們將介紹在聽力增強、健康和健康監測以及用戶體驗等領域的此類轉變。在這個過程中，我們將強調人工智能和機器學習、微型傳感器和無線連接是如何實現和加速這一轉變的。此外，我們還將討論在功耗、非易失性和易失性存儲、音頻延遲和無線可靠性方面的實際挑戰。最後，我們將展望智能耳級設備的未來發展方向和機遇。
Speaker: Dr. Tao Zhang, Ph.D., Director of Algorithms , Starkey Hearing Technologies, USA
演講人：Tao Zhang博士，美國Starkey Hearing Technologies公司算法總監

ISS 3.2: Digitization of Urban Sound in a Smart Nation

ISS 3.2：智能國家中城市聲音的數字化
In this digital era, sensing and processing are being integrated into IoT devices that can be easily and economically deployed in our urban environment. In this talk, the speaker will describe some of the digital sound and active noise mitigation technologies that have been developed in my lab and some pilot deployment in our urban environment. In order to achieve a holistic understanding of our urban environment, we must rely on intelligent sound sensing that can operate 24/7 and deploy widely under different environmental conditions. These intelligent sound sensors also serve as digital ears to complement and activates the digital eyes of the CCTV cameras. By having a comprehensive and big aural sound data allows public agencies to better formulate complete and accurate sound mitigation policies. Sound pressure level (SPL) readings have been the de facto standard in quantifying our noise environment; however, SPL alone cannot accurately indicate how humans actually perceive noise; whether they like or dislike the sound even at the same SPL. The latest ISO standards (i.e. ISO 12913-1:2014, ISO 12913-2:2018) have been moving towards a measurement that is based on how humans perceive sound. With the advent of powerful and low-cost embedded processors, analog-to-digital convertors, and acoustic sensors, we are now seeing wide-spread usage of digital active noise control (ANC) technologies in consumer products, like hearables and in automobiles. In this talk, the speaker will also showcase our latest work in extending active noise control applications to a larger region of control, such as in open windows and openings of noise sources. Digitization plays a key role in advancing the art of ANC to incorporate artificial intelligence to select the most annoying noise to cancel and provide ways to further mitigate noise by perceptual-based sound augmentation.
在這個數字時代，傳感和處理正被集成到物聯網設備中，這些設備可以方便、經濟地部署在我們的城市環境中。在本次演講中，演講者將介紹我的實驗室開發的一些數字聲音和主動降噪技術，以及在我們的城市環境中的一些試驗性部署。爲了全面瞭解我們的城市環境，我們必須依靠能夠在不同環境條件下全天候運行和廣泛部署的智能聲音傳感。這些智能聲音傳感器還充當數字耳朵，以補充和激活閉路電視攝像機的數字眼睛。通過擁有全面而龐大的聲音數據，公共機構可以更好地制定完整而準確的聲音緩解政策。聲壓級（SPL）讀數已成爲量化我們的噪聲環境的事實標準；然而，單憑SPL無法準確地指出人類是如何實際感知噪聲的；即使在相同的SPL下，他們是否喜歡或不喜歡聲音。最新的ISO標準（即ISO 12913-1:2014、ISO 12913-2:2018）已經朝着基於人類感知聲音的測量方向發展。隨着功能強大、成本低廉的嵌入式處理器、模數轉換器和聲學傳感器的出現，我們現在看到數字有源噪聲控制（ANC）技術在消費品（如可聽設備和汽車）中的廣泛應用。在本次演講中，演講者還將展示我們在將有源噪聲控制應用擴展到更大的控制區域方面的最新工作，例如在打開的窗口和噪聲源的開口處。數字化在推進ANC技術中起着關鍵作用，它結合人工智能來選擇最討厭的噪聲來抵消，並通過基於感知的聲音增強提供進一步降低噪聲的方法。
Speaker: Dr.Woon-Seng Gan, Director of Smart Nation Lab at Nanyang Technological University, Singapore
演講人：新加坡南洋理工大學智能國家實驗室主任吳森幹博士

ISS 3.3: Mechanical Noise Suppression: Debutant of phase in signal enhancement after 30 years of silence

ISS 3.3：機械噪聲抑制：沉默30年後的相位信號增強
This talk presents challenges, solutions, and applications in commercial products of mechanical noise suppression. The topic has become more important as dissemination of consumer products that process environmental signals in addition to human speech. Three typical types of mechanical noise signals with small, medium, and large signal power, represented by feature phones and camcorders, digital cameras, and standard and tablet PCs, respectively, are covered. Mechanical noise suppression for small power signals is performed by continuous spectral template subtraction with a noise template dictionary. Medium power mechanical noise is suppressed in a similar manner only when its presence is notified by the parent system such as the digital camera. When the power is large, explicit detection of the mechanical noise based on phase information determines suppression timings. In the all three scenarios, the phase information of the input noisy signal is randomized for making the residual noise inaudible in frequency bins where noise is dominant. The phase has been unaltered in the past 30 years after Lim, thus, these suppression algorithms opened the door to a new signal
本文介紹了機械噪聲抑制在商業產品中的挑戰、解決方案和應用。隨着消費品的傳播，這個話題變得越來越重要，這些消費品除了處理人類的語言外，還處理環境信號。介紹了三種典型的小、中、大信號功率機械噪聲信號，分別以功能手機和攝像機、數碼相機、標準和平板電腦爲代表。採用帶噪聲模板字典的連續譜模板減法對小功率信號進行機械噪聲抑制。只有當諸如數碼相機的父系統通知其存在時，才以類似的方式抑制中等功率機械噪聲。當功率較大時，基於相位信息的機械噪聲的顯式檢測決定了抑制時序。在這三種情況下，輸入噪聲信號的相位信息是隨機的，以使噪聲占主導地位的頻率箱中的殘餘噪聲聽不見。在Lim之後的30年裏，相位沒有改變，因此，這些抑制算法爲新的信號打開了大門

From Speech AI to Finance AI and Back

從語音人工智能到金融人工智能
Tuesday, May 5, 16:30 – 17:30
5月5日，星期二，16:30–17:30
A brief review will be provided first on how deep learning has disrupted speech recognition and language processing industries since 2009. Then connections will be drawn between the techniques (deep learning or otherwise) for modeling speech and language and those for financial markets. Similarities and differences of these two fields will be explored. In particular, three unique technical challenges to financial investment are addressed: extremely low signal-to-noise ratio, extremely strong nonstationarity (with adversarial nature), and heterogeneous big data. Finally, how the potential solutions to these challenges can come back to benefit and further advance speech recognition and language processing technology will be discussed.
首先簡要回顧一下自2009年以來，深度學習對語音識別和語言處理行業的影響。然後，將在語言和語言建模的技術（深度學習或其他）與金融市場的技術之間建立聯繫。將探討這兩個領域的異同。特別是金融投資面臨的三個獨特的技術挑戰：極低的信噪比、極強的非平穩性（具有對抗性）和異構大數據。最後，我們將討論如何使這些挑戰的潛在解決方案重新受益並進一步提高語音識別和語言處理技術。
Speaker: Li Deng, IEEE Fellow, Chief AI Officer, Citadel, USA
演講人：鄧立，IEEE研究員，美國Citadel首席人工智能官

ICASSP2020一些主題演講