【論文閱讀】A Self-supervised Approach for Adversarial Robustness#CVPR2020

combine the benefits of Adversarial training and input processing and propose a self-supervised adversarial training mechanism in the input space. Code is available at: https://github.com/Muzammal-Naseer/NRP

介紹:

基於AT和輸入處理方法的互補性,我們在輸入空間中提出了一種自監督AT機制。我們的方法(圖1)使用最小最大(鞍點)公式來學習增強模型魯棒性的最優輸入處理函數。這樣,我們的優化規則隱式地執行T。我們的方法的主要優點是它的泛化能力,一旦在一個數據集上進行了訓練,就可以立即應用它來保護一個完全不同的模型。這使得它成爲一個更具吸引力的解決方案,相比之下,流行的a-T方法在計算上更爲昂貴(因此對大型數據集的可伸縮性較差)。此外,與以前的基於預處理的防禦相比,我們的防禦對於最近的攻擊是脆弱的,我們的防禦顯示出更好的健壯性。

主要貢獻:

Task Generalizability: To ensure a task independent AT mechanism, we propose to adversarially train a purifying model named Neural Representation Purifier (NRP).Once trained, NRP can be deployed to safeguard across different tasks, e.g., classification, detection and segmentation, without any additional training (Sec. 3).
Self-Supervision: The supervisory signal used for A T should be self-supervised to make it independent of label space. To this end, we propose an algorithm to train NRP on adversaries found in the feature space in random directions to avoid any label leakage (Sec. 3.1).
Defense against strong perturbations: Attacks are continuously evolving. In order for NRP to generalize, it should be trained on worst-case perturbations that are transferable across different tasks. We propose to find highly transferable perceptual adversaries (Sec. 4.3).
Maintaining Accuracy: A strong defense must concurrently maintain accuracy on the original data distribution.We propose to train the NRP with an additional discriminator to bring adversarial examples close to original samples by recovering the fine texture details (Sec. 4.2).

具體介紹:

1、自我監督

目標是設計一種自監督擾動機制,它可以在網絡和任務之間推廣,從而實現一種可轉移的防禦方法。之所以將自監督擾動建立在特徵失真的基礎上,是因爲它直接影響到擾動的傳遞性。

feature distortion:

我們建議通過最大化特徵損失來尋找對手。我們的方法不依賴於決策邊界信息,因爲我們的“基於表示”攻擊通過解決以下優化問題直接干擾特徵空間:

2、NRP  Loss functions

提出一個混合損失函數,用於訓練淨化器網絡

1.Feature loss:(實驗

2.Pixel loss: (平滑圖像有助於減輕對抗效應,因爲擾動模式類似於噪聲模式。因此,爲了提高圖像的平滑度,我們在圖像像素空間中使用了l2loss)

3.Adversarial loss:(由網絡和GAN類似的結構特點而來

The overall loss:

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章