深度學習入門首推資料--吳恩達深度學習全程筆記分享

本文首發於微信公衆號“StrongerTang”，可打開微信搜一搜，或掃描文末二維碼，關注查看更多文章。
原文鏈接：(https://mp.weixin.qq.com/s?__biz=Mzg3NDEzOTAzMw==&mid=2247483739&idx=1&sn=90a0ef1ff6c90514f397e9b6f3c74b83&chksm=ced41dadf9a394bbba683cf597ee7deb2078442d72bff670078c1b8d89643df173f91fee1f2a&token=769811946&lang=zh_CN#rd)

最近有學弟問我有沒有深度學習方面的入門資料，說他最近比較閒，想學點東西。然後我就給他分享了我當年（其實也就是去年這個時候）剛接觸機器學習時無意中有幸發現的這份資料，雖然我當時因爲基礎太差並沒有認真地看多少，但卻不得不說這是一份非常好的資料，可謂讓我受益匪淺，所以今天也正好藉此機會分享給大家。

這份筆記是國內知名AI推廣志願者黃海廣師兄及其團隊根據吳恩達的深度學習專項課程系列的 5 門課視頻全程整理而成的。吳恩達是何人這裏就不多解釋了，百度一下絕對會讓你多一個崇拜對象。

吳恩達老師這門當年“開天闢地”的課程可謂獲贊無數，最大的特色就是內容全面、通俗易懂並配備了豐富的實戰項目。而這份筆記也只能用兩個字形容：全面！

1、課程概述

吳恩達老師深度學習專項課程，小湯和小湯身邊的同學都非常推薦。它對於理解各種算法背後的原理非常有幫助，同時提供了大量的應用場景，涉及圖像、語音、自然語言理解等各方面，還提供了一些工具函數、數據集。可以說這個系列課程是從機器學習過渡到深度學習的必備課程！

在這 5 堂課中，大家將學習到深度學習的基礎，學會構建神經網絡，包括 CNN 和 RNN 等。課程中也會有很多實操項目，幫助大家更好地應用自己學到的深度學習技術，解決真實世界問題。這些項目將涵蓋醫療、自動駕駛、和自然語言處理等時髦領域，以及音樂生成等。

這些課程專爲已有一定基礎（基本的編程知識，熟悉Python、對機器學習有基本瞭解），想要嘗試進入人工智能領域的計算機專業人士準備。課程的語言是 Python。（其實，沒基礎也沒關係，遇到不懂的地方多查查資料就可以了）

2、筆記目錄

這份筆記總共包含746頁，經過數次更新修訂，目前已經發布到最新的 V5.44版本，筆記非常詳盡，目錄如下：

第一門課神經網絡和深度學習(Neural Networks and Deep Learning)1

第一週：深度學習引言(Introduction to Deep Learning)1

1.1 歡迎(Welcome)1

1.2 什麼是神經網絡？(What is a Neural Network)4

1.3 神經網絡的監督學習(Supervised Learning with Neural Networks)8

1.4 爲什麼深度學習會興起？(Why is Deep Learning taking off?)12

1.5 關於這門課(About this Course)16

1.6 課程資源(Course Resources)17

第二週：神經網絡的編程基礎(Basics of Neural Network programming)18

2.1 二分類(Binary Classification)18

2.2 邏輯迴歸(Logistic Regression)22

2.3 邏輯迴歸的代價函數（Logistic Regression Cost Function）24

2.4 梯度下降法（Gradient Descent）26

2.5 導數（Derivatives）30

2.6 更多的導數例子（More Derivative Examples）32

2.7 計算圖（Computation Graph）35

2.8 計算圖的導數計算（Derivatives with a Computation Graph）36

2.9 邏輯迴歸中的梯度下降（Logistic Regression Gradient Descent）42

2.10 m 個樣本的梯度下降(Gradient Descent on m Examples)45

2.11 向量化(Vectorization)48

2.12 向量化的更多例子（More Examples of Vectorization）52

2.13 向量化邏輯迴歸(Vectorizing Logistic Regression)55

2.14 向量化 logistic 迴歸的梯度輸出（Vectorizing Logistic Regression’s Gradient）58

2.15 Python 中的廣播（Broadcasting in Python）61

2.16 關於 python _ numpy 向量的說明（A note on python or numpy vectors）參考視頻：65

2.17 Jupyter/iPython Notebooks快速入門（Quick tour of Jupyter/iPython Notebooks）69

2.18 （選修）logistic 損失函數的解釋（Explanation of logistic regression cost function）73

第三週：淺層神經網絡(Shallow neural networks)77

3.1 神經網絡概述（Neural Network Overview）77

3.2 神經網絡的表示（Neural Network Representation）80

3.3 計算一個神經網絡的輸出（Computing a Neural Network’s output）83

3.4 多樣本向量化（Vectorizing across multiple examples）86

3.5 向量化實現的解釋（Justification for vectorized implementation）89

3.6 激活函數（Activation functions）91

3.7 爲什麼需要非線性激活函數？（why need a nonlinear activation function?）94

3.8 激活函數的導數（Derivatives of activation functions）96

3.9 神經網絡的梯度下降（Gradient descent for neural networks）98

3.10（選修）直觀理解反向傳播（Backpropagation intuition）100

3.11 隨機初始化（Random+Initialization）102

第四周：深層神經網絡(Deep Neural Networks)104

4.1 深層神經網絡（Deep L-layer neural network）104

4.2 前向傳播和反向傳播（Forward and backward propagation）106

4.3 深層網絡中的前向傳播（Forward propagation in a Deep Network）109

4.4 覈對矩陣的維數（Getting your matrix dimensions right）110

4.5 爲什麼使用深層表示？（Why deep representations?）112

4.6 搭建神經網絡塊（Building blocks of deep neural networks）116

4.7 參數VS超參數（Parameters vs Hyperparameters）119

4.8 深度學習和大腦的關聯性（What does this have to do with the brain?）122

第二門課改善深層神經網絡：超參數調試、正則化以及優化(Improving Deep Neural Networks:Hyperparameter tuning, Regularization and Optimization)124

第一週：深度學習的實用層面(Practical aspects of Deep Learning)124

1.1 訓練，驗證，測試集（Train / Dev / Test sets）124

1.2 偏差，方差（Bias /Variance）129

1.3 機器學習基礎（Basic Recipe for Machine Learning）134

1.4 正則化（Regularization）136

1.5 爲什麼正則化有利於預防過擬合呢？（Why regularization reduces overfitting?）140

1.6 dropout 正則化（Dropout Regularization）144

1.7 理解 dropout（Understanding Dropout）152

1.8 其他正則化方法（Other regularization methods）155

1.9 歸一化輸入（Normalizing inputs）159

1.10 梯度消失/梯度爆炸（Vanishing / Exploding gradients）163

1.11 神經網絡的權重初始化（Weight Initialization for Deep NetworksVanishing / Exploding gradients）165

1.12 梯度的數值逼近（Numerical approximation of gradients）168

1.13 梯度檢驗（Gradient checking）170

1.14 梯度檢驗應用的注意事項（Gradient Checking Implementation Notes）173

第二週：優化算法 (Optimization algorithms)175

2.1 Mini-batch 梯度下降（Mini-batch gradient descent）175

2.2 理解mini-batch梯度下降法（Understanding mini-batch gradient descent）180

2.3 指數加權平均數（Exponentially weighted averages）184

2.4 理解指數加權平均數（Understanding exponentially weighted averages）188

2.5 指數加權平均的偏差修正（Bias correction in exponentially weighted averages）193

2.6 動量梯度下降法（Gradient descent with Momentum）195

2.7 RMSprop199

2.8 Adam 優化算法(Adam optimization algorithm)202

2.9 學習率衰減(Learning rate decay)205

2.10 局部最優的問題(The problem of local optima)208

第三週超參數調試、Batch正則化和程序框架（Hyperparameter tuning）211

3.1 調試處理（Tuning process）211

3.2 爲超參數選擇合適的範圍（Using an appropriate scale to pick hyperparameters）215

3.3 超參數訓練的實踐：Pandas VS Caviar（Hyperparameters tuning in practice: Pandas vs. Caviar）219

3.4 歸一化網絡的激活函數（Normalizing activations in a network）223

3.5 將 Batch Norm 擬合進神經網絡（Fitting Batch Norm into a neural network）227

3.6 Batch Norm 爲什麼奏效？（Why does Batch Norm work?）232

3.7 測試時的 Batch Norm（Batch Norm at test time）237

3.8 Softmax 迴歸（Softmax regression）239

3.9 訓練一個 Softmax 分類器（Training a Softmax classifier）244

3.10 深度學習框架（Deep Learning frameworks）249

3.11 TensorFlow251

第三門課結構化機器學習項目（Structuring Machine Learning Projects）258

第一週機器學習（ML）策略（1）（ML strategy（1））258

1.1 爲什麼是ML策略？（Why ML Strategy?）258

1.2 正交化（Orthogonalization）260

1.3 單一數字評估指標（Single number evaluation metric）264

1.4 滿足和優化指標（Satisficing and optimizing metrics）268

1.5 訓練/開發/測試集劃分（Train/dev/test distributions）271

1.6 開發集和測試集的大小（Size of dev and test sets）275

1.7 什麼時候該改變開發/測試集和指標？（When to change dev/test sets and metrics）277

1.8 爲什麼是人的表現？（Why human-level performance?）282

1.9 可避免偏差（Avoidable bias）284

1.10 理解人的表現（Understanding human-level performance）287

1.11 超過人的表現（Surpassing human- level performance）292

1.12 改善你的模型的表現（Improving your model performance）295

第二週：機器學習策略（2）(ML Strategy (2))297

2.1 進行誤差分析（Carrying out error analysis）297

2.2 清楚標註錯誤的數據（Cleaning up Incorrectly labeled data）301

2.3 快速搭建你的第一個系統，並進行迭代（Build your first system quickly, then iterate）306

2.4 在不同的劃分上進行訓練並測試（Training and testing on different distributions）309

2.5 不匹配數據劃分的偏差和方差（Bias and Variance with mismatched data distributions）314

2.6 定位數據不匹配（Addressing data mismatch）321

2.7 遷移學習（Transfer learning）325

2.8 多任務學習（Multi-task learning）329

2.9 什麼是端到端的深度學習？（What is end-to-end deep learning?）335

2.10 是否要使用端到端的深度學習？（Whether to use end-to-end learning?）341

第四門課卷積神經網絡（Convolutional Neural Networks）345

第一週卷積神經網絡（Foundations of Convolutional Neural Networks）345

1.1 計算機視覺（Computer vision）345

1.2 邊緣檢測示例（Edge detection example）349

1.3 更多邊緣檢測內容（More edge detection）356

1.4 Padding360

1.5 卷積步長（Strided convolutions）364

1.6 三維卷積（Convolutions over volumes）369

1.7 單層卷積網絡（One layer of a convolutional network）374

1.8 簡單卷積網絡示例（A simple convolution network example）380

1.9 池化層（Pooling layers）384

1.10 卷積神經網絡示例（Convolutional neural network example）389

1.11 爲什麼使用卷積？（Why convolutions?）394

第二週深度卷積網絡：實例探究（Deep convolutional models: case studies）398

2.1 爲什麼要進行實例探究？（Why look at case studies?）398

2.2 經典網絡（Classic networks）400

2.3 殘差網絡（Residual Networks (ResNets)）407

2.4 殘差網絡爲什麼有用？（Why ResNets work?）411

2.5 網絡中的網絡以及 1×1 卷積（Network in Network and 1×1 convolutions）415

2.6 谷歌 Inception 網絡簡介（Inception network motivation）418

2.7 Inception 網絡（Inception network）423

2.8 使用開源的實現方案（Using open-source implementations）428

2.9 遷移學習（Transfer Learning）432

2.10 數據擴充（Data augmentation）435

2.11 計算機視覺現狀（The state of computer vision）440

第三週目標檢測（Object detection）446

3.1 目標定位（Object localization）446

3.2 特徵點檢測（Landmark detection）451

3.3 目標檢測（Object detection）454

3.4 卷積的滑動窗口實現（Convolutional implementation of sliding windows）457

3.5 Bounding Box預測（Bounding box predictions）462

3.6 交併比（Intersection over union）469

3.7 非極大值抑制（Non-max suppression）471

3.8 Anchor Boxes475

3.9 YOLO 算法（Putting it together: YOLO algorithm）479

3.10 候選區域（選修）（Region proposals (Optional)）483

第四周特殊應用：人臉識別和神經風格轉換（Special applications: Face recognition &Neural style transfer）487

4.1 什麼是人臉識別？（What is face recognition?）487

4.2 One-Shot學習（One-shot learning）490

4.3 Siamese 網絡（Siamese network）493

4.4 Triplet 損失（Triplet 損失）495

4.5 面部驗證與二分類（Face verification and binary classification）502

4.6 什麼是神經風格轉換？（What is neural style transfer?）505

4.7 什麼是深度卷積網絡？（What are deep ConvNets learning?）507

4.8 代價函數（Cost function）512

4.9 內容代價函數（Content cost function）514

4.10 風格代價函數（Style cost function）516

4.11 一維到三維推廣（1D and 3D generalizations of models）523

第五門課序列模型(Sequence Models)529

第一週循環序列模型（Recurrent Neural Networks）529

1.1 爲什麼選擇序列模型？（Why Sequence Models?）529

1.2 數學符號（Notation）531

1.3 循環神經網絡模型（Recurrent Neural Network Model）534

1.4 通過時間的反向傳播（Backpropagation through time）540

1.5 不同類型的循環神經網絡（Different types of RNNs）543

1.6 語言模型和序列生成（Language model and sequence generation）547

1.7 對新序列採樣（Sampling novel sequences）552

1.8 循環神經網絡的梯度消失（Vanishing gradients with RNNs）556

1.9 GRU單元（Gated Recurrent Unit（GRU））558

1.10 長短期記憶（LSTM（long short term memory）unit）565

1.11 雙向循環神經網絡（Bidirectional RNN）571

1.12 深層循環神經網絡（Deep RNNs）574

第二週自然語言處理與詞嵌入（Natural Language Processing and Word Embeddings）576

2.1 詞彙表徵（Word Representation）576

2.2 使用詞嵌入（Using Word Embeddings）580

2.3 詞嵌入的特性（Properties of Word Embeddings）584

2.4 嵌入矩陣（Embedding Matrix）589

2.5 學習詞嵌入（Learning Word Embeddings）591

2.6 Word2Vec595

2.7 負採樣（Negative Sampling）600

2.8 GloVe 詞向量（GloVe Word Vectors）605

2.9 情感分類（Sentiment Classification）609

2.10 詞嵌入除偏（Debiasing Word Embeddings）612

第三週序列模型和注意力機制（Sequence models & Attention mechanism）618

3.1 序列結構的各種序列（Various sequence to sequence architectures）618

3.2 選擇最可能的句子（Picking the most likely sentence）621

3.3 集束搜索（Beam Search）624

3.4 改進集束搜索（Refinements to Beam Search）629

3.5 集束搜索的誤差分析（Error analysis in beam search）633

3.6 Bleu 得分（選修）（Bleu Score (optional)）637

3.7 注意力模型直觀理解（Attention Model Intuition）642

3.8 注意力模型（Attention Model）646

3.9 語音識別（Speech recognition）650

3.10 觸發字檢測（Trigger Word Detection）654

3.11 結論和致謝（Conclusion and thank you）656

附件658

榜樣的力量-吳恩達採訪人工智能大師實錄658

吳恩達採訪 Geoffery Hinton658

吳恩達採訪 Ian Goodfellow668

吳恩達採訪 Ruslan Salakhutdinov674

吳恩達採訪 Yoshua Bengio680

吳恩達採訪林元慶687

吳恩達採訪 Pieter Abbeel691

吳恩達採訪 Andrej Karpathy696

深度學習符號指南（原課程翻譯）702

機器學習的數學基礎704

高等數學704

線性代數712

概率論和數理統計722

筆記內容真的非常詳細，十分適合配合視頻一同學習使用，可謂入門者的寶典。

3、附錄
吳恩達視頻課程地址：

http://www.deeplearning.ai/

聲明：此筆記免費，僅作知識分享，勿用於任何商業用途。贈人玫瑰，手有餘香！

4、資源下載

最後，746頁《吳恩達深度學習核心筆記》的word電子版也已經打包完畢，需要的可以按照以下方式獲取：

1.掃描下方二維碼關注 “StrongerTang” 公衆號

2.公衆號後臺回覆關鍵詞：吳恩達筆記

?掃描上方二維碼關注

深度學習入門首推資料--吳恩達深度學習全程筆記分享

如何使用 JS 判斷用戶是否處於活躍狀態

Mono 支持LoongArch架構

lightdb秒級增加列和刪除列（not null帶默認值）

lightdb數據庫超時相關控制參數

通過HPA+CronHPA組合應對業務複雜彈性伸縮場景

❤️‍🔥 Solon Cloud Event 新的事務特性與應用

網絡爬蟲的祕密：如何高效地抓取JD.com視頻鏈接

lightdb mysql 8.0兼容之不可見主鍵

使用 JS 實現在瀏覽器控制檯打印圖片 console.image()

基於Ubuntu-22.04安裝K8s-v1.28.2實驗（四）使用域名訪問網站應用

Python strip()與split()方法

基於深度學習的目標檢測發展綜述（持續更新。。。）

人工智能發展及其倫理問題思考

windows下dlib庫簡介、安裝問題解決及簡單小例子（python）

深度學習入門首推資料--吳恩達深度學習全程筆記分享

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結