q = x ∗ w + b q = x*w + b q = x ∗ w + b
x : N ∗ D x: N*D x : N ∗ D
w : D ∗ M w: D*M w : D ∗ M
b : M b: M b : M
q : N ∗ M q: N*M q : N ∗ M
N:樣本(圖像)數量
D:一個樣本(圖像)展開後的維度
M:分類的個數
後向傳遞得到:α f α q = d o u t \frac{\alpha f}{\alpha q} = dout α q α f = d o u t
d o u t : N ∗ M dout:N*M d o u t : N ∗ M
求解d x dx d x :
α q i , k α x i , j = w j , k α f α q i , k ∗ α q i , k α x i , j = d o u t i , k ∗ w j , k ∵ q and x has a common dimension α q i , _ α x i , j = ∑ w j , _ α f α q i , _ ∗ α q i , _ α x i , j = ∑ k = 1 M d o u t i , k ∗ w j , k ∵ keep the dimension of x and dx the same d x = α f α x = d o u t ∗ w T \begin{aligned}
\frac{\alpha q_{i,k}}{\alpha x_{i,j}} = w_{j,k}\\
\\
\frac{\alpha f}{\alpha q_{i,k}} *\frac{\alpha q_{i,k}}{\alpha x_{i,j}} = dout_{i,k} * w_{j,k} \\
\\
\because \text{q and x has a common dimension }\\
\\
\frac{\alpha q_{i,\_}}{\alpha x_{i,j}} = \sum w_{j,\_}\\
\\
\frac{\alpha f}{\alpha q_{i,\_}} *\frac{\alpha q_{i,\_}}{\alpha x_{i,j}} = \sum_{k=1}^{M} dout_{i,k} * w_{j,k} \\
\\
\because \text{keep the dimension of x and dx the same}\\
\\
dx = \frac{\alpha f}{\alpha x} = dout * w^{T}\\
\end{aligned} α x i , j α q i , k = w j , k α q i , k α f ∗ α x i , j α q i , k = d o u t i , k ∗ w j , k ∵ q and x has a common dimension α x i , j α q i , _ = ∑ w j , _ α q i , _ α f ∗ α x i , j α q i , _ = k = 1 ∑ M d o u t i , k ∗ w j , k ∵ keep the dimension of x and dx the same d x = α x α f = d o u t ∗ w T
求解d w dw d w :
α q i , k α w j , k = x i , j α f α q i , k ∗ α q i , k α w j , k = d o u t i , k ∗ x i , j ∵ q and w has a common dimension α q _ , k α w j , k = ∑ x _ , j α f α q _ , k ∗ α q _ , k α w j , k = ∑ i = 1 N d o u t i , k ∗ x i , j ∵ keep the dimension of w and dw the same d w = α f α w = x T ∗ d o u t \begin{aligned}
\frac{\alpha q_{i,k}}{\alpha w_{j,k}} = x_{i,j}
\\
\\
\frac{\alpha f}{\alpha q_{i,k}} *\frac{\alpha q_{i,k}}{\alpha w_{j,k}} = dout_{i,k} * x_{i,j}\\
\\
\because \text{q and w has a common dimension }\\
\\
\frac{\alpha q_{\_,k}}{\alpha w_{j,k}} = \sum x_{\_,j}\\
\\
\frac{\alpha f}{\alpha q_{\_,k}} *\frac{\alpha q_{\_,k}}{\alpha w_{j,k}} = \sum_{i=1}^{N} dout_{i,k} * x_{i,j}\\
\\
\because \text{keep the dimension of w and dw the same}\\
\\
dw = \frac{\alpha f}{\alpha w} = x^{T} * dout\\
\end{aligned} α w j , k α q i , k = x i , j α q i , k α f ∗ α w j , k α q i , k = d o u t i , k ∗ x i , j ∵ q and w has a common dimension α w j , k α q _ , k = ∑ x _ , j α q _ , k α f ∗ α w j , k α q _ , k = i = 1 ∑ N d o u t i , k ∗ x i , j ∵ keep the dimension of w and dw the same d w = α w α f = x T ∗ d o u t
求解d b db d b :
α q i , k α b k = 1 α f α q i , k ∗ α q i , k α b k = d o u t i , k ∗ 1 ∵ q and b has a common dimension α f α q _ , k ∗ α q _ , k α b k = ∑ [ 1 , 1 , 1 , 1 , 1 , . . . ] T α f α q _ , k ∗ α q _ , k α b k = ∑ i = 1 N d o u t i , k d b = α f α b = [ ∑ i = 1 N d o u t i , 1 , ∑ i = 1 N d o u t i , 2 , . . . , ∑ i = 1 N d o u t i , M ] \begin{aligned}
\frac{\alpha q_{i,k}}{\alpha b_{k}} = 1 \\
\\
\frac{\alpha f}{\alpha q_{i,k}} *\frac{\alpha q_{i,k}}{\alpha b_{k}} = dout_{i,k} * 1\\
\\
\because \text{q and b has a common dimension }\\
\\
\frac{\alpha f}{\alpha q_{\_,k}} *\frac{\alpha q_{\_,k}}{\alpha b_{k}} = \sum [1,1,1,1,1,...]^T\\
\\
\frac{\alpha f}{\alpha q_{\_,k}} *\frac{\alpha q_{\_,k}}{\alpha b_{k}} = \sum_{i=1}^{N} dout_{i,k} \\
\\
db = \frac{\alpha f}{\alpha b} = [\sum_{i=1}^{N} dout_{i,1},\sum_{i=1}^{N} dout_{i,2},...,\sum_{i=1}^{N} dout_{i,M}] \\
\end{aligned} α b k α q i , k = 1 α q i , k α f ∗ α b k α q i , k = d o u t i , k ∗ 1 ∵ q and b has a common dimension α q _ , k α f ∗ α b k α q _ , k = ∑ [ 1 , 1 , 1 , 1 , 1 , . . . ] T α q _ , k α f ∗ α b k α q _ , k = i = 1 ∑ N d o u t i , k d b = α b α f = [ i = 1 ∑ N d o u t i , 1 , i = 1 ∑ N d o u t i , 2 , . . . , i = 1 ∑ N d o u t i , M ]