q = x*w + b 後向傳遞梯度求導(求dx,dw,db)

q=xw+bq = x*w + b
x:NDx: N*D
w:DMw: D*M
b:Mb: M
q:NMq: N*M

N:樣本(圖像)數量
D:一個樣本(圖像)展開後的維度
M:分類的個數
後向傳遞得到:αfαq=dout\frac{\alpha f}{\alpha q} = dout
dout:NMdout:N*M

求解dxdx
αqi,kαxi,j=wj,kαfαqi,kαqi,kαxi,j=douti,kwj,kq and x has a common dimension αqi,_αxi,j=wj,_αfαqi,_αqi,_αxi,j=k=1Mdouti,kwj,kkeep the dimension of x and dx the samedx=αfαx=doutwT\begin{aligned} \frac{\alpha q_{i,k}}{\alpha x_{i,j}} = w_{j,k}\\ \\ \frac{\alpha f}{\alpha q_{i,k}} *\frac{\alpha q_{i,k}}{\alpha x_{i,j}} = dout_{i,k} * w_{j,k} \\ \\ \because \text{q and x has a common dimension }\\ \\ \frac{\alpha q_{i,\_}}{\alpha x_{i,j}} = \sum w_{j,\_}\\ \\ \frac{\alpha f}{\alpha q_{i,\_}} *\frac{\alpha q_{i,\_}}{\alpha x_{i,j}} = \sum_{k=1}^{M} dout_{i,k} * w_{j,k} \\ \\ \because \text{keep the dimension of x and dx the same}\\ \\ dx = \frac{\alpha f}{\alpha x} = dout * w^{T}\\ \end{aligned}

求解dwdw

αqi,kαwj,k=xi,jαfαqi,kαqi,kαwj,k=douti,kxi,jq and w has a common dimension αq_,kαwj,k=x_,jαfαq_,kαq_,kαwj,k=i=1Ndouti,kxi,jkeep the dimension of w and dw the samedw=αfαw=xTdout\begin{aligned} \frac{\alpha q_{i,k}}{\alpha w_{j,k}} = x_{i,j} \\ \\ \frac{\alpha f}{\alpha q_{i,k}} *\frac{\alpha q_{i,k}}{\alpha w_{j,k}} = dout_{i,k} * x_{i,j}\\ \\ \because \text{q and w has a common dimension }\\ \\ \frac{\alpha q_{\_,k}}{\alpha w_{j,k}} = \sum x_{\_,j}\\ \\ \frac{\alpha f}{\alpha q_{\_,k}} *\frac{\alpha q_{\_,k}}{\alpha w_{j,k}} = \sum_{i=1}^{N} dout_{i,k} * x_{i,j}\\ \\ \because \text{keep the dimension of w and dw the same}\\ \\ dw = \frac{\alpha f}{\alpha w} = x^{T} * dout\\ \end{aligned}

求解dbdb
αqi,kαbk=1αfαqi,kαqi,kαbk=douti,k1q and b has a common dimension αfαq_,kαq_,kαbk=[1,1,1,1,1,...]Tαfαq_,kαq_,kαbk=i=1Ndouti,kdb=αfαb=[i=1Ndouti,1,i=1Ndouti,2,...,i=1Ndouti,M]\begin{aligned} \frac{\alpha q_{i,k}}{\alpha b_{k}} = 1 \\ \\ \frac{\alpha f}{\alpha q_{i,k}} *\frac{\alpha q_{i,k}}{\alpha b_{k}} = dout_{i,k} * 1\\ \\ \because \text{q and b has a common dimension }\\ \\ \frac{\alpha f}{\alpha q_{\_,k}} *\frac{\alpha q_{\_,k}}{\alpha b_{k}} = \sum [1,1,1,1,1,...]^T\\ \\ \frac{\alpha f}{\alpha q_{\_,k}} *\frac{\alpha q_{\_,k}}{\alpha b_{k}} = \sum_{i=1}^{N} dout_{i,k} \\ \\ db = \frac{\alpha f}{\alpha b} = [\sum_{i=1}^{N} dout_{i,1},\sum_{i=1}^{N} dout_{i,2},...,\sum_{i=1}^{N} dout_{i,M}] \\ \end{aligned}

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章