PyTorch中nn.Linear()理解
計算公式
$ y = xA^{T}+b$
這裏A爲weight,b爲bias。
代碼部分
初始化部分代碼
class Linear(Module):
...
__constants__ = ['bias']
def __init__(self, in_features, out_features, bias=True):
super(Linear, self).__init__()
self.in_features = in_features
self.out_features = out_features
self.weight = Parameter(torch.Tensor(out_features, in_features))
if bias:
self.bias = Parameter(torch.Tensor(out_features))
else:
self.register_parameter('bias', None)
self.reset_parameters()
計算部分
@weak_script_method
def forward(self, input):
return F.linear(input, self.weight, self.bias)
返回值爲: input * weight + bias
bias和weight
weight: the learnable weights of the module of shape
:math:`(\text{out\_features}, \text{in\_features})`. The values are
initialized from :math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})`, where
:math:`k = \frac{1}{\text{in\_features}}`
bias: the learnable bias of the module of shape :math:`(\text{out\_features})`.
If :attr:`bias` is ``True``, the values are initialized from
:math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})` where
:math:`k = \frac{1}{\text{in\_features}}`
示例
>>> import torch
>>> nn1 = torch.nn.Linear(100, 50)
>>> input1 = torch.randn(140, 100)
>>> output1 = nn1(input1)
>>> output1.size()
torch.Size([140, 50])
對於上述描述,我們創建一個input的維度爲[140,100], 通過聲明線性層會得到根據維度初始化的權重和偏差,其中weight的維度爲[50,100]。對於公式中A表示的就是weight,而b表示的就是bias。由於對A進行了轉置所以這裏weight的維度爲[50,100]而不是[100,50]。
具體計算爲[140,100] × [50,100]的轉置 + bias = [140,100] × [100,50] + bias最後得到的維度爲[140,50]。
至於對於bias和weight的初始化,根絕網上所講的是來有關維度值得均勻分佈。