TIP1: Find official explanation of a Class
Take torch.nn.BCEWithLogitsLoss as an example:
In [1]: import torch.nn as nn
In [2]: print( nn.BCEWithLogitsLoss.__doc_ )
outputs:
This loss combines a `Sigmoid` layer and the `BCELoss` in one single
class. This version is more numerically stable than using a plain `Sigmoid`
followed by a `BCELoss` as, by combining the operations into one layer,
we take advantage of the log-sum-exp trick for numerical stability.
The loss can be described as:
.. math::
\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad
l_n = - w_n \left[ t_n \cdot \log \sigma(x_n)
+ (1 - t_n) \cdot \log (1 - \sigma(x_n)) \right],
where :math:`N` is the batch size. If reduce is ``True``, then
.. math::
\ell(x, y) = \begin{cases}
\operatorname{mean}(L), & \text{if}\; \text{size_average} = \text{True},\\
\operatorname{sum}(L), & \text{if}\; \text{size_average} = \text{False}.
\end{cases}
This is used for measuring the error of a reconstruction in for example
an auto-encoder. Note that the targets `t[i]` should be numbers
between 0 and 1.
Args:
weight (Tensor, optional): a manual rescaling weight given to the loss
of each batch element. If given, has to be a Tensor of size
"nbatch".
size_average (bool, optional): By default, the losses are averaged
over observations for each minibatch. However, if the field
size_average is set to ``False``, the losses are instead summed for
each minibatch. Default: ``True``
reduce (bool, optional): By default, the losses are averaged or summed over
observations for each minibatch depending on size_average. When reduce
is False, returns a loss per input/target element instead and ignores
size_average. Default: True
Shape:
- Input: :math:`(N, *)` where `*` means, any number of additional
dimensions
- Target: :math:`(N, *)`, same shape as the input
Examples::
>>> loss = nn.BCEWithLogitsLoss()
>>> input = torch.randn(3, requires_grad=True)
>>> target = torch.empty(3).random_(2)
>>> output = loss(input, target)
>>> output.backward()
TIP2: Visualization of Tex formula
Take outputs in TIP1 as an example:
The loss can be described as:
math::
\ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad
l_n = - w_n \left[ t_n \cdot \log \sigma(x_n)
+ (1 - t_n) \cdot \log (1 - \sigma(x_n)) \right],
where :math:`N` is the batch size. If reduce is ``True``, then
math::
\ell(x, y) = \begin{cases}
\operatorname{mean}(L), & \text{if}\; \text{size_average} = \text{True},\\
\operatorname{sum}(L), & \text{if}\; \text{size_average} = \text{False}.
\end{cases}
Copy and paste to MathType app:
TIP3: MathType Settings
Prameters–>Transform–>convert to “TeX – Plain TeX”
Don’t choose any of the optional items.