現在,我們來學習如何使用預訓練的網絡解決有挑戰性的計算機視覺問題。你將使用通過 ImageNet 位於 torchvision 上訓練的網絡。
ImageNet 是一個龐大的數據集,包含 100 多萬張有標籤圖像,並涉及 1000 個類別。.它可用於訓練採用卷積層結構的深度神經網絡。我不會詳細講解卷積網絡,但是你可以觀看此視頻瞭解這種網絡。
訓練過後,作爲特徵檢測器,這些模型可以在訓練時未使用的圖像上達到驚人的效果。對不在訓練集中的圖像使用預訓練的網絡稱爲遷移學習。我們將使用遷移學習訓練網絡分類貓狗照片並達到很高的準確率。
使用 torchvision.models
,你可以下載這些預訓練的網絡,並用在你的應用中。我們現在導入 models
。
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import matplotlib.pyplot as plt
import torch
from torch import nn
from torch import optim
import torch.nn.functional as F
from torchvision import datasets, transforms, models
import helper
大多數預訓練的模型要求輸入是 224x224 圖像。此外,我們需要按照訓練時採用的標準化方法轉換圖像。每個顏色通道都分別進行了標準化,均值爲 [0.485, 0.456, 0.406]
,標準偏差爲 [0.229, 0.224, 0.225]
。
data_dir = 'Cat_Dog_data'
# TODO: Define transforms for the training data and testing data
train_transforms = transforms.Compose([transforms.RandomRotation(30),
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])])
test_transforms = transforms.Compose([transforms.Resize(255),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406],
[0.229, 0.224, 0.225])])
# Pass transforms in here, then run the next cell to see how the transforms look
train_data = datasets.ImageFolder(data_dir + '/train', transform=train_transforms)
test_data = datasets.ImageFolder(data_dir + '/test', transform=test_transforms)
trainloader = torch.utils.data.DataLoader(train_data, batch_size=64, shuffle=True)
testloader = torch.utils.data.DataLoader(test_data, batch_size=64)
data_iter = iter(testloader)
images, labels = next(data_iter)
fig, axes = plt.subplots(figsize=(10,4), ncols=4)
for ii in range(4):
ax = axes[ii]
helper.imshow(images[ii], ax=ax, normalize=False)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
DensNet網絡
我們可以加載 DenseNet 等模型。現在我們輸出這個模型的結構,看看後臺情況。
model = models.densenet121(pretrained=True)
#model
該模型由兩部分組成:特徵和分類器。特徵部分由一堆卷積層組成,整體作爲特徵檢測器傳入分類器中。分類器是一個單獨的全連接層 (classifier): Linear(in_features=1024, out_features=1000)
。這個層級是用 ImageNet 數據集訓練過的層級,因此無法解決我們的問題。這意味着我們需要替換分類器。但是特徵就完全沒有問題。你可以把預訓練的網絡看做是效果很好地的特徵檢測器,可以用作簡單前饋分類器的輸入。
# Freeze parameters so we don't backprop through them
for param in model.parameters():
param.requires_grad = False
from collections import OrderedDict
classifier = nn.Sequential(OrderedDict([
('fc1', nn.Linear(1024, 500)),
('relu', nn.ReLU()),
('fc2', nn.Linear(500, 2)),
('output', nn.LogSoftmax(dim=1))
]))
model.classifier = classifier
model
DenseNet(
(features): Sequential(
(conv0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu0): ReLU(inplace=True)
(pool0): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(denseblock1): _DenseBlock(
(denselayer1): _DenseLayer(
(norm1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
...
...
...
(denselayer16): _DenseLayer(
(norm1): BatchNorm2d(992, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(992, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
)
(norm5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(classifier): Sequential(
(fc1): Linear(in_features=1024, out_features=500, bias=True)
(relu): ReLU()
(fc2): Linear(in_features=500, out_features=2, bias=True)
(output): LogSoftmax()
)
)
構建好模型後,我們需要訓練分類器。但是,問題是,現在我們使用的是非常深度的神經網絡。如果你正常地在 CPU 上訓練此網絡,這會耗費相當長的時間。所以,我們將使用 GPU 進行運算。在 GPU 上,線性代數運算同步進行,這使得運算速度提升了 100 倍。我們還可以在多個 GPU 上訓練,進一步縮短訓練時間。
PyTorch 和其他深度學習框架一樣,也使用 CUDA 在 GPU 上高效地進行前向和反向運算。在 PyTorch 中,你需要使用 model.to('cuda')
將模型參數和其他張量轉移到 GPU 內存中。你可以使用 model.to('cpu')
將它們從 GPU 移到 CPU,比如在你需要在 PyTorch 之外對網絡輸出執行運算時。爲了向你展示速度的提升對比,我將分別使用 GPU 和不使用 GPU 進行前向和反向傳播運算。
import time
for device in ['cpu', 'cpu']:#'cuda'
criterion = nn.NLLLoss()
# Only train the classifier parameters, feature parameters are frozen
optimizer = optim.Adam(model.classifier.parameters(), lr=0.001)
model.to(device)
for ii, (inputs, labels) in enumerate(trainloader):
# Move input and label tensors to the GPU
inputs, labels = inputs.to(device), labels.to(device)
start = time.time()
outputs = model.forward(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
if ii==3:
break
print(f"Device = {device}; Time per batch: {(time.time() - start)/3:.3f} seconds")
Device = cpu; Time per batch: 1.038 seconds
Device = cpu; Time per batch: 1.030 seconds
你可以先詢問 GPU 設備是否可用,如果啓用了 CUDA,它將自動使用 CUDA:
at beginning of the script
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
…
then whenever you get a new Tensor or Module
this won’t copy if they are already on the desired device
input = data.to(device)
model = MyModule(…).to(device)
接下來由你來完成模型訓練過程。流程和之前的差不多,但是現在模型強大了很多,你應該能夠輕鬆地達到 95% 以上的準確率。
**練習:**請訓練預訓練的模型來分類貓狗圖像。你可以繼續使用 DenseNet 模型或嘗試 ResNet,兩個模型都值得推薦。記住,你只需要訓練分類器,特徵部分的參數應保持不變。
## TODO: Use a pretrained model to classify the cat and dog images
# at beginning of the script
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model2 = models.densenet121(pretrained=True)
for param in model2.parameters():
param.requires_grad = False
model2.classifier = nn.Sequential(nn.Linear(1024,256),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(256,2),
nn.LogSoftmax(dim=1))
criterion = nn.NLLLoss()
optimizer = optim.Adam(model2.classifier.parameters(),lr=0.003)
model2.to(device)
DenseNet(
(features): Sequential(
(conv0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu0): ReLU(inplace=True)
...
...
...
(denselayer16): _DenseLayer(
(norm1): BatchNorm2d(992, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(992, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
)
(norm5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(classifier): Sequential(
(0): Linear(in_features=1024, out_features=256, bias=True)
(1): ReLU()
(2): Dropout(p=0.2, inplace=False)
(3): Linear(in_features=256, out_features=2, bias=True)
(4): LogSoftmax()
)
)
epochs = 1
steps = 0
running_loss = 0
print_every = 5
for epoch in range(epochs):
for inputs,labels in trainloader:
steps += 1
inputs,labels = inputs.to(device),labels.to(device)
optimizer.zero_grad()
logps = model2.forward(inputs)
loss = criterion(logps, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
if steps%print_every == 0:
test_loss = 0
accuracy = 0
model2.eval()
with torch.no_grad():
for inputs,labels in testloader:
inputs,labels = inputs.to(device),labels.to(device)
logps = model2.forward(inputs)
batch_loss = criterion(logps, labels)
test_loss += batch_loss.item()
#Caculate accuracy
ps = torch.exp(logps)
top_p, top_class = ps.topk(1,dim=1)
equals = top_class == labels.view(*top_class.shape)
accuracy += torch.mean(equals.type(torch.FloatTensor)).item()
print(f"Epoch {epoch+1}/{epochs}.. "
f"step: {steps}..."
f"Train loss: {running_loss/print_every:.3f}.. "
f"Test loss: {test_loss/len(testloader):.3f}.. "
f"Test accuracy: {accuracy/len(testloader):.3f}")
if accuracy/len(testloader) > 0.98:
break
running_loss = 0
model2.train()
Epoch 1/1.. step: 5...Train loss: 0.857.. Test loss: 0.748.. Test accuracy: 0.568
...
...
...
Epoch 1/1.. step: 50...Train loss: 0.159.. Test loss: 0.052.. Test accuracy: 0.982
def test_model(my_model,my_device, my_epochs, my_trainloader, my_testloader):
steps = 0
running_loss = 0
print_every = 5
my_criterion = nn.NLLLoss()
my_optimizer = optim.Adam(my_model.classifier.parameters(),lr=0.003)
my_model.to(device)
for epoch in range(my_epochs):
for inputs,labels in my_trainloader:
steps += 1
inputs,labels = inputs.to(my_device),labels.to(my_device)
my_optimizer.zero_grad()
logps = my_model.forward(inputs)
loss = my_criterion(logps, labels)
loss.backward()
my_optimizer.step()
running_loss += loss.item()
if steps%print_every == 0:
test_loss = 0
accuracy = 0
my_model.eval()
with torch.no_grad():
for inputs,labels in my_testloader:
inputs,labels = inputs.to(my_device),labels.to(my_device)
logps = my_model.forward(inputs)
batch_loss = my_criterion(logps, labels)
test_loss += batch_loss.item()
#Caculate accuracy
ps = torch.exp(logps)
top_p, top_class = ps.topk(1,dim=1)
equals = top_class == labels.view(*top_class.shape)
accuracy += torch.mean(equals.type(torch.FloatTensor)).item()
print(f"Epoch {epoch+1}/{my_epochs}.. "
f"step: {steps}..."
f"Train loss: {running_loss/print_every:.3f}.. "
f"Test loss: {test_loss/len(my_testloader):.3f}.. "
f"Test accuracy: {accuracy/len(my_testloader):.3f}")
running_loss = 0
if accuracy/len(testloader) > 0.92:
break
my_model.train()
import time
# at beginning of the script
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model_type = models.densenet121(pretrained=True)
model_type
DenseNet(
(features): Sequential(
(conv0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
...
...
...
(denselayer16): _DenseLayer(
(norm1): BatchNorm2d(992, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(992, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
)
(norm5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(classifier): Linear(in_features=1024, out_features=1000, bias=True)
)
for param in model_type.parameters():
param.requires_grad = False
model_type.classifier = nn.Sequential(nn.Linear(1024,256),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(256,2),
nn.LogSoftmax(dim=1))
model_type
DenseNet(
(features): Sequential(
(conv0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu0): ReLU(inplace=True)
(pool0): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
...
...
...
(denselayer16): _DenseLayer(
(norm1): BatchNorm2d(992, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(992, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
)
(norm5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(classifier): Sequential(
(0): Linear(in_features=1024, out_features=256, bias=True)
(1): ReLU()
(2): Dropout(p=0.2, inplace=False)
(3): Linear(in_features=256, out_features=2, bias=True)
(4): LogSoftmax()
)
)
EPOCHS = 1
start = time.time()
test_model(model_type,device,EPOCHS,trainloader,testloader)
print(f"Device = {device}; model_type='densenet121',Time per batch: {(time.time() - start):.3f} seconds")
Epoch 1/1.. step: 5...Train loss: 0.727.. Test loss: 0.233.. Test accuracy: 0.957
Epoch 1/1.. step: 10...Train loss: 0.370.. Test loss: 0.175.. Test accuracy: 0.940
Epoch 1/1.. step: 15...Train loss: 0.305.. Test loss: 0.109.. Test accuracy: 0.962
Epoch 1/1.. step: 20...Train loss: 0.214.. Test loss: 0.090.. Test accuracy: 0.970
Epoch 1/1.. step: 25...Train loss: 0.192.. Test loss: 0.145.. Test accuracy: 0.943
Epoch 1/1.. step: 30...Train loss: 0.235.. Test loss: 0.073.. Test accuracy: 0.973
Device = cpu; model_type='densenet121',Time per batch: 1363.256 seconds
AlexNet 網絡
model_type = models.alexnet(pretrained=True)
model_type
AlexNet(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
(1): ReLU(inplace=True)
(2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(4): ReLU(inplace=True)
(5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(7): ReLU(inplace=True)
(8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(9): ReLU(inplace=True)
(10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
(classifier): Sequential(
(0): Dropout(p=0.5, inplace=False)
(1): Linear(in_features=9216, out_features=4096, bias=True)
(2): ReLU(inplace=True)
(3): Dropout(p=0.5, inplace=False)
(4): Linear(in_features=4096, out_features=4096, bias=True)
(5): ReLU(inplace=True)
(6): Linear(in_features=4096, out_features=1000, bias=True)
)
)
for param in model_type.parameters():
param.requires_grad = False
model_type.classifier = nn.Sequential(nn.Linear(9216,4096),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(4096,256),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(256,2),
nn.LogSoftmax(dim=1))
model_type
AlexNet(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
(1): ReLU(inplace=True)
(2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(4): ReLU(inplace=True)
(5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(7): ReLU(inplace=True)
(8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(9): ReLU(inplace=True)
(10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
(classifier): Sequential(
(0): Linear(in_features=9216, out_features=4096, bias=True)
(1): ReLU()
(2): Dropout(p=0.2, inplace=False)
(3): Linear(in_features=4096, out_features=256, bias=True)
(4): ReLU()
(5): Dropout(p=0.2, inplace=False)
(6): Linear(in_features=256, out_features=2, bias=True)
(7): LogSoftmax()
)
)
EPOCHS = 1
start = time.time()
test_model(model_type,device,EPOCHS,trainloader,testloader)
print(f"Device = {device}; model_type='resnet',Time per batch: {(time.time() - start)/1:.3f} seconds")
Epoch 1/1.. step: 5...Train loss: 24.561.. Test loss: 1.131.. Test accuracy: 0.490
Epoch 1/1.. step: 10...Train loss: 0.979.. Test loss: 0.261.. Test accuracy: 0.896
Epoch 1/1.. step: 15...Train loss: 0.499.. Test loss: 0.432.. Test accuracy: 0.791
Epoch 1/1.. step: 20...Train loss: 0.503.. Test loss: 0.277.. Test accuracy: 0.896
Epoch 1/1.. step: 25...Train loss: 0.395.. Test loss: 0.188.. Test accuracy: 0.922
Device = cpu; model_type='resnet',Time per batch: 157.419 seconds
VGG16網絡
model_type = models.vgg16(pretrained=True)
model_type
Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to /home/leon/.cache/torch/checkpoints/vgg16-397923af.pth
100.0%
VGG(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace=True)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
...
(29): ReLU(inplace=True)
(30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU(inplace=True)
(2): Dropout(p=0.5, inplace=False)
(3): Linear(in_features=4096, out_features=4096, bias=True)
(4): ReLU(inplace=True)
(5): Dropout(p=0.5, inplace=False)
(6): Linear(in_features=4096, out_features=1000, bias=True)
)
)
for param in model_type.parameters():
param.requires_grad = False
model_type.classifier = nn.Sequential(nn.Linear(25088,4096),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(4096,256),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(256,2),
nn.LogSoftmax(dim=1))
model_type
VGG(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): ReLU(inplace=True)
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(3): ReLU(inplace=True)
(4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(6): ReLU(inplace=True)
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): ReLU(inplace=True)
(9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(13): ReLU(inplace=True)
(14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): ReLU(inplace=True)
(16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(18): ReLU(inplace=True)
(19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(20): ReLU(inplace=True)
(21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(22): ReLU(inplace=True)
(23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(25): ReLU(inplace=True)
(26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(27): ReLU(inplace=True)
(28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(29): ReLU(inplace=True)
(30): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU()
(2): Dropout(p=0.2, inplace=False)
(3): Linear(in_features=4096, out_features=256, bias=True)
(4): ReLU()
(5): Dropout(p=0.2, inplace=False)
(6): Linear(in_features=256, out_features=2, bias=True)
(7): LogSoftmax()
)
)
EPOCHS = 1
start = time.time()
test_model(model_type,device,EPOCHS,trainloader,testloader)
print(f"Device = {device}; model_type='resnet',Time per batch: {(time.time() - start)/1:.3f} seconds")
Epoch 1/1.. step: 5...Train loss: 45.882.. Test loss: 3.707.. Test accuracy: 0.518
Epoch 1/1.. step: 10...Train loss: 1.980.. Test loss: 0.121.. Test accuracy: 0.957
Device = cpu; model_type='resnet',Time per batch: 672.757 seconds
觀察這些形狀
你需要檢查傳入模型和其他代碼的張量形狀是否正確。在調試和開發過程中使用 .shape 方法。
如果網絡訓練效果不好,檢查以下幾個事項:
在訓練循環中使用 optimizer.zero_grad() 清理梯度。如果執行驗證循環,使用 model.eval() 將網絡設爲評估模式,再使用 model.train() 將其設爲訓練模式。
CUDA 錯誤
有時候你會遇到這個錯誤:
RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #1 ‘mat1’
第二個類型是 torch.cuda.FloatTensor,這意味着它是已經移到 GPU 的張量。它想獲得類型爲 torch.FloatTensor 的張量,但是沒有 .cuda,因此該張量應該在 CPU 上。PyTorch 只能對位於相同設備上的張量進行運算,因此必須同時位於 CPU 或 GPU 上。如果你要在 GPU 上運行網絡,一定要使用 .to(device) 將模型和所有必要張量移到 GPU 上,其中 device 爲 “cuda” 或 “cpu”。