1 回答

TA貢獻(xiàn)1780條經(jīng)驗(yàn) 獲得超1個(gè)贊
我懷疑您的問(wèn)題與您的輸出有關(guān)/(如果您展示train_set的示例會(huì)有所幫助)。運(yùn)行下面的代碼段沒有給出nan,但是在調(diào)用之前,我手動(dòng)強(qiáng)制輸出的形狀:data[1]loss_fn(pred, outputs)
class BaselineModel(nn.Module):
def __init__(self, feature_dim=5, hidden_size=5, num_layers=2, batch_size=32):
super(BaselineModel, self).__init__()
self.num_layers = num_layers
self.hidden_size = hidden_size
self.lstm = nn.LSTM(input_size=feature_dim,
hidden_size=hidden_size, num_layers=num_layers)
def forward(self, x, hidden):
lstm_out, hidden = self.lstm(x, hidden)
return lstm_out, hidden
def init_hidden(self, batch_size):
hidden = Variable(next(self.parameters()).data.new(
self.num_layers, batch_size, self.hidden_size))
cell = Variable(next(self.parameters()).data.new(
self.num_layers, batch_size, self.hidden_size))
return (hidden, cell)
model = BaselineModel(batch_size=32)
optimizer = optim.Adam(model.parameters(), lr=0.01, weight_decay=0.0001)
loss_fn = torch.nn.MSELoss(reduction='sum')
hidden = model.init_hidden(10)
model.zero_grad()
pred, hidden = model(torch.randn(2,10,5), hidden)
pred.size() #torch.Size([2, 10, 5])
outputs = torch.zeros(2,10,5)
loss = loss_fn(pred, outputs)
loss
loss.backward()
torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
optimizer.step()
print(loss)
請(qǐng)注意,nan值的一個(gè)常見原因可能與學(xué)習(xí)階段的數(shù)字穩(wěn)定性有關(guān),但通常您在看到背離發(fā)生之前具有第一步的值,而這里顯然不是這種情況。
添加回答
舉報(bào)