第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

為了賬號(hào)安全,請(qǐng)及時(shí)綁定郵箱和手機(jī)立即綁定
已解決430363個(gè)問(wèn)題,去搜搜看,總會(huì)有你想問(wèn)的

PyTorch LSTM 有 nan for MSELoss

PyTorch LSTM 有 nan for MSELoss

梵蒂岡之花 2022-08-16 10:53:27
我的模型是:class BaselineModel(nn.Module):    def __init__(self, feature_dim=5, hidden_size=5, num_layers=2, batch_size=32):        super(BaselineModel, self).__init__()        self.num_layers = num_layers        self.hidden_size = hidden_size        self.lstm = nn.LSTM(input_size=feature_dim,                            hidden_size=hidden_size, num_layers=num_layers)    def forward(self, x, hidden):        lstm_out, hidden = self.lstm(x, hidden)        return lstm_out, hidden    def init_hidden(self, batch_size):        hidden = Variable(next(self.parameters()).data.new(            self.num_layers, batch_size, self.hidden_size))        cell = Variable(next(self.parameters()).data.new(            self.num_layers, batch_size, self.hidden_size))        return (hidden, cell)訓(xùn)練看起來(lái)像這樣:train_loader = torch.utils.data.DataLoader(    train_set, batch_size=BATCH_SIZE, shuffle=True, **params)model = BaselineModel(batch_size=BATCH_SIZE)optimizer = optim.Adam(model.parameters(), lr=0.01, weight_decay=0.0001)loss_fn = torch.nn.MSELoss(reduction='sum')for epoch in range(250):    # hidden = (torch.zeros(2, 13, 5),    #           torch.zeros(2, 13, 5))    # model.hidden = hidden    for i, data in enumerate(train_loader):        hidden = model.init_hidden(13)        inputs = data[0]        outputs = data[1]        print('inputs',  inputs.size())        # print('outputs', outputs.size())        # optimizer.zero_grad()        model.zero_grad()        # print('inputs', inputs)        pred, hidden = model(inputs, hidden)        loss = loss_fn(pred, outputs)        loss.backward()        torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)        optimizer.step()        print('Epoch: ', epoch, '\ti: ', i, '\tLoss: ', loss)我已經(jīng)設(shè)置了漸變剪切,這似乎是推薦的解決方案。但即使在第一步之后,我也會(huì)得到:Epoch: 0 i: 0 損失: 張量(nan, grad_fn=)
查看完整描述

1 回答

?
慕神8447489

TA貢獻(xiàn)1780條經(jīng)驗(yàn) 獲得超1個(gè)贊

我懷疑您的問(wèn)題與您的輸出有關(guān)/(如果您展示train_set的示例會(huì)有所幫助)。運(yùn)行下面的代碼段沒有給出nan,但是在調(diào)用之前,我手動(dòng)強(qiáng)制輸出的形狀:data[1]loss_fn(pred, outputs)


class BaselineModel(nn.Module):

    def __init__(self, feature_dim=5, hidden_size=5, num_layers=2, batch_size=32):

        super(BaselineModel, self).__init__()

        self.num_layers = num_layers

        self.hidden_size = hidden_size


        self.lstm = nn.LSTM(input_size=feature_dim,

                            hidden_size=hidden_size, num_layers=num_layers)


    def forward(self, x, hidden):

        lstm_out, hidden = self.lstm(x, hidden)

        return lstm_out, hidden


    def init_hidden(self, batch_size):

        hidden = Variable(next(self.parameters()).data.new(

            self.num_layers, batch_size, self.hidden_size))

        cell = Variable(next(self.parameters()).data.new(

            self.num_layers, batch_size, self.hidden_size))

        return (hidden, cell)


model = BaselineModel(batch_size=32)

optimizer = optim.Adam(model.parameters(), lr=0.01, weight_decay=0.0001)

loss_fn = torch.nn.MSELoss(reduction='sum')


hidden = model.init_hidden(10)

model.zero_grad()

pred, hidden = model(torch.randn(2,10,5), hidden)

pred.size() #torch.Size([2, 10, 5])

outputs = torch.zeros(2,10,5)


loss = loss_fn(pred, outputs)

loss


loss.backward()

torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)

optimizer.step()

print(loss)

請(qǐng)注意,nan值的一個(gè)常見原因可能與學(xué)習(xí)階段的數(shù)字穩(wěn)定性有關(guān),但通常您在看到背離發(fā)生之前具有第一步的值,而這里顯然不是這種情況。


查看完整回答
反對(duì) 回復(fù) 2022-08-16
  • 1 回答
  • 0 關(guān)注
  • 154 瀏覽
慕課專欄
更多

添加回答

舉報(bào)

0/150
提交
取消
微信客服

購(gòu)課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動(dòng)學(xué)習(xí)伙伴

公眾號(hào)

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號(hào)