1 回答

TA貢獻(xiàn)1839條經(jīng)驗 獲得超15個贊
NaN 通常是由學(xué)習(xí)率過高或優(yōu)化過程中類似的不穩(wěn)定性引起的,從而導(dǎo)致梯度爆炸。這也可以通過設(shè)置來防止clipnorm。設(shè)置具有適當(dāng)學(xué)習(xí)率的優(yōu)化器:
opt = keras.optimizers.Adam(0.001, clipnorm=1.)
model.compile(loss=["mse", "mse"], loss_weights=[0.9, 0.1], optimizer=opt)
可以在筆記本上進(jìn)行更好的訓(xùn)練:
Epoch 1/20
363/363 [==============================] - 1s 2ms/step - loss: 1547.7197 - main_output_loss: 967.1940 - aux_output_loss: 6772.4609 - val_loss: 19.9807 - val_main_output_loss: 20.0967 - val_aux_output_loss: 18.9365
Epoch 2/20
363/363 [==============================] - 1s 2ms/step - loss: 13.2916 - main_output_loss: 14.0150 - aux_output_loss: 6.7812 - val_loss: 14.6868 - val_main_output_loss: 14.5820 - val_aux_output_loss: 15.6298
Epoch 3/20
363/363 [==============================] - 1s 2ms/step - loss: 11.0539 - main_output_loss: 11.6683 - aux_output_loss: 5.5244 - val_loss: 10.5564 - val_main_output_loss: 10.2116 - val_aux_output_loss: 13.6594
Epoch 4/20
363/363 [==============================] - 1s 1ms/step - loss: 7.4646 - main_output_loss: 7.7688 - aux_output_loss: 4.7269 - val_loss: 13.2672 - val_main_output_loss: 11.5239 - val_aux_output_loss: 28.9570
Epoch 5/20
363/363 [==============================] - 1s 2ms/step - loss: 5.6873 - main_output_loss: 5.8091 - aux_output_loss: 4.5909 - val_loss: 5.0464 - val_main_output_loss: 4.5089 - val_aux_output_loss: 9.8839
它的表現(xiàn)并不令人驚訝,但您必須從這里優(yōu)化所有超參數(shù)才能將其調(diào)整到滿意的程度。
您還可以按照您最初的預(yù)期使用 SGD 來觀察 Clipnorm 的效果:
opt = keras.optimizers.SGD(0.001, clipnorm=1.)
model.compile(loss=["mse", "mse"], loss_weights=[0.9, 0.1], optimizer=opt)
這樣訓(xùn)練得當(dāng)。但是,一旦刪除clipnorm,您就會得到NaNs。
添加回答
舉報