首頁猿問如何使自動(dòng)編碼器在小型圖像數(shù)據(jù)集上工作

如何使自動(dòng)編碼器在小型圖像數(shù)據(jù)集上工作

Python

不負(fù)相思意 2021-04-05 17:39:31

我有一個(gè)包含三個(gè)圖像的數(shù)據(jù)集。當(dāng)我創(chuàng)建一個(gè)自動(dòng)編碼器來訓(xùn)練這三個(gè)圖像時(shí)，我得到的輸出對(duì)于每個(gè)圖像都是完全相同的，并且看起來像是所有三個(gè)圖像的混合。我的結(jié)果看起來像這樣：輸入圖像1：輸出圖像1：輸入圖片2：輸出圖像2：輸入圖片3：輸出圖像3：因此，您可以看到輸出為每個(gè)輸入提供了完全相同的東西，并且雖然每個(gè)輸入都匹配得很好，但這并不完美。這是一個(gè)包含三個(gè)圖像的數(shù)據(jù)集-應(yīng)該是完美的（或每個(gè)圖像至少是不同的）。我擔(dān)心這三個(gè)圖像數(shù)據(jù)集，因?yàn)楫?dāng)我處理500個(gè)圖像數(shù)據(jù)集時(shí)，我得到的只是一個(gè)白色的空白屏幕，因?yàn)檫@是所有圖像中最好的平均值。我正在使用Keras，并且代碼非常簡單。from keras.models import Sequentialfrom keras.layers import Dense, Flatten, Reshapeimport numpy as np# returns a numpy array with shape (3, 24, 32, 1)# there are 3 images that are each 24x32 and are black and white (1 color channel)x_train = get_data()# this is the size of our encoded representations# encode down to two numbers (I have tested using 3; I still have the same issue)encoding_dim = 2# the shape without the batch amountinput_shape = x_train.shape[1:]# how many output neurons we need to create an imageinput_dim = np.prod(input_shape)# simple feedforward network# I've also tried convolutional layers; same issueautoencoder = Sequential([ Flatten(), # flatten Dense(encoding_dim), # encode Dense(input_dim), # decode Reshape(input_shape) # reshape decoding])# adadelta optimizer works better than adam, same issue with bothautoencoder.compile(optimizer='adadelta', loss='mse')# train it to output the same thing it gets as input# I've tried epochs up to 30000 with no improvement;# still predicts the same image for all three inputsautoencoder.fit(x_train, x_train, epochs=10, batch_size=1, verbose=1)out = autoencoder.predict(x_train)然后我去輸出（out[0]，out[1]，out[2]），并將其轉(zhuǎn)換回圖像。您可以在上面看到輸出圖像。我很擔(dān)心，因?yàn)檫@表明自動(dòng)編碼器沒有保留有關(guān)輸入圖像的任何信息，這不是編碼器應(yīng)如何執(zhí)行的。如何使編碼器根據(jù)輸入圖像顯示輸出差異？編輯：我的一位同事建議不要使用自動(dòng)編碼器，而應(yīng)該使用1層前饋神經(jīng)網(wǎng)絡(luò)。我嘗試了一下，然后發(fā)生了同樣的事情，直到我將批處理大小設(shè)置為1并訓(xùn)練了1400個(gè)紀(jì)元，然后它完美地工作了。這使我認(rèn)為，更多的時(shí)代可以解決這個(gè)問題，但是我不確定。編輯：訓(xùn)練10,000個(gè)歷元（批處理大小為3）使第二個(gè)圖像看起來與編碼器上的第一個(gè)圖像和第三個(gè)圖像不同，這恰好是在非編碼器版本上運(yùn)行約400個(gè)歷元（也使用批處理大小為3個(gè)）時(shí)發(fā)生的情況）提供了進(jìn)一步的證據(jù)，那就是培訓(xùn)更多的紀(jì)元可能是解決方案。要使用批處理大小1進(jìn)行測試，看看是否有更大幫助，然后嘗試訓(xùn)練很多紀(jì)元，看看是否可以完全解決問題。

查看完整描述