首頁(yè) 猿問(wèn) 如何使用...

如何使用 tensorflow_datasets (tfds) 實(shí)現(xiàn)和理解預(yù)處理和數(shù)據(jù)擴(kuò)充？

Python

偶然的你 2022-10-18 17:22:41

我正在學(xué)習(xí)基于使用Oxford-IIIT Pets 的TF 2.0 教程的分割和數(shù)據(jù)增強(qiáng)。對(duì)于預(yù)處理/數(shù)據(jù)增強(qiáng)，它們?yōu)樘囟ü艿捞峁┝艘唤M功能：# Import datasetdataset, info = tfds.load('oxford_iiit_pet:3.*.*', with_info=True)def normalize(input_image, input_mask): input_image = tf.cast(input_image, tf.float32) / 255.0 input_mask -= 1 return input_image, input_mask@tf.functiondef load_image_train(datapoint): input_image = tf.image.resize(datapoint['image'], (128, 128)) input_mask = tf.image.resize(datapoint['segmentation_mask'], (128, 128)) if tf.random.uniform(()) > 0.5: input_image = tf.image.flip_left_right(input_image) input_mask = tf.image.flip_left_right(input_mask) input_image, input_mask = normalize(input_image, input_mask) return input_image, input_maskTRAIN_LENGTH = info.splits['train'].num_examplesBATCH_SIZE = 64BUFFER_SIZE = 1000STEPS_PER_EPOCH = TRAIN_LENGTH // BATCH_SIZE鑒于 tf 語(yǔ)法，這段代碼給我?guī)?lái)了幾個(gè)疑問(wèn)。為了防止我只是做一個(gè) ctrl C ctrl V 并真正理解 tensorflow 是如何工作的，我想問(wèn)一些問(wèn)題：1) 在normalize函數(shù)中，tf.cast(input_image, tf.float32) / 255.0可以通過(guò)tf.image.convert_image_dtype(input_image, tf.float32)?2) 在normalize函數(shù)中，可以在格式中更改我的 segmentation_mask 值tf.tensor而不更改為numpy？我想做的是只使用兩個(gè)可能的掩碼（0 和 1）而不是（0、1 和 2）。使用 numpy 我做了這樣的事情：segmentation_mask_numpy = segmentation_mask.numpy()segmentation_mask_numpy[(segmentation_mask_numpy == 2) | (segmentation_mask_numpy == 3)] = 0可以在沒(méi)有 numpy 轉(zhuǎn)換的情況下做到這一點(diǎn)嗎？3）在load_image_train函數(shù)中，他們說(shuō)這個(gè)函數(shù)正在做數(shù)據(jù)增強(qiáng)，但是怎么做呢？在我看來(lái)，他們正在通過(guò)給定隨機(jī)數(shù)的翻轉(zhuǎn)來(lái)更改原始圖像，而不是根據(jù)原始圖像向數(shù)據(jù)集提供另一個(gè)圖像。那么，功能目標(biāo)是更改圖像而不是向我的數(shù)據(jù)集添加保留原始圖像的 aug_image？如果我是正確的，如何更改此函數(shù)以提供 aug_image 并將原始圖像保留在數(shù)據(jù)集中？4) 在其他問(wèn)題中，例如How to apply data augmentation in TensorFlow 2.0 after tfds.load()和TensorFlow 2.0 Keras: How to write image summary for TensorBoard他們使用了很多.map()順序調(diào)用或.map().map().cache().batch().repeat(). 我的問(wèn)題是：有這個(gè)必要性嗎？是否存在更簡(jiǎn)單的方法來(lái)做到這一點(diǎn)？我試圖閱讀 tf 文檔，但沒(méi)有成功。5）您建議使用此處ImageDataGenerator介紹的 keras或這種 tf 方法更好？

查看完整描述

1 回答

有只小跳蛙

TA貢獻(xiàn)1824條經(jīng)驗(yàn) 獲得超8個(gè)贊

4 - 這些順序調(diào)用的事情是，它們簡(jiǎn)化了我們操作數(shù)據(jù)集以應(yīng)用轉(zhuǎn)換的工作，并且他們還聲稱這是一種加載和處理數(shù)據(jù)的更具性能的方式。關(guān)于模塊化/簡(jiǎn)單性，我猜它完成了它的工作，因?yàn)槟梢暂p松加載、將其傳遞給整個(gè)預(yù)處理管道、隨機(jī)播放并使用幾行代碼迭代批量數(shù)據(jù)。

train_dataset =tf.data.TFRecordDataset(filenames=train_records_paths).map(parsing_fn)

train_dataset = train_dataset.shuffle(buffer_size=12000)

train_dataset = train_dataset.batch(batch_size)

train_dataset = train_dataset.repeat()

# Create a test dataset

test_dataset = tf.data.TFRecordDataset(filenames=test_records_paths).map(parsing_fn)

test_dataset = test_dataset.batch(batch_size)

test_dataset = test_dataset.repeat(1)

#

validation_steps = test_size / batch_size

history = transferred_resnet50.fit(x=train_dataset,

epochs=epochs,

steps_per_epoch=steps_per_epoch,

validation_data=test_dataset,

validation_steps=validation_steps)

例如，為了加載我的數(shù)據(jù)集并為我的模型提供預(yù)處理數(shù)據(jù)，這就是我所要做的。

3 - 他們定義了一個(gè)預(yù)處理函數(shù)，他們的數(shù)據(jù)集被映射到，這意味著每次請(qǐng)求樣本時(shí)都會(huì)應(yīng)用映射函數(shù)，就像在我的情況下，我使用解析函數(shù)來(lái)解析我的使用前 TFRecord 格式的數(shù)據(jù)：

def parsing_fn(serialized):

features = \

{

'image': tf.io.FixedLenFeature([], tf.string),

'label': tf.io.FixedLenFeature([], tf.int64)

}

# Parse the serialized data so we get a dict with our data.

parsed_example = tf.io.parse_single_example(serialized=serialized,

features=features)

# Get the image as raw bytes.

image_raw = parsed_example['image']

# Decode the raw bytes so it becomes a tensor with type.

image = tf.io.decode_jpeg(image_raw)

image = tf.image.resize(image,size=[224,224])

# Get the label associated with the image.

label = parsed_example['label']

# The image and label are now correct TensorFlow types.

return image, label

（另一個(gè)例子） - 從上面的解析函數(shù)，我可以使用下面的代碼來(lái)創(chuàng)建一個(gè)數(shù)據(jù)集，遍歷我的測(cè)試集圖像并繪制它們。

records_path = DATA_DIR+'/'+'TFRecords'+'/test/'+'test_0.tfrecord'

# Create a dataset

dataset = tf.data.TFRecordDataset(filenames=records_path)

# Parse the dataset using a parsing function

parsed_dataset = dataset.map(parsing_fn)

# Gets a sample from the iterator

iterator = tf.compat.v1.data.make_one_shot_iterator(parsed_dataset)

for i in range(100):

image,label = iterator.get_next()

img_array = image.numpy()

img_array = img_array.astype(np.uint8)

plt.imshow(img_array)

plt.show()

反對(duì) 回復(fù) 2022-10-18

1 回答
0 關(guān)注
198 瀏覽

關(guān)注

添加回答

舉報(bào)

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書簽

微信客服

購(gòu)課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動(dòng)學(xué)習(xí)伙伴

公眾號(hào)

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號(hào)

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

如何使用 tensorflow_datasets (tfds) 實(shí)現(xiàn)和理解預(yù)處理和數(shù)據(jù)擴(kuò)充？

如何使用 tensorflow_datasets (tfds) 實(shí)現(xiàn)和理解預(yù)處理和數(shù)據(jù)擴(kuò)充？

1 回答

添加回答

如何使用 tensorflow_datasets (tfds) 實(shí)現(xiàn)和理解預(yù)處理和數(shù)據(jù)擴(kuò)充？