首頁猿問如何從 Pytorch...

如何從 Pytorch 中的單個圖像中提取特征向量？

Python

ibeautiful 2023-06-20 13:23:41

我正在嘗試更多地了解計算機視覺模型，并且正在嘗試探索它們的工作原理。為了更好地理解如何解釋特征向量，我嘗試使用 Pytorch 來提取特征向量。下面是我從不同地方拼湊而成的代碼。import torchimport torch.nn as nnimport torchvision.models as modelsimport torchvision.transforms as transformsfrom torch.autograd import Variablefrom PIL import Imageimg=Image.open("Documents/01235.png")# Load the pretrained modelmodel = models.resnet18(pretrained=True)# Use the model object to select the desired layerlayer = model._modules.get('avgpool')# Set model to evaluation modemodel.eval()transforms = torchvision.transforms.Compose([ torchvision.transforms.Resize(256), torchvision.transforms.CenterCrop(224), torchvision.transforms.ToTensor(), torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), ]) def get_vector(image_name): # Load the image with Pillow library img = Image.open("Documents/Documents/Driven Data Competitions/Hateful Memes Identification/data/01235.png") # Create a PyTorch Variable with the transformed image t_img = transforms(img) # Create a vector of zeros that will hold our feature vector # The 'avgpool' layer has an output size of 512 my_embedding = torch.zeros(512) # Define a function that will copy the output of a layer def copy_data(m, i, o): my_embedding.copy_(o.data) # Attach that function to our selected layer h = layer.register_forward_hook(copy_data) # Run the model on our transformed image model(t_img) # Detach our copy function from the layer h.remove() # Return the feature vector return my_embeddingpic_vector = get_vector(img)當我這樣做時，出現(xiàn)以下錯誤：RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 3, 7, 7], but got 3-dimensional input of size [3, 224, 224] instead我確定這是一個基本錯誤，但我似乎無法弄清楚如何解決這個問題。我的印象是“totensor”轉(zhuǎn)換會使我的數(shù)據(jù)成為 4-d，但它似乎無法正常工作或者我誤解了它。感謝我可以用來了解更多信息的任何幫助或資源！

查看完整描述

3 回答

蕭十郎

TA貢獻1815條經(jīng)驗獲得超13個贊

pytorch 中的所有默認值都nn.Modules需要一個額外的批次維度。如果模塊的輸入是形狀 (B, ...) 那么輸出也將是 (B, ...) （盡管后面的維度可能會根據(jù)層而改變）。此行為允許同時對 B 批輸入進行有效推理。為了使您的代碼符合您的要求，您可以在將張量發(fā)送到您的模型以使其成為 (1, ...) 張量之前，在張量unsqueeze的前面增加一個單一維度。如果你想將它復制到你的一維張量中t_img，你還需要在存儲它之前flatten的輸出。layermy_embedding

其他幾件事：

您應該在上下文中進行推斷torch.no_grad()以避免計算梯度，因為您將不需要它們（請注意，model.eval()只是更改某些層的行為，如 dropout 和批歸一化，它不會禁用計算圖的構(gòu)建，但會torch.no_grad()禁用）。
我認為這只是一個復制粘貼問題，但它transforms是一個導入模塊的名稱以及一個全局變量。
o.data只是返回o.?在舊Variable界面（大約 PyTorch 0.3.1 及更早版本）中，這曾經(jīng)是必需的，但該Variable界面在 PyTorch?0.4.0中已被棄用，不再做任何有用的事情；現(xiàn)在它的使用只會造成混亂。不幸的是，許多教程仍在使用這種陳舊且不必要的界面編寫。

更新后的代碼如下：

import torch

import torchvision

import torchvision.models as models

from PIL import Image

img = Image.open("Documents/01235.png")

# Load the pretrained model

model = models.resnet18(pretrained=True)

# Use the model object to select the desired layer

layer = model._modules.get('avgpool')

# Set model to evaluation mode

model.eval()

transforms = torchvision.transforms.Compose([

? ? torchvision.transforms.Resize(256),

? ? torchvision.transforms.CenterCrop(224),

? ? torchvision.transforms.ToTensor(),

? ? torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),

])

def get_vector(image):

? ? # Create a PyTorch tensor with the transformed image

? ? t_img = transforms(image)

? ? # Create a vector of zeros that will hold our feature vector

? ? # The 'avgpool' layer has an output size of 512

? ? my_embedding = torch.zeros(512)

? ? # Define a function that will copy the output of a layer

? ? def copy_data(m, i, o):

? ? ? ? my_embedding.copy_(o.flatten())? ? ? ? ? ? ? ? ?# <-- flatten

? ? # Attach that function to our selected layer

? ? h = layer.register_forward_hook(copy_data)

? ? # Run the model on our transformed image

? ? with torch.no_grad():? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# <-- no_grad context

? ? ? ? model(t_img.unsqueeze(0))? ? ? ? ? ? ? ? ? ? ? ?# <-- unsqueeze

? ? # Detach our copy function from the layer

? ? h.remove()

? ? # Return the feature vector

? ? return my_embedding

pic_vector = get_vector(img)

反對回復 2023-06-20

qq_笑_17

TA貢獻1818條經(jīng)驗獲得超7個贊

您可以使用create_feature_extractorfrom 從torchvision.models.feature_extraction模型中提取所需層的特征。

ResNet18 中最后一個隱藏層的節(jié)點名稱flatten基本上是扁平化的 1D avgpool。你可以通過在下面的字典中添加它們來提取你想要的任何層return_nodes。

from torchvision.io import read_image

from torchvision.models import resnet18, ResNet18_Weights

from torchvision.models.feature_extraction import create_feature_extractor

# Step 1: Initialize the model with the best available weights

weights = ResNet18_Weights.DEFAULT

model = resnet18(weights=weights)

model.eval()

# Step 2: Initialize the inference transforms

preprocess = weights.transforms()

# Step 3: Create the feature extractor with the required nodes

return_nodes = {'flatten': 'flatten'}

feature_extractor = create_feature_extractor(model, return_nodes=return_nodes)

# Step 4: Load the image(s) and apply inference preprocessing transforms

image = "?"

image = read_image(image).unsqueeze(0)

model_input = preprocess(image)

# Step 5: Extract the features

features = feature_extractor(model_input)

flatten_fts = features["flatten"].squeeze()

print(flatten_fts.shape)

反對回復 2023-06-20

瀟瀟雨雨

TA貢獻1833條經(jīng)驗獲得超4個贊

model(t_img)而不是這個

在這里做——

model(t_img[None])

這將增加一個額外的維度，因此圖像將具有形狀[1,3,224,224]并且可以使用。

反對回復 2023-06-20

3 回答
0 關(guān)注
301 瀏覽

關(guān)注

添加回答

舉報

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書簽

微信客服

購課補貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動學習伙伴

公眾號

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

如何從 Pytorch 中的單個圖像中提取特征向量？

如何從 Pytorch 中的單個圖像中提取特征向量？

3 回答

添加回答

如何從 Pytorch 中的單個圖像中提取特征向量？