首頁(yè) 猿問(wèn) Gridsearch for...

Gridsearch for NLP - 如何結(jié)合 CountVec 和其他功能？

Python

郎朗坤 2023-10-26 16:33:16

我正在做一個(gè)關(guān)于情感分析的基本 NLP 項(xiàng)目，我想使用 GridsearchCV 來(lái)優(yōu)化我的模型。下面的代碼顯示了我正在使用的示例數(shù)據(jù)框?！癈ontent”是要傳遞給 CountVectorizer 的列，“l(fā)abel”是要預(yù)測(cè)的 y 列，feature_1、feature_2 也是我希望包含在模型中的列。'content': 'Got flat way today Pot hole Another thing tick crap thing happen week list','feature_1': '1', 'feature_2': '34', 'label':1}, {'content': 'UP today Why doe head hurt badly','feature_1': '5', 'feature_2': '142', 'label':1},{'content': 'spray tan fail leg foot Ive scrubbing foot look better ', 'feature_1': '7', 'feature_2': '123', 'label':0},])我正在關(guān)注 stackoverflow 的答案：使用管道和網(wǎng)格搜索執(zhí)行功能選擇from sklearn.pipeline import FeatureUnion, Pipelinefrom sklearn.base import TransformerMixin, BaseEstimatorclass CustomFeatureExtractor(BaseEstimator, TransformerMixin): def __init__(self, feature_1=True, feature_2=True): self.feature_1=feature_1 self.feature_2=feature_2 def extractor(self, tweet): features = [] if self.feature_2: features.append(df['feature_2']) if self.feature_1: features.append(df['feature_1']) return np.array(features) def fit(self, raw_docs, y): return self def transform(self, raw_docs): return np.vstack(tuple([self.extractor(tweet) for tweet in raw_docs]))下面是我嘗試將數(shù)據(jù)框放入的網(wǎng)格搜索：lr = LogisticRegression()# Pipelinepipe = Pipeline([('features', FeatureUnion([("vectorizer", CountVectorizer(df['content'])), ("extractor", CustomFeatureExtractor())])) ,('classifier', lr()) ])But yields results: TypeError: 'LogisticRegression' object is not callable想知道是否還有其他更簡(jiǎn)單的方法可以做到這一點(diǎn)？

查看完整描述

1 回答

catspeake

TA貢獻(xiàn)1111條經(jīng)驗(yàn) 獲得超0個(gè)贊

from sklearn.pipeline import FeatureUnion, Pipeline

from sklearn.base import TransformerMixin, BaseEstimator

class CustomFeatureExtractor(BaseEstimator, TransformerMixin):

def __init__(self, feature_1=True, feature_2=True):

self.feature_1=feature_1

self.feature_2=feature_2

def extractor(self, tweet):

features = []

if self.feature_2:

features.append(df['feature_2'])

if self.feature_1:

features.append(df['feature_1'])

return np.array(features)

def fit(self, raw_docs, y):

return self

def transform(self, raw_docs):

return np.vstack(tuple([self.extractor(tweet) for tweet in raw_docs]))

下面是我嘗試將數(shù)據(jù)框放入的網(wǎng)格搜索：

lr = LogisticRegression()

# Pipeline

pipe = Pipeline([('features', FeatureUnion([("vectorizer", CountVectorizer(df['content'])),

("extractor", CustomFeatureExtractor())]))

,('classifier', lr())

])

But yields results: TypeError: 'LogisticRegression' object is not callable

想知道是否還有其他更簡(jiǎn)單的方法可以做到這一點(diǎn)？

反對(duì) 回復(fù) 2023-10-26

1 回答
0 關(guān)注
195 瀏覽

關(guān)注

添加回答

舉報(bào)

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書(shū)簽

微信客服

購(gòu)課補(bǔ)貼
聯(lián)系客服咨詢(xún)優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動(dòng)學(xué)習(xí)伙伴

公眾號(hào)

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號(hào)

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

Gridsearch for NLP - 如何結(jié)合 CountVec 和其他功能？

Gridsearch for NLP - 如何結(jié)合 CountVec 和其他功能？

1 回答

添加回答

Gridsearch for NLP - 如何結(jié)合 CountVec 和其他功能？

Gridsearch for NLP - 如何結(jié)合 CountVec 和其他功能？