1 回答

TA貢獻(xiàn)1111條經(jīng)驗(yàn) 獲得超0個(gè)贊
from sklearn.pipeline import FeatureUnion, Pipeline
from sklearn.base import TransformerMixin, BaseEstimator
class CustomFeatureExtractor(BaseEstimator, TransformerMixin):
def __init__(self, feature_1=True, feature_2=True):
self.feature_1=feature_1
self.feature_2=feature_2
def extractor(self, tweet):
features = []
if self.feature_2:
features.append(df['feature_2'])
if self.feature_1:
features.append(df['feature_1'])
return np.array(features)
def fit(self, raw_docs, y):
return self
def transform(self, raw_docs):
return np.vstack(tuple([self.extractor(tweet) for tweet in raw_docs]))
下面是我嘗試將數(shù)據(jù)框放入的網(wǎng)格搜索:
lr = LogisticRegression()
# Pipeline
pipe = Pipeline([('features', FeatureUnion([("vectorizer", CountVectorizer(df['content'])),
("extractor", CustomFeatureExtractor())]))
,('classifier', lr())
])
But yields results: TypeError: 'LogisticRegression' object is not callable
想知道是否還有其他更簡(jiǎn)單的方法可以做到這一點(diǎn)?
添加回答
舉報(bào)