第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

為了賬號安全,請及時綁定郵箱和手機立即綁定
已解決430363個問題,去搜搜看,總會有你想問的

將用戶字典和其他特定單詞替換為 0

將用戶字典和其他特定單詞替換為 0

慕哥6287543 2021-09-11 15:36:05
所以我有一個評論數(shù)據(jù)集,其中有評論簡直是最好的。這是我去年買的。還在用。迄今為止沒有遇到任何問題。驚人的電池壽命。在黑暗或光天化日之下工作正常。送給任何書友的最佳禮物。(這是來自原始數(shù)據(jù)集,我已經(jīng)刪除了所有標(biāo)點符號并在我處理的數(shù)據(jù)集中使用了所有小寫字母)我想要做的是將一些單詞替換為 1(根據(jù)我的字典),將其他單詞替換為 0。我的字典是dict = {"amazing":"1","super":"1","good":"1","useful":"1","nice":"1","awesome":"1","quality":"1","resolution":"1","perfect":"1","revolutionary":"1","and":"1","good":"1","purchase":"1","product":"1","impression":"1","watch":"1","quality":"1","weight":"1","stopped":"1","i":"1","easy":"1","read":"1","best":"1","better":"1","bad":"1"}我希望我的輸出如下:0010000000000001000000000100000我用過這個代碼:df['newreviews'] = df['reviews'].map(dict).fillna("0")這總是返回 0 作為輸出。我不想要這個,所以我將 1 和 0 作為字符串,但盡管如此,我還是得到了相同的結(jié)果。任何建議如何解決這個問題?
查看完整描述

3 回答

?
慕田峪7331174

TA貢獻(xiàn)1828條經(jīng)驗 獲得超13個贊

你可以做:


# clean the sentence

import re

sent = re.sub(r'\.','',sent)


# convert to list

sent = sent.lower().split()


# get values from dict using comprehension

new_sent = ''.join([str(1) if x in mydict else str(0) for x in sent])

print(new_sent)


'001100000000000000000000100000'


查看完整回答
反對 回復(fù) 2021-09-11
?
浮云間

TA貢獻(xiàn)1829條經(jīng)驗 獲得超4個贊

首先不要dict用作變量名,因為內(nèi)置函數(shù)(python 保留字),然后使用list comprehensionwithget將不匹配的值替換為0.


注意:


如果數(shù)據(jù)是這樣的date.Amazing- 標(biāo)點符號后沒有空格需要用空格替換。


df = pd.DataFrame({'reviews':['Simply the best. I bought this last year. Still using. No problems faced till date.Amazing battery life. Works fine in darkness or broad daylight. Best gift for any book lover.']})


d = {"amazing":"1","super":"1","good":"1","useful":"1","nice":"1","awesome":"1","quality":"1","resolution":"1","perfect":"1","revolutionary":"1","and":"1","good":"1","purchase":"1","product":"1","impression":"1","watch":"1","quality":"1","weight":"1","stopped":"1","i":"1","easy":"1","read":"1","best":"1","better":"1","bad":"1"}


df['reviews']  = df['reviews'].str.replace(r'[^\w\s]+', ' ').str.lower()

df['newreviews'] = [''.join(d.get(y, '0')  for y in x.split()) for x in df['reviews']]

選擇:


df['newreviews'] =  df['reviews'].apply(lambda x: ''.join(d.get(y, '0')  for y in x.split()))

print (df)

                                             reviews  \

0  simply the best  i bought this last year  stil...   


                        newreviews  

0  0011000000000001000000000100000  


查看完整回答
反對 回復(fù) 2021-09-11
?
人到中年有點甜

TA貢獻(xiàn)1895條經(jīng)驗 獲得超7個贊

你可以通過

df.replace(repl, regex=True, inplace=True)

df你的數(shù)據(jù)框在哪里,repl你的字典在哪里。


查看完整回答
反對 回復(fù) 2021-09-11
  • 3 回答
  • 0 關(guān)注
  • 208 瀏覽
慕課專欄
更多

添加回答

舉報

0/150
提交
取消
微信客服

購課補貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動學(xué)習(xí)伙伴

公眾號

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號