我正在嘗試從 CNBC 網(wǎng)站獲取非?;镜那榫w分析。我把這段代碼放在一起,效果很好。from bs4 import BeautifulSoupimport urllib.requestfrom pandas import DataFrameresp = urllib.request.urlopen("https://www.cnbc.com/finance/")soup = BeautifulSoup(resp, from_encoding=resp.info().get_param('charset')) substring = 'https://www.cnbc.com/'df = ['review']for link in soup.find_all('a', href=True): print(link['href']) if (link['href'].find(substring) == 0): # append df.append(link['href'])#print(link['href'])#list(df)# convert list to data framedf = DataFrame(df)#type(df)#list(df)# add column namedf.columns = ['review']# clean updf['review'] = df['review'].str.replace('\d+', '')# Get rid of special charactersdf['review'] = df['review'].str.replace(r'[^\w\s]+', '')from nltk.sentiment.vader import SentimentIntensityAnalyzersid = SentimentIntensityAnalyzer()df['sentiment'] = df['review'].apply(lambda x: sid.polarity_scores(x))def convert(x): if x < 0: return "negative" elif x > .2: return "positive" else: return "neutral"df['result'] = df['sentiment'].apply(lambda x:convert(x['compound']))df['result']當(dāng)我運(yùn)行上面的代碼時(shí),我得到了正面和負(fù)面的信息,但這些并沒有映射到原始的“評(píng)論”。如何在數(shù)據(jù)框中、每個(gè)鏈接的語言旁邊顯示每種情緒?謝謝!
添加回答
舉報(bào)
0/150
提交
取消