首頁猿問返回系列中元素的Python代碼

返回系列中元素的Python代碼

Python

DIEA 2023-05-09 09:45:47

我目前正在整理一個腳本，用于對抓取的推文進行主題建模，但我遇到了幾個問題。我希望能夠搜索一個詞的所有實例，然后返回該詞的所有實例，加上前后的詞，以便為詞的使用提供更好的上下文。我已經(jīng)標記了所有推文，并將它們添加到一個系列中，其中相對索引位置用于識別周圍的詞。我目前擁有的代碼是： myseries = pd.Series(["it", 'was', 'a', 'bright', 'cold', 'day', 'in', 'april'], index= [0,1,2,3,4,5,6,7]) def phrase(w): search_word= myseries[myseries == w].index[0] before = myseries[[search_word- 1]].index[0] after = myseries[[search_word+ 1]].index[0] print(myseries[before], myseries[search_word], myseries[after])該代碼大部分工作，但如果搜索第一個或最后一個單詞將返回錯誤，因為它超出系列的索引范圍。有沒有辦法忽略超出范圍的索引并簡單地返回范圍內的內容？當前代碼也只返回搜索詞前后的詞。我希望能夠在函數(shù)中輸入一個數(shù)字，然后返回前后的一系列單詞，但我當前的代碼是硬編碼的。有沒有辦法讓它返回指定范圍的元素？我在創(chuàng)建循環(huán)來搜索整個系列時也遇到了問題。根據(jù)我寫的內容，它要么返回第一個元素而不返回任何其他元素，要么一遍又一遍地重復打印第一個元素而不是繼續(xù)搜索。不斷重復第一個元素的令人討厭的代碼是： def ws(word): for element in tokened_df: if word == element: search_word = tokened_df[tokened_df == word].index[0] before = tokened_df[[search_word - 1]].index[0] after = tokened_df[[search_word + 1]].index[0] print(tokened_df[before], word, tokened_df[after])顯然我忽略了一些簡單的東西，但我終究無法弄清楚它是什么。我如何修改代碼，以便如果同一個詞在系列中重復出現(xiàn)，它將返回該詞的每個實例以及周圍的詞？我希望它的工作方式遵循“如果條件為真，則執(zhí)行‘短語’功能，如果不為真，則繼續(xù)執(zhí)行系列”的邏輯。

查看完整描述

2 回答

紅顏莎娜

TA貢獻1842條經(jīng)驗獲得超13個贊

是這樣的嗎？我在你的例子中添加了一個重復的詞（“明亮”）。還添加了n_before和n_after輸入周圍單詞的數(shù)量

import pandas as pd

myseries = pd.Series(["it", 'was', 'a', 'bright', 'bright', 'cold', 'day', 'in', 'april'],

index= [0,1,2,3,4,5,6,7,8])

def phrase(w, n_before=1, n_after=1):

search_words = myseries[myseries == w].index

for index in search_words:

start_index = max(index - n_before, 0)

end_index = min(index + n_after+1, myseries.shape[0])

print(myseries.iloc[start_index: end_index])

phrase("bright", n_before=2, n_after=3)

這給出：

1 was

2 a

3 bright

4 bright

5 cold

6 day

dtype: object

2 a

3 bright

4 bright

5 cold

6 day

7 in

dtype: object

反對回復 2023-05-09

茅侃侃

TA貢獻1842條經(jīng)驗獲得超22個贊

這不是很優(yōu)雅，但您可能需要一些條件來說明出現(xiàn)在短語開頭或結尾的單詞。為了解釋重復的單詞，找到重復單詞的所有實例并循環(huán)遍歷您的打印語句。對于變量myseries，我重復了這個詞cold兩次，所以應該有兩個打印語句

import pandas as pd

myseries = pd.Series(["it", 'was', 'a', 'cold', 'bright', 'cold', 'day', 'in', 'april'],

index= [0,1,2,3,4,5,6,7,8])

def phrase(w):

for i in myseries[myseries == w].index.tolist():

search_word= i

if search_word == 0:

print(myseries[search_word], myseries[i+1])

elif search_word == len(myseries)-1:

print(myseries[i-1], myseries[search_word])

else:

print(myseries[i-1], myseries[search_word], myseries[i+1])

輸出：

>>> myseries

0 it

1 was

2 a

3 cold

4 bright

5 cold

6 day

7 in

8 april

dtype: object

>>> phrase("was")

it was a

>>> phrase("cold")

a cold bright

bright cold day

反對回復 2023-05-09

2 回答
0 關注
166 瀏覽

關注

添加回答

舉報

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書簽

微信客服

購課補貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動學習伙伴

公眾號

掃描二維碼
關注慕課網(wǎng)微信公眾號

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

返回系列中元素的Python代碼

返回系列中元素的Python代碼

2 回答

添加回答