如何在除#字符之外的任何標點符號和空格處拆分字符串?tweet="I went on #Russia to see the world cup. We lost!"我想這樣分割下面的字符串:["I", "went", "to", "#Russia", "to, "see", "the", "world", "cup", "We","lost"]我的嘗試:p = re.compile(r"\w+|[^\w\s]", re.UNICODE)由于它創(chuàng)建的是“ Russia”而不是“ #Russia”,因此不起作用
3 回答

守候你守候我
TA貢獻1802條經(jīng)驗 獲得超10個贊
具有re.findall功能:
tweet="I went on #Russia to see the world cup. We lost!"
words = re.findall(r'[\w#]+', tweet)
print(words)
輸出:
['I', 'went', 'on', '#Russia', 'to', 'see', 'the', 'world', 'cup', 'We', 'lost']

牧羊人nacy
TA貢獻1862條經(jīng)驗 獲得超7個贊
使用 re.sub
前任:
import re
tweet="I went on #Russia to see the world cup. We lost!"
res = list(map(lambda x: re.sub("[^\w#]", "", x), tweet.split()))
print(res)
輸出:
['I', 'went', 'on', '#Russia', 'to', 'see', 'the', 'world', 'cup', 'We', 'lost']
添加回答
舉報
0/150
提交
取消