一些倒霉的同事將一些數(shù)據(jù)保存到這樣的文件中:s = b'The em dash: \xe2\x80\x94'with open('foo.txt', 'w') as f: f.write(str(s))當(dāng)他們應(yīng)該使用s = b'The em dash: \xe2\x80\x94'with open('foo.txt', 'w') as f: f.write(s.decode())現(xiàn)在foo.txt看起來像b'The em-dash: \xe2\x80\x94'代替The em dash: —我已經(jīng)將此文件作為字符串讀?。簑ith open('foo.txt') as f: bad_foo = f.read()現(xiàn)在如何將bad_foo錯(cuò)誤保存的格式轉(zhuǎn)換為正確保存的字符串?
3 回答

忽然笑
TA貢獻(xiàn)1806條經(jīng)驗(yàn) 獲得超5個(gè)贊
您可以嘗試文字 eval
from ast import literal_eval
test = r"b'The em-dash: \xe2\x80\x94'"
print(test)
res = literal_eval(test)
print(res.decode())

BIG陽
TA貢獻(xiàn)1859條經(jīng)驗(yàn) 獲得超6個(gè)贊
如果您相信輸入不是惡意的,則可以ast.literal_eval在損壞的字符串上使用。
import ast
# Create a sad broken string
s = "b'The em-dash: \xe2\x80\x94'"
# Parse and evaluate the string as raw Python source, creating a `bytes` object
s_bytes = ast.literal_eval(s)
# Now decode the `bytes` as normal
s_fixed = s_bytes.decode()
否則,您將不得不手動(dòng)解析并刪除或替換有問題的重復(fù)轉(zhuǎn)義。
添加回答
舉報(bào)
0/150
提交
取消