3 回答

TA貢獻(xiàn)1842條經(jīng)驗(yàn) 獲得超13個(gè)贊
循環(huán)輸入文件,使用rpl-idas 鍵分組到字典并將值附加到列表中:
rpl_dict = {}
with open('rpl_input.txt') as rpl_input_file:
lines = rpl_input_file.readlines()
for line in lines:
# Fetching current `rpl-id`
if line.startswith('>rpl'):
rpl_key = line.strip()
# Fetching current `rpl-value`
else:
rpl_value = line.strip()
# Appending current `rpl-value`
if rpl_key not in rpl_dict.keys():
rpl_dict[rpl_key] = []
rpl_dict[rpl_key].append(rpl_value)
# {'>rpl-7': ['ATGGCTCCAAC', 'AAGAAAGTGCCACAGGTTCCAGAAAC'], '>rpl-8': ['AAGAACAAGGAGAAGAAGACCCAATACTTCAAGCGTGC', 'GCTCTCCAGATCCTCCGTCTTCGTCAGATCAA', 'AAGTTCAACATCATCTGTCTTGAGGA']}
print(rpl_dict)
with open('rpl_output.txt', 'w') as rpl_output_file:
for rpl_id, rpl_values in rpl_dict.items():
rpl_output_file.write(f'{rpl_key}\n')
for v in rpl_values:
rpl_output_file.write(f'{v}\n')
輸出文件:
>rpl-8
ATGGCTCCAAC
AAGAAAGTGCCACAGGTTCCAGAAAC
>rpl-8
AAGAACAAGGAGAAGAAGACCCAATACTTCAAGCGTGC
GCTCTCCAGATCCTCCGTCTTCGTCAGATCAA
AAGTTCAACATCATCTGTCTTGAGGA

TA貢獻(xiàn)1848條經(jīng)驗(yàn) 獲得超6個(gè)贊
這是另一個(gè)解決方案,
input_ = """>rpl-7
ATGGCTCCAAC
>rpl-7
AAGAAAGTGCCACAGGTTCCAGAAAC
>rpl-8
AAGAACAAGGAGAAGAAGACCCAATACTTCAAGCGTGC
>rpl-8
GCTCTCCAGATCCTCCGTCTTCGTCAGATCAA
>rpl-8
AAGTTCAACATCATCTGTCTTGAGGA"""
results = {}
lines = input_.splitlines()
for i, j in zip(lines[::2], lines[1::2]):
results.setdefault(i, []).append(j)
for i, j in results.items():
print(i)
print("\n".join(j))
>rpl-7
ATGGCTCCAAC
AAGAAAGTGCCACAGGTTCCAGAAAC
>rpl-8
AAGAACAAGGAGAAGAAGACCCAATACTTCAAGCGTGC
GCTCTCCAGATCCTCCGTCTTCGTCAGATCAA
AAGTTCAACATCATCTGTCTTGAGGA

TA貢獻(xiàn)1934條經(jīng)驗(yàn) 獲得超2個(gè)贊
您可以使用正則表達(dá)式來執(zhí)行此操作。由于您提到文件,我添加了新行字符,您可以將其替換為文件的內(nèi)容。
import re
regex = r'rpl-\d\n.*(?:$|\n)'
dic = {}
test_str = (">rpl-7\n"
"ATGGCTCCAAC\n"
">rpl-7\n"
"AAGAAAGTGCCACAGGTTCCAGAAAC\n"
">rpl-8\n"
"AAGAACAAGGAGAAGAAGACCCAATACTTCAAGCGTGC\n"
">rpl-8\n"
"GCTCTCCAGATCCTCCGTCTTCGTCAGATCAA\n"
">rpl-8\n"
"AAGTTCAACATCATCTGTCTTGAGGA\n")
matches = re.finditer(regex, test_str, re.MULTILINE)
for match in matches:
rpl,pro = match.group().split('\n')
if rpl in dic:
dic[rpl] = dic[rpl]+pro
else:
dic[rpl] = pro
輸出:
{'rpl-7': 'ATGGCTCCAACAAGAAAGTGCCACAGGTTCCAGAAAC',
'rpl-8': 'AAGAACAAGGAGAAGAAGACCCAATACTTCAAGCGTGCGCTCTCCAGATCCTCCGTCTTCGTCAGATCAAAAGTTCAACATCATCTGTCTTGAGGA'}
添加回答
舉報(bào)