首頁(yè) 猿問在 Python...

在 Python 中使用“for x in list”訪問 x+1 元素

Python

慕標(biāo)琳琳 2021-12-26 10:33:31

我正在嘗試將新行分隔的文本文件解析為行塊，這些行附加到 .txt 文件中。我希望能夠在我的結(jié)束字符串之后抓取 x 行，因?yàn)檫@些行的內(nèi)容會(huì)有所不同，這意味著設(shè)置“結(jié)束字符串”以嘗試匹配它會(huì)丟失行。文件示例："Start""...""...""...""...""---" ##End here"xxx" ##Unique data here"xxx" ##And here這是代碼first = "Start"first_end = "---"with open('testlog.log') as infile, open('parsed.txt', 'a') as outfile: copy = False for line in infile: if line.strip().startswith(first): copy = True outfile.write(line) elif line.strip().startswith(first_end): copy = False outfile.write(line) ##Want to also write next 2 lines here elif copy: outfile.write(line)有什么方法可以使用for line in infile，或者我需要使用不同類型的循環(huán)嗎？

查看完整描述

3 回答

肥皂起泡泡

TA貢獻(xiàn)1829條經(jīng)驗(yàn) 獲得超6個(gè)贊

您可以使用next或readline（在 Python 3 及更高版本中）檢索文件中的下一行：

elif line.strip().startswith(first_end):

copy = False

outfile.write(line)

outfile.write(next(infile))

或者

#note: not compatible with Python 2.7 and below

elif line.strip().startswith(first_end):

copy = False

outfile.write(line)

outfile.write(infile.readline())

這也將導(dǎo)致文件指針前進(jìn)兩行額外的行，因此下一次迭代for line in infile:將跳過您閱讀的兩行readline。

附加術(shù)語 nitpick：文件對(duì)象不是列表，訪問列表第 x+1 個(gè)元素的方法可能不適用于訪問文件的下一行，反之亦然。如果您確實(shí)想訪問正確列表對(duì)象的下一項(xiàng)，則可以使用enumerate它來對(duì)列表的索引執(zhí)行算術(shù)運(yùn)算。例如：

seq = ["foo", "bar", "baz", "qux", "troz", "zort"]

#find all instances of "baz" and also the first two elements after "baz"

for idx, item in enumerate(seq):

if item == "baz":

print(item)

print(seq[idx+1])

print(seq[idx+2])

請(qǐng)注意，與不同readline，索引不會(huì)推進(jìn)迭代器，因此for idx, item in enumerate(seq):仍會(huì)迭代“qux”和“troz”。

適用于任何迭代的方法是使用附加變量來跟蹤迭代中的狀態(tài)。這樣做的好處是您不必了解如何手動(dòng)推進(jìn)迭代；缺點(diǎn)是對(duì)循環(huán)內(nèi)的邏輯進(jìn)行推理比較困難，因?yàn)樗┞读祟~外的副作用。

first = "Start"

first_end = "---"

with open('testlog.log') as infile, open('parsed.txt', 'a') as outfile:

copy = False

num_items_to_write = 0

for line in infile:

if num_items_to_write > 0:

outfile.write(line)

num_items_to_write -= 1

elif line.strip().startswith(first):

copy = True

outfile.write(line)

elif line.strip().startswith(first_end):

copy = False

outfile.write(line)

num_items_to_write = 2

elif copy:

outfile.write(line)

在從分隔文件中提取重復(fù)數(shù)據(jù)組的特定情況下，完全跳過迭代并使用正則表達(dá)式可能是合適的。對(duì)于像您這樣的數(shù)據(jù)，可能如下所示：

import re

with open("testlog.log") as file:

data = file.read()

pattern = re.compile(r"""

^Start$ #"Start" by itself on a line

(?:\n.*$)*? #zero or more lines, matched non-greedily

#use (?:) for all groups so `findall` doesn't capture them later

\n---$ #"---" by itself on a line

(?:\n.*$){2} #exactly two lines

""", re.MULTILINE | re.VERBOSE)

#equivalent one-line regex:

#pattern = re.compile("^Start$(?:\n.*$)*?\n---$(?:\n.*$){2}", re.MULTILINE)

for group in pattern.findall(data):

print("Found group:")

print(group)

print("End of group.\n\n")

在日志上運(yùn)行時(shí)，如下所示：

Start

foo

bar

baz

qux

---

troz

zort

alice

bob

carol

dave

Start

Fred

Barney

---

Wilma

Betty

Pebbles

...這將產(chǎn)生輸出：

Found group:

Start

foo

bar

baz

qux

---

troz

zort

End of group.

Found group:

Start

Fred

Barney

---

Wilma

Betty

End of group.

反對(duì) 回復(fù) 2021-12-26

慕村225694

TA貢獻(xiàn)1880條經(jīng)驗(yàn) 獲得超4個(gè)贊

最簡(jiǎn)單的方法是制作一個(gè)解析 infile 的生成器函數(shù)：

def read_file(file_handle, start_line, end_line, extra_lines=2):

start = False

while True:

try:

line = next(file_handle)

except StopIteration:

return

if not start and line.strip().startswith(start_line):

start = True

yield line

elif not start:

continue

elif line.strip().startswith(end_line):

yield line

try:

for _ in range(extra_lines):

yield next(file_handle)

except StopIteration:

return

else:

yield line

try-except如果您知道每個(gè)文件都是格式良好的，則不需要這些子句。

你可以像這樣使用這個(gè)生成器：

if __name__ == "__main__":

first = "Start"

first_end = "---"

with open("testlog.log") as infile, open("parsed.txt", "a") as outfile:

output = read_file(

file_handle=infile,

start_line=first,

end_line=first_end,

extra_lines=1,

)

outfile.writelines(output)

反對(duì) 回復(fù) 2021-12-26

紅顏莎娜

TA貢獻(xiàn)1842條經(jīng)驗(yàn) 獲得超13個(gè)贊

具有三態(tài)變量和更少的代碼重復(fù)。

first = "Start"

first_end = "---"

# Lines to read after end flag

extra_count = 2

with open('testlog.log') as infile, open('parsed.txt', 'a') as outfile:

# Do no copy by default

copy = 0

for line in infile:

# Strip once only

clean_line = line.strip()

# Enter "infinite copy" state

if clean_line.startswith(first):

copy = -1

# Copy next line and extra amount

elif clean_line.startswith(first_end):

copy = extra_count + 1

# If in a "must-copy" state

if copy != 0:

# One less line to copy if end flag passed

if copy > 0:

copy -= 1

# Copy current line

outfile.write(line)

反對(duì) 回復(fù) 2021-12-26

3 回答
0 關(guān)注
534 瀏覽

關(guān)注

添加回答

舉報(bào)

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書簽

微信客服

購(gòu)課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動(dòng)學(xué)習(xí)伙伴

公眾號(hào)

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號(hào)

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

在 Python 中使用“for x in list”訪問 x+1 元素

在 Python 中使用“for x in list”訪問 x+1 元素

3 回答

添加回答