首頁猿問使用 re 從 txt 文件制作字典

使用 re 從 txt 文件制作字典

Python

慕哥6287543 2023-09-12 16:36:48

考慮 asset/logdata.txt 中的標(biāo)準(zhǔn) Web 日志文件。該文件記錄用戶在訪問網(wǎng)頁時(shí)進(jìn)行的訪問（就像這個(gè)?。Ｈ罩镜拿恳恍卸加幸韵马?xiàng)目：主機(jī)（例如，'146.204.224.152'）user_name（例如，'feest6811'注意：有時(shí)用戶名會(huì)丟失！在這種情況下，請(qǐng)使用“-”作為用戶名的值。）提出請(qǐng)求的時(shí)間（例如，'21/Jun/2019:15:45:24 -0700'）post 請(qǐng)求類型（例如，'POST /incentivize HTTP/1.1'注意：并非所有內(nèi)容都是 POST?。┠娜蝿?wù)是將其轉(zhuǎn)換為字典列表，其中每個(gè)字典如下所示：example_dict = {"host":"146.204.224.152", "user_name":"feest6811", "time":"21/Jun/2019:15:45:24 -0700", "request":"POST /incentivize HTTP/1.1"}這是 txt 數(shù)據(jù)文件的示例。我寫了這幾行代碼：import redef logs(): with open("assets/logdata.txt", "r") as file: logdata = file.read() #print(logdata) pattern=""" (?P<host>.*) (-\s) (?P<user_name>\w*) (\s) ([POST]*) (?P<time>\w*) """ for item in re.finditer(pattern,logdata,re.VERBOSE): print(item.groupdict()) return(item)logs()它幫助我完成了任務(wù)"host"，"user_name"但是我無法繼續(xù)完成其余的要求。有人可以幫忙嗎？

查看完整描述

4 回答

呼喚遠(yuǎn)方

TA貢獻(xiàn)1856條經(jīng)驗(yàn) 獲得超11個(gè)贊

試試這個(gè)我的朋友

import re

def logs():

logs = []

w = '(?P<host>(?:\d+\.){3}\d+)\s+(?:\S+)\s+(?P<user_name>\S+)\s+\[(?P<time>[-+\w\s:/]+)\]\s+"(?P<request>.+?.+?)"'

with open("assets/logdata.txt", "r") as f:

logdata = f.read()

for m in re.finditer(w, logdata):

logs.append(m.groupdict())

return logs

反對(duì) 回復(fù) 2023-09-12

千萬里不及你

TA貢獻(xiàn)1784條經(jīng)驗(yàn) 獲得超9個(gè)贊

請(qǐng)看下面的代碼：

import re

regex = re.compile(

r'(?P<host>(?:\d+\.){1,3}\d+)\s+-\s+'

r'(?P<user_name>[\w+\-]+)?\s+'

r'\[(?P<time>[-\w\s:/]+)\]\s+'

r'"(?P<request>\w+.+?)"'

)

def logs():

data = []

with open("assets/logdata.txt", "r") as f:

logdata = f.read()

for item in regex.finditer(logdata):

x = item.groupdict()

if x["user_name"] is None:

x["user_name"] = "-"

data.append(x)

return data

logs()

請(qǐng)?jiān)谙旅嬲业捷敵霾糠郑?/p>

[{'host': '146.204.224.152', 'user_name': 'feest6811', 'time': '21/Jun/2019:15:45:24 -0700', 'request': 'POST /incentivize HTTP/ 1.1'}, {'主機(jī)': '197.109.77.178', '用戶名': 'kertzmann3129', '時(shí)間': '21/Jun/2019:15:45:25 -0700', '請(qǐng)求': '刪除/ virtual/solutions/target/web+services HTTP/2.0'}, {'host': '156.127.178.177', 'user_name': 'okuneva5222', 'time': '21/Jun/2019:15:45:27 -0700', '請(qǐng)求': '刪除/interactive/transparent/niches/revolutionize HTTP/1.1'}, {'主機(jī)': '100.32.205.59', '用戶名': 'ortiz8891', '時(shí)間': '21/ Jun/2019:15:45:28 -0700', 'request': 'PATCH /architectures HTTP/1.0'}, {'主機(jī)': '168.95.156.240', '用戶名': 'stark2413', '時(shí)間': '21/Jun/2019:15:45:31 -0700', '請(qǐng)求': 'GET /參與 HTTP/2.0'}, .....] 文本文件的每一行有 979 個(gè)字典。

反對(duì) 回復(fù) 2023-09-12

阿波羅的戰(zhàn)車

TA貢獻(xiàn)1862條經(jīng)驗(yàn) 獲得超6個(gè)贊

import re

def logs():

mydata = []

with open("assets/logdata.txt", "r") as file:

logdata = file.read()

pattern="""

(?P<host>.*)

(\s+)

(?:\S+)

(\s+)

(?P<user_name>\S+)

(\s+)

\[(?P<time>.*)\]\

(\s)

(?P<request>"(.)*")"""

for item in re.finditer(pattern,logdata,re.VERBOSE):

new_item = (item.groupdict())

mydata.append(new_item)

return(mydata)

反對(duì) 回復(fù) 2023-09-12

繁星淼淼

TA貢獻(xiàn)1775條經(jīng)驗(yàn) 獲得超11個(gè)贊

您正在使用\wget?user_names，但\w不包括-可以在日志中的內(nèi)容（通用日志格式（CLF）），因此您可以使用\S+（除空格之外的一個(gè)或多個(gè)任何內(nèi)容）作為替代方案。對(duì)于time您可以創(chuàng)建一個(gè)捕獲組，僅允許該字段的預(yù)期字符（類）（例如\w\s，-+時(shí)區(qū)、/日期和:時(shí)間）用方括號(hào)（文字）括起來，可以為request使用".

import re

regex = re.compile(

? ? r'(?P<host>(?:\d+\.){3}\d+)\s+'

? ? r'(?:\S+)\s+'

? ? r'(?P<user_name>\S+)\s+'

? ? r'\[(?P<time>[-+\w\s:/]+)\]\s+'

? ? r'"(?P<request>POST.+?)"'

)

def logs():

? ? data = []

? ? with open("sample.txt", "r") as f:

? ? ? ? logdata = f.read()

? ? for m in regex.finditer(logdata):

? ? ? ? data.append(m.groupdict())

? ? return data

print(logs())

（將第一行中的 user_name 替換為“-”以在第二行進(jìn)行測試）

[

? ?{

? ? ? "host":"146.204.224.152",

? ? ? "user_name":"feest6811",

? ? ? "time":"21/Jun/2019:15:45:24 -0700",

? ? ? "request":"POST /incentivize HTTP/l.l"

? ?},

? ?{

? ? ? "host":"146.204.224.152",

? ? ? "user_name":"-",

? ? ? "time":"21/Jun/2019:15:45:24 -0700",

? ? ? "request":"POST /incentivize HTTP/l.l"

? ?},

? ?{

? ? ? "host":"144.23.247.108",

? ? ? "user_name":"auer7552",

? ? ? "time":"21/Jun/2019:15:45:35 -0700",

? ? ? "request":"POST /extensible/infrastructures/one-to-one/enterprise HTTP/l.l"

? ?},

? ? ...

反對(duì) 回復(fù) 2023-09-12

4 回答
0 關(guān)注
201 瀏覽

關(guān)注

添加回答

舉報(bào)

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書簽

微信客服

購課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動(dòng)學(xué)習(xí)伙伴

公眾號(hào)

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號(hào)

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

使用 re 從 txt 文件制作字典

使用 re 從 txt 文件制作字典

4 回答

添加回答