第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

為了賬號安全,請及時綁定郵箱和手機立即綁定
已解決430363個問題,去搜搜看,總會有你想問的

來自 Python 文本文件的字典

來自 Python 文本文件的字典

縹緲止盈 2021-11-30 18:24:39
問題:我有一個這種格式的txt文件:Intestinal infectious diseases (001-003)  001 Cholera  002 Fever  003 Salmonella   Zoonotic bacterial diseases (020-022)  020 Plague  021 Tularemia  022 Anthrax  External Cause Status (E000)  E000 External cause status  Activity (E001-E002)  E001 Activities involving x and y  E002 Other activities其中以 3-integer code/E+3-integer code/V+3-integer code 開頭的每一行都是前面標題的值,它們是我字典的鍵。在我見過的其他問題中,可以使用列或冒號來解析每一行以生成鍵/值對,但是我的 txt 文件的格式不允許我這樣做。有沒有辦法將這樣的txt文件制作成字典,其中鍵是組名,值是代碼+疾病名稱?我還需要將代碼和疾病名稱解析到第二個字典中,所以我最終得到一個包含組名作為鍵的字典,值是第二個字典,代碼作為鍵,疾病名稱作為值。腳本:def process_file(filename):    myDict={}        f = open(filename, 'r')        for line in f:            if line[0] is not int:                if line.startswith("E"):                    if line[1] is int:                        line = dictionary1_values                    else:                        break                else:                    line = dictionary1_key            myDict[dictionary1_key].append[line]所需的輸出格式是:{"Intestinal infectious diseases (001-003)": {"001": "Cholera", "002": "Fever", "003": "Salmonella"}, "Zoonotic bacterial diseases (020-022)": {"020": "Plague", "021": "Tularemia", "022": "Anthrax"}, "External Cause Status (E000)": {"E000": "External cause status"}, "Activity (E001-E002)": {"E001": "Activities involving x and y", "E002": "Other activities"}}
查看完整描述

3 回答

?
慕桂英4014372

TA貢獻1871條經(jīng)驗 獲得超13個贊

def process_file(filename):

    myDict = {}

    rootkey = None

    f = open(filename, 'r')

    for line in f:

        if line[1:3].isdigit():           # if the second and third character from the checked string (line) is the ASCII Code in range 0x30..0x39 ("0".."9"), i.e.: str.isdigit()

            subkey, data = line.rstrip().split(" ",1)     # split into two parts... the first one is the number with or without "E" at begin

            myDict[rootkey][subkey] = data

        else:

            rootkey = line.rstrip()       # str.rstrip() is used to delete newlines (or another so called "empty spaces")

            myDict[rootkey] = {}          # prepare a new empty rootkey into your myDict

    f.close()

    return myDict

在 Python 控制臺中測試:


>>> d = process_file('/tmp/file.txt')

>>>

>>> d['Intestinal infectious diseases (001-003)']

{'003': 'Salmonella', '002': 'Fever', '001': 'Cholera'}

>>> d['Intestinal infectious diseases (001-003)']['002']

'Fever'

>>> d['Activity (E001-E002)']

{'E001': 'Activities involving x and y', 'E002': 'Other activities'}

>>> d['Activity (E001-E002)']['E001']

'Activities involving x and y'

>>>

>>> d

{'Activity (E001-E002)': {'E001': 'Activities involving x and y', 'E002': 'Other activities'}, 'External Cause Status (E000)': {'E000': 'External cause status'}, 'Intestinal infectious diseases (001-003)': {'003': 'Salmonella', '002': 'Fever', '001': 'Cholera'}, 'Zoonotic bacterial diseases (020-022)': {'021': 'Tularemia', '020': 'Plague', '022': 'Anthrax'}}

警告:文件中的第一行必須是“rootkey”!不是“子密鑰”或數(shù)據(jù)!否則原因可能是引發(fā)錯誤:-)


注意:也許您應該刪除第一個“E”字符。還是做不到?你需要把這個“E”字符留在某個地方嗎?


查看完整回答
反對 回復 2021-11-30
?
陪伴而非守候

TA貢獻1757條經(jīng)驗 獲得超8個贊

一種解決方案是使用正則表達式來幫助您表征和解析您可能在此文件中遇到的兩種類型的行:


import re

header_re = re.compile(r'([\w\s]+) \(([\w\s\-]+)\)')

entry_re = re.compile(r'([EV]?\d{3}) (.+)')

這使您可以非常輕松地檢查遇到的線路類型,并根據(jù)需要將其分開:


# Check if a line is a header:

header = header_re.match(line)

if header:

    header_name, header_codes = header.groups()  # e.g. ('Intestinal infectious diseases', '001-009')

    # Do whatever you need to do when you encounter a new group

    # ...

else:

    entry = entry_re.match(line)

    # If the line wasn't a header, it ought to be an entry,

    # otherwise we've encountered something we didn't expect

    assert entry is not None

    entry_number, entry_name = entry.groups()  # e.g. ('001', 'Cholera')

    # Do whatever you need to do when you encounter an entry in a group

    # ...

使用它來重新工作您的功能,我們可以編寫以下內(nèi)容:


import re


def process_file(filename):

    header_re = re.compile(r'([\w\s]+) \(([\w\s\-]+)\)')

    entry_re = re.compile(r'([EV]?\d{3}) (.+)')


    all_groups = {}

    current_group = None


    with open(filename, 'r') as f:

        for line in f:

            # Check if a line is a header:

            header = header_re.match(line)

            if header:

                current_group = {}

                all_groups[header.group(0)] = current_group

            else:

                entry = entry_re.match(line)

                # If the line wasn't a header, it ought to be an entry,

                # otherwise we've encountered something we didn't expect

                assert entry is not None

                entry_number, entry_name = entry.groups()  # e.g. ('001', 'Cholera')


                current_group[entry_number] = entry_name


    return all_groups


查看完整回答
反對 回復 2021-11-30
?
守著一只汪

TA貢獻1872條經(jīng)驗 獲得超4個贊

嘗試使用正則表達式來確定它是標題還是疾病


import re

mydict = {}

with open(filename, "r") as f:

    header = None

    for line in f:

        match_desease = re.match(r"(E?\d\d\d) (.*)", line)

        if not match_desease:

            header = line

        else:

            code = match_desease.group(1)

            desease = match_desease.group(2)

            mydict[header][code] = desease


查看完整回答
反對 回復 2021-11-30
  • 3 回答
  • 0 關注
  • 242 瀏覽
慕課專欄
更多

添加回答

舉報

0/150
提交
取消
微信客服

購課補貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動學習伙伴

公眾號

掃描二維碼
關注慕課網(wǎng)微信公眾號