第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

為了賬號(hào)安全,請(qǐng)及時(shí)綁定郵箱和手機(jī)立即綁定
已解決430363個(gè)問(wèn)題,去搜搜看,總會(huì)有你想問(wèn)的

有沒(méi)有辦法創(chuàng)建 XML 元素樹(shù)?

有沒(méi)有辦法創(chuàng)建 XML 元素樹(shù)?

慕的地6264312 2024-01-24 16:19:33
我目前正在編寫(xiě)一些 XSD 和 DTD 來(lái)驗(yàn)證一些 XML 文件,我正在手工編寫(xiě)它們,因?yàn)槲以谑褂?XSD 生成器(例如 Oxygen)時(shí)有過(guò)非常糟糕的體驗(yàn)。但是,我已經(jīng)有一個(gè)需要執(zhí)行此操作的示例 XML,并且該 XML 非常巨大,例如,我有一個(gè)包含 4312 個(gè)子元素的元素。由于我對(duì) XSD 生成器的體驗(yàn)非常糟糕,因此我想創(chuàng)建一種僅包含唯一標(biāo)簽和屬性的 XML 樹(shù),這樣在查看要編寫(xiě)的 XML 時(shí)我不必處理重復(fù)元素一個(gè)XSD。我的意思是,我有這個(gè) XML(由 W3 提供):<?xml version="1.0" encoding="UTF-8"?><breakfast_menu><food some_attribute="1.0">    <name>Belgian Waffles</name>    <price>$5.95</price>    <description>   Two of our famous Belgian Waffles with plenty of real maple syrup   </description>    <calories>650</calories></food><food>    <name>Strawberry Belgian Waffles</name>    <price>$7.95</price>    <description>    Light Belgian waffles covered with strawberries and whipped cream    </description>    <calories>900</calories></food><food>    <name>Berry-Berry Belgian Waffles</name>    <price>$8.95</price>    <description>    Belgian waffles covered with assorted fresh berries and whipped cream    </description>    <calories>900</calories></food><food>    <name>French Toast</name>    <price>$4.50</price>    <description>    Thick slices made from our homemade sourdough bread    </description>    <calories>600</calories>    <some_complex_type_element_1>      <some_simple_type_element_1>Text.</some_simple_type_element_1>    </some_complex_type_element_1></food><food>    <name>Homestyle Breakfast</name>    <price>$6.95</price>    <description>    Two eggs, bacon or sausage, toast, and our ever-popular hash browns    </description>    <calories>950</calories>    <some_simple_type_element_2>Text.</some_simple_type_element_2></food></breakfast_menu>正如您所看到的,根元素下有 4 種類(lèi)型的獨(dú)特元素。這些都是:元素 1(有屬性),元素 2 和 3,元素 4(有另一個(gè)復(fù)雜類(lèi)型元素),元素 5(有另一個(gè) simpleType 元素)。我想要實(shí)現(xiàn)的是此 XML 的某種樹(shù)表示,但僅包含唯一元素且不包含文本。
查看完整描述

1 回答

?
小唯快跑啊

TA貢獻(xiàn)1863條經(jīng)驗(yàn) 獲得超2個(gè)贊

看看這是否滿足您的需求。


from simplified_scrapy import SimplifiedDoc, utils


xml = '''

<?xml version="1.0" encoding="UTF-8"?>

<breakfast_menu>

    <food some_attribute="1.0">

        <name>Belgian Waffles</name>

        <price>$5.95</price>

        <description>

    Two of our famous Belgian Waffles with plenty of real maple syrup

    </description>

        <calories>650</calories>

    </food>

    <food>

        <name>Strawberry Belgian Waffles</name>

        <price>$7.95</price>

        <description>

        Light Belgian waffles covered with strawberries and whipped cream

        </description>

        <calories>900</calories>

    </food>

    <food>

        <name>Berry-Berry Belgian Waffles</name>

        <price>$8.95</price>

        <description>

        Belgian waffles covered with assorted fresh berries and whipped cream

        </description>

        <calories>900</calories>

    </food>

    <food>

        <name>French Toast</name>

        <price>$4.50</price>

        <description>

        Thick slices made from our homemade sourdough bread

        </description>

        <calories>600</calories>

        <some_complex_type_element_1>

        <some_simple_type_element_1>Text.</some_simple_type_element_1>

        </some_complex_type_element_1>

    </food>

    <food>

        <name>Homestyle Breakfast</name>

        <price>$6.95</price>

        <description>

        Two eggs, bacon or sausage, toast, and our ever-popular hash browns

        </description>

        <calories>950</calories>

        <some_simple_type_element_2>Text.</some_simple_type_element_2>

    </food>

</breakfast_menu>

'''


def loop(node):

    para = {}

    for k in node:

        if k=='tag' or k=='html': continue

        para[k] = ''

    if para: node.setAttrs(para) # Remove attributes

    children = node.children

    if children:

        for c in children:

            loop(c)

    else:

        if node.text:

            node.setContent('') # Remove value


doc = SimplifiedDoc(xml)

# Remove values and attributes

loop(doc.breakfast_menu)


dicNode = {}

for node in doc.breakfast_menu.children:

    key = node.outerHtml

    if dicNode.get(key):

        node.remove() # Delete duplicate

    else:

        dicNode[key] = True


print(doc.html)

結(jié)果:


<?xml version="1.0" encoding="UTF-8"?>

<breakfast_menu>

    <food some_attribute="">

        <name></name>

        <price></price>

        <description></description>

        <calories></calories>

    </food>

    <food>

        <name></name>

        <price></price>

        <description></description>

        <calories></calories>

    </food>

    <food>

        <name></name>

        <price></price>

        <description></description>

        <calories></calories>

        <some_complex_type_element_1>

        <some_simple_type_element_1></some_simple_type_element_1>

        </some_complex_type_element_1>

    </food>

    <food>

        <name></name>

        <price></price>

        <description></description>

        <calories></calories>

        <some_simple_type_element_2></some_simple_type_element_2>

    </food>

</breakfast_menu>

對(duì)于大文件,請(qǐng)嘗試以下方法。


from simplified_scrapy import SimplifiedDoc, utils

from simplified_scrapy.core.regex_helper import replaceReg


filePath = 'test.xml'

doc = SimplifiedDoc()

doc.loadFile(filePath, lineByline=True)


utils.appendFile('dest.xml','<?xml version="1.0" encoding="UTF-8"?><breakfast_menu>')

dicNode = {}

for node in doc.getIterable('food'):

    key = node.outerHtml

    key = replaceReg(key, '>[^>]*?<', '><')

    key = replaceReg(key, '"[^"]*?"', '""')


    if not dicNode.get(key):

        dicNode[key] = True

        utils.appendFile('dest.xml', key)



utils.appendFile('dest.xml', '</breakfast_menu>')


查看完整回答
反對(duì) 回復(fù) 2024-01-24
  • 1 回答
  • 0 關(guān)注
  • 175 瀏覽
慕課專欄
更多

添加回答

舉報(bào)

0/150
提交
取消
微信客服

購(gòu)課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動(dòng)學(xué)習(xí)伙伴

公眾號(hào)

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號(hào)