首頁(yè) 猿問(wèn) 有沒(méi)有辦法創(chuàng)建 XML 元素樹(shù)？

有沒(méi)有辦法創(chuàng)建 XML 元素樹(shù)？

Python

慕的地6264312 2024-01-24 16:19:33

我目前正在編寫(xiě)一些 XSD 和 DTD 來(lái)驗(yàn)證一些 XML 文件，我正在手工編寫(xiě)它們，因?yàn)槲以谑褂?XSD 生成器（例如 Oxygen）時(shí)有過(guò)非常糟糕的體驗(yàn)。但是，我已經(jīng)有一個(gè)需要執(zhí)行此操作的示例 XML，并且該 XML 非常巨大，例如，我有一個(gè)包含 4312 個(gè)子元素的元素。由于我對(duì) XSD 生成器的體驗(yàn)非常糟糕，因此我想創(chuàng)建一種僅包含唯一標(biāo)簽和屬性的 XML 樹(shù)，這樣在查看要編寫(xiě)的 XML 時(shí)我不必處理重復(fù)元素一個(gè)XSD。我的意思是，我有這個(gè) XML（由 W3 提供）：<?xml version="1.0" encoding="UTF-8"?><breakfast_menu><food some_attribute="1.0"> <name>Belgian Waffles</name> <price>$5.95</price> <description> Two of our famous Belgian Waffles with plenty of real maple syrup </description> <calories>650</calories></food><food> <name>Strawberry Belgian Waffles</name> <price>$7.95</price> <description> Light Belgian waffles covered with strawberries and whipped cream </description> <calories>900</calories></food><food> <name>Berry-Berry Belgian Waffles</name> <price>$8.95</price> <description> Belgian waffles covered with assorted fresh berries and whipped cream </description> <calories>900</calories></food><food> <name>French Toast</name> <price>$4.50</price> <description> Thick slices made from our homemade sourdough bread </description> <calories>600</calories> <some_complex_type_element_1> <some_simple_type_element_1>Text.</some_simple_type_element_1> </some_complex_type_element_1></food><food> <name>Homestyle Breakfast</name> <price>$6.95</price> <description> Two eggs, bacon or sausage, toast, and our ever-popular hash browns </description> <calories>950</calories> <some_simple_type_element_2>Text.</some_simple_type_element_2></food></breakfast_menu>正如您所看到的，根元素下有 4 種類(lèi)型的獨(dú)特元素。這些都是：元素 1（有屬性），元素 2 和 3，元素 4（有另一個(gè)復(fù)雜類(lèi)型元素），元素 5（有另一個(gè) simpleType 元素）。我想要實(shí)現(xiàn)的是此 XML 的某種樹(shù)表示，但僅包含唯一元素且不包含文本。

查看完整描述

1 回答

小唯快跑啊

TA貢獻(xiàn)1863條經(jīng)驗(yàn) 獲得超2個(gè)贊

看看這是否滿足您的需求。

from simplified_scrapy import SimplifiedDoc, utils

xml = '''

<?xml version="1.0" encoding="UTF-8"?>

<breakfast_menu>

<name>Belgian Waffles</name>

Two of our famous Belgian Waffles with plenty of real maple syrup

</description>

</food>

<food>

<name>Strawberry Belgian Waffles</name>

Light Belgian waffles covered with strawberries and whipped cream

</description>

</food>

<food>

<name>Berry-Berry Belgian Waffles</name>

Belgian waffles covered with assorted fresh berries and whipped cream

</description>

</food>

<food>

<name>French Toast</name>

Thick slices made from our homemade sourdough bread

</description>

<some_complex_type_element_1>

<some_simple_type_element_1>Text.</some_simple_type_element_1>

</some_complex_type_element_1>

</food>

<food>

<name>Homestyle Breakfast</name>

Two eggs, bacon or sausage, toast, and our ever-popular hash browns

</description>

<some_simple_type_element_2>Text.</some_simple_type_element_2>

</food>

</breakfast_menu>

'''

def loop(node):

para = {}

for k in node:

if k=='tag' or k=='html': continue

para[k] = ''

if para: node.setAttrs(para) # Remove attributes

children = node.children

if children:

for c in children:

loop(c)

else:

if node.text:

node.setContent('') # Remove value

doc = SimplifiedDoc(xml)

# Remove values and attributes

loop(doc.breakfast_menu)

dicNode = {}

for node in doc.breakfast_menu.children:

key = node.outerHtml

if dicNode.get(key):

node.remove() # Delete duplicate

else:

dicNode[key] = True

print(doc.html)

結(jié)果：

<?xml version="1.0" encoding="UTF-8"?>

<breakfast_menu>

</food>

<food>

</food>

<food>

<some_complex_type_element_1>

<some_simple_type_element_1></some_simple_type_element_1>

</some_complex_type_element_1>

</food>

<food>

<some_simple_type_element_2></some_simple_type_element_2>

</food>

</breakfast_menu>

對(duì)于大文件，請(qǐng)嘗試以下方法。

from simplified_scrapy import SimplifiedDoc, utils

from simplified_scrapy.core.regex_helper import replaceReg

filePath = 'test.xml'

doc = SimplifiedDoc()

doc.loadFile(filePath, lineByline=True)

utils.appendFile('dest.xml','<?xml version="1.0" encoding="UTF-8"?><breakfast_menu>')

dicNode = {}

for node in doc.getIterable('food'):

key = node.outerHtml

key = replaceReg(key, '>[^>]*?<', '><')

key = replaceReg(key, '"[^"]*?"', '""')

if not dicNode.get(key):

dicNode[key] = True

utils.appendFile('dest.xml', key)

utils.appendFile('dest.xml', '</breakfast_menu>')

反對(duì) 回復(fù) 2024-01-24

1 回答
0 關(guān)注
175 瀏覽

關(guān)注

添加回答

舉報(bào)

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書(shū)簽

微信客服

購(gòu)課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動(dòng)學(xué)習(xí)伙伴

公眾號(hào)

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號(hào)

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

有沒(méi)有辦法創(chuàng)建 XML 元素樹(shù)？

有沒(méi)有辦法創(chuàng)建 XML 元素樹(shù)？

1 回答

添加回答

有沒(méi)有辦法創(chuàng)建 XML 元素樹(shù)？