1 回答

TA貢獻(xiàn)1801條經(jīng)驗(yàn) 獲得超16個(gè)贊
您可以按置信度值和規(guī)則長(zhǎng)度對(duì)兩列中的對(duì)值進(jìn)行排序。然后我們將首先獲得最低的 conf 分?jǐn)?shù),并且在具有相同 conf 分?jǐn)?shù)的規(guī)則中,將首先出現(xiàn)最短的列表。我們使用“兩指”方法迭代這個(gè)排序的規(guī)則/配置對(duì)。第一根手指是當(dāng)前的規(guī)則/配置對(duì)。第二根手指移動(dòng),直到我們找到第一條規(guī)則,該規(guī)則要么是不相等的 conf 分?jǐn)?shù)(例如,如果我們的第一根手指在 0.1 上,則為 0.5)或者如果該規(guī)則不是一個(gè)子集(例如,如果我們的第一根手指在上,則遇到 ['Hamster'] ['狗'])。當(dāng)我們找到這樣的規(guī)則/配置對(duì)時(shí),我們附加我們第一根手指的規(guī)則/配置對(duì),并將我們的第一根手指推進(jìn)到我們剛剛處理的對(duì)。我們繼續(xù)迭代,跳過符合我們刪除標(biāo)準(zhǔn)的對(duì),當(dāng)我們發(fā)現(xiàn)不符合“刪除”標(biāo)準(zhǔn)的對(duì)時(shí),追加和推進(jìn)。希望這是有道理的。
rules = [['Dog'],['Dog','Cat'],['Dog','Cat','Hamster','Goldfish'], ['Dog','Cat','Hamster']]
confs = [0.1, 0.5, 0.1, 0.5]
# sort by conf values and size of rules to put the shortest sub-rule in the front
ruleConfPairs = sorted(zip(rules, confs), key=lambda x: (x[1], len(x[0])))
# initialize iteration
new_rules = []
new_confs = []
current_rule = ruleConfPairs[0][0]
current_conf = ruleConfPairs[0][1]
for rule, conf in ruleConfPairs[1:]:
if current_conf == conf and set(current_rule).issubset(rule):
# skip (i.e. remove) pair if it has the same confidence value AND rule is a subset
continue
# append current rule/conf pair if either confidence score is not equal OR rule is not a subset
new_rules.append(current_rule)
new_confs.append(current_conf)
# advance our pair
current_rule = rule
current_conf = conf
# make sure to append the last pair
new_rules.append(current_rule)
new_confs.append(current_conf)
print(new_rules)
print(new_confs)
輸出:
[['Dog'], ['Dog', 'Cat']]
[0.1, 0.5]
添加回答
舉報(bào)