首頁猿問嵌套字典替換以前的值 + 鍵而不是附加

嵌套字典替換以前的值 + 鍵而不是附加

Python

江戶川亂折騰 2022-01-05 13:24:38

我正在研究向量空間模型，數(shù)據(jù)集由 50 個文本文件組成。遍歷它們分解成單詞并將它們保存在字典中?，F(xiàn)在我想使用嵌套字典，如：dictionary = { {someword: {Doc1:23},{Doc21:2},{Doc34:3}},{someword: {Doc1:23},{Doc21:2},{Doc34:3}},{someword: {Doc1:23},{Doc21:2},{Doc34:3}} }但是當(dāng)我運行我的程序時，它不僅會替換文檔，而且不會通過添加“某個詞”在特定文檔中出現(xiàn)的次數(shù)來計算頻率。for iterator in range(1, 51): f = open(directory + str(iterator) + ext, "r") for line in f.read().lower().split(): line = getwords(line) for word in line: if check(word, stopwords) == 0: if existence(word, terms, iterator) != 1: terms[word] = {} terms[word]["Doc"+str(iterator)] = 1 else: terms[word]["Doc"+str(iterator)] = int(terms[word]["Doc"+str(iterator)]) + 1 f.close()存在函數(shù)為：def existence(tok, diction, iteration): if tok in diction: temp = "Doc"+str(iteration) if temp in diction: return 1 else: return 0 else: return 0結(jié)果有點像這樣。{'blunder': {'Doc1': 1}, 'by': {'Doc50': 1}, 'anton': {'Doc27': 1}, 'chekhov': {'Doc27': 1}, 'an': {'Doc50': 1}, 'illustration': {'Doc48': 1}, 'story': {'Doc48': 1}, 'author': {'Doc48': 1}, 'portrait'...

查看完整描述

1 回答

收到一只叮咚

TA貢獻1821條經(jīng)驗獲得超5個贊

您想知道每個單詞在每個文件中出現(xiàn)的次數(shù)嗎？這可以通過 a defaultdictof Counters輕松完成，由 collections 模塊提供。

我認(rèn)為您的想法是正確的，循環(huán)遍歷文件，逐行閱讀并拆分成單詞。這是您需要幫助的計數(shù)部分。

from collections import defaultdict, Counter

from string import punctuation

fnames = ['1.txt', '2.txt', '3.txt', '4.txt', '5.txt']

word_counter = defaultdict(Counter)

for fname in fnames:

with open(fname, 'r') as txt:

for line in txt:

words = line.lower().strip().split()

for word in words:

word = word.strip(punctuation)

if word:

word_counter[word][fname] += 1

里面的數(shù)據(jù)看起來像這樣word_counter：

{

'within': {

'1.txt': 2,

},

'we': {

'1.txt': 3,

'2.txt': 2,

'3.txt': 2,

'4.txt': 2,

'5.txt': 4,

},

'do': {

'1.txt': 7,

'2.txt': 8,

'3.txt': 8,

'4.txt': 6,

'5.txt': 5,

},

...

}

反對回復(fù) 2022-01-05

1 回答
0 關(guān)注
155 瀏覽

關(guān)注

添加回答

舉報

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書簽

微信客服

購課補貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動學(xué)習(xí)伙伴

公眾號

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

嵌套字典替換以前的值 + 鍵而不是附加

嵌套字典替換以前的值 + 鍵而不是附加

1 回答

添加回答