幾年前,我團(tuán)隊(duì)的一位前任開發(fā)人員編寫了以下 Python 代碼,調(diào)用 word2vec,傳入訓(xùn)練文件和輸出文件的位置。他在 Linux 上工作。我被要求讓它在 Windows 機(jī)器上運(yùn)行。請(qǐng)記住,我?guī)缀醪恢?Python,我已經(jīng)安裝了 Gensim,我猜它現(xiàn)在實(shí)現(xiàn)了 word2vec,但不知道如何重寫代碼以使用庫而不是在 Windows 上似乎無法編譯的可執(zhí)行文件盒子。有人可以幫我更新此代碼嗎?#!/usr/bin/env python3import osimport csvimport subprocessimport shutilfrom gensim.models import word2vecdef train_word2vec(trainFile, output): # run word2vec: subprocess.run(["word2vec", "-train", trainFile, "-output", output, "-cbow", "0", "-window", "10", "-size", "100"], shell=False) # Remove some invalid unicode: with open(output, 'rb') as input_,\ open('%s.new' % output, 'w') as new_output: for line in input_: try: print(line.decode('utf-8'), file=new_output, end='') except UnicodeDecodeError: print(line) pass shutil.move('%s.new' % output, output)def main(): train_word2vec("c:/temp/wc/test1_BigF.txt", "c:/temp/wc/test1_w2v_model.txt")if __name__ == '__main__': main()
添加回答
舉報(bào)
0/150
提交
取消