我有以下代碼將文檔插入到 MongoDB 中,問(wèn)題是它非常慢,因?yàn)槲覠o(wú)法對(duì)其進(jìn)行多處理器處理,并且考慮到我必須檢查插入的每個(gè)文檔是否已經(jīng)存在,我相信不可能使用批量-插入。我想知道是否有更快的方法來(lái)解決這個(gè)問(wèn)題。在進(jìn)行下面的分析后,我發(fā)現(xiàn)check record()和update_upstream()是兩個(gè)非常耗時(shí)的函數(shù)。因此優(yōu)化它們會(huì)提高整體速度。任何有關(guān)如何優(yōu)化以下內(nèi)容的意見(jiàn)都將受到高度贊賞。謝謝你!import osimport pymongofrom directory import Directoryfrom pymongo import ASCENDINGfrom pymongo import DESCENDINGfrom pymongo import MongoClientfrom storage_config import StorageConfigfrom tqdm import tqdmdir = Directory()def DB_collections(collection_type): types = {'p': 'player_stats', 't': 'team_standings', 'f': 'fixture_stats', 'l': 'league_standings', 'pf': 'fixture_players_stats'} return types.get(collection_type)class DB(): def __init__(self, league, season, func=None): self.db_user = os.environ.get('DB_user') self.db_pass = os.environ.get('DB_pass') self.MONGODB_URL = f'mongodb+srv://{self.db_user}:{self.db_pass}@cluster0-mbqxj.mongodb.net/<dbname>?retryWrites=true&w=majority' self.league = league self.season = str(season) self.client = MongoClient(self.MONGODB_URL) self.DATABASE = self.client[self.league + self.season] self.pool = multiprocessing.cpu_count() self.playerfile = f'{self.league}_{self.season}_playerstats.json' self.teamfile = f'{self.league}_{self.season}_team_standings.json' self.fixturefile = f'{self.league}_{self.season}_fixturestats.json' self.leaguefile = f'{self.league}_{self.season}_league_standings.json' self.player_fixture = f'{self.league}_{self.season}_player_fixture.json' self.func = func def execute(self): if self.func is not None: return self.func(self)def import_json(file): """Imports a json file in read mode Args: file(str): Name of file """ return dir.load_json(file , StorageConfig.DB_DIR)
如何提高插入的寫入速度,pymongo?
慕的地8271018
2023-10-11 22:53:59