1 回答

TA貢獻(xiàn)1864條經(jīng)驗(yàn) 獲得超2個(gè)贊
經(jīng)過一些研究,我可以得到一個(gè)解決方案。不太確定原因,但非常確定它有效。
LocalCluster、Client 及其之后的所有代碼(將分發(fā)執(zhí)行的代碼)的實(shí)例化不得在 Python 腳本的模塊級(jí)別。相反,此代碼必須位于方法中或 __main__ 塊內(nèi),如下所示:
import pandas as pd
import dask.dataframe as dd
import numpy as np
from dask.distributed import Client, LocalCluster
if __name__ == "__main__":
print("Generating LocalCluster...")
cluster = LocalCluster()
print("Generating Client...")
client = Client(cluster, processes=False)
print("Scaling client...")
client.scale(8)
data = dd.read_csv(
BASE_DATA_SOURCE + '/Data-BIGDATFILES-*.csv',
delimiter=';',
)
def get_min_dt():
min_dt = data.datetime.min().compute()
print("Min is {}".format())
print("Getting min dt...")
get_min_dt()
這個(gè)簡(jiǎn)單的改變帶來了不同。在該問題線程中找到了解決方案:https://github.com/dask/distributed/issues/2520#issuecomment-470817810
添加回答
舉報(bào)