我無法在 Windows 上通過 pyarrow 讀取活潑的鑲木地板文件。import dask.dataframe as ddimport pandas as pdimport numpy as npdf = pd.DataFrame(np.random.randint(0,100,size=(15, 4)), columns=list('ABCD'))dd_df = dd.from_pandas(df, npartitions=1)dd_df.to_parquet("my_df.snappy.parquet", engine="pyarrow", compression="snappy")dd_df_copy = dd.read_parquet("my_df.snappy.parquet", engine="pyarrow")dd_df_copy.compute() #<--- This is where it crashes我已經(jīng)使用 Python 3.8 在干凈的 Anaconda 環(huán)境中復(fù)制了這個(gè)問題。創(chuàng)建環(huán)境后,我跑pip install "dask[complete]"了pip install pyarrow錯(cuò)誤是:Problem signature: Problem Event Name: APPCRASH Application Name: python.exe Application Version: 3.8.3150.1013 Application Timestamp: 5ed53446 Fault Module Name: arrow.dll Fault Module Version: 0.0.0.0 Fault Module Timestamp: 5ebd3029 Exception Code: c000001d Exception Offset: 00000000007abfc7 OS Version: 6.3.9600.2.0.0.16.7 Locale ID: 1033 Additional Information 1: d8e4 Additional Information 2: d8e42c04b828d96accf490cd13472bea Additional Information 3: aebe Additional Information 4: aebe917bfb5c1b58e884baa1f9c3d3d2當(dāng)我嘗試使用時(shí)出現(xiàn)類似版本的崩潰conda -c conda-forge dask pyarrow:Problem signature: Problem Event Name: APPCRASH Application Name: python.exe Application Version: 3.8.3150.1013 Application Timestamp: 5ed53446 Fault Module Name: arrow.dll Fault Module Version: 0.0.0.0 Fault Module Timestamp: 5ecf56ac Exception Code: c000001d Exception Offset: 0000000000521587 OS Version: 6.3.9600.2.0.0.16.7 Locale ID: 1033 Additional Information 1: e863 Additional Information 2: e8638a01b9fb70505b0604ef9b98f3c6 Additional Information 3: 1e47 Additional Information 4: 1e47c852f479606e071f3ea8f80878a1
在 Windows 上讀取 snappy parquet 文件導(dǎo)致 python 崩潰
動(dòng)漫人物
2023-01-04 15:29:34