試過(guò)了:#Generate dataimport pandas as pd import numpy as npdf = pd.DataFrame(np.random.randn(100, 5), columns=['a', 'b', 'c', 'd', 'e'])df["y"] = (df['a'] > 0.5).astype(int)df.head()from mleap.sklearn.ensemble.forest import RandomForestClassifierforestModel = RandomForestClassifier()forestModel.mlinit(input_features='a', feature_names='a', prediction_column='e_binary')forestModel.fit(df[['a']], df[['y']])forestModel.serialize_to_bundle("jar:file:/dbfs/FileStore/tables/mleaptestmodelforestpysparkzip", "randomforest.zip")我收到此錯(cuò)誤:No such file or directory: 'jar:file:/dbfs/FileStore/tables/mleaptestmodelforestpysparkzip/randomforest.zip.node'我也試過(guò)了:forestModel.serialize_to_bundle("jar:file:/dbfs/FileStore/tables/mleaptestmodelforestpysparkzip/randomforest.zip")并收到一條錯(cuò)誤消息,指出缺少“model_name”屬性。請(qǐng)問(wèn)你能幫幫我嗎?我添加了我嘗試做的所有事情以及我得到的結(jié)果:管道到 Zip:1.pipeline.serialize_to_bundle("jar:file:/dbfs/FileStore/tables/mleap/pipeline_zip/1/model.zip", model_name="forest")=> FileNotFoundError: [Errno 2] 沒有這樣的文件或目錄:'jar:file:/dbfs/FileStore/tables/mleap/pipeline_zip/1/model.zip/model.json'2.pipeline.serialize_to_bundle("jar:file:/dbfs/FileStore/tables/mleap/pipeline_zip/1/model.zip", model_name="forest", init=True)FileNotFoundError: [Errno 2] 沒有這樣的文件或目錄:'jar:file:/dbfs/FileStore/tables/mleap/pipeline_zip/1/model.zip/forest'要壓縮的模型forest.serialize_to_bundle("jar:file:/dbfs/FileStore/tables/mleap/random_forest_zip/1/model.zip", model_name="forest")=> FileNotFoundError: [Errno 2] 沒有這樣的文件或目錄:'jar:file:/dbfs/FileStore/tables/mleap/random_forest_zip/1/model.zip/forest.node'forest.serialize_to_bundle("jar:file:/dbfs/FileStore/tables/mleap/random_forest_zip/1", model_name="model.zip")=> FileNotFoundError: [Errno 2] 沒有這樣的文件或目錄:'jar:file:/dbfs/FileStore/tables/mleap/random_forest_zip/1/model.zip.node'forest.serialize_to_bundle("/dbfs/FileStore/tables/mleap/random_forest_zip/1", model_name="model.zip")=> 不要保存 zip。而是保存一個(gè)包。
1 回答

天涯盡頭無(wú)女友
TA貢獻(xiàn)1831條經(jīng)驗(yàn) 獲得超9個(gè)贊
我發(fā)現(xiàn)了問(wèn)題和解決方法。
不再可能使用 Databricks 進(jìn)行隨機(jī)寫入,如下所述:https ://docs.databricks.com/data/databricks-file-system.html?_ga=2.197884399.1151871582.1592826411-509486897.1589442523#local-file-apis
解決方法是在本地文件系統(tǒng)中寫入 zip 文件,然后將其復(fù)制到 DBFS 中。所以:
使用“init=True”在管道中序列化您的模型,將其保存在本地目錄中
使用“dbutils.fs.cp(source, destination)”將其復(fù)制到您的數(shù)據(jù)湖
dbutils.fs.cp(來(lái)源,目的地)
添加回答
舉報(bào)
0/150
提交
取消