我正在嘗試擴(kuò)展 sklearn 中的Splitter類,它與 sklearn 的決策樹類一起使用。更具體地說,我想feature_weights在新類中添加一個(gè)變量,這將通過根據(jù)特征權(quán)重按比例改變純度計(jì)算來影響最佳分割點(diǎn)的確定。新類幾乎是 sklearnBestSplitter類 ( https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/_splitter.pyx ) 的精確副本,只有微小的變化。這是我到目前為止所擁有的:cdef class WeightedBestSplitter(WeightedBaseDenseSplitter): cdef object feature_weights # new variable - 1D array of feature weights def __reduce__(self): # same as sklearn BestSplitter (basically) # NEW METHOD def set_weights(self, object feature_weights): feature_weights = np.asfortranarray(feature_weights, dtype=DTYPE) self.feature_weights = feature_weights cdef int node_split(self, double impurity, SplitRecord* split, SIZE_t* n_constant_features) nogil except -1: # .... same as sklearn BestSplitter .... current_proxy_improvement = self.criterion.proxy_impurity_improvement() current_proxy_improvement *= self.feature_weights[<int>(current.feature)] # new line # .... same as sklearn BestSplitter ....關(guān)于上面的一些注意事項(xiàng):我正在使用object變量類型,np.asfortranarray因?yàn)檫@是變量X在其他地方定義和設(shè)置的方式,并且X像我試圖索引一樣被索引feature_weights。此外,每個(gè)文件custom.feature都有一個(gè)變量類型( https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/tree/_splitter.pxd)。SIZE_t_splitter.pxd該問題似乎是由self.feature_weights. 上面的代碼拋出多個(gè)錯(cuò)誤,但即使嘗試引用類似的東西self.feature_weights[0]并將其設(shè)置為另一個(gè)變量也會(huì)拋出錯(cuò)誤:Indexing Python object not allowed without gil我想知道我需要做什么才能索引self.feature_weights標(biāo)量值并將其用作乘數(shù)。
Cython - 在 nogil 函數(shù)中索引 numpy 數(shù)組
慕的地8271018
2022-12-20 11:06:19