工作循環(huán),預(yù)期結(jié)果我正在嘗試使用非常大的數(shù)據(jù)集對(duì)代碼中的慢速 for 循環(huán)進(jìn)行矢量化,以根據(jù)測(cè)試刪除重復(fù)項(xiàng)。結(jié)果應(yīng)該只保留前 3 個(gè)元素唯一的元素,而第 4 個(gè)元素是所有重復(fù)項(xiàng)中最大的元素。例如in = np.array(((0, 12, 13, 1), (0, 12, 13, 10), (1, 12, 13, 2)))應(yīng)該成為out = np.array(((0, 12, 13, 10), (1, 12, 13, 2)))使用 for 循環(huán)實(shí)現(xiàn)這一點(diǎn)很簡單,但正如我提到的,它非常慢。unique = np.unique(in[:, :3], axis=0)out = np.empty((0, 4))for i in unique: out = np.vstack((out, np.hstack((i[:], np.max(in[np.all(in[:, :3] == i[:], axis=1)][:, 3])))))我試過的 (1)當(dāng)我嘗試通過將每個(gè)替換為以下索引來刪除帶有索引的 for 循環(huán)i[:]時(shí)unique[np.arange(unique.shape[0])]:out = np.vstack((out, np.hstack((unique[np.arange(unique.shape[0])], np.max(in[np.all(in[:, :3].astype(int) == unique[np.arange(unique.shape[0])], axis=1)][:, 3])))))Numpy 抱怨輸入形狀連同所有:Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<__array_function__ internals>", line 6, in all File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 2351, in all return _wrapreduction(a, np.logical_and, 'all', axis, None, out, keepdims=keepdims) File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 90, in _wrapreduction return ufunc.reduce(obj, axis, dtype, out, **passkwargs)numpy.AxisError: axis 1 is out of bounds for array of dimension 0我試過的(2)根據(jù)輸入此問題時(shí) StackOverflow 的建議(Broadcasting/Vectorizing inner and outer for loops in python/NumPy):newout = np.vstack((newout, np.hstack((tempunique[:, None], np.max(inout[np.all(inout[:, :3].astype(int) == tempunique[:, None], axis=1)][:, 3])))))我收到一個(gè)錯(cuò)誤,抱怨輸入和輸出之間的大小不匹配:Traceback (most recent call last): File "<stdin>", line 1, in <module>IndexError: boolean index did not match indexed array along dimension 0; dimension is 3 but corresponding boolean dimension is 2重述問題是否有正確的方法來廣播我的索引以消除 for 循環(huán)?
查看完整描述