1 回答

TA貢獻(xiàn)1873條經(jīng)驗(yàn) 獲得超9個(gè)贊
使用您的函數(shù)和以下代碼來(lái)分析速度
import time
shape = (10, 1440)
matrix = np.zeros(shape)
sim_start = 0
sim_end = 1440
searches = generate_searches(1000000, sim_start, sim_end)
def reset():
matrix[:] = 0
def test_matrix_speed():
for i in searches:
search_start = i[0]
search_end = i[1]
availability = search_and_book_availability(matrix, search_start, search_end)
def timeit(func):
# warmup
reset()
func()
reset()
start = time.time()
func()
end = time.time()
return end - start
print(timeit(test_matrix_speed))
我發(fā)現(xiàn)jited 版本大約為 11.5s,而沒(méi)有jit. 我不是 numba 方面的專家,但它的目的是優(yōu)化以非矢量化方式編寫的數(shù)字代碼,尤其是顯式for循環(huán)。在您的代碼中沒(méi)有,您只使用矢量化操作。因此,我預(yù)計(jì)jit不會(huì)超過(guò)基線解決方案,但我必須承認(rèn),我很驚訝地看到它更糟。如果您想優(yōu)化您的解決方案,您可以使用以下代碼減少執(zhí)行時(shí)間(至少在我的 PC 上):
def search_and_book_availability_opt(matrix, search_start, search_end):
search_slice = matrix[:, search_start:search_end]
# we don't need to sum in order to check if all elements are 0.
# ndarray.any() can use short-circuiting and is therefore faster.
# Also, we don't need the selected values from np.where, only the
# indexes, so np.nonzero is faster
bookable, = np.nonzero(~search_slice.any(axis=1))
# short circuit
if bookable.size == 0:
return False
# we can perform random choice even if size is 1
id_to_book = np.random.choice(bookable)
matrix[id_to_book, search_start:search_end] = 1
return True
并通過(guò)初始化matrix為np.zeros(shape, dtype=np.bool),而不是默認(rèn)值float64。我能夠獲得大約 3.8 秒的執(zhí)行時(shí)間,比您的 unjited 解決方案提高了約 50%,比 jted 版本提高了約 70%。希望有幫助。
添加回答
舉報(bào)