Numba @jit(nopython = True)函数在重度Numpy函数上没有提供速度提升

问题描述 投票:-1回答:1

我目前正在运行test_matrix_speed(),看看我的search_and_book_availability功能有多快。使用PyCharm分析器我可以看到每个search_and_book_availability函数调用平均速度为0.001ms。拥有Numba @jit(nopython=True)装饰器对此功能的性能没有任何影响。这是因为没有改进,Numpy在这里尽可能快地运行吗? (我不关心generate_searches功能的速度)

这是我正在运行的代码

import random

import numpy as np
from numba import jit


def generate_searches(number, sim_start, sim_end):
    searches = []
    for i in range(number):
        start_slot = random.randint(sim_start, sim_end - 1)
        end_slot = random.randint(start_slot + 1, sim_end)
        searches.append((start_slot, end_slot))
    return searches


@jit(nopython=True)
def search_and_book_availability(matrix, search_start, search_end):
    search_slice = matrix[:, search_start:search_end]
    output = np.where(np.sum(search_slice, axis=1) == 0)[0]
    number_of_bookable_vecs = output.size
    if number_of_bookable_vecs > 0:
        if number_of_bookable_vecs == 1:
            id_to_book = output[0]
        else:
            id_to_book = np.random.choice(output)
        matrix[id_to_book, search_start:search_end] = 1
        return True
    else:
        return False


def test_matrix_speed():
    shape = (10, 1440)
    matrix = np.zeros(shape)
    sim_start = 0
    sim_end = 1440
    searches = generate_searches(1000000, sim_start, sim_end)
    for i in searches:
        search_start = i[0]
        search_end = i[1]
        availability = search_and_book_availability(matrix, search_start, search_end)
python performance numpy numba
1个回答
1
投票

使用您的函数和以下代码来分析速度

import time

shape = (10, 1440)
matrix = np.zeros(shape)
sim_start = 0
sim_end = 1440
searches = generate_searches(1000000, sim_start, sim_end)

def reset():
    matrix[:] = 0

def test_matrix_speed():
    for i in searches:
        search_start = i[0]
        search_end = i[1]
        availability = search_and_book_availability(matrix, search_start, search_end)

def timeit(func):
    # warmup
    reset()
    func()

    reset()
    start = time.time()
    func()
    end = time.time()

    return end - start

print(timeit(test_matrix_speed))

我发现jited版本为11.5s,没有jit为7.5s。我不是numba的专家,但它的目的是优化以非矢量化方式编写的数字代码,特别是显式的for循环。在您的代码中没有,您只使用矢量化操作。因此,我预计jit不会超越基线解决方案,但我必须承认,我更惊讶地看到它更糟糕。如果您正在寻求优化解决方案,可以使用以下代码缩短执行时间(至少在我的PC上):

def search_and_book_availability_opt(matrix, search_start, search_end):
    search_slice = matrix[:, search_start:search_end]

    # we don't need to sum in order to check if all elements are 0.
    # ndarray.any() can use short-circuiting and is therefore faster.
    # Also, we don't need the selected values from np.where, only the
    # indexes, so np.nonzero is faster
    bookable, = np.nonzero(~search_slice.any(axis=1))

    # short circuit
    if bookable.size == 0:
        return False

    # we can perform random choice even if size is 1
    id_to_book = np.random.choice(bookable)
    matrix[id_to_book, search_start:search_end] = 1
    return True

并将matrix初始化为np.zeros(shape, dtype=np.bool),而不是默认的float64。我能够获得大约3.8s的执行时间,比非常规解决方案提高约50%,比jited版本提高约70%。希望有所帮助。

© www.soinside.com 2019 - 2024. All rights reserved.