关于python：哪个是更快的np.vstack，np.append，np.concatenate或cython中的手动函数？

which one is faster np.vstack, np.append, np.concatenate or a manual function made in cython?

我编写了一些程序，该程序在每次迭代中都会更新numpy列表并对其进行一些操作。迭代次数取决于时间。例如1秒内，可能会有1000到2500次迭代。这意味着numpy列表中的项目在运行1秒钟时不会超过2500。

我已经实现了一个基本算法，我不确定这是否是计算bonus的最快方法：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

import numpy as np

cdef int[:, :] pl_list
cdef list pl_length
cdef list bonus
pl_list = np.array([[8, 7]], dtype=np.int32)

def modify(pl_list, pl_length):
cdef int k_const = 10
mean = np.mean(pl_list, axis=0)
mean = np.subtract(mean, pl_length)
dev = np.std(pl_list, axis=0)
mean[0] / dev[0] if dev[0] != 0 else 0
mean[1] / dev[1] if dev[1] != 0 else 0

bonus = -1 + (2 / (1 + np.exp(-k_const * mean)))
return list(bonus)

for i in range(2499): # I just simplified the loop. the main loop works like startTime - time.clock() < seconds
rand = np.random.randint(8, 64)
pl_length = [rand, rand-1]

pl_list = np.append(pl_list, [pl_length], axis=0)
bonus = modify(pl_list, pl_length)

我当时正在考虑使用以下思路来加速该程序：

使用np.vstack，np.stack或np.concatenate而不是np.append(pl_list, [pl_length])。(哪个可能更快？)

使用自制函数来计算np.std，np.mean(因为在cython中在内存视图中迭代如此之快)：

cdef int i,sm = 0
for i in range(pl_list.shape[0]):
sm += pl_list[i]
mean = sm/pl_list.shape[0]

我还考虑为内存视图定义一个静态长度(例如2500)，所以我不需要使用np.append，并且可以在该numpy列表上建立队列结构。 (队列库怎么样？在这种操作中，它比numpy列表快吗？)

对不起，如果我的问题太多且太复杂。我只是想在速度上获得最佳性能。