关于python：随机矩阵的所有行的快速随机加权选择

Fast random weighted selection across all rows of a stochastic matrix

numpy.random.choice允许从矢量中进行加权选择，即

1
2
3

arr = numpy.array([1, 2, 3])
weights = numpy.array([0.2, 0.5, 0.3])
choice = numpy.random.choice(arr, p=weights)

选择概率为0.2的1，概率为0.5的2，概率为0.3的3。

如果我们想以矢量化的方式快速地对每一行都是概率向量的二维数组(矩阵)执行此操作，该怎么办？也就是说，我们想要一个随机矩阵的选择向量？这是一条非常慢的路：

1
2
3
4
5
6
7
8
9
10
11
12
13

import numpy as np

m = 10
n = 100 # Or some very large number

items = np.arange(m)
prob_weights = np.random.rand(m, n)
prob_matrix = prob_weights / prob_weights.sum(axis=0, keepdims=True)

choices = np.zeros((n,))
# This is slow, because of the loop in Python
for i in range(n):
choices[i] = np.random.choice(items, p=prob_matrix[:,i])

print(choices)：

1
2
3
4
5
6
7
8

array([ 4., 7., 8., 1., 0., 4., 3., 7., 1., 5., 7., 5., 3.,
1., 9., 1., 1., 5., 9., 8., 2., 3., 2., 6., 4., 3.,
8., 4., 1., 1., 4., 0., 1., 8., 5., 3., 9., 9., 6.,
5., 4., 8., 4., 2., 4., 0., 3., 1., 2., 5., 9., 3.,
9., 9., 7., 9., 3., 9., 4., 8., 8., 7., 6., 4., 6.,
7., 9., 5., 0., 6., 1., 3., 3., 2., 4., 7., 0., 6.,
3., 5., 8., 0., 8., 3., 4., 5., 2., 2., 1., 1., 9.,
9., 4., 3., 3., 2., 8., 0., 6., 1.])

这篇文章表明，cumsum和bisect可能是一种潜在的方法，而且速度很快。但是，虽然numpy.cumsum(arr, axis=1)可以沿着numpy数组的一个轴执行此操作，但bisect.bisect函数一次只能在单个数组上工作。同样，numpy.searchsorted也只能在一维阵列上工作。

有没有一种只使用矢量化操作的快速方法？

这是一个完全矢量化的版本，速度相当快：

1
2
3
4
5

def vectorized(prob_matrix, items):
s = prob_matrix.cumsum(axis=0)
r = np.random.rand(prob_matrix.shape[1])
k = (s < r).sum(axis=0)
return items[k]

理论上，searchsorted是在累计加总概率中查找随机值的正确函数，但当m相对较小时，k = (s < r).sum(axis=0)最终会更快。它的时间复杂度是O(m)，而searchsorted方法是O(log(m))，但这只对更大的m很重要。另外，cumsum是O(m)，所以vectorized和@perimosocordiae的improved都是O(m)。(如果您的m实际上要大得多，则必须运行一些测试，以查看在这种方法变慢之前m有多大。)

下面是我对m = 10和n = 10000的时间安排(使用@perimosocordiae的答案中的函数original和improved)：

1
2
3
4
5
6
7
8

In [115]: %timeit original(prob_matrix, items)
1 loops, best of 3: 270 ms per loop

In [116]: %timeit improved(prob_matrix, items)
10 loops, best of 3: 24.9 ms per loop

In [117]: %timeit vectorized(prob_matrix, items)
1000 loops, best of 3: 1 ms per loop

定义函数的完整脚本是：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

import numpy as np

def improved(prob_matrix, items):
# transpose here for better data locality later
cdf = np.cumsum(prob_matrix.T, axis=1)
# random numbers are expensive, so we'll get all of them at once
ridx = np.random.random(size=n)
# the one loop we can't avoid, made as simple as possible
idx = np.zeros(n, dtype=int)
for i, r in enumerate(ridx):
idx[i] = np.searchsorted(cdf[i], r)
# fancy indexing all at once is faster than indexing in a loop
return items[idx]

def original(prob_matrix, items):
choices = np.zeros((n,))
# This is slow, because of the loop in Python
for i in range(n):
choices[i] = np.random.choice(items, p=prob_matrix[:,i])
return choices

def vectorized(prob_matrix, items):
s = prob_matrix.cumsum(axis=0)
r = np.random.rand(prob_matrix.shape[1])
k = (s < r).sum(axis=0)
return items[k]

m = 10
n = 10000 # Or some very large number

items = np.arange(m)
prob_weights = np.random.rand(m, n)
prob_matrix = prob_weights / prob_weights.sum(axis=0, keepdims=True)