关于python：在Scipy中，为什么具有统一概率的custom.rvs()仅返回开始区域中的值？

In Scipy why does custom.rvs() having uniform probability return the values in the begining region only?

如果我生成一个数组

1	custom=np.ones(800, dtype=np.float32)

然后使用

从中创建自定义概率分布

1 2	custom=normalize(custom)[0] customPDF = stats.rv_discrete(name='pdfX', values=(np.arange(800), custom))

然后如果我使用

1	customPDF.rvs()

我得到的返回值在0到20的范围内，而我希望随机数在0到800之间变化。

以下代码为我提供了所需的输出，

1	random.uniform(0,800)

但是由于必须能够通过更改自定义数组来操纵概率分布，因此我不得不使用customPDF.rvs()

对此是否有解决方案，或者为什么会发生？

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

In [206]: custom=np.ones(800, dtype=np.float32)

In [207]: custom=normalize(custom)[0]
/usr/local/lib/python3.4/dist-packages/sklearn/utils/validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
DeprecationWarning)

In [208]: customPDF = stats.rv_discrete(name='pdfX', values=(np.arange(800), custom))

In [209]: customPDF.rvs()
Out[209]: 7

In [210]: customPDF.rvs()
Out[210]: 13

In [211]: customPDF.rvs()
Out[211]: 15

In [212]: customPDF.rvs()
Out[212]: 3

In [213]: customPDF.rvs()
Out[213]: 8

In [214]: customPDF.rvs()
Out[214]: 10

In [215]: customPDF.rvs()
Out[215]: 10

In [216]: customPDF.rvs()
Out[216]: 11

In [217]: customPDF.rvs()
Out[217]: 15

In [218]: customPDF.rvs()
Out[218]: 6

In [219]: customPDF.rvs()
Out[219]: 7

In [220]: random.uniform(0,800)
Out[220]: 707.0265562968543

相关讨论

问题是这条线：

1	custom=normalize(custom)[0]

根据警告，看起来normalize指的是sklearn.preprocessing.normalize。 normalize需要一个[n_samples, n_features] 2D数组-因为您给它一个1D向量，它将插入一个新维度并将其视为[1, n_features]数组(因此为什么索引输出的第0个元素)。

默认情况下，它将调整每行要素的L2(欧几里得)范数等于1。这与使元素总和为1相同：

1 2	print(normalize(np.ones(800))[0].sum()) # 28.2843

由于custom的总和远大于1，因此在到达概率向量的末尾之前，绘制特定整数的累积概率达到1：

1 2	print(custom.cumsum().searchsorted(1)) # 28

结果是您永远不会绘制大于28的整数：

1 2	print(customPDF.rvs(size=100000).max()) # 28

要归一化custom所需要的除以其总和：

1
2
3
4

custom /= custom.sum()

# or alternatively:
custom = np.repeat(1./800, 800)