1 回答

TA貢獻(xiàn)1786條經(jīng)驗(yàn) 獲得超11個(gè)贊
原因是這plt.hist主要是為了處理連續(xù)分布。如果您不提供明確的 bin 邊界,plt.hist則只需在最小值和最大值之間創(chuàng)建 10 個(gè)等距的 bin。這些垃圾箱大部分都是空的。如果只有兩個(gè)可能的數(shù)據(jù)值,則應(yīng)該只有兩個(gè) bin,因此有 3 個(gè)邊界:
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats
trials = 10**3
p = 0.5
sample_bernoulli = stats.bernoulli.rvs(p, size=trials) # Generate benoulli RV
plt.plot((0,1), stats.bernoulli.pmf((0,1), p), 'bo', ms=8, label='bernoulli pmf')
# Density histogram of generated values
plt.hist(sample_bernoulli, density=True, alpha=0.5, color='steelblue', edgecolor='none', bins=np.linspace(-0.5, 1.5, 3))
plt.show()
以下是默認(rèn) bin 邊界以及樣本如何放入 bin 的可視化。請(qǐng)注意density=True,使用 時(shí),直方圖已標(biāo)準(zhǔn)化,所有條形的面積之和為 1。在本例中,兩個(gè)條形寬且0.1高5.0,而其他 8 個(gè)條形的高度為零。所以,總面積為2*0.1*5 + 8*0.0 = 1。
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats
trials = 10 ** 3
p = 0.5
sample_bernoulli = stats.bernoulli.rvs(p, size=trials) # Generate benoulli RV
# Density histogram of generated values with default bins
values, binbounds, bars = plt.hist(sample_bernoulli, density=True, alpha=0.2, color='steelblue', edgecolor='none')
# show the bin boundaries
plt.vlines(binbounds, 0, max(values) * 1.05, color='crimson', ls=':')
# show the sample values with a random displacement
plt.scatter(sample_bernoulli * 0.9 + np.random.uniform(0, 0.1, trials),
np.random.uniform(0, max(values), trials), color='lime')
# show the index of each bin
for i in range(len(binbounds) - 1):
plt.text((binbounds[i] + binbounds[i + 1]) / 2, max(values) / 2, i, ha='center', va='center', fontsize=20, color='crimson')
plt.show()
添加回答
舉報(bào)