我正在使用(或尝试使用 scipy.stats.ttest_ind),如下所示:
from scipy.stats import ttest_ind, norm
def test_ttest_ind():
rng = np.random.default_rng(43)
rvs1 = stats.norm.rvs(loc=15, scale=10, size=500, random_state=rng)
rvs2 = stats.norm.rvs(loc=15, scale=10, size=500, random_state=rng)
print("Same same")
print(ttest_ind(rvs1, rvs2))
print(ttest_ind(rvs1, rvs2, equal_var=False))
print()
print("Same size diff variance")
rvs3 = stats.norm.rvs(loc=15, scale=20, size=500, random_state=rng)
print(ttest_ind(rvs1, rvs3))
print(ttest_ind(rvs1, rvs3, equal_var=False))
equal_var=False))
print()
print("Diff size diff variance")
rvs4 = stats.norm.rvs(loc=15, scale=20, size=100, random_state=rng)
print(ttest_ind(rvs1, rvs4))
print(ttest_ind(rvs1, rvs4, equal_var=False))
equal_var=False))
print()
但是结果让我感到困惑......
Same same
TtestResult(statistic=np.float64(0.7788368), pvalue=np.float64(0.43626), df=np.float64(998.0))
TtestResult(statistic=np.float64(0.7788368), pvalue=np.float64(0.43626), df=np.float64(997.167))
# OK all the above looks good.
Same size diff variance
TtestResult(statistic=np.float64(0.5116953), pvalue=np.float64(0.60897), df=np.float64(998.0))
TtestResult(statistic=np.float64(0.5116953), pvalue=np.float64(0.60901), df=np.float64(730.112))
# The variance *is* different... So how come pooling it or not has no effect?
Diff size diff variance
TtestResult(statistic=np.float64(-5.125124), pvalue=np.float64(4.0e-07), df=np.float64(598.0))
TtestResult(statistic=np.float64(-3.577155), pvalue=np.float64(0.00051), df=np.float64(111.951))
# Now the sample size differs, pooling the variance seems to have a huge effect! Why?
抱歉,如果这是一个非常愚蠢的问题,我被“更高级的东西”困住了,所以在我知道我对更复杂的东西感到困惑之前,希望了解基础知识。
现在我查看我的代码,我认为我担心的“巨大影响”只是我在每次测试中生成全新分布的巧合。
如果我只是“扩展”第一个分布来更改大小,我会看到看起来更“预期”的结果...
def test_ttest_ind2():
rng = np.random.default_rng()
rvs1 = norm.rvs(loc=15, scale=10, size=100, random_state=rng)
rvs2 = norm.rvs(loc=35, scale=80, size=200, random_state=rng)
print("Same size same variance")
print_ttest_ind(ttest_ind(rvs1, rvs2[:100]))
print_ttest_ind(ttest_ind(rvs1, rvs2[:100], equal_var=False))
print()
print("Diff size ( 1) diff variance")
print_ttest_ind(ttest_ind(rvs1, rvs2[:101]))
print_ttest_ind(ttest_ind(rvs1, rvs2[:101], equal_var=False))
print()
print("Diff size ( 50) diff variance")
print_ttest_ind(ttest_ind(rvs1, rvs2[:150]))
print_ttest_ind(ttest_ind(rvs1, rvs2[:150], equal_var=False))
print()
print("Diff size (100) diff variance")
print_ttest_ind(ttest_ind(rvs1, rvs2[:200]))
print_ttest_ind(ttest_ind(rvs1, rvs2[:200], equal_var=False))
print()
for _ in range(10):
test_ttest_ind2()
print()
它会吐出这样的东西:
t-stat: -2.052 p-value: 0.04144 df: 198.0
t-stat: -2.052 p-value: 0.04269 df: 101.7
Diff size ( 1) diff variance
t-stat: -2.079 p-value: 0.03887 df: 199.0
t-stat: -2.089 p-value: 0.03914 df: 102.8
Diff size ( 50) diff variance
t-stat: -1.792 p-value: 0.07437 df: 248.0
t-stat: -2.184 p-value: 0.03050 df: 155.2
Diff size (100) diff variance
t-stat: -2.211 p-value: 0.02782 df: 298.0
t-stat: -3.095 p-value: 0.00224 df: 210.7
Same size same variance
t-stat: -2.299 p-value: 0.02255 df: 198.0
t-stat: -2.299 p-value: 0.02354 df: 102.1
Diff size ( 1) diff variance
t-stat: -2.406 p-value: 0.01703 df: 199.0
t-stat: -2.418 p-value: 0.01736 df: 103.1
Diff size ( 50) diff variance
t-stat: -3.437 p-value: 0.00069 df: 248.0
t-stat: -4.187 p-value: 0.00005 df: 155.5
Diff size (100) diff variance
t-stat: -3.123 p-value: 0.00197 df: 298.0
t-stat: -4.374 p-value: 0.00002 df: 210.0
所以基本上,不合并方差(考虑到这些输入是正确的)往往会给出略低的 p 值,其中 p 值(以及 p 值的差异)是第二个样本大小的函数,这完全是预料之中的(更多数据提供了更多反对零假设的证据)。
我想我可能对测试所做的事情感到困惑(寻找平均值的差异),测试并不是寻找方差的差异,尽管后者对于测试的方式很重要(对于平均值的差异)手段)制定。
所以我想我不再困惑了;-)