来自不同发行版的python采样

问题描述 投票:0回答:1

我之前可能会问这个问题,但问题可能并不清楚。我试图通过从3个不同的正态分布中抽取50个样本大小来进行300次重复随机抽样来创建样本中位数的样本分布:

subpop1: mean = 100, std dev = 40 (14 of the 50 sample from subpop1)
subpop2: mean = 200, std dev = 70 (20 of the 50 sample from subpop2)
subpop3: mean = 300, std dev = 80 (16 of the 50 sample from subpop3)

那怎么能解决这个问题呢?这是我到目前为止所做的:

repeat = 300
samplesize_list = [14, 20, 16] ]
std_list = [40, 70, 80]
mean_list = [100, 200, 300]
repeat_median = np.empty(repeat, dtype = float)
for j in range(len(samplesize_list)):
    size = samplesize_list[j]
    for m in range(len(mean_list)):
        mean = mean_list[m]
        for z in range(len(std_list)):
            std = std_list[m]
            for i in range(repeat): 
                sample_data = np.random.normal(mean, std, size)
                repeat_median[i] = np.median(sample_data)
sns.distplot(repeat_median, color = 'blue')
plt.show()

我不知道我哪里出错了,因为我在python的入门课程中,我需要帮助我的编码!

python statistics sampling
1个回答
0
投票

我不熟悉绘图,但就数学而言:

import random
import numpy as np

groups = [
    {'label': 'sub_one', 'mean': 100, 'std_dev': 40, 'size': 14},
    {'label': 'sub_two', 'mean': 200, 'std_dev': 70, 'size': 20},
    {'label': 'sub_three', 'mean': 300, 'std_dev': 80, 'size': 16} 
]

def median(mean, std_dev):
    data = np.random.normal(mean, std_dev)
    get_median = np.median(data)
    return get_median


group_all = []

for i in range(300):
    for i in range(groups[0]['size']):
        group_all.append(median(groups[0]['mean'], groups[0]['std_dev']))

    for i in range(groups[1]['size']):
        group_all.append(median(groups[1]['mean'], groups[1]['std_dev']))

    for i in range(groups[2]['size']):
        group_all.append(median(groups[2]['mean'], groups[2]['std_dev']))

print(len(group_all))
(xenial)vash@localhost:~/python/stack_overflow$ python3.7 median.py 
15000
© www.soinside.com 2019 - 2024. All rights reserved.