如何在Python中绘制概率密度函数?

问题描述 投票:0回答:1

我想为日志文件的偏移量绘制概率密度函数。 这是代码:

timestamps = []
sequences = []

log_Name = 'test_rtt_25-01-17_13-07-41_values5_rate50.log'
log_Path = "/home/ubuntu/results-25-01-09-docker/"
true_Path = log_Path + log_Name
with open(true_Path, "r") as f:
    for line in f:
        if not line.startswith('#'):
            time_part, seq_part = line.strip().split('(')
            base, offset = time_part.split('+')

            timestamps.append(float(offset))

            seq = int(seq_part[:-1])
            sequences.append(seq)

代码从日志文件中读取数据,然后将偏移量和序列保存在“时间戳”和“序列”中。

这是“时间戳”和“序列”的示例。

[0.001009023, 0.001055868, 0.000992934, 0.001148472, 0.001086814, 0.001110649, 0.001066759, 0.00126167, 0.001231778, 0.000944345]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

如您所见,有 10 个偏移和 10 个序列。每个偏移量都有其编号,例如:0.001009023 是编号 1。 我想绘制概率密度函数,我尝试了这个:

source = {'seqs': sequences, 'times': timestamps}
df = pd.DataFrame(source)
df.sort_values(by = ['times'], inplace=True)
df_mean = np.mean(df['times'])
df_std = np.std(df['times'])
pdf = stats.norm.pdf(df['times'], df_mean, df_std)
plt.plot(df['times'], pdf)
plt.xlabel('Offsets')  # Label for the x-axis
plt.savefig('/home/ubuntu/')

但是输出的y轴确实很奇怪,它看起来像这样: enter image description here

我不知道为什么概率比1大得多,它应该小于1。有人知道我哪里做错了吗?

python dataframe numpy matplotlib
1个回答
0
投票

以下是如何估计数据基础分布的 PDF:最大似然估计(假设正态分布)、核密度估计和 Rosenblatt 的平移直方图。

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats, integrate

times = [0.001009023, 0.001055868, 0.000992934, 0.001148472, 0.001086814, 
         0.001110649, 0.001066759, 0.00126167, 0.001231778, 0.000944345]
times = np.asarray(times)
x = np.linspace(0.0008, 0.0014, 300)

# Maximum likelihood estimate normal distribution
mu, sigma = stats.norm.fit(times)  # simply the mean and uncorrected variance
X = stats.Normal(mu=mu, sigma=sigma)

# Kernel density estimate
Y = stats.gaussian_kde(times)

# Rosenblatt's Shifted Histogram
z = stats.mstats.rsh(times, points=x)

plt.plot(x, X.pdf(x), label='MLE Normal Distribution')
plt.plot(x, Y.evaluate(x), label='KDE')
plt.plot(x, z, label='RSH')
plt.legend()
plt.title("PDF Estimates")

enter image description here

© www.soinside.com 2019 - 2024. All rights reserved.