拟合幂律的线性回归

Question

我有两个数据集

index_list

和

frequency_list

，我用

plt.loglog(index_list, freq_list)

将它们绘制在双对数图中。现在我正在尝试用线性回归拟合幂律

a*x^(-b)

。我希望曲线紧密跟随初始曲线，但以下代码似乎输出类似的曲线，但在 y 轴上镜像。我怀疑我使用

curve_fit

不好。

为什么这条曲线在 x 轴上镜像以及如何让它正确拟合我的初始曲线？

使用此数据

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

f = open ("input.txt", "r")
index_list = []
freq_list = []
index = 0
for line in f:
    split_line = line.split()
    freq_list.append(int(split_line[1]))
    index_list.append(index)
    index += 1

plt.loglog(index_list, freq_list)
def power_law(x, a, b):
    return a * np.power(x, -b)

popt, pcov = curve_fit(power_law, index_list, freq_list)
plt.plot(index_list,  power_law(freq_list, *popt))
plt.show()

Answer 1

下面的代码做了以下更改：

为了使 scipy 函数正常工作，最好
```
index_list
```
和
```
freq_list
```
都是 numpy 数组，而不是 Python 列表。另外，为了让
```
power
```
不至于过快溢出，这些数组应该是
```
float
```
类型（而不是
```
int
```
）。
由于
```
0
```
的负幂会导致被零除的问题，因此以
```
index_list
```
开始
```
1
```
是有意义的。
由于权力的原因，浮点数也可能会产生溢出。因此，为
```
curve_fit
```
添加边界是有意义的。特别是
```
b
```
应限制在不超过50左右（最高值约为
```
power(100000, b) giving an overflow when 
```
b
```
e.g. is
```
100
```
)
```
。设置初始值也有助于指导拟合过程（
```
p0=...
```
）。
用
```
index_list
```
作为
```
x
```
和
```
power_law(freq_list, ...)
```
作为
```
y
```
绘制绘图会生成一条非常奇怪的曲线。绘图和函数必须使用相同的
```
x
```
。

请注意，调用

plt.loglog()

会将绘图的两个轴更改为对数轴。同一轴上的所有后续绘图将继续使用对数刻度。

import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import pandas as pd
import numpy as np

def power_law(x, a, b):
    return a * np.power(x, -b)

df = pd.read_csv("https://norvig.com/google-books-common-words.txt", delim_whitespace=True, header=None)

index_list = df.index.to_numpy(dtype=float) + 1
freq_list = df[1].to_numpy(dtype=float)

plt.loglog(index_list, freq_list, label='given data')

popt, pcov = curve_fit(power_law, index_list, freq_list, p0=[1, 1], bounds=[[1e-3, 1e-3], [1e20, 50]])

plt.plot(index_list, power_law(index_list, *popt), label='power law')
plt.legend()
plt.show()

拟合幂律的线性回归

问题描述投票：0回答：1

1个回答

最新问题

拟合幂律的线性回归

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1