今天我做了自己的函数来解决分位数/十分位数/百分位数:
from math import floor, ceil
def mean(x: list)->float:
""""arithmetic mean (average) of list
adding all elements and divide by umber of elements"""
if len(x) < 1:
raise TypeError("mean need at least one element")
else:
sum_of_elements =sum(x)
mean_value = sum_of_elements/len(x)
return mean_value
def percentiles(percentile, x:list)->float:
'''Percentile is the value below which the values of a given percentage of samples fall.
percentile = 25, 50, 75,100 is 1st, 2nd, 3rd and 4th quartile
percentile = 10,20,30,40,50,60,70,80,90,100 is 1st, 2nd, 3rd
4th, 5th, 6th, 7th, 8th, 9th and 10th decile.'''
x = sorted(x)
n = len(x)
if percentile <1 or percentile >100:
raise TypeError("percentile should be in range 1-100")
if n<0:
raise TypeError("list cannot be empty")
location = percentile*(n+1)/100
if int(location) == location:
return x[int(location)-1]
else:
down = floor(location)
up = ceil(location)
if down == 0:
return x[up]
else:
even = mean([x[down-1], x[up-1]])
return even
当我将上面的函数与 np.quantile 和 stats.quantiles 进行比较时,我得到了不同的结果。例如列表:[1,2,3,4,5,6,7,8,10,11,12,15,16,21] 我的第一季度:3.5 统计 q1:3.75(方法 = '排除')或 4.25(方法 = '包含') Numpy q1:4.25(?)
我的percentiles() 函数有什么问题?当我在纸上做时,我总是得到与百分位数()相同的结果。
计算百分位数的正确公式应使用
(n-1)*percentile/100
作为乘数。这是更正后的 percentiles
函数:
from math import floor, ceil
def percentiles(percentile, x:list)->float:
'''Percentile is the value below which the values of a given percentage of samples fall.
percentile = 25, 50, 75, 100 is 1st, 2nd, 3rd and 4th quartile
percentile = 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 is 1st, 2nd, 3rd
4th, 5th, 6th, 7th, 8th, 9th and 10th decile.'''
x = sorted(x)
n = len(x)
if percentile < 1 or percentile > 100:
raise ValueError("percentile should be in the range 1-100")
if n < 1:
raise ValueError("list cannot be empty")
location = (n - 1) * percentile / 100 # Corrected formula
if int(location) == location:
return x[int(location)]
else:
down = floor(location)
up = ceil(location)
even = (x[down] + x[up]) / 2
return even