检查特定值的分布,例如给定列中天气正好多云的次数。请随意检查其他值。您可以通过调用带有相应参数的函数clear来检查它。
#Importing the modules
import pandas as pd
import numpy as np
from scipy.stats import mode
weather = pd.read_csv(path, sep = ",", delimiter = None)
#code to check distribution of variable
def clear(df,col,val):
""" Check distribution of variable
df,col,val
This function accepts a dataframe,column(feature) and value which returns count of the value,
containing the value counts of a variable(value_counts)
Keyword arguments:
df - Pandas dataframe
col - Feature of the datagrame
val - value of the feature
Returns:
value_counts - Value count of the feature
"""
value_counts = df[(df.col == val)] # filtering dataframe
print(len(value_counts))
clear(weather,weather.Weather,'Cloudy') # 调用函数clear
也许只传递列的名称 - 而不是整个列:
def clear(df, column_name, column_value):
value_counts = df.loc[(df[column_name] == column_value)] # filtering dataframe
return len(value_counts)