使用 pandas 中的数据透视表计算特定数据的所有缺失值

Question

我正在研究这个名为

titanic.csv

的数据集，让我们简化问题并在此处包含一些数据：

我需要计算

child

的所有缺失值，正如您所看到的，它是

who

列下的值。这应该使用数据透视表来完成。

我尝试过这个解决方案：

pd.pivot_table(df[df['who'] == 'child'], 
index='sex', 
aggfunc=lambda x: x.isnull().sum(), 
 margins=True) # to sum all missing values based on gender

但我得到这个输出：您还注意到，其中所有行并未对每个性别的所有缺失值进行求和。

我的代码问题出在哪里？我应该使用其他方法来创建数据透视表吗？

Answer 1

如果您只想要孩子的每个特征的缺失值数量，您可以使用 isna():

data = {'survived': [0, 1, 1, 1, 0], 
        'pclass': [3, 1, None, 1, 3], 
        'sex': ['male', 'female', 'female', 'female', 'male'], 
        'age': [22, 38, None, None, 35], 
        'class': ['Third', 'First', None, 'First', 'Third'], 
        'who': ['man', 'woman', 'child', 'child', 'man'], 
        'deck': [None, 'C', None, 'C', None], 
        'alive': ['no', 'yes', 'yes', 'yes', 'no'], 
        'alone': [False, False, True, False, True] } 
df = pd.DataFrame(data)

display(df[df["who"] == "child"].isna().sum())

survived    0
pclass      1
sex         0
age         2
class       1
who         0
deck        1
alive       0
alone       0

使用 pandas 中的数据透视表计算特定数据的所有缺失值

问题描述投票：0回答：1

1个回答

最新问题

使用 pandas 中的数据透视表计算特定数据的所有缺失值

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1