如何去掉最后一列中的零

问题描述 投票:0回答:4

我正在做应用数据科学的作业。

问题: 将可再生能源百分比削减为 5 个类别。按大陆划分的前 15 名组,以及这些新的可再生百分比垃圾箱。每个组中有多少个国家? 此函数应返回一个具有 Continent MultiIndex 的系列,然后是可再生百分比的 bin。请勿包含没有国家/地区的团体。

这是我的代码:

def answer_twelve():

    Top15 = answer_one()
    ContinentDict  = {'China':'Asia', 
                  'United States':'North America', 
                  'Japan':'Asia', 
                  'United Kingdom':'Europe', 
                  'Russian Federation':'Europe', 
                  'Canada':'North America', 
                  'Germany':'Europe', 
                  'India':'Asia',
                  'France':'Europe', 
                  'South Korea':'Asia', 
                  'Italy':'Europe', 
                  'Spain':'Europe', 
                  'Iran':'Asia',
                  'Australia':'Australia', 
                  'Brazil':'South America'}
    Top15['Continent'] = Top15.index.to_series().map(ContinentDict)
    Top15['bins'] = pd.cut(Top15['% Renewable'],5)
    return pd.Series(Top15.groupby(by = ['Continent', 'bins']).size())#,apply(lambda x:s if x['Rank']==0 continue))
answer_twelve()

这是我对上述代码的输出

Continent      bins            
Asia           (2.212, 15.753]     4
               (15.753, 29.227]    1
               (29.227, 42.701]    0
               (42.701, 56.174]    0
               (56.174, 69.648]    0
Australia      (2.212, 15.753]     1
               (15.753, 29.227]    0
               (29.227, 42.701]    0
               (42.701, 56.174]    0
               (56.174, 69.648]    0
Europe         (2.212, 15.753]     1
               (15.753, 29.227]    3
               (29.227, 42.701]    2
               (42.701, 56.174]    0
               (56.174, 69.648]    0
North America  (2.212, 15.753]     1
               (15.753, 29.227]    0
               (29.227, 42.701]    0
               (42.701, 56.174]    0
               (56.174, 69.648]    1
South America  (2.212, 15.753]     0
               (15.753, 29.227]    0
               (29.227, 42.701]    0
               (42.701, 56.174]    0
               (56.174, 69.648]    1
dtype: int64

所需输出为

Continent      bins            
Asia           (2.212, 15.753]     4
               (15.753, 29.227]    1
Australia      (2.212, 15.753]     1
Europe         (2.212, 15.753]     1
               (15.753, 29.227]    3
               (29.227, 42.701]    2
North America  (2.212, 15.753]     1
               (56.174, 69.648]    1
South America  (56.174, 69.648]    1
Name: Countries, dtype: int64

我想去掉零,我尝试使用

pd.Series(Top15.groupby(by = ['Continent', 'bins']).size().apply(lambda x:s if x['Rank']==0 continue))

但我不断收到以下错误

File "<ipython-input-317-14bc05bb2137>", line 20
    return pd.Series(Top15.groupby(by = ['Continent', 'bins']).size().apply(lambda x:s if x['Rank']==0 continue))
                                                                                                              ^
SyntaxError: invalid syntax

我无法找出我的错误。请帮助我!

python pandas-groupby
4个回答
1
投票

使用pandas,当列为零时删除行

如果column_name是您的列:

df = df[df.column_name != 0]

0
投票
lambda x:s if x['Rank']==0 continue

这没有任何意义,因为

continue
仅在循环内有用。 请注意,您需要一个要打印的值。 相反,将其留空:

lambda x:"" if x['Rank']==0 else s

0
投票

您可以使用“for”循环迭代这些值,然后使用

replace()
将 0 替换为 NaN, 现在您可以使用
dropna()
删除它们。 我尝试使用
drop()
droplevel()
而不是替换它们,但它不起作用。这是我的代码:

for k,i in series_df.items():
    if i == 0:
        pd_series.replace(to_replace=i, value=np.nan, inplace=True)
        pd_series.dropna(axis=0, inplace=True)
print(pd_series)

您可能需要更改结果的数据类型。输出为:

Continent      bins            
Asia           (2.212, 15.753]     4
               (15.753, 29.227]    1
Australia      (2.212, 15.753]     1
Europe         (2.212, 15.753]     1
               (15.753, 29.227]    3
               (29.227, 42.701]    2
North America  (2.212, 15.753]     1
               (56.174, 69.648]    1
South America  (56.174, 69.648]    1
dtype: int64

0
投票

由于您的最终结果是一个系列,因此您需要替换

return pd.Series(Top15.groupby(by = ['Continent', 'bins']).size())

temp_df=Top15.groupby(by = ['Continent', 'bins']).size()
return temp_df[temp_df != 0]
© www.soinside.com 2019 - 2024. All rights reserved.