如何合并行中的值,替换 pandas 中的 nan 值

问题描述 投票:0回答:1

我正在对数据框进行一些操作:

df

  Node        Interface      Speed      carrier    1-May  9-May   2-Jun    21-Jun  
  Server1      internet1     10          ATT       20     30     50      90    
  Server1      wan3.0        20          Comcast   NaN    NaN    NaN     100
  Server1      wan3.0        50          Comcast   30     40     40      NaN
  Server2      wan2          100         Sprint    90     70     NaN     NaN
  Server2      wan2          20          Sprint    NaN    NaN    88      70
  Server2      Internet2     40          Verizon   10     60     90      70

我需要按节点和接口合并数据帧组中的行,将 nan 值替换为另一行,然后选择接口速度的最大值。

预期的数据框应该是这样的:

df1

   Node        Interface      Speed      carrier    1-May  9-May   2-Jun    21-Jun  
  Server1      internet1     10          ATT       20      30      50       90    
  Server1      wan3.0        50          Comcast   30      40      40       100
  Server2      wan2          100         Sprint    90      70      88       70
  Server2      Internet2     40          Verizon   10      60      90       70

我试过这个:

df2=df.groupby(['Node','Interface','carrier']),agg({'Speep': 'max'}).reset_index()

df3=df.drop('Speed', axis=1)

df4=df3.ffill().drop_duplicates()

不太有效。有没有一种简单的方法来合并行,用其他行值替换 nan 值并为速度单元格值选择最大速度?

python pandas
1个回答
0
投票

代码

cols = ['carrier', '1-May', '9-May', '2-Jun', '21-Jun']
g = df.groupby(['Node', 'Interface'], sort=False, as_index=False)
out = g.agg({**{'Speed': 'max'}, **dict.fromkeys(cols, 'first')})

输出:

      Node  Interface  Speed  carrier  1-May  9-May  2-Jun  21-Jun
0  Server1  internet1     10      ATT   20.0   30.0   50.0    90.0
1  Server1     wan3.0     50  Comcast   30.0   40.0   40.0   100.0
2  Server2       wan2    100   Sprint   90.0   70.0   88.0    70.0
3  Server2  Internet2     40  Verizon   10.0   60.0   90.0    70.0
© www.soinside.com 2019 - 2024. All rights reserved.