在数据框中过滤并用 NaN 替换字符串

Question

我想过滤含有 %G% 或 %P% 的水果如果不像 %G% 或 %P% 那么应该替换为 NaN

唯一ID	水果1	水果2	水果3
1234	香蕉	桃子	番石榴
1235	橙色	葡萄	南
1236	梨	木瓜	杏子
1237	番石榴	南	南
1238	猕猴桃	樱桃	桃子

我的结果需要看起来像：

唯一ID	水果1	水果2	水果3
1234	南	桃子	番石榴
1235	南	葡萄	南
1236	梨	木瓜	南
1237	番石榴	南	南
1238	南	南	桃子

Answer 1

# Create a list with the strings to match
to_search = ["G","P"]
# Replaces string by np.NaN if it doesnt contain any "G" or "P" 
rep = lambda string: np.nan if not any(s in string for s in to_search) else string
# Apply the function to the whole DataFrame
df = df.applymap(rep)

假设您的 DataFrame 称为 df，这应该可以解决问题。

Answer 2

如果你只想更改某些列，你可以这样做：

to_replace = ["Fruit1", "Fruit2", "Fruit3"]  # these columns will be changed
for column in to_replace:
    df[column] = df[column].where(
        df[column].str.contains("G|P")
    )

默认情况下，

str.contains

使用正则表达式，所以如果你熟悉正则表达式，可能会更方便一些。

在数据框中过滤并用 NaN 替换字符串

问题描述投票：0回答：2

2个回答

最新问题

在数据框中过滤并用 NaN 替换字符串

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2