对多列进行 DF 排序并自定义顺序

Question

我有以下数据框：

df = pd.DataFrame({
        'label': [1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4], 
        'condition1': ['c','c','f','f','c','c','f','f','c','c','f','f','c','c','f','f'], 
        'condition2': ['c','f','c','f','c','f','c','f','c','f','c','f','c','f','c','f']})

我已经使用以下代码对 df 进行排序：

df = df.sort_values(by=['label', 'condition1'], ascending=[True, True])

我还想对“condition2”进行排序，使其看起来像这样::

df = pd.DataFrame({
        'label': [1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4], 
        'condition1': ['c','c','f','f','c','c','f','f','c','c','f','f','c','c','f','f'], 
        'condition2': ['c','f','f','c','c','f','f','c','c','f','f','c','c','f','f','c']})

我怎样才能实现这个目标？我尝试将条件2添加到sort_values中，但什么也没发生。

Answer 1

逻辑仍有待澄清，但假设您希望条件 2 的第一个值与条件 1 的每组相匹配，您可以根据两列的相等性计算排序系列：

tmp = df.sort_values(by=['label', 'condition1'], ascending=[True, True])

order = np.lexsort([tmp['condition2'].ne(tmp['condition1']), df['condition1'], df['label']])

out = df.iloc[order]

输出：

    label condition1 condition2
0       1          c          c
1       1          c          f
3       1          f          f
2       1          f          c
4       2          c          c
5       2          c          f
7       2          f          f
6       2          f          c
8       3          c          c
9       3          c          f
11      3          f          f
10      3          f          c
12      4          c          c
13      4          c          f
15      4          f          f
14      4          f          c

对多列进行 DF 排序并自定义顺序

问题描述投票：0回答：1

1个回答

最新问题

对多列进行 DF 排序并自定义顺序

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1