python Pandas DataFrame中的数据帧操作

Question

将pandas数据帧转换为每个行的最有效方法是什么，如下所示：

    p1  p2  prog
0   A   B   C

分成3行这样的？

    n1  n2  edge_type
0   A   A/B marriage
1   B   A/B marriage
2   A/B C   child

或等效地，将df转换为DF如下：

df = pd.DataFrame({'prog':['C'], 'p1': ['A'], 'p2': ['B']})
dF = pd.DataFrame({'edge_type':['marriage', 'marriage', 'child'], 'n1': ['A', 'B', 'A/B'], 'n2': ['A/B', 'A/B', 'C']})

定义一个worker函数并在mapply中使用R很简单，但我仍然在Python中这样做。

Answer 1

使用apply：

def func(s):
    combo = '/'.join([s['p1'], s['p2']])
    l = [[s['p1'], combo, 'marriage'], [s['p2'], combo, 'marriage'], [combo, s['prog'], 'child']]
    return pd.DataFrame(l, columns=['n1', 'n2', 'edge_type']).unstack()

然后用你的例子：

df.apply(func, axis=1).stack().reset_index(drop=True)

回报

    n1   n2 edge_type
0    A  A/B  marriage
1    B  A/B  marriage
2  A/B    C     child

Answer 2

df = pd.DataFrame({'prog':['C'], 'p1': ['A'], 'p2': ['B']})

data = []
for row in df.itertuples():
    for i in range(1,4):
        if i in (1,2):
            data.append(('marriage', row[i], '/'.join([row[1], row[2]])))
        else:
            data.append(('child', '/'.join([row[1], row[2]]), row[i]))
dF = pd.DataFrame.from_records(data, columns=('edge_type', 'n1', 'n2'))

我试过应用函数，但结果却是一个非常hackish的解决方案。我相信有更好的解决方案。

python Pandas DataFrame中的数据帧操作

问题描述投票：0回答：2

2个回答

最新问题

python Pandas DataFrame中的数据帧操作

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2