在jupyter笔记本中使用pandas转换数据框?

问题描述 投票:0回答:1

我正在寻找在 jupyter 笔记本中使用 pandas 转换数据框。

表1:是现有的数据框 enter image description here

     Dept Name  Jan_Base  Feb_Base  Mar_Base  Jan_Count  Feb_Count  Mar_Count
    Sales    A         8        46        28          5         70         62
    Sales    B        62        43        20         27         10         44
Marketing    A        39        30        54         41         32         60
Marketing    B        41        13        67         31         20         11
     Tech    A        46        64        39         73         24         46
     Tech    B        41        35        70         44         51          8

表2:我想要的输出数据框。 enter image description here

     Dept Name  Month  Base  Count
    Sales    A      1     8      5
    Sales    A      2    46     70
    Sales    A      3    28     62
    Sales    B      1    62     27
    Sales    B      2    43     10
    Sales    B      3    20     44
Marketing    A      1    39     41
Marketing    A      2    30     32
Marketing    A      3    54     60
Marketing    B      1    41     31
Marketing    B      2    13     20
Marketing    B      3    67     11
     Tech    A      1    46     73
     Tech    A      2    64     24
     Tech    A      3    39     46
     Tech    B      1    41     44
     Tech    B      2    35     51
     Tech    B      3    70      8

请帮助我如何在 jupyternotebook 中使用 pandas 和 python 来完成此操作。

我想将 df 1 转换为 df 2

python-3.x pandas dataframe jupyter transform
1个回答
0
投票

您可以创建一个

MultiIndex
来拆分列名称并将数据框从宽格式重塑为长格式:

mi = pd.MultiIndex.from_frame(df.columns[2:].str.extract(r'([^_]+)_?(.*)'),
                              names=['Month', None])

out = (df.set_index(df.columns[:2].tolist()).set_axis(mi, axis=1)
         .stack('Month', sort=False).reset_index())

输出:

>>> out
         Dept Name Month  Base  Count
0       Sales    A   Jan     8      5
1       Sales    A   Feb    46     70
2       Sales    A   Mar    28     62
3       Sales    B   Jan    62     27
4       Sales    B   Feb    43     10
5       Sales    B   Mar    20     44
6   Marketing    A   Jan    39     41
7   Marketing    A   Feb    30     32
8   Marketing    A   Mar    54     60
9   Marketing    B   Jan    41     31
10  Marketing    B   Feb    13     20
11  Marketing    B   Mar    67     11
12       Tech    A   Jan    46     73
13       Tech    A   Feb    64     24
14       Tech    A   Mar    39     46
15       Tech    B   Jan    41     44
16       Tech    B   Feb    35     51
17       Tech    B   Mar    70      8

>>> mi
MultiIndex([('Jan',  'Base'),
            ('Feb',  'Base'),
            ('Mar',  'Base'),
            ('Jan', 'Count'),
            ('Feb', 'Count'),
            ('Mar', 'Count')],
           names=['Month', None])
© www.soinside.com 2019 - 2024. All rights reserved.