将两个数据帧组合成每个数据帧中包含的值的单个表示

问题描述 投票:0回答:1

我有两个大型数据框,

cl
cb
,它们描述了一段时间内的交易限价订单簿。
cl
包含级别(想想价格),
cb
包含尺寸(想想订单)。

我想以某种方式将其中的每一个组合起来,从而产生一个数据帧,其中

cl
中的每个价格(值)条目作为列值,以及来自
cb
的相应关联大小作为给定的行/列值一天中的某个时间。

cl

2023-08-14 06:30:01 4470.75 4471.0  4471.25 4471.5  4471.75 4472.0  4472.25 4472.5  4472.75 4473.0  4473.25 4473.5  4473.75 4474.0  4474.25 4474.5  4474.75 4475.0  4475.25 4475.5
2023-08-14 06:30:02 4472.0  4472.25 4472.5  4472.75 4473.0  4473.25 4473.5  4473.75 4474.0  4474.25 4474.5  4474.75 4475.0  4475.25 4475.5  4475.75 4476.0  4476.25 4476.5  4476.75
2023-08-14 06:30:03 4472.5  4472.75 4473.0  4473.25 4473.5  4473.75 4474.0  4474.25 4474.5  4474.75 4475.0  4475.25 4475.5  4475.75 4476.0  4476.25 4476.5  4476.75 4477.0  4477.25
2023-08-14 06:30:04 4471.25 4471.5  4471.75 4472.0  4472.25 4472.5  4472.75 4473.0  4473.25 4473.5  4473.75 4474.0  4474.25 4474.5  4474.75 4475.0  4475.25 4475.5  4475.75 4476.0
2023-08-14 06:30:05 4471.5  4471.75 4472.0  4472.25 4472.5  4472.75 4473.0  4473.25 4473.5  4473.75 4474.0  4474.25 4474.5  4474.75 4475.0  4475.25 4475.5  4475.75 4476.0  4476.25
2023-08-14 06:30:06 4471.0  4471.25 4471.5  4471.75 4472.0  4472.25 4472.5  4472.75 4473.0  4473.25 4473.5  4473.75 4474.0  4474.25 4474.5  4474.75 4475.0  4475.25 4475.5  4475.75
2023-08-14 06:30:07 4471.0  4471.25 4471.5  4471.75 4472.0  4472.25 4472.5  4472.75 4473.0  4473.25 4473.5  4473.75 4474.0  4474.25 4474.5  4474.75 4475.0  4475.25 4475.5  4475.75
2023-08-14 06:30:08 4470.0  4470.25 4470.5  4470.75 4471.0  4471.25 4471.5  4471.75 4472.0  4472.25 4472.5  4472.75 4473.0  4473.25 4473.5  4473.75 4474.0  4474.25 4474.5  4474.75
2023-08-14 06:30:09 4469.5  4469.75 4470.0  4470.25 4470.5  4470.75 4471.0  4471.25 4471.5  4471.75 4472.0  4472.25 4472.5  4472.75 4473.0  4473.25 4473.5  4473.75 4474.0  4474.25
2023-08-14 06:30:10 4470.0  4470.25 4470.5  4470.75 4471.0  4471.25 4471.5  4471.75 4472.0  4472.25 4472.5  4472.75 4473.0  4473.25 4473.5  4473.75 4474.0  4474.25 4474.5  4474.75

cb

2023-08-14 06:30:01 38.0    45.0    105.0   53.0    49.0    42.0    68.0    49.0    32.0    26.0    -21.0   -33.0   -33.0   -60.0   -49.0   -47.0   -48.0   -72.0   -76.0   -70.0
2023-08-14 06:30:02 69.0    64.0    55.0    59.0    53.0    59.0    41.0    46.0    51.0    26.0    -41.0   -48.0   -66.0   -61.0   -67.0   -44.0   -78.0   -72.0   -54.0   -61.0
2023-08-14 06:30:03 54.0    54.0    54.0    56.0    54.0    50.0    43.0    41.0    52.0    40.0    -1.0    -41.0   -56.0   -41.0   -73.0   -44.0   -47.0   -47.0   -58.0   -76.0
2023-08-14 06:30:04 100.0   43.0    53.0    67.0    59.0    41.0    41.0    40.0    42.0    23.0    -25.0   -34.0   -54.0   -57.0   -61.0   -67.0   -49.0   -55.0   -40.0   -93.0
2023-08-14 06:30:05 43.0    53.0    69.0    50.0    42.0    43.0    43.0    41.0    31.0    6.0    -36.0    -45.0   -58.0   -62.0   -60.0   -48.0   -56.0   -41.0   -94.0   -45.0
2023-08-14 06:30:06 70.0    101.0   44.0    53.0    72.0    51.0    43.0    43.0    41.0    42.0    -13.0   -41.0   -41.0   -56.0   -59.0   -61.0   -59.0   -45.0   -56.0   -41.0
2023-08-14 06:30:07 66.0    101.0   42.0    54.0    48.0    51.0    45.0    42.0    30.0    30.0    -15.0   -39.0   -53.0   -61.0   -57.0   -60.0   -57.0   -41.0   -53.0   -42.0
2023-08-14 06:30:08 67.0    46.0    48.0    36.0    67.0    99.0    39.0    50.0    36.0    46.0    -8.0    -39.0   -43.0   -50.0   -47.0   -51.0   -49.0   -58.0   -53.0   -79.0
2023-08-14 06:30:09 94.0    54.0    59.0    45.0    46.0    30.0    45.0    95.0    27.0    30.0    -26.0   -44.0   -42.0   -53.0   -56.0   -50.0   -44.0   -47.0   -46.0   -55.0

并且想要这样的东西(不是我的示例的实际预期输出,而是说明所需的输出):

df

4469.5  4469.75 4470    4470.25 4470.5  4470.75 4471    4471.25 4471.5  4471.75 4472    4472.25 4472.5  4472.75 4473    4473.25 4473.5  4473.75 4474    4474.25 4474.5  4474.75 4475    4475.25 4475.5  4475.75 4476    4476.25 4476.5  4476.75 4477    4477.25
0   0   0   0   0   0   0   38  45  105 53  49  42  68  49  32  26  -21 -33 -33 -60 -49 -47 -48 -72 -76 -70 0   0   0   0   0
0   0   0   0   0   0   69  64  55  59  53  59  41  46  51  26  -41 -48 -66 -61 -67 -44 -78 -72 -54 -61 0   0   0   0   0   0
0   0   0   0   54  54  54  56  54  50  43  41  52  40  -1  -41 -56 -41 -73 -44 -47 -47 -58 -76 0   0   0   0   0   0   0   0
0   0   0   0   0   0   100 43  53  67  59  41  41  40  42  23  -25 -34 -54 -57 -61 -67 -49 -55 -40 -93 0   0   0   0   0   0
0   0   0   0   0   0   0   0   43  53  69  50  42  43  43  41  31  6   -36 -45 -58 -62 -60 -48 -56 -41 -94 -45 0   0   0   0
0   0   0   0   0   0   0   0   0   0   70  101 44  53  72  51  43  43  41  42  -13 -41 -41 -56 -59 -61 -59 -45 -56 -41 0   0
0   0   0   0   0   0   0   0   0   0   0   0   66  101 42  54  48  51  45  42  30  30  -15 -39 -53 -61 -57 -60 -57 -41 -53 -42
0   0   0   0   0   0   0   0   0   0   67  46  48  36  67  99  39  50  36  46  -8  -39 -43 -50 -47 -51 -49 -58 -53 -79 0   0
0   0   0   0   0   0   0   0   94  54  59  45  46  30  45  95  27  30  -26 -44 -42 -53 -56 -50 -44 -47 -46 -55 0   0   0   0
0   0   0   0   0   0   0   61  46  50  36  51  95  35  42  31  26  -17 -37 -56 -46 -46 -44 -45 -52 -56 -60 0   0   0   0   0
0   0   0   0   0   0   30  45  99  43  48  39  48  30  25  35  -30 -50 -47 -47 -54 -54 -60 -61 -41 -60 0   0   0   0   0   0
0   0   0   0   0   43  42  29  48  99  32  39  39  44  19  -10 -36 -44 -56 -48 -49 -56 -55 -60 -62 0   0   0   0   0   0   0
0   0   0   0   61  46  50  32  37  90  33  43  42  1   -35 -54 -54 -61 -57 -49 -51 -56 -57 -68 0   0   0   0   0   0   0   0
0   0   0   47  35  41  110 43  45  49  33  31  29  -19 -57 -59 -52 -51 -58 -57 -62 -76 -54 0   0   0   0   0   0   0   0   0
0   0   48  35  41  110 34  44  49  32  32  12  -9  -44 -60 -51 -52 -58 -57 -62 -75 -54 0   0   0   0   0   0   0   0   0   0
0   49  34  40  109 34  47  49  34  39  23  -12 -42 -56 -52 -52 -60 -58 -63 -74 -54 0   0   0   0   0   0   0   0   0   0   0
46  33  37  111 32  42  50  34  46  28  -15 -24 -54 -50 -49 -58 -58 -62 -75 -54 0   0   0   0   0   0   0   0   0   0   0   0
0   48  35  40  111 39  49  56  41  55  28  -21 -39 -59 -51 -54 -61 -58 -63 -76 -54 0   0   0   0   0   0   0   0   0   0   0
48  46  42  116 37  46  53  38  59  31  -20 -44 -61 -61 -54 -61 -61 -63 -76 -55 0   0   0   0   0   0   0   0   0   0   0   0
0   0   46  51  116 35  47  53  42  65  38  30  -28 -59 -63 -56 -62 -58 -63 -77 -56 -68 0   0   0   0   0   0   0   0   0   0
python pandas dataframe data-structures
1个回答
0
投票

解决方案:逆透视+合并+透视

代码

import pandas as pd

with open('cl.txt', 'r', encoding="utf-8") as file:
    file_txt = file.read()
    file_txt = file_txt.replace("   "," ")
    file_txt = file_txt.replace("  "," ")
    file_txt = file_txt.split("\n")
    cl_list = [ li.split(" ") for li in file_txt]

with open('cb.txt', 'r', encoding="utf-8") as file:
    file_txt = file.read()
    file_txt = file_txt.replace("   "," ")
    file_txt = file_txt.replace("  "," ")
    file_txt = file_txt.split("\n")
    cb_list = [ li.split(" ") for li in file_txt]

df_cl = pd.DataFrame(cl_list,columns= ['date','time','1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20'])

df_cb = pd.DataFrame(cb_list,columns=['date','time','1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20'])

df_cb_unpivot = pd.melt(df_cb, id_vars=['date','time'], value_vars=['1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20'])
df_cl_unpivot = pd.melt(df_cl, id_vars=['date','time'], value_vars=['1','2','3','4','5','6','7','8','9','10','11','12','13','14','15','16','17','18','19','20'])

df_cl_cb_join = pd.merge(df_cl_unpivot, df_cb_unpivot, on=['date','time','variable'])

df_final = df_cl_cb_join
df_final = df_final.drop('variable', axis=1)
df_final.rename(columns={'value_x': 'column', 'value_y': 'value'}, inplace=True)
df_final.pivot(index=['date','time'], columns='column', values='value')

输出 unpivot pivot output

© www.soinside.com 2019 - 2024. All rights reserved.