llet说,我有一条路很长的路,上面有英里标记:

问题描述 投票:0回答:1
(0到5英里,检查1/1/2025)

(2至5英里检查1/8/2025)

(0到3英里,检查1/15/2025)

(0到2英里,检查1/22/2025)

代表这些检查的表/dataFrame“ all_df”看起来像:

import pandas as pd all_df = pd.DataFrame( data=( (0, 5, pd.Timestamp("2025-1-1")), (2, 5, pd.Timestamp("2025-1-8")), (0, 3, pd.Timestamp("2025-1-15")), (0, 2, pd.Timestamp("2025-1-22")), ), columns = ["From_Mi", "To_Mi", "Date"] )

所需的数据帧“最近_df”仅显示最近的检查,看起来像:

| From_MI | To_Mi | Last Date | | ------- | ----- | --------- | | 0 | 2 | 1/22/2025 | | 2 | 3 | 1/15/2025 | | 3 | 5 | 1/8/2025 |

这可能是涉及.split()和.intersection()的操作?

任何帮助都非常感谢!

也许您可以将
From_Mi

/

To_Mi
转换为
range()

,爆炸并使用

.groupby

来获取

last
python pandas intervals
1个回答
0
投票

# sort if necessary: # all_df = all_df.sort_values(by='Date') all_df['mile'] = all_df.apply(lambda row: range(row['From_Mi'], row['To_Mi'] + 1), axis=1) all_df = all_df.explode('mile') out = ( all_df.groupby('mile')['Date'] .last() .reset_index() .groupby('Date', sort=False)['mile'] .last() .reset_index(name='To_Mi') ) out['From_Mi'] = out['To_Mi'].shift().fillna(all_df['From_Mi'].iat[0]).astype(int) print(out[['Date', 'From_Mi', 'To_Mi']])

Prints:
        Date  From_Mi  To_Mi
0 2025-01-22        0      2
1 2025-01-15        2      3
2 2025-01-08        3      5
    
我无法弄清楚如何使用更矢量化的方法,但是您可以使用以下内容:
# ensure sorted
all_df = all_df.sort_values("Date", ascending=True)
# interval array
arr = pd.arrays.IntervalArray.from_arrays(all_df.From_Mi, all_df.To_Mi)
# unique intervals
unique_intervals = piso.split(arr, set(arr.left).union(arr.right)).unique()

# dates with interval index
dates = all_df.set_index(pd.IntervalIndex(arr)).Date
# create dataframe with left and right intervals and last date where interval overlaps
output = pd.DataFrame(
    [
        {
            "From_Mi": ui.left,
            "To_Mi": ui.right,
            "Date": dates[dates.index.overlaps(ui)].iloc[-1]
        }
        for ui in unique_intervals
    ]
)

	

最新问题
© www.soinside.com 2019 - 2025. All rights reserved.