(0到2英里,检查1/22/2025)
代表这些检查的表/dataFrame“ all_df”看起来像:
import pandas as pd
all_df = pd.DataFrame(
data=(
(0, 5, pd.Timestamp("2025-1-1")),
(2, 5, pd.Timestamp("2025-1-8")),
(0, 3, pd.Timestamp("2025-1-15")),
(0, 2, pd.Timestamp("2025-1-22")),
),
columns = ["From_Mi", "To_Mi", "Date"]
)
所需的数据帧“最近_df”仅显示最近的检查,看起来像:
| From_MI | To_Mi | Last Date |
| ------- | ----- | --------- |
| 0 | 2 | 1/22/2025 |
| 2 | 3 | 1/15/2025 |
| 3 | 5 | 1/8/2025 |
这可能是涉及.split()和.intersection()的操作?
任何帮助都非常感谢!
也许您可以将
From_Mi
/
To_Mi
转换为
range()
,爆炸并使用
.groupby
来获取
last# sort if necessary:
# all_df = all_df.sort_values(by='Date')
all_df['mile'] = all_df.apply(lambda row: range(row['From_Mi'], row['To_Mi'] + 1), axis=1)
all_df = all_df.explode('mile')
out = (
all_df.groupby('mile')['Date']
.last()
.reset_index()
.groupby('Date', sort=False)['mile']
.last()
.reset_index(name='To_Mi')
)
out['From_Mi'] = out['To_Mi'].shift().fillna(all_df['From_Mi'].iat[0]).astype(int)
print(out[['Date', 'From_Mi', 'To_Mi']])
Date From_Mi To_Mi
0 2025-01-22 0 2
1 2025-01-15 2 3
2 2025-01-08 3 5
我无法弄清楚如何使用更矢量化的方法,但是您可以使用以下内容:
# ensure sorted
all_df = all_df.sort_values("Date", ascending=True)
# interval array
arr = pd.arrays.IntervalArray.from_arrays(all_df.From_Mi, all_df.To_Mi)
# unique intervals
unique_intervals = piso.split(arr, set(arr.left).union(arr.right)).unique()
# dates with interval index
dates = all_df.set_index(pd.IntervalIndex(arr)).Date
# create dataframe with left and right intervals and last date where interval overlaps
output = pd.DataFrame(
[
{
"From_Mi": ui.left,
"To_Mi": ui.right,
"Date": dates[dates.index.overlaps(ui)].iloc[-1]
}
for ui in unique_intervals
]
)