根据日期时间距离匹配正值与负值

问题描述 投票:0回答:1

见下图:

enter image description here

我想将红色虚线上方的每个正主干与红线下方最近的负主干相匹配。这场比赛是根据茎彼此分开的时间来进行的。因此,词干 A 将与词干 B 匹配。负词干只能匹配一次,因此 B 也不能与 C 匹配。由于茎 D 距离 A en C 太远(假设时间增量 >= X),因此不予考虑。

            BOX_187_084_11
2005-12-01     -190.379230  D
2008-03-01     -261.853410  B
2008-09-01      268.353538  A
2011-09-01      258.084186  C

这是对应的Dataframe。如何以 pandaic 方式轻松解决这个问题?

python pandas datetime group
1个回答
0
投票

这个问题不太适合矢量化,而矢量化正是 pandas 最擅长的。您仍然可以使用

for
循环来解决它。

我假设日期是您的索引,并且类型为日期时间。如果没有,请在此代码片段之前使用

df.index = pd.to_datetime(df.index)
将其转换为日期时间:

# The two red lines
lowerbound, upperbound = -125, 125
# The time limit for a match
time_limit = pd.Timedelta(days=365)

# The signal is considered positive or negative only if it exceeds the red lines
s = df["BOX_187_084_11"]
is_positive = s > upperbound
is_negative = s < lowerbound

# What a positive signal is matched to
df["MatchedTo"] = None
# Whether the negative signal has been matched
df["Matched"] = False

# Loop through each positive signal a find a match
for index, value in s[is_positive].items():
    # A matching signal must be negative, never matched before, and within the
    # time limit
    cond = is_negative & ~df["Matched"] & (df.index > index - time_limit)
    if ~cond.any():
        continue

    # Store the matched data
    matched_index = cond.index.max()
    df.loc[index, "MatchedTo"] = matched_index
    df.loc[matched_index, "Matched"] = True
© www.soinside.com 2019 - 2024. All rights reserved.