我试图用时间和数值来执行if语句,使新的分类列成为一个新的列。
Condition - if time is between 05:00:00 and 19:00:00 and t_value > 0 & t_value <=13 then classify as "C" else "IC"
If time is not in the range then classify as NA
输入示例
t_value
2020-05-17 00:00:00 0
2020-05-17 01:00:00 0
2020-05-17 02:00:00 0
2020-05-17 03:00:00 0
2020-05-17 04:00:00 0
2020-05-17 05:00:00 0
2020-05-17 06:00:00 0
2020-05-17 07:00:00 8
2020-05-17 08:00:00 9
2020-05-17 09:00:00 10
2020-05-17 10:00:00 11
2020-05-17 11:00:00 12
我不知道在这方面应该采取什么办法?
预期产出
t_value C/IC
2020-05-17 00:00:00 0 NA
2020-05-17 01:00:00 0 NA
2020-05-17 02:00:00 0 NA
2020-05-17 03:00:00 0 NA
2020-05-17 04:00:00 0 NA
2020-05-17 05:00:00 0 IC
2020-05-17 06:00:00 0 IC
2020-05-17 07:00:00 8 C
2020-05-17 08:00:00 9 C
2020-05-17 09:00:00 10 C
2020-05-17 10:00:00 11 C
2020-05-17 11:00:00 12 C
#convert to datetime index
df.index = pd.to_datetime(df.index)
#get condition for time boundary
cond1 = df.between_time( '05:00:00', '19:00:00')
print(cond1.index)
DatetimeIndex(['2020-05-17 05:00:00', '2020-05-17 06:00:00',
'2020-05-17 07:00:00', '2020-05-17 08:00:00',
'2020-05-17 09:00:00', '2020-05-17 10:00:00',
'2020-05-17 11:00:00'],
dtype='datetime64[ns]', freq=None)
#get index to match the t_value conditions
#indices that match time boundary, but not t_value boundary
ic = cond1.loc[~(cond1.t_value.gt(0)) & (cond1.t_value.le(13))].index
#indices that match time boundary and t_value boundary
c = cond1.loc[(cond1.t_value.gt(0)) & (cond1.t_value.le(13))].index
#assign value
df.loc[c,'C/IC'] = "C"
df.loc[ic,'C/IC'] = "IC"
print(df)
t_value C/IC
2020-05-17 00:00:00 0 NaN
2020-05-17 01:00:00 0 NaN
2020-05-17 02:00:00 0 NaN
2020-05-17 03:00:00 0 NaN
2020-05-17 04:00:00 0 NaN
2020-05-17 05:00:00 0 IC
2020-05-17 06:00:00 0 IC
2020-05-17 07:00:00 8 C
2020-05-17 08:00:00 9 C
2020-05-17 09:00:00 10 C
2020-05-17 10:00:00 11 C
2020-05-17 11:00:00 12 C