Sample data我想创建一个新列,其值依赖于包含日期时间或时间数据的另一列。所以当时间介于[x]
和[y]
之间时,新列的值是Z
,其中Z
是一个整数。
我知道当价格超过100时,df['newColumn'] = np.where(df['price']>=100, 'yes', 'no')
会创建一个带有yes和nos的新列。我想用我的Panda数据框做类似的事情,指定X和Y之间的时间范围,将新的列添加“1”,elif X2和Y2之间的时间范围,向列添加“2”,依此类推。
我确实看到df.loc['2002-1-1 01:00:00':'2002-1-1 04:00:00']
作为选择时间范围的一种手段,但不能把两者放在一起。任何人的想法?
我相信你需要between
与np.where
或铸造布尔到int
s然后到string
s:
rng = pd.date_range('2010-03-12 15:00:00', periods=10, freq='20H')
X = pd.DataFrame({'Datetime': rng})
mask = X['Datetime'].between('2010-03-12 16:49:00', '2010-03-14 16:49:00')
#solution1
X['CLASS'] = np.where(mask,'1','0')
#solution2
#X['CLASS'] = mask.astype(int).astype(str)
print (X)
Datetime CLASS
0 2010-03-12 15:00:00 0
1 2010-03-13 11:00:00 1
2 2010-03-14 07:00:00 1
3 2010-03-15 03:00:00 0
4 2010-03-15 23:00:00 0
5 2010-03-16 19:00:00 0
6 2010-03-17 15:00:00 0
7 2010-03-18 11:00:00 0
8 2010-03-19 07:00:00 0
9 2010-03-20 03:00:00 0
编辑:
rng = pd.date_range('2010-03-12 16:35:00', periods=30, freq='T')
X = pd.DataFrame({'Datetime': rng})
#same output as below
#conditions = [X['Datetime'] < '2010-03-12 16:39:00',
# (X['Datetime'] < '2010-03-12 16:49:00') & (X['Datetime']>= '2010-03-12 16:39:00'),
# X['Datetime']>='2010-03-12 17:01:00']
conditions = [X['Datetime'] < '2010-03-12 16:39:00',
X['Datetime'].between('2010-03-12 16:39:00','2010-03-12 16:48:00'),
X['Datetime']>='2010-03-12 17:01:00']
classes = ['0','1','2']
#gaps are repalced to NaNs
X['CLASS'] = np.select(conditions, classes, default=np.nan)
#if want bin all data without gaps
bins = pd.DatetimeIndex(['1900-01-01', '2010-03-12 16:38:00', '2010-03-12 16:48:00', '2262-04-11'])
labels=['0','1','2']
X['label'] = pd.cut(X['Datetime'], bins=bins, labels=labels, right=True)
print (X)
Datetime CLASS label
0 2010-03-12 16:35:00 0 0
1 2010-03-12 16:36:00 0 0
2 2010-03-12 16:37:00 0 0
3 2010-03-12 16:38:00 0 0
4 2010-03-12 16:39:00 1 1
5 2010-03-12 16:40:00 1 1
6 2010-03-12 16:41:00 1 1
7 2010-03-12 16:42:00 1 1
8 2010-03-12 16:43:00 1 1
9 2010-03-12 16:44:00 1 1
10 2010-03-12 16:45:00 1 1
11 2010-03-12 16:46:00 1 1
12 2010-03-12 16:47:00 1 1
13 2010-03-12 16:48:00 1 1
14 2010-03-12 16:49:00 nan 2
15 2010-03-12 16:50:00 nan 2
16 2010-03-12 16:51:00 nan 2
17 2010-03-12 16:52:00 nan 2
18 2010-03-12 16:53:00 nan 2
19 2010-03-12 16:54:00 nan 2
20 2010-03-12 16:55:00 nan 2
21 2010-03-12 16:56:00 nan 2
22 2010-03-12 16:57:00 nan 2
23 2010-03-12 16:58:00 nan 2
24 2010-03-12 16:59:00 nan 2
25 2010-03-12 17:00:00 nan 2
26 2010-03-12 17:01:00 2 2
27 2010-03-12 17:02:00 2 2
28 2010-03-12 17:03:00 2 2
29 2010-03-12 17:04:00 2 2