我想根据LabelID属性计算数据帧中“1”块的数量。例如,给定以下数据帧:
DF输入:
eventTime velocity LabelId
1 2017-08-19 12:53:55.050 3 0
2 2017-08-19 12:53:55.100 4 1
3 2017-08-19 12:53:55.150 180 1
4 2017-08-19 12:53:55.200 2 1
5 2017-08-19 12:53:55.250 5 0
6 2017-08-19 12:53:55.050 3 0
7 2017-08-19 12:53:55.100 4 1
8 2017-08-19 12:53:55.150 70 1
9 2017-08-19 12:53:55.200 2 1
10 2017-08-19 12:53:55.250 5 0
Output=2
因为它有两个1.Block_1=rows 2-4
和Block_2=rows 7-9
。请非常感谢任何帮助。
最诚挚的问候,卡罗
我们可以使用diff()
。像这样的东西:
d = df.LabelId.diff()
d.iloc[0] = df.LabelId.iloc[0]
这给你:
[0, 1, 0, 0, -1, 0, 1, 0, 0, -1]
一组的数量是diff为1的次数。所以:
(d == 1).sum()
给你答案。
这是另一种简单的方法:
INTERESTING_LABEL = 1
df = ... # Make data frame
# Find positions where the label is not present
s = (df.LabelId != INTERESTING_LABEL)
# Counter that increases where the label is not present
# Then select where the label is present and count unique values
num_blocks = s.cumsum()[~s].nunique()