这是一个数据帧:
id value next_value step
0 10764 612486.33921 800000.0 0.0
1 10764 612486.33921 1000000.0 200000.0
2 10822 97604.30520 100000.0 0.0
3 10823 97805.45851 100000.0 0.0
4 10823 97805.45851 165000.0 65000.0
5 10823 97805.45851 250000.0 85000.0
6 10823 97805.45851 400000.0 150000.0
7 10823 97805.45851 600000.0 200000.0
8 10823 97805.45851 800000.0 200000.0
9 10823 97805.45851 1000000.0 200000.0
10 10823 97805.45851 1200000.0 200000.0
11 10882 33843.33701 40000.0 0.0
12 10882 33843.33701 60000.0 20000.0
13 10882 33843.33701 90000.0 30000.0
14 10882 33843.33701 125000.0 35000.0
15 10882 33843.33701 150000.0 25000.0
16 10882 33843.33701 175000.0 25000.0
17 10882 33843.33701 205000.0 30000.0
18 10882 33843.33701 230000.0 25000.0
19 10882 33843.33701 300000.0 70000.0
20 10882 33843.33701 440000.0 140000.0
21 10758 241890.94525 385000.0 0.0
22 10817 138519.73560 150000.0 0.0
23 10817 138519.73560 200000.0 50000.0
24 10817 138519.73560 250000.0 50000.0
25 10817 138519.73560 300000.0 50000.0
26 10817 138519.73560 350000.0 50000.0
27 10817 138519.73560 400000.0 50000.0
28 10859 73140.30048 80000.0 0.0
29 10859 73140.30048 100000.0 20000.0
我需要在step
的一组中加入next_value
value
和id
之间的第一个区别。
即对于id=10764
step = step + 800000.0 - 612486.33921
,
对于id=10822
step = step + 100000.0 - 97604.30520
,
等等
我试过这个:
df['step'] += df.groupby('id')[['next_value', 'value']].apply
(lambda x: x.iloc[0, 0] - x.iloc[0, 1])
但它给了我所有step
值等于NaN
。
我能在这做什么?
使用groupby
id
并获得first
值。然后,使用s
为df.step
In [717]: s = df.groupby('id').transform('first')
In [718]: df.step + s.next_value - s.value
Out[718]:
0 187513.66079
1 387513.66079
2 2395.69480
3 2194.54149
4 67194.54149
5 87194.54149
6 152194.54149
7 202194.54149
8 202194.54149
9 202194.54149
10 202194.54149
11 6156.66299
12 26156.66299
13 36156.66299
14 41156.66299
15 31156.66299
16 31156.66299
17 36156.66299
18 31156.66299
19 76156.66299
20 146156.66299
21 143109.05475
22 11480.26440
23 61480.26440
24 61480.26440
25 61480.26440
26 61480.26440
27 61480.26440
28 6859.69952
29 26859.69952
dtype: float64
细节
In [719]: s
Out[719]:
value next_value step
0 612486.33921 800000.0 0.0
1 612486.33921 800000.0 0.0
2 97604.30520 100000.0 0.0
3 97805.45851 100000.0 0.0
4 97805.45851 100000.0 0.0
5 97805.45851 100000.0 0.0
6 97805.45851 100000.0 0.0
7 97805.45851 100000.0 0.0
8 97805.45851 100000.0 0.0
9 97805.45851 100000.0 0.0
10 97805.45851 100000.0 0.0
11 33843.33701 40000.0 0.0
12 33843.33701 40000.0 0.0
13 33843.33701 40000.0 0.0
14 33843.33701 40000.0 0.0
15 33843.33701 40000.0 0.0
16 33843.33701 40000.0 0.0
17 33843.33701 40000.0 0.0
18 33843.33701 40000.0 0.0
19 33843.33701 40000.0 0.0
20 33843.33701 40000.0 0.0
21 241890.94525 385000.0 0.0
22 138519.73560 150000.0 0.0
23 138519.73560 150000.0 0.0
24 138519.73560 150000.0 0.0
25 138519.73560 150000.0 0.0
26 138519.73560 150000.0 0.0
27 138519.73560 150000.0 0.0
28 73140.30048 80000.0 0.0
29 73140.30048 80000.0 0.0