本文介绍了组内的 Cumsum 并在 pandas 的条件下重置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!
问题描述
I have a dataframe with two columns ID and Activity. The activity is either 0 or 1. I want a new column containing a increasing number since the last activity was 1. However, the count should only be within one group (ID). If the activity is 1, the counting column should be reset to 0 and the count starts again.
So, I have a dataframe containing the following:
What is want is this:
Can someone help me?
解决方案
We using a new para 'G' here
df['G']=df.groupby('ID').Activeity.apply(lambda x :(x.diff().ne(0)&x==1)|x==1)
df.groupby([df.ID,df.G.cumsum()]).G.apply(lambda x : (~x).cumsum())
Out[713]:
0 1
1 2
2 0
3 1
4 2
5 1
6 2
7 0
8 1
9 0
10 1
11 1
12 0
13 0
14 1
15 2
Name: G, dtype: int32
Data input
df=pd.DataFrame({'ID':list('AAAAABBBBBBCCCCC'),'Activeity':[0,0,1,0,0,0,0,1,0,1,0,0,1,1,0,0]})
Explanation :
Here we get the new para 'G'
df['G']=df.groupby('ID').Activeity.apply(lambda x :(x.diff().ne(0)&x==1)|x==1)
df
Out[134]:
Activeity ID G
0 0 A False
1 0 A False
2 1 A True
3 0 A False
4 0 A False
5 0 B False
6 0 B False
7 1 B True
8 0 B False
9 1 B True
10 0 B False
11 0 C False
12 1 C True
13 1 C True
14 0 C False
15 0 C False
Then we do cumsum for G, is to getting where is the cycle we should set the number to 0
df.G.cumsum()
Out[135]:
0 0
1 0
2 1
3 1
4 1
5 1
6 1
7 2
8 2
9 3
10 3
11 3
12 4
13 5
14 5
15 5
Name: G, dtype: int32
这篇关于组内的 Cumsum 并在 pandas 的条件下重置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!
The End


大气响应式网络建站服务公司织梦模板
高端大气html5设计公司网站源码
织梦dede网页模板下载素材销售下载站平台(带会员中心带筛选)
财税代理公司注册代理记账网站织梦模板(带手机端)
成人高考自考在职研究生教育机构网站源码(带手机端)
高端HTML5响应式企业集团通用类网站织梦模板(自适应手机端)