Pandas - Duplicate Row based on condition(Pandas - 根据条件复制行)
问题描述
如果行满足条件,我正在尝试创建重复行.在下表中,我根据 groupby 创建了一个累积计数,然后再计算 groupby 的 MAX.
I'm trying to create a duplicate row if the row meets a condition. In the table below, I created a cumulative count based on a groupby, then another calculation for the MAX of the groupby.
df['PathID'] = df.groupby(DateCompleted).cumcount() + 1
df['MaxPathID'] = df.groupby(DateCompleted)['PathID'].transform(max)
Date Completed PathID MaxPathID
1/31/17 1 3
1/31/17 2 3
1/31/17 3 3
2/1/17 1 1
2/2/17 1 2
2/2/17 2 2
在这种情况下,我只想复制 2/1/17 的记录,因为该日期只有一个实例(即 MaxPathID == 1).
In this case, I want to duplicate only the record for 2/1/17 since there is only one instance for that date (i.e. where the MaxPathID == 1).
期望的输出:
Date Completed PathID MaxPathID
1/31/17 1 3
1/31/17 2 3
1/31/17 3 3
2/1/17 1 1
2/1/17 1 1
2/2/17 1 2
2/2/17 2 2
提前致谢!
推荐答案
我认为你需要通过 Date Completed
获取 unique
行,然后 concat
行到原始:
I think you need get unique
rows by Date Completed
and then concat
rows to original:
df1 = df.loc[~df['Date Completed'].duplicated(keep=False), ['Date Completed']]
print (df1)
Date Completed
3 2/1/17
df = pd.concat([df,df1], ignore_index=True).sort_values('Date Completed')
df['PathID'] = df.groupby('Date Completed').cumcount() + 1
df['MaxPathID'] = df.groupby('Date Completed')['PathID'].transform(max)
print (df)
Date Completed PathID MaxPathID
0 1/31/17 1 3
1 1/31/17 2 3
2 1/31/17 3 3
3 2/1/17 1 2
6 2/1/17 2 2
4 2/2/17 1 2
5 2/2/17 2 2
print (df)
Date Completed a b
0 1/31/17 4 5
1 1/31/17 3 5
2 1/31/17 6 3
3 2/1/17 7 9
4 2/2/17 2 0
5 2/2/17 6 7
df1 = df[~df['Date Completed'].duplicated(keep=False)]
#alternative - boolean indexing by numpy array
#df1 = df[~df['Date Completed'].duplicated(keep=False).values]
print (df1)
Date Completed a b
3 2/1/17 7 9
df = pd.concat([df,df1], ignore_index=True).sort_values('Date Completed')
print (df)
Date Completed a b
0 1/31/17 4 5
1 1/31/17 3 5
2 1/31/17 6 3
3 2/1/17 7 9
6 2/1/17 7 9
4 2/2/17 2 0
5 2/2/17 6 7
这篇关于Pandas - 根据条件复制行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:Pandas - 根据条件复制行


基础教程推荐
- 修改列表中的数据帧不起作用 2022-01-01
- 求两个直方图的卷积 2022-01-01
- PermissionError: pip 从 8.1.1 升级到 8.1.2 2022-01-01
- Plotly:如何设置绘图图形的样式,使其不显示缺失日期的间隙? 2022-01-01
- PANDA VALUE_COUNTS包含GROUP BY之前的所有值 2022-01-01
- 使用大型矩阵时禁止 Pycharm 输出中的自动换行符 2022-01-01
- 在同一图形上绘制Bokeh的烛台和音量条 2022-01-01
- 在Python中从Azure BLOB存储中读取文件 2022-01-01
- 包装空间模型 2022-01-01
- 无法导入 Pytorch [WinError 126] 找不到指定的模块 2022-01-01