将 pandas DataFrame 旋转为正确的格式:`DataError: No numeric types to

2023-10-19Python开发问题
16

本文介绍了将 pandas DataFrame 旋转为正确的格式:`DataError: No numeric types to aggregate`的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

这是我想要操作的 pandas DataFrame:

Here is a pandas DataFrame I would like to manipulate:

import pandas as pd

data = {"grouping": ["item1", "item1", "item1", "item2", "item2", "item2", "item2", ...],
        "labels": ["A", "B", "C", "A", "B", "C", "D", ...],
        "count": [5, 1, 8, 3, 731, 189, 9, ...]}

df = pd.DataFrame(data)

print(df)
>>>   grouping            labels       count
0        item1             A            5
1        item1             B            1
2        item1             C            8
3        item2             A            3
4        item2             B          731
5        item2             C          189
6        item2             D            9
7        ...               ...         ....

我想将此数据框展开"为以下格式:

I would like to "unfold" this dataframe into the following format:

grouping    A    B    C    D
item1       5    1    8    3
item2       3    731  189  9
....        ........

如何做到这一点?我认为这会起作用:

How would one do this? I would think that this would work:

pd.pivot_table(df,index=["grouping", "labels"]

但我收到以下错误:

DataError: No numeric types to aggregate

推荐答案

有四种惯用的 pandas 方法可以做到这一点.

There are four idiomatic pandas ways to do this.

  • 分组列之间没有重复.不需要聚合
    • 枢轴
    • set_index
    • 数据透视表
    • 分组方式

    枢轴

    df.pivot('grouping', 'labels', 'count')
    

    set_index

    df.set_index(['grouping', 'labels'])['count'].unstack()
    

    pivot_table

    df.pivot_table('count', 'grouping', 'labels')
    

    groupby

    df.groupby(['grouping', 'labels'])['count'].sum().unstack()
    

    全部收益

    labels      A      B      C    D
    grouping                        
    item1     5.0    1.0    8.0  NaN
    item2     3.0  731.0  189.0  9.0
    

    时机

    使用 groupbyset_indexpivot_table 方法,您可以使用 fill_value=0

    With the groupby, set_index, or pivot_table approach, you can easily fill in missing values with fill_value=0

    df.pivot_table('count', 'grouping', 'labels', fill_value=0)
    
    df.groupby(['grouping', 'labels'])['count'].sum().unstack(fill_value=0)
    
    df.set_index(['grouping', 'labels'])['count'].sum().unstack(fill_value=0)
    

    全部收益

    labels    A    B    C  D
    grouping                
    item1     5    1    8  0
    item2     3  731  189  9
    

    <小时>

    关于groupby的其他想法

    因为我们不需要任何聚合.如果我们想使用 groupby,我们可以通过使用影响较小的聚合器来最小化隐式聚合的影响.

    Because we don't require any aggregation. If we wanted to use groupby, we can minimize the impact of the implicit aggregation by utilizing a less impactful aggregator.

    df.groupby(['grouping', 'labels'])['count'].max().unstack()
    

    df.groupby(['grouping', 'labels'])['count'].first().unstack()
    

    定时groupby

    这篇关于将 pandas DataFrame 旋转为正确的格式:`DataError: No numeric types to aggregate`的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

    The End

相关推荐

在xarray中按单个维度的多个坐标分组
groupby multiple coords along a single dimension in xarray(在xarray中按单个维度的多个坐标分组)...
2024-08-22 Python开发问题
15

Pandas中的GROUP BY AND SUM不丢失列
Group by and Sum in Pandas without losing columns(Pandas中的GROUP BY AND SUM不丢失列)...
2024-08-22 Python开发问题
17

pandas 有从特定日期开始的按月分组的方式吗?
Is there a way of group by month in Pandas starting at specific day number?( pandas 有从特定日期开始的按月分组的方式吗?)...
2024-08-22 Python开发问题
10

GROUP BY+新列+基于条件的前一行抓取值
Group by + New Column + Grab value former row based on conditionals(GROUP BY+新列+基于条件的前一行抓取值)...
2024-08-22 Python开发问题
18

PANDA中的Groupby算法和插值算法
Groupby and interpolate in Pandas(PANDA中的Groupby算法和插值算法)...
2024-08-22 Python开发问题
11

PANAS-基于列对行进行分组,并将NaN替换为非空值
Pandas - Group Rows based on a column and replace NaN with non-null values(PANAS-基于列对行进行分组,并将NaN替换为非空值)...
2024-08-22 Python开发问题
10