将一列从一个 DataFrame 复制到另一个会给出 NaN 值?

2023-08-31Python开发问题
15

本文介绍了将一列从一个 DataFrame 复制到另一个会给出 NaN 值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

This question has been asked so many times, and it seemed to work for others, however, I am getting NaN values when I copy a column from a different DataFrame(df1 and df2 are same length).

df1

        date     hour      var1
a   2017-05-01  00:00:00   456585
b   2017-05-01  01:00:00   899875
c   2017-05-01  02:00:00   569566
d   2017-05-01  03:00:00   458756
e   2017-05-01  04:00:00   231458
f   2017-05-01  05:00:00   986545

df2

      MyVar1     MyVar2 
 0  6169.719338 3688.045368
 1  5861.148007 3152.238704
 2  5797.053347 2700.469871
 3  5779.102340 2730.471948
 4  6708.219647 3181.298291
 5  8550.380343 3793.580394

I need like this in my df2

       MyVar1    MyVar2        date        hour
 0  6169.719338 3688.045368  2017-05-01  00:00:00
 1  5861.148007 3152.238704  2017-05-01  01:00:00
 2  5797.053347 2700.469871  2017-05-01  02:00:00
 3  5779.102340 2730.471948  2017-05-01  03:00:00
 4  6708.219647 3181.298291  2017-05-01  04:00:00
 5  8550.380343 3793.580394  2017-05-01  05:00:00

I tried the following,

df2['date'] = df1['date']
df2['hour'] = df1['hour']

type(df1)
>> pandas.core.frame.DataFrame

type(df2)
>> pandas.core.frame.DataFrame

I am getting the following,

       MyVar1    MyVar2      date       hour
 0  6169.719338 3688.045368  NaN        NaN
 1  5861.148007 3152.238704  NaN        NaN
 2  5797.053347 2700.469871  NaN        NaN

Why is this happening? There is another post that discusses merge, but I just need to copy it. Any help would be appreciated.

解决方案

The culprit is unalignable indexes

Your DataFrames' indexes are different (and correspondingly, the indexes for each columns), so when trying to assign a column of one DataFrame to another, pandas will try to align the indexes, and failing to do so, insert NaNs.

Consider the following examples to understand what this means:

# Setup
A = pd.DataFrame(index=['a', 'b', 'c']) 
B = pd.DataFrame(index=['b', 'c', 'd', 'f'])                                  
C = pd.DataFrame(index=[1, 2, 3])

# Example of alignable indexes - A & B (complete or partial overlap of indexes)
A.index B.index
      a        
      b       b   (overlap)
      c       c   (overlap)
              d
              f

# Example of unalignable indexes - A & C (no overlap at all)
A.index C.index
      a        
      b        
      c        
              1
              2
              3

When there are no overlaps, pandas cannot match even a single value between the two DataFrames to put in the result of the assignment, so the output is a column full of NaNs.

If you're working on an IPython notebook, you can check that this is indeed the root cause using,

df1.index.equals(df2.index)
# False
df1.index.intersection(df2.index).empty
# True


You can use any of the following solutions to fix this issue.

Solution 1: Reset both DataFrames' indexes

You may prefer this option if you didn't mean to have different indices in the first place, or if you don't particularly care about preserving the index.

# Optional, if you want a RangeIndex => [0, 1, 2, ...]
# df1.index = pd.RangeIndex(len(df))
# Homogenize the index values,
df2.index = df1.index
# Assign the columns.
df2[['date', 'hour']] = df1[['date', 'hour']]

If you want to keep the existing index, but as a column, you may use reset_index() instead.


Solution 2: Assign NumPy arrays (bypass index alignment)

This solution will only work if the lengths of the two DataFrames match.

# pandas >= 0.24
df2['date'] = df1['date'].to_numpy()
# pandas < 0.24
df2['date'] = df1['date'].values

To assign multiple columns easily, use,

df2[['date', 'hour']] = df1[['date', 'hour']].to_numpy()

这篇关于将一列从一个 DataFrame 复制到另一个会给出 NaN 值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

The End

相关推荐

在xarray中按单个维度的多个坐标分组
groupby multiple coords along a single dimension in xarray(在xarray中按单个维度的多个坐标分组)...
2024-08-22 Python开发问题
15

Pandas中的GROUP BY AND SUM不丢失列
Group by and Sum in Pandas without losing columns(Pandas中的GROUP BY AND SUM不丢失列)...
2024-08-22 Python开发问题
17

pandas 有从特定日期开始的按月分组的方式吗?
Is there a way of group by month in Pandas starting at specific day number?( pandas 有从特定日期开始的按月分组的方式吗?)...
2024-08-22 Python开发问题
10

GROUP BY+新列+基于条件的前一行抓取值
Group by + New Column + Grab value former row based on conditionals(GROUP BY+新列+基于条件的前一行抓取值)...
2024-08-22 Python开发问题
18

PANDA中的Groupby算法和插值算法
Groupby and interpolate in Pandas(PANDA中的Groupby算法和插值算法)...
2024-08-22 Python开发问题
11

PANAS-基于列对行进行分组,并将NaN替换为非空值
Pandas - Group Rows based on a column and replace NaN with non-null values(PANAS-基于列对行进行分组,并将NaN替换为非空值)...
2024-08-22 Python开发问题
10