简体   繁体   English

将pd.Series传递到数据框?

[英]Pass a pd.Series to a dataframe?

I tried the following code but the new column consists of only NAN values. 我尝试了以下代码,但new列仅包含NAN值。

df['new'] = pd.Series(np.repeat(1, len(df)))

Can someone explain to me what the problem is here? 有人可以向我解释这里的问题吗?

It is possible that the index of the DataFrame df does not match with the newly created Series'. DataFrame df的索引可能与新创建的Series'不匹配。 For example, 例如,

import pandas as pd
import numpy as np
df = pd.DataFrame({'a': [11, 22, 33, 44, 55]}, index=['r1','r2','r3','r4','r5'])
df['new'] = pd.Series(np.repeat(1, len(df)))
print df

and the output will be: 输出将是:

     a  new
r1  11  NaN
r2  22  NaN
r3  33  NaN
r4  44  NaN
r5  55  NaN

since the index of pd.Series(np.repeat(1, len(df))) is Int64Index([0, 1, 2, 3, 4], dtype='int64') . 因为pd.Series(np.repeat(1, len(df)))Int64Index([0, 1, 2, 3, 4], dtype='int64')Int64Index([0, 1, 2, 3, 4], dtype='int64')

To prevent that, specify the index argument when creating the Series: 为防止这种情况,请在创建系列时指定index参数:

df['new'] = pd.Series(np.repeat(1, len(df)), index=df.index)

Alternatively, you can just pass a numpy array if the index is to be ignored: 另外,如果要忽略索引,则可以只传递一个numpy数组:

df['new'] = np.repeat(1, len(df))

without needing to create a Series (in fact, df['new'] = 1 will do for this case). 无需创建Series(实际上,在这种情况下df['new'] = 1 )。 Using a Series is helpful when you need to align the new column with the existing DataFrame using the index. 当您需要使用索引将新列与现有DataFrame对齐时,使用Series会很有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM