[英]Pass a pd.Series to a dataframe?
I tried the following code but the new
column consists of only NAN
values. 我尝试了以下代码,但new
列仅包含NAN
值。
df['new'] = pd.Series(np.repeat(1, len(df)))
Can someone explain to me what the problem is here? 有人可以向我解释这里的问题吗?
It is possible that the index of the DataFrame df
does not match with the newly created Series'. DataFrame df
的索引可能与新创建的Series'不匹配。 For example, 例如,
import pandas as pd
import numpy as np
df = pd.DataFrame({'a': [11, 22, 33, 44, 55]}, index=['r1','r2','r3','r4','r5'])
df['new'] = pd.Series(np.repeat(1, len(df)))
print df
and the output will be: 输出将是:
a new
r1 11 NaN
r2 22 NaN
r3 33 NaN
r4 44 NaN
r5 55 NaN
since the index of pd.Series(np.repeat(1, len(df)))
is Int64Index([0, 1, 2, 3, 4], dtype='int64')
. 因为pd.Series(np.repeat(1, len(df)))
的Int64Index([0, 1, 2, 3, 4], dtype='int64')
是Int64Index([0, 1, 2, 3, 4], dtype='int64')
。
To prevent that, specify the index argument when creating the Series: 为防止这种情况,请在创建系列时指定index参数:
df['new'] = pd.Series(np.repeat(1, len(df)), index=df.index)
Alternatively, you can just pass a numpy array if the index is to be ignored: 另外,如果要忽略索引,则可以只传递一个numpy数组:
df['new'] = np.repeat(1, len(df))
without needing to create a Series (in fact, df['new'] = 1
will do for this case). 无需创建Series(实际上,在这种情况下df['new'] = 1
)。 Using a Series is helpful when you need to align the new column with the existing DataFrame using the index. 当您需要使用索引将新列与现有DataFrame对齐时,使用Series会很有帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.