[英]Python Pandas Dataframe - Iterating rows and adding dictionary issue
import pandas as pd
d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
def calculation(text):
return text*2
for idx, row in df.iterrows():
df.at[idx, 'col3'] = dict(cats=calculation(row['col1']))
df
So as you can see from the code above I have tried a few different things. 因此,从上面的代码中您可以看到,我已经尝试了一些不同的方法。
Basically I am trying to get the dictionary in to col3. 基本上,我正在尝试将字典导入col3。
However, when you run for the first time on new dataframe - you get a 但是,当您首次在新数据框上运行时,会得到一个
col1 col2 col3
0 1 3 cats
1 2 4 {'cats': 4}
If you run the for loop again on the same dataframe you get what I am looking for which is 如果您在同一数据帧上再次运行for循环,您将得到我正在寻找的内容
col1 col2 col3
0 1 3 {'cats': 2}
1 2 4 {'cats': 4}
How do I go straight to having the dictionary in there to start without having to run the loop again? 如何直接在其中启动字典而不必再次运行循环?
I have tried other ways like df.loc and others, still no joy. 我尝试了其他方法,例如df.loc和其他方法,仍然没有乐趣。
Try to stay away from df.iterrows()
. 尝试远离df.iterrows()
。
You can use df.apply
instead: 您可以改用df.apply
:
import pandas as pd
d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
def calculation(text):
return text*2
def calc_dict(row):
return dict(cats=calculation(row['col1']))
df['col3'] = df.apply(calc_dict, axis=1)
df
Which outputs the result you expect. 哪个输出您期望的结果。
The error seems to creep in with the creation and assignment of an object datatype to col col3
. 该错误似乎随着向col col3
创建和分配对象数据类型而蔓延。 I tried to pre-allocate to NaNs with df['col3'] = pd.np.NaN
which did not have an effect (inspect with print(df.dtypes)
). 我试图用df['col3'] = pd.np.NaN
预先分配给NaN,但没有效果(检查print(df.dtypes)
)。 Anyway this seems like buggy behaviour. 无论如何,这似乎是越野车的行为。 Use df.apply
instead, its faster and less prone to these types of issues. 请改用df.apply
,它更快并且更不容易出现此类问题。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.