简体   繁体   English

Pandas:更快地迭代数据框并根据操作添加新数据

[英]Pandas: Faster way to iterate through a dataframe and add new data based on operations

I want to pandas to look into values in 2 columns in each row of df1, look for the match in another df2, and paste this in a new column in df1 in the same row and continue 我想让pandas在df1的每一行中查看2列中的值,在另一个df2中查找匹配,并将其粘贴到同一行的df1中的新列中并继续

alp=list("ABCDEFGHIJKLMNOPQRTSUVQXYZ")
df1['NewCol'] = (np.random.choice(alp)) #create new col and input random values

for i in range(len(df1['code1'])):
    a = df1['code2'].iloc[i].upper()
    b = df1['code1'].str[-3:].iloc[i]
    df1['NewCol'].iloc[i] = df2.loc[b,a]
df1['code3'] = df1[['code3','NewCol']].max(axis=1)
df1 =df1.drop('NewCol',axis=1)

My inputs as below: df1: 我的输入如下:df1:

    code1 code2  code3
0  XXXHYG     a     12
1  XXXTBG     a     23
2  XXXECT     b     34
3  XXXKOL     b     45
4  XXXBTW     c     56

df2: DF2:

    A   B   C   D   E
HYG 33  38  40  41  30
TBG 20  46  41  43  45
ECT 53  42  39  34  45
KOL 45  51  54  47  30
BTW 37  36  49  48  58

output needed: 需要输出:

    code1 code2  code3
0  XXXHYG     a     33
1  XXXTBG     a     23
2  XXXECT     b     42
3  XXXKOL     b     51
4  XXXBTW     c     56

When I do this over just 4200 rows in df1, it takes 222 seconds for just the loop.. there has got to be a way to utilize the power of pandas to do this faster? 当我在df1中仅仅4200行执行此操作时,只需要222秒即可完成循环..必须有一种方法来利用大熊猫的力量来更快地完成这项工作吗?

thanks a lot for your time! 非常感谢你的时间!

You could use apply ( docs ), but a faster way to do this if you have a lot of data would be to create a version of df2 with a MultiIndex ( docs ) using stack ( docs ) and look up values on this new DataFrame. 您可以使用applydocs ),但如果您有大量数据,则更快的方法是使用stackdocs )创建带有MultiIndexdocs )的df2版本,并在此新DataFrame上查找值。

df3 = df1.copy()
tups = list(zip(df1['code1'].str[-3:], df1['code2'].str.upper()))
df3['code3'] = df2.stack()[tups].values
print(df3)

outputs 输出

    code1 code2  code3
0  XXXHYG     a     33
1  XXXTBG     a     20
2  XXXECT     b     42
3  XXXKOL     b     51
4  XXXBTW     c     49

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM