繁体   English   中英

一个 dataframe 中的值基于另一个行中的条件

[英]Looing values in one dataframe based on conditions from rows in another

我想向 df1 添加列,该列使用基于 df1 行的条件从 df2 中查找值。

例如:

df1

Name  Date     Score   Country
John  1st Jan   5        US
John  2nd Jan   6        US
Phil  1st Jan   4        Canada
Phil  2nd Jan   8        Canada
Phil  3rd Jan   7        Canada

我想要一个公式来查找 df2 中另一列的值,如果 Name = John,Date is > 1st of Jan,country is = US。 所有其他行都相同。

非常感谢

尝试这个...

import pandas as pd
import numpy as np

columns = ["Name","Date","Score","Country"]
data=[
    ["John","1st Jan","5","US"],
    ["John","2nd Jan","6","US"],
    ["Phil","1st Jan","4","Canada"],
    ["Phil","2nd Jan","8","Canada"],
    ["Phil","3rd Jan","7","Canada"]
]

columns2 = ["Col1","Col2","Col3","Col4"]
data2 = [[1,2,3,4],[1,2,3,4],[1,2,3,4],[1,2,3,4] ]

df = pd.DataFrame(data, columns = columns)
df2 = pd.DataFrame(data2, columns = columns2)
print(df)
print(df2)

df.loc[(df['Name'] == "John")   & 
       (df['Date'] == "1st Jan")&
       (df['Score'] == "5")     &
       (df['Country'] == "US"), 'New'] = df2["Col1"]

name = "Phil"
date = "1st Jan"
score = "4"
country = "Canada"
df.loc[(df['Name'] == name)   & 
       (df['Date'] == date)   &
       (df['Score'] == score) &
       (df['Country'] == country), 'New'] = df2["Col2"]

OUTPUT:

Name     Date Score Country
0  John  1st Jan     5      US
1  John  2nd Jan     6      US
2  Phil  1st Jan     4  Canada
3  Phil  2nd Jan     8  Canada
4  Phil  3rd Jan     7  Canada
Col1  Col2  Col3  Col4
0     1     2     3     4
1     1     2     3     4
2     1     2     3     4
3     1     2     3     4
Name     Date Score Country  New
0  John  1st Jan     5      US  1.0
1  John  2nd Jan     6      US  NaN
2  Phil  1st Jan     4  Canada  2.0
3  Phil  2nd Jan     8  Canada  NaN
4  Phil  3rd Jan     7  Canada  NaN

编辑

您可以通过使用带有df.apply()的 function 和调用 function 的 lambda 来使其更加自动化。

def lambdafunc(row):
    name = row[0]
    date = row[1]
    score = row[2]
    country = row[3]
    df.loc[(df['Name'] == name)   & 
           (df['Date'] == date)   &
           (df['Score'] == score) &
           (df['Country'] == country), 'New'] = df2.loc[df2['Col1'] == name, 'Col4']


df.apply(lambda x: lambdafunc(x), axis = 1)

print(df)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM