简体   繁体   中英

how to use numpy vectorization for mutiple datasets, and then call a function?

I have a dataset that contains name and date. And i need to compare them to others datasets that have name and date, and call another function if the name is in it, in the example i just mocked a return, that would be assigned to a new column in the dataframe. But i couldn't find how. Here's what i did so far: *I need to use numpy vectorization

def getName(name, date, df1, df2):
    if name  == df1['NAME'].values:
       return name
    if name  == df2['NAME'].values:
       return 'HEY'

df = pd.DataFrame({
    "NAME": ["JOE", "CHRIS", "AARON"],
    "DATE": [10, 20, 30]
})
df1 = pd.DataFrame({
    "NAME": ["JOE", "JASON", "GUS"],
    "DATE": [10, 20, 30]
})

df2 = pd.DataFrame({
    "NAME": ["STEPHEN", "CHRIS", "AARON"],
    "DATE": [10, 20, 30]
})

df['NAME_'] = getname(df['NAME'].values, df['DATE'].values, df1, df2)

The output should be:

df = 
NAME DATE NAME_
JOE   10   JOE
CHRIS 20   HEY
AARON 30   HEY

So you are testing equality with the == operator, which will evaluate False because name is a str and df1['NAME'] is a Series . I think you want to test if name is in a column. You can do this with a construct like if name in df1['NAME'].values .

But, even if you fix the function, you can't call getName just once and get the result you are looking for. Typically, you could use apply so the function is called for every row of df . You can do this with df['NAME'].apply(getname, axis=1) . But this isn't using vectorization, as apply is a loop behind the scenes.

So perhaps you could use join

df1['NAME_'] = df1['NAME']
df2['NAME_'] = 'HEY'
df3 = pd.concat([df2, df3]).set_index('NAME')
df.join(df3['NAME_'], on='NAME', how='left')

Output

    NAME  DATE NAME_
0    JOE    10   JOE
1  CHRIS    20   HEY
2  AARON    30   HEY

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM