how to return columns values with the input of other column values of same row using pandas?

Question

I have a data frame like this:

df
col1     col2     col3     col4
 1         2        P        Q
 4         2        R        S
 5         3        P        R

I want to create a function which returns the col1 and col2 values with the input of col3 and col4 values,

for example if the function is f, the output of f([P,Q]) will be like:

col1    col2
 1       2

How to do it in most efficient way using pandas ?

Answer 1

If need most efficient way compare numpy arrays:

def f(a, b):
    #pandas 0.24+ 
    mask = (df['col3'].to_numpy() == a) & (df['col4'].to_numpy() == b)
    #all pandas versions yet
    #mask = (df['col3'].values == a) & (df['col4'].values == b)
    return  df.loc[mask, ['col1','col2']]

Performance : Depends of data, number of rows, number of matched rows, but generally here is comparing 1d numpy arrays faster:

np.random.seed(123)
N = 10000
L = list('PQRSTU')
df = pd.DataFrame({'col1': np.random.randint(10, size=N),
                   'col2': np.random.randint(10, size=N),
                   'col3': np.random.choice(L, N),
                   'col4': np.random.choice(L, N)})
print (df)

def f(a, b):
    #pandas 0.24+ 
    mask = (df['col3'].to_numpy() == a) & (df['col4'].to_numpy() == b)
    #all pandas versions yet
    #mask = (df['col3'].values == a) & (df['col4'].values == b)
    return  df.loc[mask, ['col1','col2']]

def f1(first, second):
    return df.loc[(df['col3'] == first) & (df['col4'] == second), ['col1', 'col2']]

In [91]: %timeit (f('P', 'Q'))
2.05 ms ± 13.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [92]: %timeit (f1('P', 'Q'))
3.52 ms ± 24.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Answer 2

Just use boolean masking:

def f(first, second):
    return df.loc[(df['col3'] == first) & (df['col4'] == second), ['col1', 'col2']]

Answer 3

**Simple line of code can do this**

在'P'和'Q'的位置，你应该放置你想要匹配的值。

df[(df.col3 == 'P') & (df.col4 == 'Q')][col1,col2]

Answer 4

You can try below code:

def func(x):
    series = f(x['col3'], c['col4'])
    return series.append(x)

dataframe = dataframe.apply(lambda x: func(x))

how to return columns values with the input of other column values of same row using pandas?

Question

4 answers

solution1
3 ACCPTED 2019-04-15 06:51:45

solution2
3 2019-04-15 06:51:46

solution3
2 2019-04-15 07:00:18

solution4
0 2019-04-15 06:58:07

how to return columns values with the input of other column values of same row using pandas?

Question

4 answers

solution1 3 ACCPTED 2019-04-15 06:51:45

solution2 3 2019-04-15 06:51:46

solution3 2 2019-04-15 07:00:18

solution4 0 2019-04-15 06:58:07

solution1
3 ACCPTED 2019-04-15 06:51:45

solution2
3 2019-04-15 06:51:46

solution3
2 2019-04-15 07:00:18

solution4
0 2019-04-15 06:58:07