简体   繁体   中英

Extracting single value from column in pandas

I have a simple pandas question regarding extracting a single column value

df = DataFrame({'A' : [15,56,23,84], 'B' : [10,20,33,25]})
df

     A    B
0    15   10
1    56   20
2    23   33
3    84   55

x = df[df['A'] == 23]
x

outputs

    A    B
2  23    33

However, I only want to get the value in column B ie 33. How do I get that?

My preferred way is Jeff's using loc (it's generally good practice to avoid working on copies, especially if you might later do assignment).

You can eek some more performance by not creating a Series for the boolean mask, just a numpy array:

df = pd.DataFrame(np.random.randint(1, 100, 2000).reshape(-1, 2),
                  columns=list('AB'))

In [21]: %timeit df.loc[df.A == 23, 'B']
1000 loops, best of 3: 532 µs per loop

In [22]: %timeit df['B'][df.A == 23]
1000 loops, best of 3: 432 µs per loop

In [23]: %timeit df.loc[df.A.values == 23, 'B']  # preferred
1000 loops, best of 3: 294 µs per loop

In [24]: %timeit df['B'].loc[df.A.values == 23]
1000 loops, best of 3: 197 µs per loop

I'm not sure why this is so slow tbh, maybe this usecase could be improved...? (I'm not sure where the the extra 100us is spent)...

However, if you are just interested in the values of B and not their corresponding index (and the subframe) it's much faster just to use the numpy arrays directly:

In [25]: %timeit df.B.values[df.A.values == 23]
10000 loops, best of 3: 60.3 µs per loop

Simply: df['B'][df['A'] == 23]

Thanks @Jeff.

And the speed comparisons:

In [30]:

%timeit df['B'][df['A'] == 23].values
1000 loops, best of 3: 813 µs per loop
In [31]:

%timeit df.loc[df['A'] == 23, 'B']
1000 loops, best of 3: 976 µs per loop

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM