给定 pandas DataFrame 中一列中的值列表，如何从同一行中的另一列中获取 output 中的值？

Question

The problem is simple, the input is a list of non-container objects ( int , str etc.), all elements inside the list are contained inside a column in a DataFrame , the task is, for each element inside the list, find the object (only its value, not the array) in another column in the same row.问题很简单，输入是一个非容器对象列表（ int ， str等），列表中的所有元素都包含在DataFrame的列中，任务是，对于列表中的每个元素，找到object（只是它的值，不是数组）在同一行的另一列中。

The problem will be better demonstrated in code:该问题将在代码中得到更好的证明：

from pandas import DataFrame
digits = '0123456789abcdef'
df = DataFrame([(a,b) for a, b in zip(digits, range(16))], columns=['hex', 'dec'])
df
df.loc[df.dec == 12, 'hex']
df.loc[df.dec == 12, 'hex'].values[0]
import random
eight = random.sample(range(16), 8)
eight
fun = lambda x: df.loc[df.dec == x, 'hex'].values[0]
''.join(fun(i) for i in eight)
''.join(map(fun, eight))

As you can see I can already do this, but I am using a for loop, and the performance isn't very impressive, I know pandas and numpy are all about vectorization, I wonder is there a built-in way to do this...如您所见，我已经可以做到这一点，但我正在使用 for 循环，并且性能不是很令人印象深刻，我知道pandas和numpy都是关于矢量化的，我想知道是否有内置的方法可以做到这一点。 ..

In [1]: from pandas import DataFrame

In [2]: digits = '0123456789abcdef'

In [3]: df = DataFrame([(a,b) for a, b in zip(digits, range(16))], columns=['hex', 'dec'])

In [4]: df
Out[4]:
   hex  dec
0    0    0
1    1    1
2    2    2
3    3    3
4    4    4
5    5    5
6    6    6
7    7    7
8    8    8
9    9    9
10   a   10
11   b   11
12   c   12
13   d   13
14   e   14
15   f   15

In [5]: df.loc[df.dec == 12, 'hex']
Out[5]:
12    c
Name: hex, dtype: object

In [6]: df.loc[df.dec == 12, 'hex'].values[0]
Out[6]: 'c'

In [7]: import random

In [8]: eight = random.sample(range(16), 8)

In [9]: eight
Out[9]: [9, 7, 1, 6, 11, 12, 14, 10]

In [10]: fun = lambda x: df.loc[df.dec == x, 'hex'].values[0]

In [11]: ''.join(fun(i) for i in eight)
Out[11]: '9716bcea'

In [12]: ''.join(map(fun, eight))
Out[12]: '9716bcea'

In [13]: %timeit ''.join(fun(i) for i in eight)
2.34 ms ± 136 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [14]: %timeit ''.join(map(fun, eight))
2.34 ms ± 134 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

So what is a vectorized way to achieve the same result as the method demonstrated in the code?那么有什么向量化的方式可以实现和代码中演示的方法一样的结果呢？

Answer 1

A vectorized way would be to construct a Series:矢量化的方法是构造一个系列：

series = df.set_index('dec')['hex']
''.join(series[eight])

Output: '9716bcea' Output: '9716bcea'

给定 pandas DataFrame 中一列中的值列表，如何从同一行中的另一列中获取 output 中的值？

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-10-06 04:07:57

给定 pandas DataFrame 中一列中的值列表，如何从同一行中的另一列中获取 output 中的值？

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-10-06 04:07:57

解决方案1
1 已采纳 2021-10-06 04:07:57