简体   繁体   English

Python:Pandas:将行值与列名/键值匹配

[英]Python: pandas: match row value to column name/ key's value

Problem 问题
"How to match a row value to a column name and take that intersecting value in pandas" “如何将行值与列名匹配,并在熊猫中获取相交的值”

Context 上下文
We have a pandas df like this: 我们有一个这样的pandas df:

df = pd.DataFrame([{'name': 'john', 'john': 1, 'mac': 10}, {'name': 'mac', 'john': 2, 'mac': 20}], columns=["name", "john", "mac"])

Looking like this: 看起来像这样:

name | john | mac
john |  1   | 10
mac  |  2   | 20


Desired output 所需的输出

name | john | mac  | value
john |  1   | 10   | 1
mac  |  2   | 20   | 20

In words, the column "value" should take the number from the corresponding column where name intersects. 换句话说, "value"列应采用名称相交的相应列中的数字。

So, if name == 'john' , then take the value from column 'john' 因此,如果name == 'john' ,则从'john'列取值
So, if name == 'mac' , then take the value from column 'mac' 因此,如果name == 'mac' ,则从'mac'列获取值

Tried so far 到目前为止尝试过
Bunch of lambdas (none successful). 一堆lambdas(没有成功)。

Specifications 产品规格
Python: 3.5.2 的Python:3.5.2
Pandas: 0.18.1 熊猫:0.18.1

You could use DataFrame.lookup , which accepts the row and column labels to use: 您可以使用DataFrame.lookup ,它接受要使用的行和列标签:

In [66]: df
Out[66]: 
   name  john  mac
0  john     1   10
1   mac     2   20

In [67]: df["value"] = df.lookup(df.index, df.name)

In [68]: df
Out[68]: 
   name  john  mac  value
0  john     1   10      1
1   mac     2   20     20

Note that this will have problems with duplicate row labels (which could be trivially worked around by adding a reset_index). 请注意,这将导致重复的行标签出现问题(可以通过添加reset_index来解决该问题)。 It should be faster than calling apply , which can be pretty slow, but if your frames aren't too large both should work well enough. 它的速度应该比调用apply速度快,后者可能会非常慢,但是如果您的框架不太大,那么两者都应该可以很好地工作。

well imo lambda is the way to go, but you can make it very short such has: 好的imo lambda是要走的路,但是您可以将其做得很短,例如:

df = pd.DataFrame([{'name': 'john', 'john': 5, 'mac': 10}, {'name': 'mac', 'john': 10, 'mac': 15}], columns=["name", "john", "mac"])
df = df.set_index('name')
df
Out[64]: 
      john  mac
name           
john     5   10
mac     10   15

df['values'] = df.apply(lambda x: x[x.name], axis=1)
In[68]: df
Out[68]: 
      john  mac  values
name                   
john     5   10       5
mac     10   15      15

I did set the index to name for convinience but you could do it without it such has: 我确实为方便起见将索引设置为name,但是如果没有它,您可以这样做:

df = pd.DataFrame([{'name': 'john', 'john': 5, 'mac': 10}, {'name': 'mac', 'john': 10, 'mac': 15}], columns=["name", "john", "mac"])
df['values'] = df.apply(lambda x: x[x['name']], axis=1)
df
Out[71]: 
   name  john  mac  values
0  john     5   10       5
1   mac    10   15      15

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM