[英]Python: pandas: match row value to column name/ key's value
Problem 问题
"How to match a row value to a column name and take that intersecting value in pandas" “如何将行值与列名匹配,并在熊猫中获取相交的值”
Context 上下文
We have a pandas df like this: 我们有一个这样的pandas df:
df = pd.DataFrame([{'name': 'john', 'john': 1, 'mac': 10}, {'name': 'mac', 'john': 2, 'mac': 20}], columns=["name", "john", "mac"])
Looking like this: 看起来像这样:
name | john | mac
john | 1 | 10
mac | 2 | 20
Desired output 所需的输出
name | john | mac | value
john | 1 | 10 | 1
mac | 2 | 20 | 20
In words, the column "value"
should take the number from the corresponding column where name intersects. 换句话说,
"value"
列应采用名称相交的相应列中的数字。
So, if name == 'john'
, then take the value from column 'john' 因此,如果
name == 'john'
,则从'john'列取值
So, if name == 'mac'
, then take the value from column 'mac' 因此,如果
name == 'mac'
,则从'mac'列获取值
Tried so far 到目前为止尝试过
Bunch of lambdas (none successful). 一堆lambdas(没有成功)。
Specifications 产品规格
Python: 3.5.2 的Python:3.5.2
Pandas: 0.18.1 熊猫:0.18.1
You could use DataFrame.lookup
, which accepts the row and column labels to use: 您可以使用
DataFrame.lookup
,它接受要使用的行和列标签:
In [66]: df
Out[66]:
name john mac
0 john 1 10
1 mac 2 20
In [67]: df["value"] = df.lookup(df.index, df.name)
In [68]: df
Out[68]:
name john mac value
0 john 1 10 1
1 mac 2 20 20
Note that this will have problems with duplicate row labels (which could be trivially worked around by adding a reset_index). 请注意,这将导致重复的行标签出现问题(可以通过添加reset_index来解决该问题)。 It should be faster than calling
apply
, which can be pretty slow, but if your frames aren't too large both should work well enough. 它的速度应该比调用
apply
速度快,后者可能会非常慢,但是如果您的框架不太大,那么两者都应该可以很好地工作。
well imo lambda is the way to go, but you can make it very short such has: 好的imo lambda是要走的路,但是您可以将其做得很短,例如:
df = pd.DataFrame([{'name': 'john', 'john': 5, 'mac': 10}, {'name': 'mac', 'john': 10, 'mac': 15}], columns=["name", "john", "mac"])
df = df.set_index('name')
df
Out[64]:
john mac
name
john 5 10
mac 10 15
df['values'] = df.apply(lambda x: x[x.name], axis=1)
In[68]: df
Out[68]:
john mac values
name
john 5 10 5
mac 10 15 15
I did set the index to name for convinience but you could do it without it such has: 我确实为方便起见将索引设置为name,但是如果没有它,您可以这样做:
df = pd.DataFrame([{'name': 'john', 'john': 5, 'mac': 10}, {'name': 'mac', 'john': 10, 'mac': 15}], columns=["name", "john", "mac"])
df['values'] = df.apply(lambda x: x[x['name']], axis=1)
df
Out[71]:
name john mac values
0 john 5 10 5
1 mac 10 15 15
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.