[英]Pandas get the row value by column name in an apply function in Python?
I have the following dataframe:我有以下 dataframe:
>>> d = {'route1': ['a', 'b'], 'route2': ['c', 'd'], 'val1': [1,2]}
>>> df = pd.DataFrame(data=d)
>>> df
route1 route2 val1
0 a c 1
1 b d 2
What I am trying to do is to pass a list that contains some column names and print the row value associated with that columns:我要做的是传递一个包含一些列名的列表并打印与这些列关联的行值:
>>> def row_eval(row, list):
>>> print(row.loc[list])
In the dataframe above, I first find all the columns that contains the name "route" and then apply the row_val func to each row.在上面的 dataframe 中,我首先找到所有包含名称“路由”的列,然后将 row_val 函数应用于每一行。 However I get the following err:
但是我得到以下错误:
>>> route_cols = [col for col in df.columns if 'route' in col]
>>> route_cols
['route1', 'route2']
>>> df.apply(lambda row: row_eval(row, route_cols)
KeyError: "None of [Index(['route1', 'route2'], dtype='object')] are in the [index]"
Result should look like this:结果应如下所示:
route1 a
route2 c
route1 b
route2 d
Add axis=1
to the apply function:将
axis=1
添加到apply function:
df.apply(lambda row: row_eval(row, route_cols), axis=1)
Without axis=1, you are iterating over the row index (column-wise).如果没有axis = 1,您将遍历行索引(按列)。 The row index are
0 1
instead of the column index route1 route2
you want to match.行索引是
0 1
而不是要匹配的列索引route1 route2
。 Hence, the error message.因此,错误消息。 What you want is to have row-wise operation (ie passing row to the apply function), then you need
axis=1
你想要的是按行操作(即将行传递给apply函数),那么你需要
axis=1
To get you started, you can use either .melt()
or .stack()
to get real close to your expected output.为了让您开始,您可以使用
.melt()
或.stack()
来真正接近您预期的 output。 Not 100% sure if you're looking for 2 dataframes or not.不是 100% 确定您是否正在寻找 2 个数据帧。
df[route_cols].stack()
0 route1 a
route2 c
1 route1 b
route2 d
dtype: object
df[route_cols].melt()
variable value
0 route1 a
1 route1 b
2 route2 c
3 route2 d
One way to print all of the values in these columns, is to iterate over all of the columns within a loop that iterates through all of the rows.打印这些列中所有值的一种方法是在循环中遍历所有列,该循环遍历所有行。 Then to simply print the column name and the value together.
然后简单地将列名和值一起打印。 The if statement is optional, but it will give a line break between rows, like in your example.
if 语句是可选的,但它会在行之间换行,就像在您的示例中一样。
for idx in df.index:
for column in route_cols:
print(f'{column} {df.loc[idx, column]}')
if column == route_cols[-1]:
print('\n')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.