简体   繁体   English

Pandas dataframe,在一行中,查找所选列中的最大值,并根据该值查找另一列的值

[英]Pandas dataframe, in a row, to find the max in selected column, and find value of another column based on that

I have a dataframe like this:我有一个这样的 dataframe:

import pandas as pd
df = pd.DataFrame({'x1':[20,25],'y1':[5,8],'x2':[22,27],'y2':[10,2]})

df去向

       x1  y1  x2  y2
    0  20   5  22  10
    1  25   8  27   2

X and Y pair together. X 和 Y 配对在一起。 I need to compare y1 and y2 and get the max in every row.我需要比较 y1 和 y2 并在每一行中获得最大值。 And find the corresponding x.并找到对应的x。 Hence the max of row [0] is y2 (=10), and the corresponding x is x2 (=22).因此第[0]行的最大值是y2(=10),对应的x是x2(=22)。 The second row will be y1 (=8) and x1(=25).第二行将是 y1 (=8) 和 x1(=25)。 Expected result, new columns x and y:预期结果,新列 x 和 y:

   x1  y1  x2  y2   x   y
0  20   5  22  10  22  10
1  25   8  27   2  25   8

This is a simple dataframe I made to elaborate on the question.这是一个简单的 dataframe 我为了详细说明这个问题。 X and Y pairs, in my case, can be 30 pairs. X 和 Y 对,在我的例子中,可以是 30 对。

# get a hold on "y*" columns
y_cols = df.filter(like="y")

# get the maximal y-values' suffixes, and then add from front "x" to them
max_x_vals = y_cols.idxmax(axis=1).str.extract(r"(\d+)$", expand=False).radd("x")
# get the locations of those x* values
max_x_ids = df.columns.get_indexer(max_x_vals)

# now we have the indexes of x*'s in the columns; NumPy's indexing
# helps to get a cross section
df["max_xs"] = df.to_numpy()[np.arange(len(df)), max_x_ids]

# for y*'s, it's directly the maximum per row
df["max_ys"] = y_cols.max(axis=1)

to get要得到

>>> df

   x1  y1  x2  y2  max_xs  max_ys
0  20   5  22  10      22      10
1  25   8  27   2      25       8

You can do it with the help of.apply function.您可以在申请 function 的帮助下完成。

import pandas as pd
import numpy as np

df = pd.DataFrame({'x1':[20,25],'y1':[5,8],'x2':[22,27],'y2':[10,2]})
y_cols = [col for col in df.columns if col[0] == 'y'] 
x_cols = [col for col in df.columns if col[0] == 'x'] 

def find_corresponding_x(row):
    max_y_index = np.argmax(row[y_cols])
    return row[f'{x_cols[max_y_index]}']

df['corresponding_x'] = df.apply(find_corresponding_x, axis = 1)

this is one solution:这是一个解决方案:

a = df[df['y1'] < df['y2']].drop(columns=['y1','x1']).rename(columns={'y2':'y', 'x2':'x'})
b = df[df['y1'] >= df['y2']].drop(columns=['y2','x2']).rename(columns={'y1':'y', 'x1':'x'})

result = pd.concat([a,b])

if you need to keep order then maybe add another column with original index and sort by it after concatenation如果您需要保持顺序,则可以添加另一列具有原始索引并在连接后按它排序

you can use the function below.您可以使用下面的 function。 remember to import pandas and numpy like I did in this code.请记住像我在此代码中所做的那样导入 pandas 和 numpy。 import your data set and use Max_number function.导入您的数据集并使用Max_number function。

import pandas as pd
import numpy as np
df = pd.DataFrame({'x1':[20,25],'y1':[5,8],'x2':[22,27],'y2':[10,2]})

def Max_number (df):
    columns = list(df.columns)
    rows = df.shape[0]
    max_value = []
    column_name = []

    for i in range(rows):
        row_array = list(np.array(df[i:i+1])[0])
        maximum = max(row_array)
        max_value.append(maximum)
        index=row_array.index(maximum)
        column_name.append(columns[index])
    
    return pd.DataFrame({"column":column_name,"max_value":max_value})

returns this:返回这个:

row index行索引 column柱子 max_value最大值
0 0 x2 x2 22 22
1 1个 x2 x2 27 27

if x1 column comes first and then y1, then x2, y2 and so on, you can just try:如果先是 x1 列,然后是 y1,然后是 x2、y2 等等,您可以尝试:

a = df.columns.get_indexer(y_cols.idxmax(axis=1))
df[['y', 'x']] = df.to_numpy()[np.arange(len(df)), [a, a - 1]].T

I hope it works for your solution,我希望它适用于您的解决方案,

import pandas as pd
df = pd.DataFrame({'x1':[20,25],'y1':[5,8],'x2':[22,27],'y2':[10,2]})
df['x_max'] = df[['x1', 'x2']].max(axis=1)
df['y_max'] = df[['y1', 'y2']].max(axis=1)
df

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何获取熊猫数据框列的最大值并在另一列中找到相应的值? - How do I take max value of a pandas dataframe column and find the corresponding value in another column? Pandas Dataframe:groupby id查找最大列值并返回另一列的对应值 - Pandas Dataframe: groupby id to find max column value and return corresponding value of another column 根据另一个数据框查找列值 - find column value based on another dataframe Pandas:在列的每一行中查找最大值,并在另一列中标识相应的值 - Pandas: Find max value in each row of a column and identify corresponding values in another column python:pandas:如何基于groupby另一列在列中查找最大值 - python: pandas: how to find max value in a column based on groupby another column 根据列的最大值删除pandas数据帧行 - Drop pandas dataframe row based on max value of a column Python pandas dataframe:为另一列的每个唯一值查找最大值 - Python pandas dataframe: find max for each unique values of an another column 通过另一列 pandas 找到列组的最大值 - find the max of column group by another column pandas 在列中查找匹配值并创建另一列 pandas dataframe - Find matching value in column and create another column pandas dataframe 在 DataFrame 中找到具有行空间范围的每一列的最大值 - find the max value of each column with spatial range of row in DataFrame
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM