简体   繁体   English

如何使用数字索引和标题从数据框中获取最大值

[英]How to get maximum value from dataframe with numeric indexes and headings

I have the following dataframe: 我有以下数据帧:

df:

Unnamed: 0          0          1
0          0.0   0.000000        NaN
1          1.0   2.236068   0.000000
2          2.0   3.000000   2.236068
3          NaN   0.000000   1.000000
4          0.0   0.000000        NaN
5          1.0   1.414214   0.000000
6          2.0   2.828427   1.414214
7          NaN   0.000000   1.000000
8          0.0   0.000000        NaN
9          1.0   3.162278   0.000000
10         2.0  11.401754   3.162278
11         NaN   0.000000   1.000000
12         0.0   0.000000        NaN
13         1.0  14.142136   0.000000
14         2.0   2.828427  14.142136

I'm trying to get the maximum value from each set of data, the problem is that I generated this dataframe from several operations, but at the end the indexes and headings are numbers and I can not use groupby or loc. 我试图从每组数据中获取最大值,问题是我从几个操作生成了这个数据帧,但最后索引和标题是数字,我不能使用groupbyloc. What I need is something as follows: 我需要的是如下内容:

df1
        0
 1   3.000000
 2   2.828427
 3  11.401754
 4  14.142136

You absolutely use loc ! 你绝对使用loc Problem is that you aren't paying attention to whether df.columns are integers or strings. 问题是你没有注意df.columns是整数还是字符串。 Since you're having issues, I'm guessing strings. 既然你遇到了问题,那我猜是字符串。

However, what you're trying to do is not at all clear 但是,你要做的事情一点也不清楚

IIUC IIUC

m = df['0'] == 0
g = m.cumsum()[~m]
df.loc[~m, '0'].groupby(g).max()

Use .iloc and cumsum : 使用.iloccumsum

df.groupby((~df.iloc[:,0].astype(bool)).cumsum()).max()

Output: 输出:

            Unnamed: 0          0          1
Unnamed: 0                                  
1                  2.0   3.000000   2.236068
2                  2.0   2.828427   1.414214
3                  2.0  11.401754   3.162278
4                  2.0  14.142136  14.142136

To just get the maxes for column index 1: 要获得列索引1的最大值:

df.groupby((~df.iloc[:,0].astype(bool)).cumsum()).max().iloc[:,1]

Output: 输出:

Unnamed: 0
1     3.000000
2     2.828427
3    11.401754
4    14.142136
Name: 0, dtype: float64

file.csv : file.csv

0,1,2,3
9,6,7,
0,,,
5,6,2

Try: 尝试:

import pandas as pd

df = pd.read_csv('file.csv', header=-1)
# keep only max per row
print(df.max(axis=1))

Output: 输出:

0    3.0
1    9.0
2    0.0
3    6.0
dtype: float64

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM