[英]How to get maximum value from dataframe with numeric indexes and headings
I have the following dataframe: 我有以下数据帧:
df:
Unnamed: 0 0 1
0 0.0 0.000000 NaN
1 1.0 2.236068 0.000000
2 2.0 3.000000 2.236068
3 NaN 0.000000 1.000000
4 0.0 0.000000 NaN
5 1.0 1.414214 0.000000
6 2.0 2.828427 1.414214
7 NaN 0.000000 1.000000
8 0.0 0.000000 NaN
9 1.0 3.162278 0.000000
10 2.0 11.401754 3.162278
11 NaN 0.000000 1.000000
12 0.0 0.000000 NaN
13 1.0 14.142136 0.000000
14 2.0 2.828427 14.142136
I'm trying to get the maximum value from each set of data, the problem is that I generated this dataframe from several operations, but at the end the indexes and headings are numbers and I can not use groupby
or loc.
我试图从每组数据中获取最大值,问题是我从几个操作生成了这个数据帧,但最后索引和标题是数字,我不能使用groupby
或loc.
What I need is something as follows: 我需要的是如下内容:
df1
0
1 3.000000
2 2.828427
3 11.401754
4 14.142136
You absolutely use loc
! 你绝对使用loc
! Problem is that you aren't paying attention to whether df.columns
are integers or strings. 问题是你没有注意df.columns
是整数还是字符串。 Since you're having issues, I'm guessing strings. 既然你遇到了问题,那我猜是字符串。
However, what you're trying to do is not at all clear 但是,你要做的事情一点也不清楚
IIUC IIUC
m = df['0'] == 0
g = m.cumsum()[~m]
df.loc[~m, '0'].groupby(g).max()
Use .iloc
and cumsum
: 使用.iloc
和cumsum
:
df.groupby((~df.iloc[:,0].astype(bool)).cumsum()).max()
Output: 输出:
Unnamed: 0 0 1
Unnamed: 0
1 2.0 3.000000 2.236068
2 2.0 2.828427 1.414214
3 2.0 11.401754 3.162278
4 2.0 14.142136 14.142136
To just get the maxes for column index 1: 要获得列索引1的最大值:
df.groupby((~df.iloc[:,0].astype(bool)).cumsum()).max().iloc[:,1]
Output: 输出:
Unnamed: 0
1 3.000000
2 2.828427
3 11.401754
4 14.142136
Name: 0, dtype: float64
file.csv : file.csv :
0,1,2,3
9,6,7,
0,,,
5,6,2
Try: 尝试:
import pandas as pd
df = pd.read_csv('file.csv', header=-1)
# keep only max per row
print(df.max(axis=1))
Output: 输出:
0 3.0
1 9.0
2 0.0
3 6.0
dtype: float64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.