I have the following dataframe:
df:
Unnamed: 0 0 1
0 0.0 0.000000 NaN
1 1.0 2.236068 0.000000
2 2.0 3.000000 2.236068
3 NaN 0.000000 1.000000
4 0.0 0.000000 NaN
5 1.0 1.414214 0.000000
6 2.0 2.828427 1.414214
7 NaN 0.000000 1.000000
8 0.0 0.000000 NaN
9 1.0 3.162278 0.000000
10 2.0 11.401754 3.162278
11 NaN 0.000000 1.000000
12 0.0 0.000000 NaN
13 1.0 14.142136 0.000000
14 2.0 2.828427 14.142136
I'm trying to get the maximum value from each set of data, the problem is that I generated this dataframe from several operations, but at the end the indexes and headings are numbers and I can not use groupby
or loc.
What I need is something as follows:
df1
0
1 3.000000
2 2.828427
3 11.401754
4 14.142136
You absolutely use loc
! Problem is that you aren't paying attention to whether df.columns
are integers or strings. Since you're having issues, I'm guessing strings.
However, what you're trying to do is not at all clear
IIUC
m = df['0'] == 0
g = m.cumsum()[~m]
df.loc[~m, '0'].groupby(g).max()
Use .iloc
and cumsum
:
df.groupby((~df.iloc[:,0].astype(bool)).cumsum()).max()
Output:
Unnamed: 0 0 1
Unnamed: 0
1 2.0 3.000000 2.236068
2 2.0 2.828427 1.414214
3 2.0 11.401754 3.162278
4 2.0 14.142136 14.142136
To just get the maxes for column index 1:
df.groupby((~df.iloc[:,0].astype(bool)).cumsum()).max().iloc[:,1]
Output:
Unnamed: 0
1 3.000000
2 2.828427
3 11.401754
4 14.142136
Name: 0, dtype: float64
file.csv :
0,1,2,3
9,6,7,
0,,,
5,6,2
Try:
import pandas as pd
df = pd.read_csv('file.csv', header=-1)
# keep only max per row
print(df.max(axis=1))
Output:
0 3.0
1 9.0
2 0.0
3 6.0
dtype: float64
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.