简体   繁体   中英

Pandas dataframe max and min value

I have a pandas dataframe that looks like the following:

+-----+---+---+--+--+
|     | A | B |  |  |
+-----+---+---+--+--+
| 288 | 1 | 4 |  |  |
+-----+---+---+--+--+
| 245 | 2 | 3 |  |  |
+-----+---+---+--+--+
| 543 | 3 | 6 |  |  |
+-----+---+---+--+--+
| 867 | 1 | 9 |  |  |
+-----+---+---+--+--+
| 345 | 2 | 7 |  |  |
+-----+---+---+--+--+
| 122 | 3 | 8 |  |  |
+-----+---+---+--+--+
| 233 | 1 | 1 |  |  |
+-----+---+---+--+--+
| 346 | 2 | 6 |  |  |
+-----+---+---+--+--+
| 765 | 3 | 3 |  |  |
+-----+---+---+--+--+

What I want to do is get the max and min values from column 'B' given by the the range from 1 to 3 in column 'A'

For example:

loop on A in range 1 to 3:
       get max and min values from column 'B'
       max = 6
       min = 3
loop on the next range of A from 1 to 3:
       get max and min values from column 'B'
       max = 9
       min = 7           
loop on the next range of A from 1 to 3:
       get max and min values from column 'B'
       max = 6
       min = 1

and add the min max values to a column like:

+-----+---+---+--+----+
|     | A | B |min|max|
+-----+---+---+--+----+
| 288 | 1 | 4 | 3 | 6 |
+-----+---+---+--+----+
| 245 | 2 | 3 |   |   |
+-----+---+---+--+----+
| 543 | 3 | 6 |   |   |
+-----+---+---+--+----+
| 867 | 1 | 9 | 7 | 9 |
+-----+---+---+--+----+
| 345 | 2 | 7 |   |   |
+-----+---+---+--+----+
| 122 | 3 | 8 |   |   |
+-----+---+---+--+----+
| 233 | 1 | 1 | 1 | 6 |
+-----+---+---+--+----+
| 346 | 2 | 6 |   |   |
+-----+---+---+--+----+
| 765 | 3 | 3 |   |   |
+-----+---+---+--+----+

If dont need empty values:

g = df.groupby(np.arange(len(df.index)) // 3)
df['min'] = g.B.transform('min')
df['max'] = g.B.transform('max')
print (df)
     A  B  min  max
288  1  4    3    6
245  2  3    3    6
543  3  6    3    6
867  1  9    7    9
345  2  7    7    9
122  3  8    7    9
233  1  1    1    6
346  2  6    1    6
765  3  3    1    6

For emty values is possible add empty spaces, BUT then all values in columns min and max are converted to strings too:

g = df.groupby(np.arange(len(df.index)) // 3)
df['min'] = g.B.transform('min')
df['max'] = g.B.transform('max')
df.loc[df.A != 1, ['min','max']] = ''
print (df)
     A  B min max
288  1  4   3   6
245  2  3        
543  3  6        
867  1  9   7   9
345  2  7        
122  3  8        
233  1  1   1   6
346  2  6        
765  3  3    

EDIT1:

df['range']='range' + pd.Series(np.arange(len(df.index))//3 + 1, index=df.index).astype(str) 
g = df.groupby('range')
df['min'] = g.B.transform('min')
df['max'] = g.B.transform('max')
print (df)
     A  B   range  min  max
288  1  4  range1    3    6
245  2  3  range1    3    6
543  3  6  range1    3    6
867  1  9  range2    7    9
345  2  7  range2    7    9
122  3  8  range2    7    9
233  1  1  range3    1    6
346  2  6  range3    1    6
765  3  3  range3    1    6

Another solution with cumsum of boolean mask:

df['range'] = 'range' + (df.A == 1).cumsum().astype(str)
g = df.groupby('range')
df['min'] = g.B.transform('min')
df['max'] = g.B.transform('max')
print (df)
     A  B   range  min  max
288  1  4  range1    3    6
245  2  3  range1    3    6
543  3  6  range1    3    6
867  1  9  range2    7    9
345  2  7  range2    7    9
122  3  8  range2    7    9
233  1  1  range3    1    6
346  2  6  range3    1    6
765  3  3  range3    1    6

General solution

g = df.groupby(df.groupby('A').cumcount())
df['min'] = g.B.transform('min')
df['max'] = g.B.transform('max')
print (df)
     A  B  min  max
288  1  4    3    6
245  2  3    3    6
543  3  6    3    6
867  1  9    7    9
345  2  7    7    9
122  3  8    7    9
233  1  1    1    6
346  2  6    1    6
765  3  3    1    6

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM