I have a pandas dataframe that looks like the following:
+-----+---+---+--+--+
| | A | B | | |
+-----+---+---+--+--+
| 288 | 1 | 4 | | |
+-----+---+---+--+--+
| 245 | 2 | 3 | | |
+-----+---+---+--+--+
| 543 | 3 | 6 | | |
+-----+---+---+--+--+
| 867 | 1 | 9 | | |
+-----+---+---+--+--+
| 345 | 2 | 7 | | |
+-----+---+---+--+--+
| 122 | 3 | 8 | | |
+-----+---+---+--+--+
| 233 | 1 | 1 | | |
+-----+---+---+--+--+
| 346 | 2 | 6 | | |
+-----+---+---+--+--+
| 765 | 3 | 3 | | |
+-----+---+---+--+--+
What I want to do is get the max and min values from column 'B' given by the the range from 1 to 3 in column 'A'
For example:
loop on A in range 1 to 3:
get max and min values from column 'B'
max = 6
min = 3
loop on the next range of A from 1 to 3:
get max and min values from column 'B'
max = 9
min = 7
loop on the next range of A from 1 to 3:
get max and min values from column 'B'
max = 6
min = 1
and add the min max values to a column like:
+-----+---+---+--+----+
| | A | B |min|max|
+-----+---+---+--+----+
| 288 | 1 | 4 | 3 | 6 |
+-----+---+---+--+----+
| 245 | 2 | 3 | | |
+-----+---+---+--+----+
| 543 | 3 | 6 | | |
+-----+---+---+--+----+
| 867 | 1 | 9 | 7 | 9 |
+-----+---+---+--+----+
| 345 | 2 | 7 | | |
+-----+---+---+--+----+
| 122 | 3 | 8 | | |
+-----+---+---+--+----+
| 233 | 1 | 1 | 1 | 6 |
+-----+---+---+--+----+
| 346 | 2 | 6 | | |
+-----+---+---+--+----+
| 765 | 3 | 3 | | |
+-----+---+---+--+----+
If dont need empty values:
g = df.groupby(np.arange(len(df.index)) // 3)
df['min'] = g.B.transform('min')
df['max'] = g.B.transform('max')
print (df)
A B min max
288 1 4 3 6
245 2 3 3 6
543 3 6 3 6
867 1 9 7 9
345 2 7 7 9
122 3 8 7 9
233 1 1 1 6
346 2 6 1 6
765 3 3 1 6
For emty values is possible add empty spaces, BUT then all values in columns min
and max
are converted to strings too:
g = df.groupby(np.arange(len(df.index)) // 3)
df['min'] = g.B.transform('min')
df['max'] = g.B.transform('max')
df.loc[df.A != 1, ['min','max']] = ''
print (df)
A B min max
288 1 4 3 6
245 2 3
543 3 6
867 1 9 7 9
345 2 7
122 3 8
233 1 1 1 6
346 2 6
765 3 3
EDIT1:
df['range']='range' + pd.Series(np.arange(len(df.index))//3 + 1, index=df.index).astype(str)
g = df.groupby('range')
df['min'] = g.B.transform('min')
df['max'] = g.B.transform('max')
print (df)
A B range min max
288 1 4 range1 3 6
245 2 3 range1 3 6
543 3 6 range1 3 6
867 1 9 range2 7 9
345 2 7 range2 7 9
122 3 8 range2 7 9
233 1 1 range3 1 6
346 2 6 range3 1 6
765 3 3 range3 1 6
Another solution with cumsum
of boolean mask:
df['range'] = 'range' + (df.A == 1).cumsum().astype(str)
g = df.groupby('range')
df['min'] = g.B.transform('min')
df['max'] = g.B.transform('max')
print (df)
A B range min max
288 1 4 range1 3 6
245 2 3 range1 3 6
543 3 6 range1 3 6
867 1 9 range2 7 9
345 2 7 range2 7 9
122 3 8 range2 7 9
233 1 1 range3 1 6
346 2 6 range3 1 6
765 3 3 range3 1 6
General solution
g = df.groupby(df.groupby('A').cumcount())
df['min'] = g.B.transform('min')
df['max'] = g.B.transform('max')
print (df)
A B min max
288 1 4 3 6
245 2 3 3 6
543 3 6 3 6
867 1 9 7 9
345 2 7 7 9
122 3 8 7 9
233 1 1 1 6
346 2 6 1 6
765 3 3 1 6
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.