Can Pandas perform row-wise min() and max() functions?

Question

In my DataFrame I wish to clip the value of a particular column between 0 and 100. For instance, given the following:

I want to get:

  a  b   c
0 10 90  90
1 20 150 100
2 30 -30 0

I know that in Pandas certain arithmetic operations work across columns. For instance, I could double every number in column b like so:

>>>df["c"] = df["b"] * 2
>>>df
  a  b   c
0 10 90  180
1 20 150 300
2 30 -30 -60

However this doesn't work for built-in functions like min and max :

>>>df["c"] = min(100, max(0, df["b"]))
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Is there some way to accomplish what I want efficiently?

Answer 1

You can use Series.clip :

df['c'] = df['b'].clip(0,100)
print (df)
    a    b    c
0  10   90   90
1  20  150  100
2  30  -30    0

Answer 2

You can use the Pandas min function across an axis. Then combine it with min/max

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.min.html

For example

df.max(axis=1)

But it looks like you want to clip the values instead of min/max.

Answer 3

A numpy view. Not as elegant as clip .

Option 1

df.assign(c=np.minimum(np.maximum(df.b.values, 0), 100))

    a    b    c
0  10   90   90
1  20  150  100
2  30  -30    0

Option 2

b = df.b.values
df.assign(c=np.where(b > 100, 100, np.where(b < 0, 0, b)))

    a    b    c
0  10   90   90
1  20  150  100
2  30  -30    0

Timing
Code Below

res.div(res.min(1), 0)

            pir1  pir2       jez1
10     30.895514   1.0  75.210427
30     28.611177   1.0  49.913498
100    20.658307   1.0  50.823106
300    19.842134   1.0  39.162901
1000   14.078159   1.0  25.148937
3000    8.767133   1.0  15.066847
10000   4.377849   1.0   8.849138
30000   2.634263   1.0   4.653956

res = pd.DataFrame(
    index=[10, 30, 100, 300, 1000, 3000, 10000, 30000],
    columns=['pir1', 'pir2', 'jez1'],
    dtype=float
)

jez1 = lambda d: d.assign(c=df.b.clip(0, 1))
pir1 = lambda d: d.assign(c=np.minimum(np.maximum(d.b.values, 0), 100))
pir2 = lambda d: (lambda b: np.where(b > 100, 100, np.where(b < 0, 0, b)))(d.b.values)

for i in res.index:
    d = pd.concat([df] * i, ignore_index=True)
    for j in res.columns:
        stmt = '{}(d)'.format(j)
        setp = 'from __main__ import d, {}'.format(j)
        res.at[i, j] = timeit(stmt, setp, number=10)

res.plot(loglog=True)

Can Pandas perform row-wise min() and max() functions?

Question

3 answers

solution1
8 ACCPTED 2017-08-30 17:33:28

solution2
3 2017-08-30 17:33:24

solution3
0 2017-08-30 18:20:41

Can Pandas perform row-wise min() and max() functions?

Question

3 answers

solution1 8 ACCPTED 2017-08-30 17:33:28

solution2 3 2017-08-30 17:33:24

solution3 0 2017-08-30 18:20:41

solution1
8 ACCPTED 2017-08-30 17:33:28

solution2
3 2017-08-30 17:33:24

solution3
0 2017-08-30 18:20:41