I've got a DataFrame which looks something like this:
x1 x2
0 4 1
1 0 2
2 5 1
3 0 3
4 4 2
Now I want to create another column which takes average of columns x1
and x2
, or returns 0
if x1
is 0
:
x1 x2 ave
0 4 1 2.5
1 0 2 0
2 5 1 3
3 0 3 0
4 4 2 3
Neither this
data['ave'] = (data['x1'] + data['x2'])/2 if data['x1'] > 0 else 0
nor this
data['ave'] = (data['x1'] != 0)*(data['x1'] + data['x2'])/2
works for obvious reasons (series can't be used in these operations).
I do know that this is easy to accomplish using a loop, but is there a shorthand pythonic way of doing it?
Proper python data is below:
data = pd.DataFrame({'x1': (4,0,5,0,4), 'x2': (1,2,1,3,2)})
You're very close. Both of your approaches should work with only a tweak or two. Method #1:
>>> df = pd.DataFrame({'x1': (4,0,5,0,4), 'x2': (1,2,1,3,2)})
>>> df["ave"] = (df["x1"] != 0) * (df["x1"] + df["x2"])/2.
>>> df
x1 x2 ave
0 4 1 2.5
1 0 2 0.0
2 5 1 3.0
3 0 3 0.0
4 4 2 3.0
If you leave off the .
in 2.
and your columns are integers, you might not get the results you expect due to integer division, but Series
can be used without any problems.
Method #2:
df["ave"] = df.apply(lambda r: (r["x1"] + r["x2"])/2. if r["x1"] > 0 else 0, axis=1)
Pass a function to apply
and specify axis=1
.
Method #3a, 3b:
df["ave"] = df.mean(axis=1) * (df["x1"] != 0)
or
df["ave"] = df[["x1", "x2"]].mean(axis=1)
df["ave"][df["x1"] == 0] = 0
Et cetera.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.