简体   繁体   中英

using pandas apply and inplace dataframe

I have dataframe like below and want to change as below result df using below def by 'apply method' in pandas.
As far as I know, 'apply' method makes a series not inplacing original df.

id a b
-------
a  1 4
b  2 5
c  6 2

if df['a'] > df['b'] :
    df['a'] = df['b']
else :
    df['b'] = df['a']

result df :

id a b
-------
a  4 4
b  5 5
c  6 6

I am not sure what you need,since the expected output is different from your condition, here I can only fix your code

for x,y in df.iterrows():
    if y['a'] > y['b']:
        df.loc[x,'a'] = df.loc[x,'b']
    else:
        df.loc[x,'b'] = df.loc[x,'a']

df
Out[40]: 
  id  a  b
0  a  1  1
1  b  2  2
2  c  2  2

If I understand your problem correctly

df.assign(**dict.fromkeys(['a','b'],np.where(df.a>df.b,df.a,df.b)))
Out[43]: 
  id  a  b
0  a  1  1
1  b  2  2
2  c  2  2

Like the rest, not totally sure what you're trying to do, i'm going to ASSUME you are meaning to set the value of either the current "A" or "B" value throughout to be equal to the highest of either column's values in that row.... If that assumption is correct, here's how that can be done with ".apply()".

First thing, is most "clean" applications (remembering that the application of ".apply()" is generally never recommended) of ".apply()" utilize a function that takes the input of the row fed to it by the ".apply()" function and generally returns the same object, but modified/changed/etc as needed. With your dataframe in mind, this is a function to achieve the desired output, followed by the application of the function against the dataframe using ".apply()".

# Create the function to be used within .apply()
def comparer(row):
    if row["a"] > row["b"]:
        row["b"] = row["a"]
    elif row["b"] > row["a"]:
        row["a"] = row["b"]
    return(row)

# Use .apply() to execute our function against our column values. Returning the result of .apply(), re-creating the "df" object as our new modified dataframe.
df = df.apply(comparer, axis=1)

Most, if not everyone seems to rail against ".apply()" usage however. I'd probably heed their wisdom :)

Try :

df = pd.DataFrame({'a': [1, 2, 6], 'b': [4,5,2]})

df['a'] = df.max(axis=1)
df['b'] = df['a']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM