简体   繁体   中英

Alternative to nested np.where in Pandas DataFrame

I have this code (which works) - a bunch of nested conditional statements to set the value in the 'paragenesis1' row of a dataframe ( myOxides['cpx'] ), depending on the values in various other rows of the frame.

I'm very new to python and programming in general. I am thinking that I should write a function to perform this, but how then to apply that function elementwise? This is the only way I have found to avoid the 'truth value of a series is ambiguous' error.

Any help greatly appreciated!

myOxides['cpx'].loc['paragenesis1'] = np.where(
            ((cpxCrOx>=0.5) & (cpxAlOx<=4)),
            "GtPeridA", 
            np.where(
                    ((cpxCrOx>=2.25) & (cpxAlOx<=5)), 
                    "GtPeridB", 
                    np.where(
                            ((cpxCrOx>=0.5)&
                             (cpxCrOx<=2.25)) &
                             ((cpxAlOx>=4) & (cpxAlOx<=6)),
                             "SpLhzA",
                             np.where(
                                     ((cpxCrOx>=0.5) &
                                      (cpxCrOx<=(5.53125 - 
                                                 0.546875 * cpxAlOx))) &
                                      ((cpxAlOx>=4) & 
                                       (cpxAlOx <= ((cpxCrOx - 
                                                     5.53125)/ -0.546875))),
                             "SpLhzB",
                             "Eclogite, Megacryst, Cognate"))))

or;

df.loc['a'] = np.where(
            (some_condition),
            "value", 
            np.where(
                    ((conditon_1) & (condition_2)), 
                    "some_value", 
                    np.where(
                            ((condition_3)& (condition_4)),
                             "some_other_value",
                              np.where(
                                      ((condition_5),
                                        "another_value",
                                        "other_value"))))

One possible solution is use numpy.select :

m1 = (cpxCrOx>=0.5) & (cpxAlOx<=4)
m2 = (cpxCrOx>=2.25) & (cpxAlOx<=5)
m3 = ((cpxCrOx>=0.5) & (cpxCrOx<=2.25)) & ((cpxAlOx>=4) & (cpxAlOx<=6))
m4 = ((cpxCrOx>=0.5) &(cpxCrOx<=(5.53125 -  0.546875 * cpxAlOx))) & \
     ((cpxAlOx>=4) &  (cpxAlOx <= ((cpxCrOx -  5.53125)/ -0.546875))

vals = [ "GtPeridA", "GtPeridB", "SpLhzA", "SpLhzB"]
default = 'Eclogite, Megacryst, Cognate'

myOxides['paragenesis1'] = np.select([m1,m2,m3,m4], vals, default=default)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM