简体   繁体   中英

Creating new column based on multiple different values

I have current code below that creates a new column based on multiple different values of a column that has different values representing similar things such as Car, Van or Ship, Boat, Submarine that I want all to be classified under the same value in the new column such as Vehicle or Boat.

Code with Simplified Dataset example:

def f(row):
    if row['A'] == 'Car':
        val = 'Vehicle'
    elif row['A'] == 'Van':
        val = 'Vehicle'
    elif row['Type'] == 'Ship'
        val = 'Boat'
    elif row['Type'] == 'Scooter'
        val = 'Bike'
    elif row['Type'] == 'Segway'
        val = 'Bike'
    return val

What is best method similar to using wildcards rather than type each value out if there are multiple values (30 plus values ) that I want to bucket into the same new values under the new column?

Thanks

One way is to use np.select with isin :

df = pd.DataFrame({"Type":["Car","Van","Ship","Scooter","Segway"]})

df["new"] = np.select([df["Type"].isin(["Car","Van"]),
                       df["Type"].isin(["Scooter","Segway"])],
                      ["Vehicle","Bike"],"Boat")

print (df)

      Type      new
0      Car  Vehicle
1      Van  Vehicle
2     Ship     Boat
3  Scooter     Bike
4   Segway     Bike

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM