简体   繁体   中英

Iterate dataframe in pandas looking for a string and generate new column

I have the following dataframe:

      import pandas as pd
      df = pd.DataFrame({'Id_Sensor': [1, 2, 3, 4],'Old_Column': ['P55X', 'MEC8901', 'P58Y', 'M59X']})

      print(df)

        Id_Sensor   Old_Column
           1           P55X
           2           MEC8901
           3           P58Y
           4           M59X

I need to create a new column on this dataframe. If the first letter is P, the column should receive 'computer_typeA'. If the first three letters are MEC the column should receive 'computer_typeB'

I tried to do the following:

        #This code segment is incorrect
        for i in range(0, len(df)):

          if(df['Old_Column'].iloc[i][:1] == 'P'):
               df['New_Column'].iloc[i] == 'computer_typeA'

         elif(df['Old_Column'].iloc[i][:3] == 'MEC'):
               df['New_Column'].iloc[i] == 'computer_typeB'

         else:
               df['New_Column'].iloc[i] == 'computer_other'   

The answer is incorrect:

      print(df)
        Id_Sensor   Old_Column  New_Column
           1           P55X       Nan
           2          MEC8901     Nan
           3           P58Y       Nan
           4           M59X       Nan

I would like the answer to be like this:

        Id_Sensor   Old_Column       New_Column
           1           P55X       computer_typeA
           2          MEC8901     computer_typeB
           3           P58Y       computer_typeA
           4           M59X       computer_other

You can use numpy.select for conditional statements:

cond1 = df.Old_Column.str.startswith('P')
cond2 = df.Old_Column.str.startswith('MEC')
condlist = [cond1,cond2]
choicelist = ['computer_typeA', 'computer_typeB']
df['New Column'] = np.select(condlist,choicelist)
df['New Column'] = df['New Column'].replace('0','computer_other')

   Id_Sensor    Old_Column  New Column
0   1   P55X    computer_typeA
1   2   MEC8901 computer_typeB
2   3   P58Y    computer_typeA
3   4   M59X    computer_other

This simple code should do the work:

df["New_Column"] = "computer_other"

df.loc[df.Old_Column.apply(lambda x: x[0] == "P"), "New_Column"] = "computer_typeA"

df.loc[df.Old_Column.apply(lambda x: x[:3] == "MEC"), "New_Column"] = "computer_typeB"

Note: The reason of the initial declaration of New_Column as computer_other is to simplify the process.

Hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM