I have the following dataframe:
import pandas as pd
df = pd.DataFrame({'Id_Sensor': [1, 2, 3, 4],'Old_Column': ['P55X', 'MEC8901', 'P58Y', 'M59X']})
print(df)
Id_Sensor Old_Column
1 P55X
2 MEC8901
3 P58Y
4 M59X
I need to create a new column on this dataframe. If the first letter is P, the column should receive 'computer_typeA'. If the first three letters are MEC the column should receive 'computer_typeB'
I tried to do the following:
#This code segment is incorrect
for i in range(0, len(df)):
if(df['Old_Column'].iloc[i][:1] == 'P'):
df['New_Column'].iloc[i] == 'computer_typeA'
elif(df['Old_Column'].iloc[i][:3] == 'MEC'):
df['New_Column'].iloc[i] == 'computer_typeB'
else:
df['New_Column'].iloc[i] == 'computer_other'
The answer is incorrect:
print(df)
Id_Sensor Old_Column New_Column
1 P55X Nan
2 MEC8901 Nan
3 P58Y Nan
4 M59X Nan
I would like the answer to be like this:
Id_Sensor Old_Column New_Column
1 P55X computer_typeA
2 MEC8901 computer_typeB
3 P58Y computer_typeA
4 M59X computer_other
You can use numpy.select for conditional statements:
cond1 = df.Old_Column.str.startswith('P')
cond2 = df.Old_Column.str.startswith('MEC')
condlist = [cond1,cond2]
choicelist = ['computer_typeA', 'computer_typeB']
df['New Column'] = np.select(condlist,choicelist)
df['New Column'] = df['New Column'].replace('0','computer_other')
Id_Sensor Old_Column New Column
0 1 P55X computer_typeA
1 2 MEC8901 computer_typeB
2 3 P58Y computer_typeA
3 4 M59X computer_other
This simple code should do the work:
df["New_Column"] = "computer_other"
df.loc[df.Old_Column.apply(lambda x: x[0] == "P"), "New_Column"] = "computer_typeA"
df.loc[df.Old_Column.apply(lambda x: x[:3] == "MEC"), "New_Column"] = "computer_typeB"
Note: The reason of the initial declaration of New_Column
as computer_other
is to simplify the process.
Hope this helps.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.