[英]Iterate dataframe in pandas looking for a string and generate new column
I have the following dataframe:我有以下数据框:
import pandas as pd
df = pd.DataFrame({'Id_Sensor': [1, 2, 3, 4],'Old_Column': ['P55X', 'MEC8901', 'P58Y', 'M59X']})
print(df)
Id_Sensor Old_Column
1 P55X
2 MEC8901
3 P58Y
4 M59X
I need to create a new column on this dataframe.我需要在这个数据框上创建一个新列。 If the first letter is P, the column should receive 'computer_typeA'.
如果第一个字母是 P,则该列应接收“computer_typeA”。 If the first three letters are MEC the column should receive 'computer_typeB'
如果前三个字母是 MEC,则该列应收到“computer_typeB”
I tried to do the following:我尝试执行以下操作:
#This code segment is incorrect
for i in range(0, len(df)):
if(df['Old_Column'].iloc[i][:1] == 'P'):
df['New_Column'].iloc[i] == 'computer_typeA'
elif(df['Old_Column'].iloc[i][:3] == 'MEC'):
df['New_Column'].iloc[i] == 'computer_typeB'
else:
df['New_Column'].iloc[i] == 'computer_other'
The answer is incorrect:答案是错误的:
print(df)
Id_Sensor Old_Column New_Column
1 P55X Nan
2 MEC8901 Nan
3 P58Y Nan
4 M59X Nan
I would like the answer to be like this:我希望答案是这样的:
Id_Sensor Old_Column New_Column
1 P55X computer_typeA
2 MEC8901 computer_typeB
3 P58Y computer_typeA
4 M59X computer_other
You can use numpy.select for conditional statements:您可以将numpy.select用于条件语句:
cond1 = df.Old_Column.str.startswith('P')
cond2 = df.Old_Column.str.startswith('MEC')
condlist = [cond1,cond2]
choicelist = ['computer_typeA', 'computer_typeB']
df['New Column'] = np.select(condlist,choicelist)
df['New Column'] = df['New Column'].replace('0','computer_other')
Id_Sensor Old_Column New Column
0 1 P55X computer_typeA
1 2 MEC8901 computer_typeB
2 3 P58Y computer_typeA
3 4 M59X computer_other
This simple code should do the work:这个简单的代码应该可以完成工作:
df["New_Column"] = "computer_other"
df.loc[df.Old_Column.apply(lambda x: x[0] == "P"), "New_Column"] = "computer_typeA"
df.loc[df.Old_Column.apply(lambda x: x[:3] == "MEC"), "New_Column"] = "computer_typeB"
Note: The reason of the initial declaration of New_Column
as computer_other
is to simplify the process.注意:将
New_Column
初始声明为computer_other
是为了简化过程。
Hope this helps.希望这可以帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.