[英]Add new column, classifying according to multiple conditions
我有一个 DataFrame (df),我正在尝试将 Classes DataFrames A、B 和 C 分配给第一个,以便创建一个名为 Str2 的新列。 df 实际上有 100 万行。 考虑到类和层,合并它们的最有效方法是什么? 我一直在尝试使用 if function,但我做不到。
df = pd.DataFrame(data={'Class': ["A", "B", "C", "A", "B", "C", "A", "B", "C"], 'Stra': [1,1,1,2,2,2,3,3,3], 'Energy':[41,3,22,21,32,2,23,2,6]})
df
Class Stra Energy
0 A 1 41
1 B 1 3
2 C 1 22
3 A 2 21
4 B 2 32
5 C 2 2
6 A 3 23
7 B 3 2
8 C 3 6
类数据框:
A = pd.DataFrame(data={'Stra': [1,2,3], 'Str2':["INT1","INT2","INT2"]})
B = pd.DataFrame(data={'Stra': [1,2,3], 'Str2':["INT1","INT3","INT4"]})
C = pd.DataFrame(data={'Stra': [1,2,3], 'Str2':["INT2","INT5","INT6"]})
然后,预期的结果是这样的:
Class Stra Energy Str2
0 A 1 41 INT1
1 B 1 3 INT1
2 C 1 22 INT2
3 A 2 21 INT2
4 B 2 32 INT3
5 C 2 2 INT5
6 A 3 23 INT2
7 B 3 2 INT4
8 C 3 6 INT6
试试下面的代码片段,
df = pd.DataFrame(data={'Class': ["A", "B", "C", "A", "B", "C", "A", "B", "C"], 'Stra': [1,1,1,2,2,2,3,3,3], 'Energy':[41,3,22,21,32,2,23,2,6]})
A = pd.DataFrame(data={'Stra': [1,2,3], 'Str2':["INT1","INT2","INT2"]})
B = pd.DataFrame(data={'Stra': [1,2,3], 'Str2':["INT1","INT3","INT4"]})
C = pd.DataFrame(data={'Stra': [1,2,3], 'Str2':["INT2","INT5","INT6"]})
# Store the Classes Dataframes in class object
class Vars:
pass
setattr(Vars,"A",A)
setattr(Vars,"B",B)
setattr(Vars,"C",C)
# Write a function to return the Str2 value
def ret_str2(val,val2):
df = getattr(Vars,val)
str2 = df[df['Stra']==val2]['Str2'].values[0]
return str2
# Apply ret_str2 function in df
df["Str2"] = df[["Class","Stra"]].apply(lambda x : ret_str2(x["Class"],x["Stra"]),axis=1)
print(df)
Class Stra Energy Str2
0 A 1 41 INT1
1 B 1 3 INT1
2 C 1 22 INT2
3 A 2 21 INT2
4 B 2 32 INT3
5 C 2 2 INT5
6 A 3 23 INT2
7 B 3 2 INT4
8 C 3 6 INT6
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.