[英]Extracting features from dataframe
I have pandas dataframe like this我有这样的 pandas dataframe
ID Phone ex
0 1 5333371000 533
1 2 5354321938 535
2 3 3840812 384
3 4 5451215 545
4 5 2125121278 212
For example if "ex" start to 533,535,545 new variable should be:例如,如果“ex”开始到 533,535,545 新变量应该是:
Sample output:样本 output:
ID Phone ex iswhat
0 1 5333371000 533 personal
1 2 5354321938 535 personal
2 3 3840812 384 notpersonal
3 4 5451215 545 personal
4 5 2125121278 212 notpersonal
How can i do that?我怎样才能做到这一点?
We can use np.where
along with str.contains
:我们可以将np.where
与str.contains
一起使用:
df["iswhat"] = np.where(df["ex"].str.contains(r'^(?:533|535|545)$'),
'personal', 'notpersonal')
You can use np.where
:您可以使用np.where
:
df['iswhat'] = np.where(df['ex'].isin([533, 535, 545]), 'personal', 'not personal')
print(df)
# Output
ID Phone ex iswhat
0 1 5333371000 533 personal
1 2 5354321938 535 personal
2 3 3840812 384 not personal
3 4 5451215 545 personal
4 5 2125121278 212 not personal
Update更新
You can also use your Phone
column directly:您也可以直接使用Phone
列:
df['iswhat'] = np.where(df['Phone'].astype(str).str.match('533|535|545'),
'personal', 'not personal')
Note: If Phone
column contains strings you can safely remove .astype(str)
.注意:如果Phone
列包含字符串,您可以安全地删除.astype(str)
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.