![](/img/trans.png)
[英]How to replace a string in a list if it contains a substring in Pandas DataFrame column
[英]Replace string in Pandas column by substring from list
我有一个 DF:
DF
camp, value
asd_abcd_gr_yxz_aaaa, 5
efgh_kr_ijk, 10
hjssaasd_kr_adsad, 15
asdas_kr_asd, 2
asd_fr_asda_bb_bbbbbbb, 12
adklasdj_gr_asdsad, 3
并且更长。
在与列表[_gr_, _kr_, _fr_, etc..]
元素进行比较后[_gr_, _kr_, _fr_, etc..]
我希望结果是
DF
camp, value
gr, 8
kr, 27
fr, 12
最好尽可能短而不循环通过 DF。 该列表比_gr_, _kr_, _fr_
提前致谢!
您可以使用loc
尝试str.contains
:
print df
camp value
0 abcd_gr_yxz 5
1 efgh_kr_ijk 10
2 hjssaasd_kr_adsad 15
3 asdas_kr_asd 2
4 asd_fr_asda 12
5 adklasdj_gr_asdsad 3
ABR = ['_gr_', '_kr_', '_fr_']
for x in ABR:
df.loc[df['camp'].str.contains(x), 'camp'] = x
print df
camp value
0 _gr_ 5
1 _kr_ 10
2 _kr_ 15
3 _kr_ 2
4 _fr_ 12
5 _gr_ 3
print df.groupby('camp')['value'].sum().reset_index()
camp value
0 _fr_ 12
1 _gr_ 8
2 _kr_ 27
ABR = ['_gr_', '_kr_', '_fr_']
s = '(' + '|'.join(ABR) + ')'
print s
(_gr_|_kr_|_fr_)
df['camp'] = df['camp'].str.extract(s, expand=False)
df = df.groupby('camp', as_index=False)['value'].sum()
df['camp'] = df['camp'].str.strip('_')
print df
camp value
0 fr 12
1 gr 8
2 kr 27
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.