![](/img/trans.png)
[英]How to replace a string in a list if it contains a substring in Pandas DataFrame column
[英]Replace string in Pandas column by substring from list
我有一個 DF:
DF
camp, value
asd_abcd_gr_yxz_aaaa, 5
efgh_kr_ijk, 10
hjssaasd_kr_adsad, 15
asdas_kr_asd, 2
asd_fr_asda_bb_bbbbbbb, 12
adklasdj_gr_asdsad, 3
並且更長。
在與列表[_gr_, _kr_, _fr_, etc..]
元素進行比較后[_gr_, _kr_, _fr_, etc..]
我希望結果是
DF
camp, value
gr, 8
kr, 27
fr, 12
最好盡可能短而不循環通過 DF。 該列表比_gr_, _kr_, _fr_
提前致謝!
您可以使用loc
嘗試str.contains
:
print df
camp value
0 abcd_gr_yxz 5
1 efgh_kr_ijk 10
2 hjssaasd_kr_adsad 15
3 asdas_kr_asd 2
4 asd_fr_asda 12
5 adklasdj_gr_asdsad 3
ABR = ['_gr_', '_kr_', '_fr_']
for x in ABR:
df.loc[df['camp'].str.contains(x), 'camp'] = x
print df
camp value
0 _gr_ 5
1 _kr_ 10
2 _kr_ 15
3 _kr_ 2
4 _fr_ 12
5 _gr_ 3
print df.groupby('camp')['value'].sum().reset_index()
camp value
0 _fr_ 12
1 _gr_ 8
2 _kr_ 27
ABR = ['_gr_', '_kr_', '_fr_']
s = '(' + '|'.join(ABR) + ')'
print s
(_gr_|_kr_|_fr_)
df['camp'] = df['camp'].str.extract(s, expand=False)
df = df.groupby('camp', as_index=False)['value'].sum()
df['camp'] = df['camp'].str.strip('_')
print df
camp value
0 fr 12
1 gr 8
2 kr 27
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.