[英]Remove certain strings from list of strings as column in pandas.DataFrame
I have a pandas.DataFrame
: 我有一个
pandas.DataFrame
:
index question_id tag
0 1858 [pset3, game-of-fifteen]
1 2409 [pset4]
2 4346 [pset6, cs50submit]
3 9139 [pset8, pset5, gradebook]
4 9631 [pset4, recover]
I need to remove every string from list of strings in tag
column except pset*
strings. 我需要从
tag
列的字符串列表中删除每个字符串,但pset*
字符串除外。
So I need to end with something like this: 所以我需要以这样的结尾:
index question_id tag
0 1858 [pset3]
1 2409 [pset4]
2 4346 [pset6]
3 9139 [pset8, pset5]
4 9631 [pset4]
How can I do that please? 我该怎么办?
One option: Use apply
method to loop through the items in the tag
column; 一种选择:使用
apply
方法遍历tag
列中的项目; for each item, use a list comprehension to filter strings based on the prefix using startswith
method: 对于每个项目,请使用列表
startswith
使用startswith
方法基于前缀过滤字符串:
df['tag'] = df.tag.apply(lambda lst: [x for x in lst if x.startswith("pset")])
df
You can apply a function to the tag
series that constructs a list using only the elements that start with 'pset'
您可以将函数应用于仅使用以
'pset'
开头的元素构成列表的tag
系列
df.tag.apply(lambda x: [xx for xx in x if xx.startswith('pset')])
# returns:
0 [pset3]
1 [pset4]
2 [pset6]
3 [pset8, pset5]
4 [pset4]
You can even use python in operator 您甚至可以在运算符中使用python
df.tag = df.tag.apply(lambda x: [elem for elem in x if 'pset' in elem])
0 [pset3]
1 [pset4]
2 [pset6]
3 [pset8, pset5]
4 [pset4]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.