简体   繁体   English

从字符串列表中删除某些字符串作为pandas.DataFrame中的列

[英]Remove certain strings from list of strings as column in pandas.DataFrame

I have a pandas.DataFrame : 我有一个pandas.DataFrame

    index    question_id    tag
    0        1858           [pset3, game-of-fifteen]
    1        2409           [pset4]
    2        4346           [pset6, cs50submit]
    3        9139           [pset8, pset5, gradebook]
    4        9631           [pset4, recover]

I need to remove every string from list of strings in tag column except pset* strings. 我需要从tag列的字符串列表中删除每个字符串,但pset*字符串除外。

So I need to end with something like this: 所以我需要以这样的结尾:

    index    question_id    tag
    0        1858           [pset3]
    1        2409           [pset4]
    2        4346           [pset6]
    3        9139           [pset8, pset5]
    4        9631           [pset4]

How can I do that please? 我该怎么办?

One option: Use apply method to loop through the items in the tag column; 一种选择:使用apply方法遍历tag列中的项目; for each item, use a list comprehension to filter strings based on the prefix using startswith method: 对于每个项目,请使用列表startswith使用startswith方法基于前缀过滤字符串:

df['tag'] = df.tag.apply(lambda lst: [x for x in lst if x.startswith("pset")])
df

在此处输入图片说明

You can apply a function to the tag series that constructs a list using only the elements that start with 'pset' 您可以将函数应用于仅使用以'pset'开头的元素构成列表的tag系列

df.tag.apply(lambda x: [xx for xx in x if xx.startswith('pset')])

# returns:
0           [pset3]
1           [pset4]
2           [pset6]
3    [pset8, pset5]
4           [pset4]

You can even use python in operator 您甚至可以在运算符中使用python

df.tag = df.tag.apply(lambda x: [elem for elem in x if 'pset' in elem])

0           [pset3]
1           [pset4]
2           [pset6]
3    [pset8, pset5]
4           [pset4]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM