简体   繁体   English

使用其他列表过滤 pandas DataFrame 列列表

[英]filter pandas DataFrame column list with other list

Input DataFrame :输入数据框:

data = { "id" : ['[1,2]','[2,4]','[4,3]'],
             "name" : ['a','b','c'] }
df = pd.DataFrame(data)

filterstr = [1,2]

Expected Output:预期输出:

id     name
[1,2]   a      
[2,4]   b

Tried Code : df1 = df[df.id.map(lambda x: np.isin(np.array(x), [[ str([i]) for i in filter]]).all())]尝试代码: df1 = df[df.id.map(lambda x: np.isin(np.array(x), [[ str([i]) for i in filter]]).all())]

This works for single value in id column but not for two values like '[1,2]' Not sure where i am going wrong.这适用于 id 列中的单个值,但不适用于像“[1,2]”这样的两个值。不确定我哪里出错了。

Taking exactly what you've given:完全按照你给出的:

data = { "id" : ['[1,2]','[2,4]','[4,3]'],
             "name" : ['a','b','c'] }
df = pd.DataFrame(data)

filterstr = [1,2]

I do:我愿意:

df['id'] = df['id'].apply(eval) # Convert from string to list.
output = df[df.id.map(lambda id: any(x for x in id if x in filterstr))]
print(output)

Output:输出:

       id name
0  [1, 2]    a
1  [2, 4]    b

Creating DF:创建 DF:

data = { "id" : [[1,2],[2,4],[4,3]], #REMOVE STRING AROUND []!!!
         "name" : ['a','b','c'] }
df = pd.DataFrame(data)
df

Result:结果:

index指数 id ID name姓名
0 0 1,2 1,2 a一个
1 1 2,4 2,4 b b
2 2 4,3 4,3 c C

Then let's create a variable which will be our "boolean" filter :然后让我们创建一个变量,它将成为我们的“布尔”过滤器

reg = [1,2]
filter = df.id.apply(lambda x: any(i for i in x if i in reg))

Intermediate result:中间结果:

0     True
1     True
2    False
Name: id, dtype: bool

Then select only " True " values:然后只选择“ True ”值:

df = df[filter]
df

Final result:最后结果:

index指数 id ID name姓名
0 0 1,2 1,2 a一个
1 1 2,4 2,4 b b

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM