[英]filter pandas DataFrame column list with other list
Input DataFrame :输入数据框:
data = { "id" : ['[1,2]','[2,4]','[4,3]'],
"name" : ['a','b','c'] }
df = pd.DataFrame(data)
filterstr = [1,2]
Expected Output:预期输出:
id name
[1,2] a
[2,4] b
Tried Code : df1 = df[df.id.map(lambda x: np.isin(np.array(x), [[ str([i]) for i in filter]]).all())]
尝试代码:
df1 = df[df.id.map(lambda x: np.isin(np.array(x), [[ str([i]) for i in filter]]).all())]
This works for single value in id column but not for two values like '[1,2]' Not sure where i am going wrong.这适用于 id 列中的单个值,但不适用于像“[1,2]”这样的两个值。不确定我哪里出错了。
Taking exactly what you've given:完全按照你给出的:
data = { "id" : ['[1,2]','[2,4]','[4,3]'],
"name" : ['a','b','c'] }
df = pd.DataFrame(data)
filterstr = [1,2]
I do:我愿意:
df['id'] = df['id'].apply(eval) # Convert from string to list.
output = df[df.id.map(lambda id: any(x for x in id if x in filterstr))]
print(output)
Output:输出:
id name
0 [1, 2] a
1 [2, 4] b
Creating DF:创建 DF:
data = { "id" : [[1,2],[2,4],[4,3]], #REMOVE STRING AROUND []!!!
"name" : ['a','b','c'] }
df = pd.DataFrame(data)
df
Result:结果:
index![]() |
id ![]() |
name![]() |
---|---|---|
0 ![]() |
1,2 ![]() |
a![]() |
1 ![]() |
2,4 ![]() |
b ![]() |
2 ![]() |
4,3 ![]() |
c ![]() |
Then let's create a variable which will be our "boolean" filter :然后让我们创建一个变量,它将成为我们的“布尔”过滤器:
reg = [1,2]
filter = df.id.apply(lambda x: any(i for i in x if i in reg))
Intermediate result:中间结果:
0 True
1 True
2 False
Name: id, dtype: bool
Then select only " True " values:然后只选择“ True ”值:
df = df[filter]
df
Final result:最后结果:
index![]() |
id ![]() |
name![]() |
---|---|---|
0 ![]() |
1,2 ![]() |
a![]() |
1 ![]() |
2,4 ![]() |
b ![]() |
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.