[英]Pandas: How to check if a list-type column is in dataframe
How can I create a new list column from a list column 如何从列表列创建新的列表列
My dataframe: 我的数据框:
id x list_id
1 20 [2, 4]
2 10 [1, 3]
3 10 [1]
4 30 [1, 2]
What I want: 我想要的是:
id x list_id list_x
1 20 [2, 4] [10, 30]
2 10 [1, 3] [20, 10]
3 10 [1] [20]
4 30 [1, 2] [20, 10]
My first idea is to iterate on each line then check if the id is in the list 我的第一个想法是在每一行上进行迭代,然后检查ID是否在列表中
for index, row in df.iterrows():
if ( df['id'].isin(row['list_id']) ):
do_somthing
But its not working, any suggestion !! 但它不起作用,任何建议!
Use a list comprehension: 使用列表理解:
df.loc[:,'list_x'] = [df.x[df['id'].isin(l)].values for l in df.list_id]
Full example with dummy data: 有关伪数据的完整示例:
import pandas as pd
data= {
'id': [1,2,3,4],
'x': [20,10,10,30],
'list_id': [[2,4],[1,3],[1],[1,2]],
}
df = pd.DataFrame(data)
df.loc[:,'list_x'] = [df.x[df['id'].isin(l)].values for l in df.list_id]
Output 产量
print df
list_id x list_x
1 [2, 4] 20 [10, 30]
2 [1, 3] 10 [20, 10]
3 [1] 10 [20]
4 [1, 2] 30 [20, 10]
Creative Solution 创意解决方案
Using numpy
object arrays with set
elements 将
numpy
对象数组与set
元素一起使用
i = np.array([set([x]) for x in df.id.values.tolist()])
x = np.empty(i.shape, dtype=object)
x[:] = [[x] for x in df.x.values.tolist()]
y = np.empty_like(x)
y.fill([])
j = np.array([set(x) for x in df.list_id.values.tolist()])
df.assign(list_x=np.where(i <= j[:, None], x, y).sum(1))
id x list_id list_x
0 1 20 [2, 4] [10, 30]
1 2 10 [1, 3] [20, 10]
2 3 10 [1] [20]
3 4 30 [1, 2] [20, 10]
Timing 定时
%timeit df.assign(list_x=[df.x[df['id'].isin(l)].values for l in df.list_id])
1000 loops, best of 3: 1.21 ms per loop
%%timeit
i = np.array([set([x]) for x in df.id.values.tolist()])
x = np.empty(i.shape, dtype=object)
x[:] = [[x] for x in df.x.values.tolist()]
y = np.empty_like(x)
y.fill([])
j = np.array([set(x) for x in df.list_id.values.tolist()])
df.assign(list_x=np.where(i <= j[:, None], x, y).sum(1))
1000 loops, best of 3: 371 µs per loop
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.