简体   繁体   中英

Pandas: How to check if a list-type column is in dataframe

How can I create a new list column from a list column

My dataframe:

id    x    list_id
1     20   [2, 4]
2     10   [1, 3]
3     10   [1]
4     30   [1, 2]

What I want:

id    x    list_id    list_x
1     20   [2, 4]     [10, 30]
2     10   [1, 3]     [20, 10]
3     10   [1]        [20]
4     30   [1, 2]     [20, 10]

My first idea is to iterate on each line then check if the id is in the list

for index, row in df.iterrows():
  if ( df['id'].isin(row['list_id']) ):
     do_somthing

But its not working, any suggestion !!

Use a list comprehension:

df.loc[:,'list_x'] = [df.x[df['id'].isin(l)].values for l in df.list_id]

Full example with dummy data:

import pandas as pd

data= {
    'id': [1,2,3,4],
    'x': [20,10,10,30],
    'list_id': [[2,4],[1,3],[1],[1,2]],
}

df = pd.DataFrame(data)

df.loc[:,'list_x'] = [df.x[df['id'].isin(l)].values for l in df.list_id]

Output

print df

  list_id   x    list_x
1  [2, 4]  20  [10, 30]
2  [1, 3]  10  [20, 10]
3     [1]  10      [20]
4  [1, 2]  30  [20, 10]

Creative Solution
Using numpy object arrays with set elements

i = np.array([set([x]) for x in df.id.values.tolist()])
x = np.empty(i.shape, dtype=object)
x[:] = [[x] for x in df.x.values.tolist()]
y = np.empty_like(x)
y.fill([])
j = np.array([set(x) for x in df.list_id.values.tolist()])

df.assign(list_x=np.where(i <= j[:, None], x, y).sum(1))

   id   x list_id    list_x
0   1  20  [2, 4]  [10, 30]
1   2  10  [1, 3]  [20, 10]
2   3  10     [1]      [20]
3   4  30  [1, 2]  [20, 10]

Timing

%timeit df.assign(list_x=[df.x[df['id'].isin(l)].values for l in df.list_id])

1000 loops, best of 3: 1.21 ms per loop

%%timeit 
i = np.array([set([x]) for x in df.id.values.tolist()])
x = np.empty(i.shape, dtype=object)
x[:] = [[x] for x in df.x.values.tolist()]
y = np.empty_like(x)
y.fill([])
j = np.array([set(x) for x in df.list_id.values.tolist()])

df.assign(list_x=np.where(i <= j[:, None], x, y).sum(1))

1000 loops, best of 3: 371 µs per loop

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM