用于指示列表中元素的数据框列

Question

Raw data as below:原始数据如下：

all_names = ['Darren','John','Kate','Mike','Nancy']
list_0 = ['John', 'Mike']
list_1 = ['Kate', 'Nancy']

What I want to achieve is a data-frame with columns indicating which names in the lists appeared (1 for positive, 0 for negative), such as:我想要实现的是一个数据框，其中的列指示列表中出现了哪些名称（1 表示正面，0 表示负面），例如：

I tried a way which is to loop the lists and create new lists by adding 0 for the missing ones, otherwise 1.我尝试了一种方法，即循环列表并通过为缺少的列表添加 0 来创建新列表，否则为 1。

It is clumsy and troublesome, especially when the number of lists increased.它既笨拙又麻烦，尤其是当列表数量增加时。

new_list_0 = []
for _ in all_names:
    if _ not in list_0:
        new_list_0.append(0)
    else:
        new_list_0.append(1)

new_list_1 = []
for _ in all_names:
    if _ not in list_1:
        new_list_1.append(0)
    else:
        new_list_1.append(1)

import pandas as pd

data = [all_names, new_list_0,new_list_1]
column_names = data.pop(0)
df = pd.DataFrame(data, columns=column_names)

Output: Output：

   Darren  John  Kate  Mike  Nancy
0       0     1     0     1      0
1       0     0     1     0      1

What's the smart way?聪明的方法是什么？ Thank you.谢谢你。

Answer 1

Let us try str.get_dummies and reindex让我们尝试str.get_dummies并reindex

df=pd.Series([list_0,list_1]).str.join(',').str.get_dummies(',').reindex(columns=all_names,fill_value=0)
Out[160]: 
   Darren  John  Kate  Mike  Nancy
0       0     1     0     1      0
1       0     0     1     0      1

Answer 2

You can use pandas series:您可以使用 pandas 系列：

x = pd.Series(all_names)
pd.concat([x.isin(list_0), x.isin(list_1)], axis=1).astype(int).T

Answer 3

Using, dict.fromkeys() + fillna使用， dict.fromkeys() + fillna

import pandas as pd

all_names = ['Darren', 'John', 'Kate', 'Mike', 'Nancy']

list_0 = ['John', 'Mike']
list_1 = ['Kate', 'Nancy']

df = (
    pd.DataFrame([dict.fromkeys(x, 1) for x in [list_0, list_1]],
                 columns=all_names)
).fillna(0)

   Darren  John  Kate  Mike  Nancy
0     0.0   1.0   0.0   1.0    0.0
1     0.0   0.0   1.0   0.0    1.0

Answer 4

Using normal pandas operations and list comprehensions.使用正常的 pandas 操作和列表推导。

import pandas as pd


all_names = ['Darren','John','Kate','Mike','Nancy']
list_0 = ['John', 'Mike']
list_1 = ['Kate', 'Nancy']

lists = [list_0, list_1]
df = pd.DataFrame(columns=all_names)

for item in lists:
    df = df.append(pd.Series([int(name in item) for name in all_names], index=df.columns), ignore_index=True)

print(df)

Output Output

  Darren John Kate Mike Nancy
0      0    1    0    1     0
1      0    0    1    0     1

用于指示列表中元素的数据框列

问题描述

4 个解决方案

解决方案1
2 已采纳 2020-08-09 14:52:01

解决方案2
1 2020-08-09 14:52:11

解决方案3
1 2020-08-09 14:52:21

解决方案4
1 2020-08-09 15:19:33

用于指示列表中元素的数据框列

问题描述

4 个解决方案

解决方案1 2 已采纳 2020-08-09 14:52:01

解决方案2 1 2020-08-09 14:52:11

解决方案3 1 2020-08-09 14:52:21

解决方案4 1 2020-08-09 15:19:33

解决方案1
2 已采纳 2020-08-09 14:52:01

解决方案2
1 2020-08-09 14:52:11

解决方案3
1 2020-08-09 14:52:21

解决方案4
1 2020-08-09 15:19:33