将列表映射到 1 和 0

Question

I have two lists my_genre and list_of_genres.我有两个列表 my_genre 和 list_of_genres。 I want a function to check if my_list[index] is in list_of_genres and convert list_of_genres[index2] into a 1 if that is the case.我想要一个 function 来检查my_list[index]是否在list_of_genres中，如果是这种情况，将list_of_genres[index2]转换为1 。

list_of_genres = ['Adventure', 'Animation', 'Children', 'Comedy', 'Fantasy', 'Drama', 'Romance', 'Action', 'Thriller', 'Sci-Fi', 'Crime', 'Horror', 'Mystery', 'IMAX', 'Documentary', 'War', 'Musical', 'Western', 'Film-Noir']


my_genre = ['Action', 'Crime', 'Drama', 'Thriller']

expected result:预期结果：

[0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 0 0 0]
data type : np.array

Ultimately I want to apply the function that does this to a pandas column that contains the genres.最终，我想将执行此操作的 function 应用于包含流派的 pandas 列。

Answer 1

Numpy isin is what you are looking for. Numpy isin 就是您要找的。

results = np.isin(list_of_genres, my_genre).astype(int)

It's the same for pandas. pandas 也是如此。

list_of_genres = ['Adventure', 'Animation', 'Children', 'Comedy', 'Fantasy', 'Drama', 'Romance', 'Action', 'Thriller', 'Sci-Fi', 'Crime', 'Horror', 'Mystery', 'IMAX', 'Documentary', 'War', 'Musical', 'Western', 'Film-Noir']
my_genre = ['Action', 'Crime', 'Drama', 'Thriller']

df = pd.DataFrame({"genres" : list_of_genres})
df["my_genre"]  = df["genres"].isin(my_genre).astype(int)
print(df)

Answer 2

A map() based solution producing a list :基于map()的解决方案生成list ：

ll = list(map(int, map(my_genre.__contains__, list_of_genres)))
print(ll)
# [0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]

For the result to be numpy.ndarray() you could use np.fromiter() :要获得numpy.ndarray()的结果，您可以使用np.fromiter() ：

import numpy as np

arr = np.fromiter(map(my_genre.__contains__, list_of_genres), dtype=int)
print(arr)
# [0 0 0 0 0 1 0 1 1 0 1 0 0 0 0 0 0 0 0]

For larger inputs, np.in() should be the fastest.对于较大的输入， np.in()应该是最快的。 For inputs of this size, the map() approach is ~6 times faster than np.isin() , ~65 times faster than the pandas solution, and ~40% faster than a comprehension.对于这种大小的输入， map()方法比np.isin()快约 6 倍，比pandas解决方案快约 65 倍，比理解快约 40%。

%timeit np.isin(list_of_genres, my_genre).astype(int)                                                                                        
# 15.8 µs ± 385 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit np.fromiter(map(my_genre.__contains__, list_of_genres), dtype=int)                                                                   
# 2.55 µs ± 27.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit np.fromiter((my_genre.__contains__(x) for x in list_of_genres), dtype=int)                                                           
# 4.14 µs ± 19.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit df["genres"].isin(my_genre).astype(int)                                                                                              
# 167 µs ± 2.26 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

This can be further speed up by converting my_genre to a set prior to the application of the in / .__contains__ operator:这可以通过在应用in / .__contains__运算符之前将my_genre转换为set来进一步加快速度：

%timeit np.fromiter(map(set(my_genre).__contains__, list_of_genres), dtype=int)                                                              
# 1.9 µs ± 7.17 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Answer 3

Here it is, although your question is poorly formulated.在这里，尽管您的问题表述不当。

list_of_genres = ['Adventure', 'Animation', 'Children', 'Comedy', 'Fantasy', 'Drama', 'Romance', 'Action', 'Thriller', 'Sci-Fi', 'Crime', 'Horror', 'Mystery', 'IMAX', 'Documentary', 'War', 'Musical', 'Western', 'Film-Noir']
my_genre = ['Action', 'Crime', 'Drama', 'Thriller']

idx = [1 if g in my_genre else 0 for g in list_of_genres]

Output: Output：

Out[13]: [0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]

If you want a numpy array, then simply convert it into one with numpy.asarray() .如果您想要一个 numpy 数组，则只需使用numpy.asarray()将其转换为一个。 And to apply it to a dataframe, simply change the elements my_genre and list_of_genres accordingly.要将其应用于 dataframe，只需相应地更改元素my_genre和list_of_genres 。

Answer 4

Try this,尝试这个，

>>> list_of_genres = ['Adventure', 'Animation', 'Children', 'Comedy', 'Fantasy', 'Drama', 'Romance', 'Action', 'Thriller', 'Sci-Fi', 'Crime', 'Horror', 'Mystery', 'IMAX', 'Documentary', 'War', 'Musical', 'Western', 'Film-Noir']


>>> my_genre = ['Action', 'Crime', 'Drama', 'Thriller']

Output: Output：

>>> [1 if el in my_genre else 0 for el in list_of_genres]

[0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]

Answer 5

If you want to use pandas as your tags suggest you can do如果您想按照标签建议的方式使用pandas

import pandas as pd
list_of_genres = ['Adventure', 'Animation', 'Children', 'Comedy',
                  'Fantasy', 'Drama', 'Romance', 'Action', 'Thriller',
                  'Sci-Fi', 'Crime', 'Horror', 'Mystery', 'IMAX',
                  'Documentary', 'War', 'Musical', 'Western', 'Film-Noir']

my_genre = ['Action', 'Crime', 'Drama', 'Thriller']

df = pd.DataFrame({"genre": list_of_genres})

df["genre"].apply(lambda x: x in my_genre).astype(int)

# or even faster

df["genre"].isin(my_genre).astype(int)

Answer 6

This should do it as a nice little one liner:这应该是一个不错的小班轮：

list_of_genres = ['Adventure', 'Animation', 'Children', 'Comedy', 'Fantasy', 'Drama', 'Romance', 'Action', 'Thriller', 'Sci-Fi', 'Crime', 'Horror', 'Mystery', 'IMAX', 'Documentary', 'War', 'Musical', 'Western', 'Film-Noir']
my_genre = ['Action', 'Crime', 'Drama', 'Thriller']

result = np.array([int(my_genre.__contains__(n)) for n in list_of_genres])

Output: Output：

[0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]

Answer 7

You can use list comprehension as 1 line solution您可以将列表理解用作 1 行解决方案

bool_list = [1 if item in my_genre else 0 for item in list_of_genres]

If you sort of new to this and don't quite understand list comprehension you can split it in a for loop如果您对此有点陌生并且不太了解列表理解，则可以将其拆分为 for 循环

bool_list =[]
for item in list_of_genres:
    if(item in my_genre):
        bool_list.append(1)
    else:
        bool_list.append(0)

将列表映射到 1 和 0

问题描述

7 个解决方案

解决方案1
5 2019-10-15 11:03:18

解决方案2
1 2019-10-15 11:24:53

解决方案3
0 2019-10-15 10:59:29

解决方案4
0 2019-10-15 11:01:34

解决方案5
0 2019-10-15 11:01:54

解决方案6
0 2019-10-15 11:07:40

解决方案7
0 2019-10-15 11:18:03

将列表映射到 1 和 0

问题描述

7 个解决方案

解决方案1 5 2019-10-15 11:03:18

解决方案2 1 2019-10-15 11:24:53

解决方案3 0 2019-10-15 10:59:29

解决方案4 0 2019-10-15 11:01:34

解决方案5 0 2019-10-15 11:01:54

解决方案6 0 2019-10-15 11:07:40

解决方案7 0 2019-10-15 11:18:03

解决方案1
5 2019-10-15 11:03:18

解决方案2
1 2019-10-15 11:24:53

解决方案3
0 2019-10-15 10:59:29

解决方案4
0 2019-10-15 11:01:34

解决方案5
0 2019-10-15 11:01:54

解决方案6
0 2019-10-15 11:07:40

解决方案7
0 2019-10-15 11:18:03