列存储为列表；如何在 pandas python 中拆分为 COLUMNS？

Question

assume "Tags" column as stores as below;假定“标签”列为商店，如下所示； How can I split into multiple columns or set into one list?如何拆分为多列或设置为一个列表？

desired as " To be combined as List and filter-out duplication desired as " 合并为 List 并过滤掉重复项

"Tags"
['Saudi', 'law', 'Saudi Arabia', 'rules']
['Hindi', 'Tamil', 'imposition', 'cbse', 'neet', 'Tamil Nadu', 'India']
['Stephen', 'Hawkins', 'Tamil', 'predictions', 'future', 'science', 'scientist', 'top 5', 'five']
['Bigg Boss', 'Tamil', 'Kamal', 'big', 'boss']
['Mary', 'real', 'story', 'Tamil', 'history']
['football', 'Tamil', 'FIFA', '2018', 'world cup', 'MG', 'top', '10', 'ten']
['India', 'Tamil', 'poor', 'rich', 'money', 'MG', 'why', 'Indians']

Answer 1

Try:尝试：

df["Tags"].explode().unique()

Or:或者：

np.unique(df["Tags"].sum())

Edit:编辑：

Maybe you need:也许你需要：

import ast
df["Tags"].apply(ast.literal_eval).explode().unique()

Answer 2

If need list without duplicates use set comprehension with set if performance is important:如果需要没有重复的列表，如果性能很重要，请使用集合理解和set ：

L = list(set(y for x in df['Tags'] for y in x))

If possible there are list s saved like strings use:如果可能的话，像字符串一样保存list使用：

import ast

L = list(set(y for x in df['Tags'] for y in ast.literal_eval(x)))

print (L)
['FIFA', 'Mary', 'world cup', 'rich', 'story', 'Tamil', 'rules', 'neet', 'money', 'Kamal', 'Hindi', 'big', 'cbse', 'imposition', 'football', 'MG', 'history', 'predictions', 'why', 'Tamil Nadu', 'top 5', 'ten', '10', 'Bigg Boss', 'India', 'Stephen', 'top', 'poor', 'law', 'Saudi', 'real', 'Indians', 'future', 'boss', 'five', '2018', 'scientist', 'Saudi Arabia', 'science', 'Hawkins']

Answer 3

You could flatten the lists and use set() :您可以展平列表并使用set() ：

out = []
for lst in df['Tags'].tolist():
    out.extend(lst)

out = list(set(out))

Output: Output：

['cbse', '2018', 'future', 'India', '10', 'Indians', 'money', 
'Hindi', 'rules', 'poor', 'Kamal', 'neet', 'top 5', 'world cup', 
'five', 'law', 'ten', 'Stephen', 'Tamil', 'Mary', 'Bigg Boss', 
'top', 'scientist', 'boss', 'Saudi Arabia', 'big', 'real', 'story', 
'why', 'Hawkins', 'predictions', 'football', 'rich', 'science', 
'imposition', 'Saudi', 'FIFA', 'history', 'Tamil Nadu', 'MG']

Using the same code, for the lists below:对于以下列表，使用相同的代码：

lsts = [['thamizh', 'kannada', 'karnataka', 'bangalore', 'mysore', 
'bengaluru', 'Bengaluru', 'malayalam', 'kerala', 'chennai', 'yash',
 'kgf', 'songs', 'kannada songs', 'news', 'today'], 
 ['songs', 'kannada songs', 'news', 'today'], 
['mysore', 'bengaluru', 'Bengaluru', 'malayalam',]]

Output: Output：

['today', 'songs', 'malayalam', 'bangalore', 'karnataka', 'kerala', 
'bengaluru', 'mysore', 'kgf', 'Bengaluru', 'chennai', 'yash', 
'thamizh', 'kannada', 'news', 'kannada songs']

列存储为列表；如何在 pandas python 中拆分为 COLUMNS？

问题描述

2 个解决方案

解决方案1
0 2021-12-22 08:54:29

解决方案2
0 2021-12-22 08:54:55

解决方案3
0 2021-12-22 09:24:24

列存储为列表； 如何在 pandas python 中拆分为 COLUMNS？

问题描述

2 个解决方案

解决方案1 0 2021-12-22 08:54:29

解决方案2 0 2021-12-22 08:54:55

解决方案3 0 2021-12-22 09:24:24

列存储为列表；如何在 pandas python 中拆分为 COLUMNS？

解决方案1
0 2021-12-22 08:54:29

解决方案2
0 2021-12-22 08:54:55

解决方案3
0 2021-12-22 09:24:24