简体   繁体   English

如何获取唯一 pandas dataframe 列元素的列表?

[英]how to get a list for unique pandas dataframe column elements?

I am trying to get a list for each unique string in a pandas dataframe column:我正在尝试获取 pandas dataframe 列中每个唯一字符串的列表:

import pandas as pd

catalog = {'code': ['A001', 'A001', 'A001', 'A002', 'A002'], 'title': ['director', 'president', 'vice president', 'sales director', 'sales vice president']}

catalog=pd.DataFrame(catalog)

## unique column values ##
codes = catalog['code'].unique()

for code in codes:
     titles = catalog[catalog == code]['title'].tolist()
     print(titles)

Which gives the next output:这给出了下一个 output:

[nan, nan, nan, nan, nan]
[nan, nan, nan, nan, nan]

Expected output could look like this:预期的 output 可能如下所示:

['director', 'president', 'vice president']
['sales director', 'sales vice president']

What am I missing?我错过了什么? Is there any other way to accomplish this task?有没有其他方法可以完成这项任务?

Try with尝试

catalog.groupby('code')['title'].unique()
code
A001     [director, president, vice president]
A002    [sales director, sales vice president]
Name: title, dtype: object

Instead of iterating through the unique codes, it's easier to use a groupby:与其遍历唯一代码,不如使用 groupby 更容易:

catalog.groupby("code").title.apply(list)

code
A001    [director, president, vice president]
A002    [sales director, sales vice president]
Name: title, dtype: object

Your code has an issue where you compare the full dataframe when assigning the title variable, instead of comparing with a column:您的代码存在问题,您在分配title变量时比较完整的 dataframe,而不是与列进行比较:

for code in codes:
    titles = catalog[catalog['code'] == code]['title'].tolist()
    print(titles)

Or:或者:

for code in codes:
    titles = catalog.loc[catalog['code'] == code,'title'].tolist()
    print(titles)

['director', 'president', 'vice president']
['sales director', 'sales vice president']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何获取熊猫数据框中列的唯一长度? - How to get the unique length of a column in a pandas dataframe? Pyspark Dataframe从具有字符串作为元素列表的列中获取唯一元素 - Pyspark Dataframe get unique elements from column with string as list of elements 如何计算 Pandas dataframe 中列中唯一元素的数量 - How to count the number of unique elements in a column in a Pandas dataframe 尝试从 Python 中的 Pandas 数据框列获取唯一值时,如何克服不可散列类型:“列表”错误 - How to overcome unhashable type: 'list' error, when trying to get unique values from a pandas dataframe column in Python 如何将列表列表转换为唯一的Pandas DataFrame列? - How to convert a list of lists into a unique Pandas DataFrame column? 在 DataFrame 的列表列中查找唯一元素的数量 - Find number of unique elements in a list column in DataFrame Pandas dataframe 获取列的唯一值 - Pandas dataframe get unique value of a column 如何在保持列相同的同时使用列表元素填充 Pandas dataframe? - How to populate a Pandas dataframe with list elements while keeping a column same? 如何从 pandas dataframe 列的列表中提取元素? - How do I extract elements from a list in a pandas dataframe column? 如何根据有序列表替换pandas dataframe列中的元素? - How to replace elements within a pandas dataframe column according to an ordered list?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM