简体   繁体   English

如何创建密钥字典:column_name和value:python中来自数据框的列中的唯一值

[英]How to create a dictionary of key : column_name and value : unique values in column in python from a dataframe

I am trying to create a dictionary of key:value pairs where key is the column name of a dataframe and value will be a list containing all the unique values in that column.Ultimately I want to be able to filter out the key_value pairs from the dict based on conditions. 我正在尝试创建一个key:value对的字典,其中key是数据帧的列名,value将是一个包含该列中所有唯一值的列表。最后我希望能够从中删除key_value对基于条件的字典。 This is what I have been able to do so far: 这是我迄今为止所做的:

for col in col_list[1:]:
    _list = []
    _list.append(footwear_data[col].unique())
    list_name = ''.join([str(col),'_list'])

product_list = ['shoe','footwear']
color_list = []
size_list = []

Here product,color,size are all column names and the dict keys should be named accordingly like color_list etc. Ultimately I will need to access each key:value_list in the dictionary. 这里的产品,颜色,大小都是列名,dict键应该像color_list等一样命名。最后我需要访问每个键:字典中的value_list。 Expected output: 预期产量:

KEY              VALUE
color_list :    ["red","blue","black"]
size_list:  ["9","XL","32","10 inches"]

Can someone please help me regarding this?A snapshot of the data is attached. 有人可以帮我解决这个问题吗?附上数据的快照。 data_frame

With a DataFrame like this: 使用像这样的DataFrame

import pandas as pd
df = pd.DataFrame([["Women", "Slip on", 7, "Black", "Clarks"], ["Women", "Slip on", 8, "Brown", "Clarcks"], ["Women", "Slip on", 7, "Blue", "Clarks"]], columns= ["Category", "Sub Category", "Size", "Color", "Brand"])

print(df)

Output: 输出:

  Category Sub Category  Size  Color    Brand
0    Women      Slip on     7  Black   Clarks
1    Women      Slip on     8  Brown  Clarcks
2    Women      Slip on     7   Blue   Clarks

You can convert your DataFrame into dict and create your new dict when mapping the the columns of the DataFrame, like this example: 您可以将DataFrame转换为dict并在映射DataFrame的列时创建新的dict,如下例所示:

new_dict = {"color_list": list(df["Color"]), "size_list": list(df["Size"])}
# OR:
#new_dict = {"color_list": [k for k in df["Color"]], "size_list": [k for k in df["Size"]]}

print(new_dict)

Output: 输出:

{'color_list': ['Black', 'Brown', 'Blue'], 'size_list': [7, 8, 7]}

In order to have a unique values, you can use set like this example: 为了获得唯一值,您可以像这样使用set

new_dict = {"color_list": list(set(df["Color"])), "size_list": list(set(df["Size"]))}
print(new_dict)

Output: 输出:

{'color_list': ['Brown', 'Blue', 'Black'], 'size_list': [8, 7]}

Or, like what @Ami Tavory said in his answer, in order to have the whole unique keys and values from your DataFrame, you can simply do this: 或者,就像@Ami Tavory在他的回答中所说的那样,为了获得DataFrame中的全部唯一键和值,您可以简单地执行此操作:

new_dict = {k:list(df[k].unique()) for k in df.columns}
print(new_dict)

Output: 输出:

{'Brand': ['Clarks', 'Clarcks'],
 'Category': ['Women'],
 'Color': ['Black', 'Brown', 'Blue'],
 'Size': [7, 8],
 'Sub Category': ['Slip on']}

I am trying to create a dictionary of key:value pairs where key is the column name of a dataframe and value will be a list containing all the unique values in that column. 我正在尝试创建键的字典:值对,其中key是数据帧的列名,值将是包含该列中所有唯一值的列表。

You could use a simple dictionary comprehension for that. 您可以使用简单的字典理解

Say you start with 假设你开始

import pandas as pd

df = pd.DataFrame({'a': [1, 2, 1], 'b': [1, 4, 5]})

Then the following comprehension solves it: 然后以下理解解决了它:

>>> {c: list(df[c].unique()) for c in df.columns}
{'a': [1, 2], 'b': [1, 4, 5]}

If I understand your question correctly, you may need set instead of list. 如果我正确理解您的问题,您可能需要set而不是列表。 Probably at this piece of code, you might add set to get the unique values of the given list. 可能在这段代码中,您可以添加set以获取给定列表的唯一值。

for col in col_list[1:]:
    _list = []
    _list.append(footwear_data[col].unique())
    list_name = ''.join([str(col),'_list'])
    list_name = set(list_name)

Sample of usage 使用样本

>>> a_list = [7, 8, 7, 9, 10, 9]
>>> set(a_list)
    {8, 9, 10, 7}

Here how i did it let me know if it helps 在这里我是如何做到的,让我知道它是否有帮助

import pandas as pd

df = pd.read_csv("/path/to/csv/file")

colList = list(df)
dic = {}
for x in colList:
    _list = []
    _list.append(list(set(list(df[x]))))
    list_name = ''.join([str(x), '_list'])
    dic[str(x)+"_list"] = _list


print dic

Output: 输出:

{'Color_list': [['Blue', 'Orange', 'Black', 'Red']], 'Size_list': [['9', '8', '10 inches', 'XL', '7']], 'Brand_list': [['Clarks']], 'Sub_list': [['SO', 'FOR']], 'Category_list': [['M', 'W']]}

MyCsv File MyCsv文件

Category,Sub,Size,Color,Brand
W,SO,7,Blue,Clarks
W,SO,7,Blue,Clarks
W,SO,7,Black,Clarks
W,SO,8,Orange,Clarks
W,FOR,8,Red,Clarks
M,FOR,9,Black,Clarks
M,FOR,10 inches,Blue,Clarks
M,FOR,XL,Blue,Clarks

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何通过将字典键与列名匹配 python 将 map 字典值与 dataframe 列的值相匹配 - How to map dictionary values to values of dataframe column by matching dictionary key to column name python 从 Python dataframe 创建 Json 文件,在一个列上分组并将列名作为键,并将唯一值作为键内的列表 - Create Json file from Python dataframe with grouping on one col and making column name as key with unique values as a list inside the key 如何使用 dataframe 的 column_name 作为行上的值? - How can I use the column_name of a dataframe as a value on the rows? 在 pandas 中的 dataframe 中创建列的唯一值字典 - Create a dictionary of unique values of a column in a dataframe in pandas Python:如何对列名为列的唯一值编号的数据框进行分组? - Python: how to groupby a dataframe with column name numbered for the unique values of a column? 如何为列表中的一个键创建具有多个值的 Python 字典,然后创建具有一列和多行的 pandas 数据框 - How can I create a Python dictionary with multiple values for one key from a list, to then create a pandas dataframe with one column and multiple rows 将列值转换为pandas数据框中的column_name - Transform column value to the column_name in pandas dataframe 如何将 append Python 字典到 Pandas ZBA834BA059A9A379459C112175EB88E 匹配到列名键 - How to append Python dictionary to Pandas DataFrame, matching the key to the column name 字典中的 Append Dataframe 值通过匹配字典键与 Dataframe 列名 ZA7F5F35426B927411FC9231B56 - Append Dataframe values in Dictionary by matching Dictionary key with Dataframe Column name Python 如何从同一数据框中的字典键创建具有键名的列? - How can I create column having key name from dictionary keys in same dataframe?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM