[英]How to create a dictionary of key : column_name and value : unique values in column in python from a dataframe
I am trying to create a dictionary of key:value pairs where key is the column name of a dataframe and value will be a list containing all the unique values in that column.Ultimately I want to be able to filter out the key_value pairs from the dict based on conditions. 我正在尝试创建一个key:value对的字典,其中key是数据帧的列名,value将是一个包含该列中所有唯一值的列表。最后我希望能够从中删除key_value对基于条件的字典。 This is what I have been able to do so far: 这是我迄今为止所做的:
for col in col_list[1:]:
_list = []
_list.append(footwear_data[col].unique())
list_name = ''.join([str(col),'_list'])
product_list = ['shoe','footwear']
color_list = []
size_list = []
Here product,color,size are all column names and the dict keys should be named accordingly like color_list etc. Ultimately I will need to access each key:value_list in the dictionary. 这里的产品,颜色,大小都是列名,dict键应该像color_list等一样命名。最后我需要访问每个键:字典中的value_list。 Expected output: 预期产量:
KEY VALUE
color_list : ["red","blue","black"]
size_list: ["9","XL","32","10 inches"]
Can someone please help me regarding this?A snapshot of the data is attached. 有人可以帮我解决这个问题吗?附上数据的快照。
With a DataFrame
like this: 使用像这样的DataFrame
:
import pandas as pd
df = pd.DataFrame([["Women", "Slip on", 7, "Black", "Clarks"], ["Women", "Slip on", 8, "Brown", "Clarcks"], ["Women", "Slip on", 7, "Blue", "Clarks"]], columns= ["Category", "Sub Category", "Size", "Color", "Brand"])
print(df)
Output: 输出:
Category Sub Category Size Color Brand
0 Women Slip on 7 Black Clarks
1 Women Slip on 8 Brown Clarcks
2 Women Slip on 7 Blue Clarks
You can convert your DataFrame into dict and create your new dict when mapping the the columns of the DataFrame, like this example: 您可以将DataFrame转换为dict并在映射DataFrame的列时创建新的dict,如下例所示:
new_dict = {"color_list": list(df["Color"]), "size_list": list(df["Size"])}
# OR:
#new_dict = {"color_list": [k for k in df["Color"]], "size_list": [k for k in df["Size"]]}
print(new_dict)
Output: 输出:
{'color_list': ['Black', 'Brown', 'Blue'], 'size_list': [7, 8, 7]}
In order to have a unique values, you can use set
like this example: 为了获得唯一值,您可以像这样使用set
:
new_dict = {"color_list": list(set(df["Color"])), "size_list": list(set(df["Size"]))}
print(new_dict)
Output: 输出:
{'color_list': ['Brown', 'Blue', 'Black'], 'size_list': [8, 7]}
Or, like what @Ami Tavory said in his answer, in order to have the whole unique keys and values from your DataFrame, you can simply do this: 或者,就像@Ami Tavory在他的回答中所说的那样,为了获得DataFrame中的全部唯一键和值,您可以简单地执行此操作:
new_dict = {k:list(df[k].unique()) for k in df.columns}
print(new_dict)
Output: 输出:
{'Brand': ['Clarks', 'Clarcks'],
'Category': ['Women'],
'Color': ['Black', 'Brown', 'Blue'],
'Size': [7, 8],
'Sub Category': ['Slip on']}
I am trying to create a dictionary of key:value pairs where key is the column name of a dataframe and value will be a list containing all the unique values in that column. 我正在尝试创建键的字典:值对,其中key是数据帧的列名,值将是包含该列中所有唯一值的列表。
You could use a simple dictionary comprehension for that. 您可以使用简单的字典理解 。
Say you start with 假设你开始
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 1], 'b': [1, 4, 5]})
Then the following comprehension solves it: 然后以下理解解决了它:
>>> {c: list(df[c].unique()) for c in df.columns}
{'a': [1, 2], 'b': [1, 4, 5]}
If I understand your question correctly, you may need set
instead of list. 如果我正确理解您的问题,您可能需要set
而不是列表。 Probably at this piece of code, you might add set
to get the unique values of the given list. 可能在这段代码中,您可以添加set
以获取给定列表的唯一值。
for col in col_list[1:]:
_list = []
_list.append(footwear_data[col].unique())
list_name = ''.join([str(col),'_list'])
list_name = set(list_name)
Sample of usage 使用样本
>>> a_list = [7, 8, 7, 9, 10, 9]
>>> set(a_list)
{8, 9, 10, 7}
Here how i did it let me know if it helps 在这里我是如何做到的,让我知道它是否有帮助
import pandas as pd
df = pd.read_csv("/path/to/csv/file")
colList = list(df)
dic = {}
for x in colList:
_list = []
_list.append(list(set(list(df[x]))))
list_name = ''.join([str(x), '_list'])
dic[str(x)+"_list"] = _list
print dic
Output: 输出:
{'Color_list': [['Blue', 'Orange', 'Black', 'Red']], 'Size_list': [['9', '8', '10 inches', 'XL', '7']], 'Brand_list': [['Clarks']], 'Sub_list': [['SO', 'FOR']], 'Category_list': [['M', 'W']]}
MyCsv File MyCsv文件
Category,Sub,Size,Color,Brand
W,SO,7,Blue,Clarks
W,SO,7,Blue,Clarks
W,SO,7,Black,Clarks
W,SO,8,Orange,Clarks
W,FOR,8,Red,Clarks
M,FOR,9,Black,Clarks
M,FOR,10 inches,Blue,Clarks
M,FOR,XL,Blue,Clarks
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.