Sorry for the confusing title, basically I have a Pandas dataframe and I want to convert two columns into a dictionary (with one being the key and the other the value). However, when I use to_dict(), the problem is that since I have many rows in the first column that have the same value, it only takes one of them and I don't get all the data. Is there a way to work around this?
I have tried solving this recursively but I haven't been able to figure it out.
EDIT: added code
data = pd.read_csv('file')
datalist = []
data2list = []
for i in range(len(data.index)):
datalist.append(data.loc[i, 'column1'])
for i in range(len(data.index)):
data2list.append(data.loc[i, 'column2'])
Now datalist has all the values from column1, which I want to be the keys, and column2 has all the values that I want to be the values in the dictionary.
The problem however is, the dataframe looks kind of like this:
column1 column2
0 key1 value1
1 key1 value2
2 key2 value3
3 key2 value4
I want the dictionary to look like this:
dict = {"key1": [value1, value2], "key2": [value3, value4]}
Python dictionaries do not support repeated keys. You could solve this by adjusting the values in your first column so that the keys are not repeated. Alternatively, you could create a dictionary of lists for each unique key in the first column. Since your data is in a Pandas DataFrame, you could do:
import pandas as pd
# Your data
data = pd.DataFrame({'column1':['key1','key1','key2','key2'],
'column2':['value1','value2','value3','value3']})
# Grouped dict
data_dict = data.groupby('column1').column2.apply(list).to_dict()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.