I am trying to convert dataframe to dictionary(as they are faster when filtering on key) I am currently using
t3 = time()
r={}
for i in df.index.unique():
r[i]=[]
r[i].append(df.loc[i].values)
print(round((time()-t3), 1), "s")
this type of conversion is slow. Is there an alternative to this? I want index of dataframe as key and row as values with multiple values on a single key
Use pandas.DataFrame.to_dict
after transposing to get index as key and row values as values:
import pandas as pd
df = pd.DataFrame({'col1': [1, 2], 'col2': ['a', 'b']})
r = df.T.to_dict('list')
print(r)
Output:
{0: [1, 'a'], 1: [2, 'b']}
I was able to convert my dataframe with multiple duplicate indexes to dictionary using:
dicti={}
for line in df.itertuples():
if line.index not in dicti:
dicti[line.index]=[]
dicti[line.index].append(list(line))
else:
dicti[line.index].append(list(line))
With 5 sec run time for 600k rows
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.