简体   繁体   中英

Convert Python dataframe to dictionary with index

I am trying to convert dataframe to dictionary(as they are faster when filtering on key) I am currently using

t3 = time()
r={}
for i in df.index.unique():
    r[i]=[]
    r[i].append(df.loc[i].values)
print(round((time()-t3), 1), "s")

this type of conversion is slow. Is there an alternative to this? I want index of dataframe as key and row as values with multiple values on a single key

Use pandas.DataFrame.to_dict after transposing to get index as key and row values as values:

import pandas as pd

df = pd.DataFrame({'col1': [1, 2], 'col2': ['a', 'b']})
r = df.T.to_dict('list')
print(r)

Output:

{0: [1, 'a'], 1: [2, 'b']}

I was able to convert my dataframe with multiple duplicate indexes to dictionary using:

dicti={}
for line in df.itertuples():
   if line.index not in dicti:
      dicti[line.index]=[]
      dicti[line.index].append(list(line))
   else:
      dicti[line.index].append(list(line))

With 5 sec run time for 600k rows

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM