简体   繁体   中英

reading from excel and converting the data to dictionary in python

I have some data in excel which represents information about a graph and it looks like this:

1  2  4.5
1  3  6.6
2  4  7.3
3  4  5.1

The first two elements in each row are edges of the graph and the last element is the weight of the arc between those two edges. For example, edge "1" is connected to edge "2" and the weight is 4.5

I import this data into python by the following code:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

training_data_x = pd.read_excel("/Users/mac/Downloads/navid.xlsx",header=None)

x= training_data_x.as_matrix()

So "x" here is the adjacency matrix of the graph. What I am trying to do is converting x to list of dictionaries in python which I need in another code. I am kind of new to python but I think a dictionary that suits here kind of looks like this

gr = {'1': {'2': 4.5, '3': 6.6},
      '2': {'4': 7.3},
      '3': {'4':5.1}}

In fact "gr" should be output of my code here. I think I should use ""pandas.DataFrame.to_dict"' but I have hard time using this command. I really appreciate your help here.

In case you want to rely on pandas' great groupby/split/combine functionality ( see more here ) in addition to the pandas.DataFrame.to_dict method you could actually do the following:

import pandas as pd

file_path = "/Users/mac/Downloads/navid.xlsx"
gr = pd.read_excel(file_path, header=None, index_col=0) \ 
   .groupby(level=0) \ 
   .apply(lambda x: dict(x.to_records(False))) \
   .to_dict()

This should work for all pandas versions above 0.17.

My advice: save your xlsx file as a csv . Now, using vanilla Python:

import csv
gr = {}
with open('data.csv') as f:
    reader = csv.reader(f)
    for row in reader:
        e1, e2, w = row
        gr.setdefault(e1, {})[e2] = float(w)

Perhaps even better, use a defaultdict :

import csv
from collections import defaultdict
gr = defaultdict(dict)
with open('data.csv') as f:
    reader = csv.reader(f)
    for row in reader:
        e1, e2, w = row
        gr[e1][e2] = float(w)

EDIT: Note, I have converted to float manually, but you can probably get away with simply passing the following argument to csv.reader : csv.reader(f, quoting=csv.QUOTE_NONNUMERIC) if you don't mind having your keys be floats as well.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM