简体   繁体   中英

Adjacency Matrix of Non-numeric tuples

I have a massive dictionary of items in a co-occurrence format. Basically, conditional word vectors. the simplified dictionary looks something like this:

reservoir ={
 ('a', 'b'): 2,
 ('a', 'c'): 3,
 ('b', 'a'): 1,
 ('b', 'c'): 3,
 ('c', 'a'): 1,
 ('c', 'b'): 2,
 ('c', 'd'): 5,             ,
}

For the sake of storage, I have decided that if there isn't a co-occurrence, then to not store the information at all, ie: the fact that a and b never occur with d, and therefore I do not have any information associated with either point.

The result I'm trying to get is that for every tuple, key1=x and key2=y, so that in a matrix it will look like this:

  a b c d
a 0 2 3 0
b 1 0 3 0
c 1 2 0 5
d 0 0 0 0

I

I have found information in this post: Adjacency List and Adjacency Matrix in Python , but it's just not quite what I'm looking to do. All my attempts thus far have proven to be less than fruitful. Any help would be amazing.

Thanks again,

You really just need to get the labels for the rows and columns. From there, it's just a few for loops:

from __future__ import print_function

import itertools

reservoir = {
    ('a', 'b'): 2,
    ('a', 'c'): 3,
    ('b', 'a'): 1,
    ('b', 'c'): 3,
    ('c', 'a'): 1,
    ('c', 'b'): 2,
    ('c', 'd'): 5
}

fields = sorted(list(set(itertools.chain.from_iterable(reservoir))))

print(' ', *fields)

for row in fields:
    print(row, end=' ')

    for column in fields:
        print(reservoir.get((row, column), 0), end=' ')

    print()

Your table will start getting ugly when the cells get more than one digit, so I'll leave that to you to figure out. You'll just need to find the maximal length of the field for each column before printing them.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM