简体   繁体   English

从 Python 中的文本文件索引矩阵并将其转换为嵌套字典

[英]Indexing a matrix from a text file in Python and convert it to a nested dictionary

I am trying to read this matrix from a.txt file:我正在尝试从 a.txt 文件中读取此矩阵:

-1 2 3 -1 -1 -1 -1 -1
2 -1 -1 4 -1 -1 -1 -1
3 -1 -1 -1 4 5 -1 -1
-1 4 -1 -1 -1 -1 3 -1
-1 -1 4 -1 -1 -1 -1 5
-1 -1 5 -1 -1 -1 2 3
-1 -1 -1 3 -1 2 -1 -1
-1 -1 -1 -1 5 3 -1 -1

and I want it to transform it into:我希望它把它变成:

    A   B   C   D   E   F   G   H
A   -1  2   3   -1  -1  -1  -1  -1
B   2   -1  -1  4   -1  -1  -1  -1
C   3   -1  -1  -1  4   5   -1  -1
D   -1  4   -1  -1  -1  -1  3   -1
E   -1  -1  4   -1  -1  -1  -1  5
F   -1  -1  5   -1  -1  -1  2   3
G   -1  -1  -1  3   -1  2   -1  -1
H   -1  -1  -1  -1  5   3   -1  -1

I need to read the matrix as a dictionary so I can use it with my dijkstra algorithm to find a path.我需要将矩阵作为字典读取,以便可以将它与我的 dijkstra 算法一起使用来查找路径。 I also want to convert those "-1" to "9999" (a very high value) to make the algorithm work properly.我还想将那些“-1”转换为“9999”(一个非常高的值)以使算法正常工作。
I'm using this to read the file and convert it into a dictionary.我正在使用它来读取文件并将其转换为字典。 But it doesn't add an index such as 'A' 'B' 'C'... 'H' for each row and column as shown before.但它不会为每一行和每一列添加一个索引,如 'A' 'B' 'C'... 'H',如前所示。 How can I do it?我该怎么做? I want to know if there is a way to do it in vanilla python and if there is another way to do it maybe using a module.我想知道在香草 python 中是否有办法做到这一点,如果有另一种方法可以使用模块。


with open('matrix.txt', 'r') as f:
    columns = next(f).split()
    matrix = collections.defaultdict(dict)
    for line in f:
        items = line.split()
        row, vals = items[0], items[1:]
        for col, val in zip(columns, vals):
            matrix[col][row] = int(val)
print(matrix)

It is important to mention that I cannot assign the letters by hand.重要的是要提到我不能手动分配字母。 Instead this program should be able to read any matrix (ie a 4x4 matrix or a 12x12 one) and assign an index to each row and column.相反,这个程序应该能够读取任何矩阵(即 4x4 矩阵或 12x12 矩阵)并为每一行和每一列分配一个索引。 (ie a 4x4 matrix should have only four letters on each column and row (AB C D), because it's a 4x4 matrix, and so on...). (即 4x4 矩阵的每列和每行应该只有四个字母 (AB C D),因为它是 4x4 矩阵,依此类推......)。
I'm new to python so I was reading some information about "Pandas" but I don't know if I can do it "more generic" so it can work with even a 25x25 matrix and assign a letter for each column and row.我是 python 的新手,所以我正在阅读一些关于“熊猫”的信息,但我不知道我是否可以“更通用”,所以它甚至可以使用 25x25 矩阵并为每列和每行分配一个字母。

For example.例如。 When I use the code above with the SECOND MATRIX it prints:当我将上面的代码与 SECOND MATRIX 一起使用时,它会打印:

defaultdict(<class 'dict'>, {'A': {'A': -1, 'B': 2, 'C': 3, 'D': -1, 'E': -1, 'F': -1, 'G': -1, 'H': -1}, 'B': {'A': 2, 'B': -1, 'C': -1, 'D': 4, 'E': -1, 'F': -1, 'G': -1, 'H': -1}, 'C': {'A': 3, 'B': -1, 'C': -1, 'D': -1, 'E': 4, 'F': 5, 'G': -1, 'H': -1}, 'D': {'A': -1, 'B': 4, 'C': -1, 'D': -1, 'E': -1, 'F': -1, 'G': 3, 'H': -1}, 'E': {'A': -1, 'B': -1, 'C': 4, 'D': -1, 'E': -1, 'F': -1, 'G': -1, 'H': 5}, 'F': {'A': -1, 'B': -1, 'C': 5, 'D': -1, 'E': -1, 'F': -1, 'G': 2, 'H': 3}, 'G': {'A': -1, 'B': -1, 'C': -1, 'D': 3, 'E': -1, 'F': 2, 'G': -1, 'H': -1}, 'H': {'A': -1, 'B': -1, 'C': -1, 'D': -1, 'E': 5, 'F': 3, 'G': -1, 'H': -1}})


But If I use it with the FIRST MATRIX it prints:但如果我将它与 FIRST MATRIX 一起使用,它会打印:

defaultdict(<class 'dict'>, {'-1': {'2': -1, '3': -1, '-1': -1}, '2': {'2': -1, '3': -1, '-1': -1}, '3': {'2': 4, '3': -1, '-1': -1}})
So how can I assign them automatically an index in each row and column?那么如何在每一行和每一列中自动为它们分配一个索引呢?
Here is the full code and both.txt files if you are not sure what I'm trying to do:如果您不确定我要做什么,这是完整的代码和 both.txt 文件:
dijkstra.py dijkstra.py
matrixA矩阵A
matrixB矩阵B

You don't: those labels are not part of the matrix.你不知道:这些标签不是矩阵的一部分。 Instead, simply use the upper-case alphabet string whenever needed.相反,只需在需要时使用大写字母字符串。 For instance:例如:

letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

# Read in your NxN matrix

print("  " + "  ".join(let[:N]))

for row in range(N):
    print(letters[row] + "  ", end = "")
    # Now print your row of values, properly formatted.

So, I think this could work:所以,我认为这可以工作:

import string

with open('matrix-a.txt', 'r') as f:
    lines = f.readlines()

letters = string.ascii_uppercase

lines = [line.split() for line in lines]
matrix = {}
for row in range(len(lines)):
    inner = {}
    for letter, line in zip(letters, lines[row]):
        line = int(line)
        if line == -1:
            line = 9999
        inner[letter] = line
    matrix[letters[row]] = inner

If I print out matrix , I get:如果我打印出matrix ,我会得到:

{'A': {'A': 9999, 'B': 2, 'C': 3, 'D': 9999, 'E': 9999, 'F': 9999, 'G': 9999, 'H': 9999}, 'B': {'A': 2, 'B': 9999, 'C': 9999, 'D': 4, 'E': 9999, 'F': 9999, 'G': 9999, 'H': 9999}, 'C': {'A': 3, 'B': 9999, 'C': 9999, 'D': 9999, 'E': 4, 'F': 5, 'G': 9999, 'H': 9999}, 'D': {'A': 9999, 'B': 4, 'C': 9999, 'D': 9999, 'E': 9999, 'F': 9999, 'G': 3, 'H': 9999}, 'E': {'A': 9999, 'B': 9999, 'C': 4, 'D': 9999, 'E': 9999, 'F': 9999, 'G': 9999, 'H': 5}, 'F': {'A': 9999, 'B': 9999, 'C': 5, 'D': 9999, 'E': 9999, 'F': 9999, 'G': 2, 'H': 3}, 'G': {'A': 9999, 'B': 9999, 'C': 9999, 'D': 3, 'E': 9999, 'F': 2, 'G': 9999, 'H': 9999}, 'H': {'A': 9999, 'B': 9999, 'C': 9999, 'D': 9999, 'E': 5, 'F': 3, 'G': 9999, 'H': 9999}}

Which seems to be the dictionary behavior you are looking for, correct?这似乎是您正在寻找的字典行为,对吗?

If you want to play code golf:如果你想打代码高尔夫:

matrix = {letters[row]: {letter: int(line) for letter, line in zip(letters, lines[row])} for row in range(len(lines))}

Use pandas and string使用pandasstring

import pandas as pd
import string

# read file
df = pd.read_csv('matrix.txt', sep='\\s+', header=None)

# convert -1 to 9999 if you want. I didn't do it for the output shown
df.replace(-1, 9999, inplace=True)

# letters
letters = string.ascii_uppercase

# create column names and index
df.columns = [letter for letter in letters[:len(df.columns)]]
df.index = [letter for letter in letters[:len(df.columns)]]

# convert to dict
my_dict = df.to_dict()

print(my_dict)

{'A': {'A': -1, 'B': 2, 'C': 3, 'D': -1, 'E': -1, 'F': -1, 'G': -1, 'H': -1},
 'B': {'A': 2, 'B': -1, 'C': -1, 'D': 4, 'E': -1, 'F': -1, 'G': -1, 'H': -1},
 'C': {'A': 3, 'B': -1, 'C': -1, 'D': -1, 'E': 4, 'F': 5, 'G': -1, 'H': -1},
 'D': {'A': -1, 'B': 4, 'C': -1, 'D': -1, 'E': -1, 'F': -1, 'G': 3, 'H': -1},
 'E': {'A': -1, 'B': -1, 'C': 4, 'D': -1, 'E': -1, 'F': -1, 'G': -1, 'H': 5},
 'F': {'A': -1, 'B': -1, 'C': 5, 'D': -1, 'E': -1, 'F': -1, 'G': 2, 'H': 3},
 'G': {'A': -1, 'B': -1, 'C': -1, 'D': 3, 'E': -1, 'F': 2, 'G': -1, 'H': -1},
 'H': {'A': -1, 'B': -1, 'C': -1, 'D': -1, 'E': 5, 'F': 3, 'G': -1, 'H': -1}}

Solution解决方案

Try this.尝试这个。 You can get a nested dictionary (json-like) from the dataframe ( df ) using df.to_dict() .您可以使用df.to_dict() ) 从 dataframe ( df ) 获取嵌套字典(类 json)。

import pandas as pd
import string
filename = 'matrix.txt'
df = (pd
      .read_csv(filename, sep='\s+', header=None)
      .replace(-1, 9999, inplace=False)
)
headers = string.ascii_uppercase
df.columns = list(headers[:len(df.columns)])
df.index = list(headers[:len(df.index)])
print(df)
nested_dict = df.to_dict()

Example例子

import pandas as pd
import string
from io import StringIO

s = """
-1 2 3 -1 -1 -1 -1 -1
2 -1 -1 4 -1 -1 -1 -1
3 -1 -1 -1 4 5 -1 -1
-1 4 -1 -1 -1 -1 3 -1
-1 -1 4 -1 -1 -1 -1 5
-1 -1 5 -1 -1 -1 2 3
-1 -1 -1 3 -1 2 -1 -1
-1 -1 -1 -1 5 3 -1 -1
"""
df = (pd
      .read_csv(StringIO(s), sep='\s+', header=None)
      .replace(-1, 9999, inplace=False)
)
headers = string.ascii_uppercase
df.columns = list(headers[:len(df.columns)])
df.index = list(headers[:len(df.index)])
print(df)

Output : Output

      A     B     C     D     E     F     G     H
A  9999     2     3  9999  9999  9999  9999  9999
B     2  9999  9999     4  9999  9999  9999  9999
C     3  9999  9999  9999     4     5  9999  9999
D  9999     4  9999  9999  9999  9999     3  9999
E  9999  9999     4  9999  9999  9999  9999     5
F  9999  9999     5  9999  9999  9999     2     3
G  9999  9999  9999     3  9999     2  9999  9999
H  9999  9999  9999  9999     5     3  9999  9999

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM