如何在Python中創建二維數組？

Question

我想像這樣在python中創建二維數組：

     n1 n2 n3 n4 n5

w1   1  4  0  1 10

w2   3  0  7  0  3

w3   0  12 9  5  4

w4   9  0  0  9  7

其中w1 w2 ...是不同的詞，n1 n2 n3是不同的博客。
我該如何實現？

Answer 1

假設每個博客文字可作為一個字符串，你必須提供這樣的字符串列表blogs ，這是你如何創建你的矩陣。

import re
# Sample input for the following code.
blogs = ["This is a blog.","This is another blog.","Cats? Cats are awesome."]
# This is a list that will contain dictionaries counting the wordcounts for each blog
wordcount = []
# This is a list of all unique words in all blogs.
wordlist = []
# Consider each blog sequentially
for blog in blogs:
    # Remove all the non-alphanumeric, non-whitespace characters,
    # and then split the string at all whitespace after converting to lowercase.
    # eg: "That's not mine." -> "Thats not mine" -> ["thats","not","mine"]
    words = re.sub("\s+"," ",re.sub("[^\w\s]","",blog)).lower().split(" ")
    # Add a new dictionary to the list. As it is at the end,
    # it can be referred to by wordcount[-1]
    wordcount.append({})
    # Consider each word in the list generated above.
    for word in words:
        # If that word has been encountered before, increment the count
        if word in wordcount[-1]: wordcount[-1][word]+=1
        # Else, create a new entry in the dictionary
        else: wordcount[-1][word]=1
        # If it is not already in the list of unique words, add it.
        if word not in wordlist: wordlist.append(word)

# We now have wordlist, which has a unique list of all words in all blogs.
# and wordcount, which contains len(blogs) dictionaries, containing word counts.
# Matrix is the table that you need of wordcounts. The number of rows will be
# equal to the number of unique words, and the number of columns = no. of blogs.
matrix = []
# Consider each word in the unique list of words (corresponding to each row)
for word in wordlist:
    # Add as many columns as there are blogs, all initialized to zero.
    matrix.append([0]*len(wordcount))
    # Consider each blog one by one
    for i in range(len(wordcount)):
        # Check if the currently selected word appears in that blog
        if word in wordcount[i]:
            # If yes, increment the counter for that blog/column
            matrix[-1][i]+=wordcount[i][word]

# For printing matrix, first generate the column headings
temp = "\t"
for i in range(len(blogs)):
    temp+="Blog "+str(i+1)+"\t"

print temp
# Then generate each row, with the word at the starting, and tabs between numbers.

for i in range(len(matrix)):
    temp = wordlist[i]+"\t"
    for j in matrix[i]: temp += str(j)+"\t"
    print temp

現在， matrix[i][j]將包含單詞wordlist[i]在博客blogs[j]出現的次數。

Answer 2

如果列表或字典中的元組不起作用，請考慮使用pandas ：

from pandas import *
In [554]: print DataFrame({'n1':[1,3,0,9], 'n2':[4,0,12,0], 'n3':[0,7,9,0], 'n4':[1,0,5,9], 'n5':[10,3,4,7]},index=['w1','w2','w3','w4'])
    n1  n2  n3  n4  n5
w1   1   4   0   1  10
w2   3   0   7   0   3
w3   0  12   9   5   4
w4   9   0   0   9   7

Answer 3

我根本不會創建任何列表，也不會創建2-d數組，而是創建一個由x和y標頭鍵控的字典作為元組。 如：

data["w1", "n1"] = 1

可以認為這是一種“稀疏矩陣”表示。 根據您要對數據執行的操作，您可能會想要一個dict字典，其中外部dict的鍵是xheader或yheader，而內部dict的鍵是反向鍵。

假定元組為鍵的表示形式，以數據表為輸入：

text = """\
     n1 n2 n3 n4 n5

w1   1  4  0  1 10

w2   3  0  7  0  3

w3   0  12 9  5  4

w4   9  0  0  9  7
"""

data = {}
lines = text.splitlines()
xheaders = lines.pop(0).split()
for line in lines:
    if not line.strip():
        continue
    elems = line.split()
    yheader = elems[0]
    for (xheader, datum) in zip(xheaders, elems[1:]):
        data[xheader, yheader] = int(datum)
print data
print sorted(data.items())

印刷品產生：

{('n3', 'w4'): 0, ('n4', 'w2'): 0, ('n2', 'w2'): 0, ('n1', 'w4'): 9, ('n3', 'w3'): 9, ('n2', 'w3'): 12, ('n3', 'w2'): 7, ('n2', 'w4'): 0, ('n5', 'w3'): 4, ('n2', 'w1'): 4, ('n4', 'w1'): 1, ('n5', 'w2'): 3, ('n5', 'w1'): 10, ('n4', 'w3'): 5, ('n4', 'w4'): 9, ('n1', 'w3'): 0, ('n1', 'w2'): 3, ('n5', 'w4'): 7, ('n1', 'w1'): 1, ('n3', 'w1'): 0}
[(('n1', 'w1'), 1), (('n1', 'w2'), 3), (('n1', 'w3'), 0), (('n1', 'w4'), 9), (('n2', 'w1'), 4), (('n2', 'w2'), 0), (('n2', 'w3'), 12), (('n2', 'w4'), 0), (('n3', 'w1'), 0), (('n3', 'w2'), 7), (('n3', 'w3'), 9), (('n3', 'w4'), 0), (('n4', 'w1'), 1), (('n4', 'w2'), 0), (('n4', 'w3'), 5), (('n4', 'w4'), 9), (('n5', 'w1'), 10), (('n5', 'w2'), 3), (('n5', 'w3'), 4), (('n5', 'w4'), 7)]

Answer 4

一種方法是使用numpy ：

>>> from numpy import array
>>> array( [ (1,4,0,1,10), (3,0,7,0,3), (0,12,9,5,4), (9,0,0,9,7) ] )
array([[ 1,  4,  0,  1, 10],
   [ 3,  0,  7,  0,  3],
   [ 0, 12,  9,  5,  4],
   [ 9,  0,  0,  9,  7]])

Answer 5

如果您只是想要二維數組而不進行任何分析，則可以這樣編寫：

a = [
    [1, 4, 0, 1, 10],
    [3, 0, 7, 0, 3],
    [0, 12, 9, 5, 4],
    [9, 0, 0, 9, 7]
]

如何在Python中創建二維數組？

問題描述

5 個解決方案

解決方案1
1 已采納 2012-10-20 14:28:00

解決方案2
0 2012-10-20 13:45:55

解決方案3
0 2012-10-20 14:05:08

解決方案4
0 2012-10-20 15:06:02

解決方案5
0 2012-10-21 07:16:33

如何在Python中創建二維數組？

問題描述

5 個解決方案

解決方案1 1 已采納 2012-10-20 14:28:00

解決方案2 0 2012-10-20 13:45:55

解決方案3 0 2012-10-20 14:05:08

解決方案4 0 2012-10-20 15:06:02

解決方案5 0 2012-10-21 07:16:33

解決方案1
1 已采納 2012-10-20 14:28:00

解決方案2
0 2012-10-20 13:45:55

解決方案3
0 2012-10-20 14:05:08

解決方案4
0 2012-10-20 15:06:02

解決方案5
0 2012-10-21 07:16:33