Python：如何讀取和存儲列號不斷變化的數據以列出？

Question

我的數據如下

...
 5 4 3 16 22 247 0         1.168         0.911         0.944         3.205         0.000         0.562
 6 4 4 17 154 93 309 0         0.930         0.919         0.903         0.917         3.852         0.000         1.419
 7 3 2 233 311 0         0.936         0.932         1.874         2.000        -0.807
...

數據由整數和浮點數組成，但我希望僅收集整數，並獲取它們的元素並使用它們。 但是，此數據的總列數正在變化。 幸運的是，該數據的第三列是下幾列的數量。 例如，第一行在第三列中具有“ 3”，其后跟隨3個整數。 下一行在第三列中具有“ 4”，因此該行在第三行之后具有4個后續整數。 最后一行具有“ 2”，因此該行具有2個后續整數。

以前，我編寫了一個代碼作為make empty list，然后將數據放入列表中，例如

   at_index = [None]*nline
   at_type = [None]*nline
   num_of_bonds = [None]*nline
   neighbor_id1 = [None]*nline
   neighbor_id2 = [None]*nline
   neighbor_id3 = [None]*nline
   neighbor_id4 = [None]*nline
   neighbor_id5 = [None]*nline
   for i1 in nlines:
      ### Split each line based on spaces
          line = data_lines[i1].split()
          at_index[i1] = int(line[0])
          at_type[i1] = int(line[1])
          num_of_bonds[i1] = int(line[2])
          if num_of_bonds[i1] == 2:
             neighbor_id1[i1] = int(line[3])
             neighbor_id2[i1] = int(line[4])
          if num_of_bonds[i1] == 3:
             neighbor_id1[i1] = int(line[3])
             neighbor_id2[i1] = int(line[4])
             neighbor_id3[i1] = int(line[5])
          if num_of_bonds[i1] == 4:
             neighbor_id1[i1] = int(line[3])
             neighbor_id2[i1] = int(line[4])
             neighbor_id3[i1] = int(line[5])
             neighbor_id4[i1] = int(line[6])

但是此嘗試失敗了，因為最后一個if循環（“ num_ofbonds [i1] == 4”）條件覆蓋了'neighbor_id1'至'neighbor_id4'中的所有數據。 看來我需要區分列表的名稱，例如“ neighbor1_id1”和“ neighbor4_id3”，但這需要我在執行某些操作之前先將所有這些空數組都做完。

如何讀取和存儲具有“動態列數”的數據？ 以干凈整潔的方式，仍然可以使用每列中的元素嗎？ 謝謝

最好，

Answer 1

這是您需要的輸出：

>>> lines = ['5 4 3 16 22 247 0 1.168 0.911 0.944 3.205 0.000 0.562',
     '6 4 4 17 154 93 309 0 0.930 0.919 0.903 0.917 3.852 0.000 1.419',
     '7 3 2 233 311 0 0.936 0.932 1.874 2.000 -0.807']
>>> def getInt(lines):
    result = []
    for line in lines:
        items = line.split()
        for i in range(1,int(items[2])+1):
            result.append(items[2+i])
    return result

>>> res = getInt(lines)
>>> res
['16', '22', '247', '17', '154', '93', '309', '233', '311']
>>>

要獲取每行的詳細信息值，可以修改如下代碼：

>>> def getInt(lines):
    result = []
    for line in lines:
        row = []
        items = line.split()
        for i in range(1,int(items[2])+1):
            row.append(items[2+i])
        result.append(row)
    return result

>>> res = getInt(lines)
>>> res
[['16', '22', '247'], ['17', '154', '93', '309'], ['233', '311']]
>>> res[0]
['16', '22', '247']

根據您的要求，1.我們需要迭代行/行和列中的每個項目，並且我們手動進行操作，而無需使用枚舉功能。 2.保持行和列的位置，並比較值3.在上一個腳本中，我忘記鍵入將值強制轉換為int，在以下代碼中閱讀注釋

>>> lines = ['5 4 3 16 22 247 0 1.168 0.911 0.944 3.205 0.000 0.562',
     '6 4 4 17 154 233 309 0 0.930 0.919 0.903 0.917 3.852 0.000 1.419',
     '7 3 2 233 311 0 0.936 0.932 1.874 2.000 -0.807']
>>> def getInt(lines):
    result = []
    for line in lines:
        row = []
        items = line.split()
        for i in range(1,int(items[2])+1):
            row.append(int(items[2+i])) # old line row.append(items[2+i])
        result.append(row)
    return result

>>> def getPos(result, item):
    row_pos = 0
    for i in result:
        row_pos +=1
        for j in range(len(i)):
            if i[j]==item:
                print("Item %s found in position : (%s,%s)" % (item, row_pos,j))

>>> res = getInt(lines)
>>> getPos(res, 22)
Item 22 found in position : (1,1)
>>> getPos(res, 233)
Item 233 found in position : (2,2)
Item 233 found in position : (3,0)

希望對您有所幫助。

Answer 2

只是分割，切片和切塊

事實上，您需要知道

如何使用str.split將字符串拆分為用空格分隔的字段
切片和索引的工作方式

履行

for line in st.splitlines():
    line = line.split()
    line = line[:3+int(line[2])]
    print line

將此想法擴展到您的問題

Python：如何讀取和存儲列號不斷變化的數據以列出？

問題描述

2 個解決方案

解決方案1
1 已采納 2014-10-01 04:16:02

解決方案2
0 2014-10-01 03:38:31

Python：如何讀取和存儲列號不斷變化的數據以列出？

問題描述

2 個解決方案

解決方案1 1 已采納 2014-10-01 04:16:02

解決方案2 0 2014-10-01 03:38:31

解決方案1
1 已采納 2014-10-01 04:16:02

解決方案2
0 2014-10-01 03:38:31