簡體   English   中英

簡化 python 代碼,用於從文件中讀取數據並存儲到 numpy 數組中

[英]simplify python code for reading data from files and store into numpy array

我有需要從 python data.inp 讀取的 inp 文件

*Heading
** Job name: inp6_1 Model name: Model-1
*Node
      1,          50.,          20.,          40.
      2,         100.,          20.,          40.
      3,         100.,          20.,           0.
      4,          50.,          20.,           0.
      5,         100.,           0.,          40.
      6,         100.,           0.,           0.
      7,          50.,           0.,           0.
      8,          50.,           0.,          40.
      9,           0.,          40.,          40.
*Element, type=C3D8
  1,  82, 336, 712, 294,   1,  15, 168,  46
  2, 336, 337, 713, 712,  15,  16, 169, 168
  3, 337, 338, 714, 713,  16,  17, 170, 169
*Elset, elset=Set-1, instance=Part-1-1, generate
 321,  951,   10
*End Assembly

目的是將“*Node”和“*Element”之間的所有數字存儲在inp文件中。 下面是我當前的代碼,它是可行的,但相當冗長,因為我使用with open(filepath, 'r') as file:兩次,第一次是獲取行號,第二次是從行讀取並存儲到 numpy 數組. 我試圖在 with 下放置 2 個 for 循環,但在我只從 inp 文件中獲取一行數字的情況下它不起作用。

我的工作代碼:

def findNodes(filepath):
    with open(filepath, 'r') as file:
        for num, line in enumerate(file,1):
            if '*Node' in line:
                nodeLineStart = num
            if '*Element' in line:
                nodeLineEnd = num

    xx = np.empty(shape=[1, 4])
    with open(filepath, 'r') as file:
        for num, lin in enumerate(file, 1):
            if nodeLineStart+1 <= num <= nodeLineEnd-1:
                text = lin.replace(" ", "")
                lines = np.genfromtxt(StringIO(text), delimiter=",").reshape(1, 4)
                xx = np.append(xx, lines, axis=0)
    return xx

ndarray = findNodes('data.inp')

您可以只讀取拆分整個字符串,而不是逐行讀取:

# Read as single string
with open(filepath, 'r') as file:
    contents = file.read()

# find *Node and *Element and get substring in between
first = "*Node"
second = "*Element"
numbers = contents[contents.find(first)+len(first):contents.find(second)]

# Remove commas, split string at whitespace characters and convert numbers to floats
numbers = [float(x) for x in numbers.replace(',', '').split()]

或者使用您的基本結構和 numpy 數組作為返回類型:

def findNodes(filepath, first="*Node", second="*Element"):
    with open(filepath, 'r') as file:
        contents = file.read()
    numbers = contents[contents.find(first)+len(first):contents.find(second)]
    return np.array([float(x) for x in numbers.replace(',', '').split()])

findNodes("data.inp")

BernieD 已經給出了很好的答案,但是如果您確實想逐行讀取文件,則可以使用指示變量來跟蹤當前行是否在 start 和 stop 關鍵字之間:

def findNodes(filepath):
    datalist = []
    datastream = False
    
    with open(filepath, 'r') as file:
        for line in file:
            if '*Element' in line:
                datastream = False
            if datastream:
                datalist.append([float(n) for n in 
                                 line.replace(' ', '').replace('\n', '').split(',')])
            if '*Node' in line:
                datastream = True

    return np.array(datalist)


ndarray = findNodes('data.inp')

這是來自我為解析inp文件而編寫的腳本:

def read_input(ifn):
    """Read an Abaqus INP file, read its sections.
    Return the section headings and the lines.
    """
    with open(ifn) as inf:
        lines = [ln.strip() for ln in inf.readlines()]
    # Remove comments
    lines = [ln for ln in lines if not ln.startswith("**")]
    # Find section headers
    headings = [(ln[1:], n) for n, ln in enumerate(lines) if ln.startswith("*")]
    # Filter the headings so that every heading has a start-of-data and
    # end-of-data index.
    headings.append(("end", -1))
    ln = [h[1] for h in headings]
    headings = [
        (name, start + 1, end) for (name, start), end in zip(headings[:-1], ln[1:])
    ]
    return headings, lines


def retrieve_nodes(headings, lines):
    """Extract the nodes out of lines.
    Return a dict of nodes, indexed by the node number.
    A node is a 3-tuple of coordinate strings.
    The node coordinates are *not* converted to floats, so as to not lose precision.

    Arguments:
        headings (list): list of (name, start, end) tuples.
        lines (list): list of lines.

    Returns:
        A dict of nodes (x,y,z)-tuples indexed by the node number.
    """
    nodes = {}
    for h in headings:
        if h[0].lower().startswith("node"):
            for ln in lines[h[1]:h[2]]:
                idx, x, y, z = ln.split(",")
                nodes[int(idx)] = (x.strip(), y.strip(), z.strip())
            # Assuming there is only one NODE section.
            break
    return nodes

嘗試(適用於任意數量的行,因為它使用 numpy.loadtxt() 方法加載數組):

import re
import numpy as np
from io import StringIO
def findNodes(filepath): 
    with open(filepath,'r') as fr: alldata = fr.read()
    ioData = StringIO( re.findall(r"(?<=\*Node\n)[\d.,\s\n]+(?=\n\*Element)",alldata)[0] )
    return np.loadtxt(ioData, delimiter=',')
print(findNodes('data.inp'))

在返回的 numpy 數組的打印輸出下方:

[[  1.  50.  20.  40.]
 [  2. 100.  20.  40.]
 [  3. 100.  20.   0.]
 [  4.  50.  20.   0.]
 [  5. 100.   0.  40.]
 [  6. 100.   0.   0.]
 [  7.  50.   0.   0.]
 [  8.  50.   0.  40.]
 [  9.   0.  40.  40.]]

PS you need to import next to numpy also the Python re module and from io StringIO for numpy.loadtxt(). Python的標准安裝中提供所有進口。 正則表達式使用正向查找和正向查找來查找適當的文本部分。

有多種讀取和查找所需行塊的方法。 與您已經在做的事情非常接近的是:

讀一讀以獲得限制:

In [125]: with open('test.csv', 'r') as file:
     ...:         for num, line in enumerate(file,1):
     ...:             if '*Node' in line:
     ...:                 nodeLineStart = num
     ...:             if '*Element' in line:
     ...:                 nodeLineEnd = num
     ...:                 

In [126]: nodeLineStart, nodeLineEnd
Out[126]: (3, 13)

然后使用行限制參數進行一次genfromtxt調用:

In [128]: data = np.genfromtxt('test.csv', delimiter=',', skip_header=nodeLineStart, max_rows=nodeLineEnd-nodeLineStart-1)

In [129]: data
Out[129]: 
array([[  1.,  50.,  20.,  40.],
       [  2., 100.,  20.,  40.],
       [  3., 100.,  20.,   0.],
       [  4.,  50.,  20.,   0.],
       [  5., 100.,   0.,  40.],
       [  6., 100.,   0.,   0.],
       [  7.,  50.,   0.,   0.],
       [  8.,  50.,   0.,  40.],
       [  9.,   0.,  40.,  40.]])

或使用loadtxt

In [133]: data = np.loadtxt('test.csv', delimiter=',', skiprows=nodeLineStart, max_rows=nodeLineEnd-nodeLineStart-1)

或者使用一個readlines ,只將所需的行塊傳遞給閱讀器:

In [138]: with open('test.csv', 'r') as file: 
     ...:     lines=file.readlines()
     ...:     for num, line in enumerate(lines,1):
     ...:             if '*Node' in line:
     ...:                 nodeLineStart = num
     ...:             if '*Element' in line:
     ...:                 nodeLineEnd = num
In [140]: data = np.loadtxt(lines[nodeLineStart:nodeLineEnd-1], delimiter=',')

np.genfromtxt/loadtxt可以使用任何給他們一組行、一個文件迭代器、stringio 或只是一個字符串列表的東西。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM