简体   繁体   English

在python中研究具有可变数字的字符串

[英]Research a string with a variable number in python

I have a text file which contains several lines in the following format: 我有一个文本文件,其中包含以下格式的几行:

ELEMENT=      1 PLY=  1
-----------------------
 Code 1425                                    
    GP= 1  4.324E-03 -1.350E-03 -2.974E-03  3.084E-04  0.000E+00  0.000E+00
    GP= 2  1.435E-03 -3.529E-04 -1.082E-03  1.183E-04  0.000E+00  0.000E+00
    GP= 3  7.742E-03 -3.542E-03 -4.200E-03  4.714E-04  0.000E+00  0.000E+00
    GP= 4  4.842E-03 -2.378E-03 -2.463E-03  3.040E-04  0.000E+00  0.000E+00

The number after the word ELEMENT goes from 1 to 60. My first goal is to read this text file and stop to every occurrence of the word ELEMENT = 1 to ELEMENT = 60 单词ELEMENT后面的数字从1到60。我的第一个目标是读取此文本文件,并停止到单词ELEMENT = 1ELEMENT = 60每次出现

My test script reads the first occurrence of ELEMENT . 我的测试脚本读取了第一次出现的ELEMENT I would now like to go through the 60 occurrences of ELEMENT , so I have tried to implement a variable following ELEMENT , in this example I have initialized it to 2 to see if it would work and as you can guess it doesn't (see example code below). 我现在想遍历ELEMENT的60次,因此我尝试在ELEMENT实现一个变量,在此示例中,我已将其初始化为2,以查看它是否有效,并且您可以猜测它是否无效(请参见下面的示例代码)。

elem= 2
lines = open("myfile.txt", "r" ).readlines()

for line in lines:
 if re.search( r"ELEMENT=      %i" (line, elem) ):
   words = line.split()

   energy = float( words[1] )

   print "%f" % energy
   break

I get the following error code: 我收到以下错误代码:

File "recup.py", line 42, in <module>
if re.search( r"ELEMENT=      %i" (line, elem) ):
TypeError: 'str' object is not callable

My question then is how would I implement a variable into my search? 那么我的问题是如何在搜索中实现变量?

Just iterate over the blocks: 只需遍历块:

import re

txt='''\
ELEMENT=      1 PLY=  1
-----------------------
 Code 1425                                    
    GP= 1  4.324E-03 -1.350E-03 -2.974E-03  3.084E-04  0.000E+00  0.000E+00
    GP= 2  1.435E-03 -3.529E-04 -1.082E-03  1.183E-04  0.000E+00  0.000E+00
    GP= 3  7.742E-03 -3.542E-03 -4.200E-03  4.714E-04  0.000E+00  0.000E+00
    GP= 4  4.842E-03 -2.378E-03 -2.463E-03  3.040E-04  0.000E+00  0.000E+00

ELEMENT=      2 PLY=  22
-----------------------
 Code 1426                                 
    GP= 5  4.324E-03 -1.350E-03 -2.974E-03  3.084E-04  0.000E+00  0.000E+00
    GP= 6  1.435E-03 -3.529E-04 -1.082E-03  1.183E-04  0.000E+00  0.000E+00
    GP= 7  7.742E-03 -3.542E-03 -4.200E-03  4.714E-04  0.000E+00  0.000E+00
    GP= 8  4.842E-03 -2.378E-03 -2.463E-03  3.040E-04  0.000E+00  0.000E+00    
    '''

for i, m in enumerate(re.finditer(r'^ELEMENT=\s+(\d+.*?)(?=^ELEMENT|\Z)', txt, re.M | re.S)):
    print 'Group {}===:\n{}'.format(i, m.group(1))

This will find the blocks of lines stating with ELEMENT and ending either with the next block or the end of the file. 这将找到以ELEMENT开头并以下一个块或文件末尾结尾的行块。 Then parse the block found into whatever. 然后将找到的块解析为任何东西。

Prints: 印刷品:

Group 0===:
1 PLY=  1
-----------------------
 Code 1425                                    
    GP= 1  4.324E-03 -1.350E-03 -2.974E-03  3.084E-04  0.000E+00  0.000E+00
    GP= 2  1.435E-03 -3.529E-04 -1.082E-03  1.183E-04  0.000E+00  0.000E+00
    GP= 3  7.742E-03 -3.542E-03 -4.200E-03  4.714E-04  0.000E+00  0.000E+00
    GP= 4  4.842E-03 -2.378E-03 -2.463E-03  3.040E-04  0.000E+00  0.000E+00


Group 1===:
2 PLY=  22
-----------------------
 Code 1426                                 
    GP= 5  4.324E-03 -1.350E-03 -2.974E-03  3.084E-04  0.000E+00  0.000E+00
    GP= 6  1.435E-03 -3.529E-04 -1.082E-03  1.183E-04  0.000E+00  0.000E+00
    GP= 7  7.742E-03 -3.542E-03 -4.200E-03  4.714E-04  0.000E+00  0.000E+00
    GP= 8  4.842E-03 -2.378E-03 -2.463E-03  3.040E-04  0.000E+00  0.000E+00  

I'm not entirely sure what you're trying to do, but if you're trying to test which iteration of ELEMENT you're on, this would be a better way: 我不确定您要做什么,但是如果您要测试正在进行的ELEMENT迭代,那将是一个更好的方法:

elem= 2
lines = open("myfile.txt", "r" ).readlines()

for line in lines:
  if re.match(r"ELEMENT=",line):
    words = line.split()
    if int(words[1]) == elem:
      # Do whatever you're trying to do.

If the line you search always starts with "ELEMENT" there is an easy way to work around this : 如果您搜索的行始终以“ ELEMENT”开头,则有一种简单的方法可以解决此问题:

lines = open("myfile.txt", "r").readlines()
for line in lines:
  if line.startswith("ELEMENT"):
    words = line.split()
    print "ELEMENT : " + words[1] + ", PLY : " + words[3]

Using this you will print the line contents everytime you find an "ELEMENT" line. 每次找到“ ELEMENT”行时,您都将使用此方法打印行内容。 You can easily extract the "CODE" and "GP" line contents using the same trick ;). 您可以使用相同的技巧轻松地提取“ CODE”和“ GP”行内容;)。

a few simple changes: 一些简单的更改:

elem= 2
lines = open("myfile.txt", "r" ).readlines()

for line in lines:
    words = line.split()
    if words[0].startswith('ELEMENT'):
        energy = int( words[1] )
        if energy == elem:
            break

print "%f" % energy
break

Don't try to compare == floats - it seldon turns out well 不要尝试比较==浮点数-seldon结果很好

如果我正确理解了您的问题,您可以像这样将变量“植入”到搜索中:

if re.search( r"ELEMENT=      {}".format(elem), line ):

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM