简体   繁体   English

如何修改此脚本以搜索多个关键字?

[英]How to modify this script to search for multiple keywords?

I am trying to modify a script. 我正在尝试修改脚本。 It is being difficult for me so I came for help. 这对我来说很难,所以我来寻求帮助。 This script is supposed to extract data from some .out files and then write them in a .txt file. 该脚本应该从一些.out文件中提取数据,然后将它们写入.txt文件中。 The problem is that I have two different keywords to look for. 问题是我有两个不同的关键字要查找。 So, I provide the script, the things I am not able to modify and then two examples of input files. 所以,我提供脚本,我无法修改的东西,然后是输入文件的两个例子。

#!/usr/bin/env python
# -*- coding: utf-8

#~ Data analysis
import glob, subprocess, shutil, os, math
from funciones import *
for namefile in glob.glob("*.mol2"):
    lstmol2 = []
    lstG=[]
    os.chdir("some_directory")
    searchprocess="grep -i -H 'CURRENT VALUE OF HEAT OF FORMATION =' *.out | sort -k 4 > firstfile.txt" 
#~I need also to look for 'CURRENT BEST VALUE OF HEAT OF FORMATION ='
    os.system(searchprocess)

    fileout=open("results.txt","w")   
    filein=open("firstfile.txt", "r")
    #~ write data in results.txt
    fileout.write('\t %s \n' %("  HOF"))

    for line in filein:
        linediv=line.split()
        HOF=float(linediv[8])

  #~or [10] (for the keyword in need to add) but in both cases I need the float. I need both data of the keywords be included on this file.
        lstG.append(HOF)  
     fileout.close()  
    filein.close()

Input data, type 1: 输入数据,类型1:

foofoofooofoofoofoofoofoo
foofoofooofoofoofoofoofoov
foofoofooofoofoofoofoofoo
CURRENT VALUE OF HEAT OF FORMATION = 1928
foofoofooofoofoofoofoofoo
foofoofooofoofoofoofoofoov

Input data, type 2: 输入数据,类型2:

foofoofooofoofoofoofoofoo
foofoofooofoofoofoofoofoov
foofoofooofoofoofoofoofoo
CURRENT BEST VALUE OF HEAT OF FORMATION = 1930
foofoofooofoofoofoofoofoo
foofoofooofoofoofoofoofoov

You should update your grep command to look for the optional word with the ? 您应该更新您的grep命令以查找带有?的可选单词? operator. 运营商。 Use the -E flag to enable exteneded regular expressions so you don't have to escape your regex operators. 使用-E标志启用扩展正则表达式,这样就不必转义正则表达式运算符。 Always use single quotes around your pattern: 始终在模式周围使用单引号:

searchprocess="grep -E -i -H 'CURRENT( BEST)? VALUE OF HEAT OF FORMATION =' *.out | sort -k 4 > firstfile.txt"

@PrestonHager is correct that you should change linediv[8] to linediv[-1] , since in the cases where BEST is present, the number will be in the linediv[9] position, but in both cases linediv[-1] will give you the desired result. @PrestonHager是正确的,您应该将linediv[8]更改为linediv[-1] ,因为在BEST存在的情况下,数字将位于linediv[9]位置,但在两种情况下, linediv[-1]将给你想要的结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM