简体   繁体   中英

Python pattern match from a file

Experts, I am Just trying to match the pattern from my raw data file so as to list the not running service into html format.

I have took the help from the googling and using something like below but its not working, any help on this will be greatful.

code:

Html_file= open("result1.html","w")
html_str = """
<table border=1>
     <tr>
       <th bgcolor=fe9a2e>Hostname</th>
       <th bgcolor=fe9a2e>Service</th>
     </tr>
"""
Html_file.write(html_str)
fh=open(sys.argv[1],"r")
for line in fh:
        pat_match=re.match("^HostName:"\s+(.*?)\".*", line)
        pat_match1=re.match("^Service Status:"\s+(.*not.*?)\".*", line)
        if pat_match:
                Html_file.write("""<TR><TD bgcolor=fe9a2e>""" + pat_match.group(1) + """</TD>\n""")
        elif pat_match1:
                Html_file.write("""<TR><TD><TD>""" + pat_match1.group(2) + """</TD></TD></TR>\n""")

raw data:

HostName: dbfoxn001.examle.com
Service Status:  NTP is Running on the host dbfoxn001.examle.com
Service Status:  NSCD is not Running on the host dbfoxn001.examle.com
Service Status:  SSSD is Running on the host dbfoxn001.examle.com
Service Status:  Postfix  is Running on the host dbfoxn001.examle.com
Service Status:  Automount is Running on the host dbfoxn001.examle.com
HostName: dbfoxn002.examle.com                   SSH Authentication failed

Required Result:

Hostname                        Service
dbfoxn001.examle.com            NSCD is not Running on the host dbfoxn001.examle.com

Your first problem is that your regex is not properly embedded in a string. You need to either escape or remove the offending " s.

Other than that, the actual regex doesn't really match your input data (for example, you are trying to match some " s which aren't in your input data. I have written regexes as such:

^HostName:\s*(.+)
^Service Status:\s*(.+is not Running.*)

You can try them here and here .

Lastly, your python code for generating the html seems to not generate the sort of html you want. My assumption on how the html of your sample table should look like is as follows:

<table border=1>
  <tr>
    <th bgcolor=fe9a2e>Hostname</th>
    <th bgcolor=fe9a2e>Service</th>
  </tr>
  <tr>
    <td>dbfoxn001.examle.com</td>
    <td>NSCD is not Running on the host dbfoxn001.examle.com</td>
  </tr>
</table>

To that end I have put the hostname into its own variable rather than writing it to the file and added it each time a status is parsed. I have also added the missing final </table> and closed the open files:

import sys
import re

result = open("result1.html","w")
table_header = """
<table border=1>
     <tr>
       <th bgcolor=fe9a2e>Hostname</th>
       <th bgcolor=fe9a2e>Service</th>
     </tr>
"""
result.write(table_header)
input_file=open(sys.argv[1],"r")
for line in input_file:
        host_match = re.match("^HostName:\s*(.+)", line)
        status_match = re.match("^Service Status:\s*(.+is not Running.*)", line)
        if host_match:
                hostname = host_match.group(1)
        elif status_match:
                result.write("""<tr><td>""" + hostname + """</td><td>""" + status_match.group(1) + """</td></tr>\n""")
result.write("""</table>"""
input_file.close()
result.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM