简体   繁体   中英

Python iterate through file and extract specific lines to a list for SQL injection

I have a file output as txt, and the log is unfortunately listing each job details underneath each other making it difficult to extract and insert each record to a table.

I'm wanting to pull:

(Job ID = 4126, Owner = User One, Job Status = Executing, Description = Running document Document1 Dashboard, Creation Time = 09 June 2022 11:51:23 BST, Project ID = 511C117A11E2E7CC029B0080EF659E7B, Project = Testing Project, Job Duration = 5366)

(Job ID = 5682, Owner = User Two, Job Status = Waiting For Children., Description = Running document Safety Dashboard, Creation Time = 09 June 2022 13:11:59 BST, Project ID = 511C117A11E2E7CC029B0080EF659E7B, Project = Testing Project, Job Duration = 530)

(Job ID = 5683, Owner = User Three, Job Status = Executing, Description = Running First report , Creation Time = 09 June 2022 13:11:59 BST, Project ID = 511C117A11E2E7CC029B0080EF659E7B, Project = Testing Project, Job Duration = 530)

(Job ID = 5684, Owner = User Four, Job Status = Executing, Description = Running Second report , Creation Time = 09 June 2022 13:11:59 BST, Project ID = 511C117A11E2E7CC029B0080EF659E7B, Project = Testing Project, Job Duration = 530)

Blockquote

09/06/22 13:20:48 BST Version 11.27.0 (Build 11.2365435459.0000.2270)
09/06/22 13:20:48 BST Connected:Administrator
09/06/22 13:20:48 BST Executing task(s)...
09/06/22 13:20:48 BST Checking syntax...
09/06/22 13:20:48 BST Syntax checking has been completed.
Job ID = 4126
Owner = User One
Job Status = Executing
Description = Running document Document1 Dashboard
Creation Time = 09 June 2022 11:51:23 BST
Project ID = 511C117A11E2E7CC029B0080EF659E7B
Project = Testing Project
Job Duration = 5366
Job ID = 5682
Owner = User Two
Job Status = Waiting For Children.
Description = Running document Safety Dashboard
Creation Time = 09 June 2022 13:11:59 BST
Project ID = 511C117A11E2E7CC029B0080EF659E7B
Project = Testing Project
Job Duration = 530
Job ID = 5683
Owner = User Three
Job Status = Executing
Description = Running First report 
Creation Time = 09 June 2022 13:11:59 BST
Project ID = 511C117A11E2E7CC029B0080EF659E7B
Project = Testing Project
Job Duration = 530
Job ID = 5684
Owner = User Four
Job Status = Executing
Description = Running Second report 
Creation Time = 09 June 2022 13:11:59 BST
Project ID = 511C117A11E2E7CC029B0080EF659E7B
Project = Testing Project
Job Duration = 530
=================================================
09/06/22 13:20:49 BST Execution Time: 00:00:01
09/06/22 13:20:49 BST Successfully disconnected. Prod: Administrator
###################################################################

I couldn't work with the text file but I maanged to export the required results as XML, convert to CSV and then convert to a Text file that I could then use to insert into a table

# EXPORT to XML and convert to CSV
from xml.etree import ElementTree
import csv

# PARSE XML
xml = ElementTree.parse("Job_Monitoring.xml")

# CREATE CSV FILE
csvfile = open("output.csv",'w',encoding='utf-8')
csvfile_writer = csv.writer(csvfile)

# ADD THE HEADER TO CSV FILE
csvfile_writer.writerow(["Owner", "JobId", "Description", "Project", 
"CreationTime", "JobStatus", "JobDuration"])

# FOR EACH JOB
for jobs in xml.findall("Row"):
    if(jobs):
        Owner = jobs.find("Owner")
        JobID = jobs.find("JobId")
        Description = jobs.find("Description")
        Project = jobs.find("Project")
        CreationTime = jobs.find("CreationTime")
        JobStatus = jobs.find("JobStatus")
        JobDuration = jobs.find("JobDuration")
        csv_line = ["INSERT INTO SCHEMA.TABLE_NAME (USERNAME, 
        JOB_ID, REPORT_NAME, PROJECT_NAME, TIME_EXECUTED, JOB_STATUS, 
        JOB_DURATION) VALUES ('"+Owner.text+"'", JobID.text, 
        "'"+Description.text+"'", "'"+Project.text+"'", 
        "CURRENT_TIMESTAMP", "'"+JobStatus.text+"'", JobDuration.text+");"]

    # ADD A NEW ROW TO CSV FILE
    csvfile_writer.writerow(csv_line)
csvfile.close()

Then I performed a cleanup to remove certain characters and save as a text file:

#input file replace "'"
fin = open("output.csv", "rt")
#read file contents to string
data = fin.read()
#replace all occurrences of the required string
data = data.replace("Owner,JobId,Description,JobStatus,CreationTime,Project,JobDuration", '')
#close the input file
fin.close()
#open the input file in write mode
fin = open("output.txt", "wt")
#overrite the input file with the resulting data
fin.write(data)
#close the file
fin.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM