簡體   English   中英

Python腳本未遍歷數組

[英]Python script not iterating through array

因此,我最近開始學習python,在工作中,我們需要一種方法來簡化在日志文件中查找特定關鍵字的過程,從而更容易確定要添加到阻止列表中的IP。

我決定開始編寫一個python腳本,該腳本將接收一個日志文件,接收一個包含關鍵術語列表的文件,然后在日志文件中查找那些關鍵術語,然后在其中寫入與會話ID匹配的行找到了關鍵術語; 到一個新文件。

import sys
import time
import linecache
from datetime import datetime

def timeStamped(fname, fmt='%Y-%m-%d-%H-%M-%S_{fname}'):
    return datetime.now().strftime(fmt).format(fname=fname)

importFile = open('rawLog.txt', 'r') #pulling in log file
importFile2 = open('keyWords.txt', 'r') #pulling in keywords
exportFile = open(timeStamped('ParsedLog.txt'), 'w') #writing the parsed log

FILE = importFile.readlines()
keyFILE = importFile2.readlines()

logLine = 1  #for debugging purposes when testing
parseString = '' 
holderString = ''
sessionID = []
keyWords= []
j = 0

for line in keyFILE: #go through each line in the keyFile 
        keyWords = line.split(',') #add each word to the array

print(keyWords)#for debugging purposes when testing, this DOES give all the correct results


for line in FILE:
        if keyWords[j] in line:
                parseString = line[29:35] #pulling in session ID
                sessionID.append(parseString) #saving session IDs to a list
        elif importFile == '' and j < len(keyWords):  #if importFile is at end of file and we are not at the end of the array
                importFile.seek(0) #goes back to the start of the file
                j+=1        #advance the keyWords array

        logLine +=1 #for debugging purposes when testing
importFile2.close()              
print(sessionID) #for debugging purposes when testing



importFile.seek(0) #goes back to the start of the file


i = 0
for line in FILE:
        if sessionID[i] in line[29:35]: #checking if the sessionID matches (doing it this way since I ran into issues where some sessionIDs matched parts of the log file that were not sessionIDs
                holderString = line #pulling the line of log file
                exportFile.write(holderString)#writing the log file line to a new text file
                print(holderString) #for debugging purposes when testing
                if i < len(sessionID):
                    i+=1

importFile.close()
exportFile.close()

在我的keyWords列表上並沒有迭代,我可能犯了一些愚蠢的菜鳥錯誤,但是我沒有足夠的經驗來意識到自己搞砸了。 當我檢查輸出時,它僅在rawLog.txt文件的keyWords列表中搜索第一項。

第三個循環的確返回了基於第二個列表拉出的sessionID並嘗試進行迭代的結果(由於我從未小於sessionID列表的長度,因此給出了超出范圍的異常,這是由於sessionID僅具有1個值)。

該程序確實成功寫入並命名了新的日志文件,並帶有DateTime和ParsedLog.txt。

如果elif永遠不為True,則永遠不增加j因此您需要始終遞增或檢查elif語句實際上是否為True

   for line in FILE:
        if keyWords[j] in line:
                parseString = line[29:35] #pulling in session ID
                sessionID.append(parseString) #saving session IDs to a list
        elif importFile == '' and j < len(keyWords):  #if importFile is at end of file and we are not at the end of the array
                importFile.seek(0) #goes back to the start of the file
        j+=1     # always increase

查看上面的循環,您在代碼前面用importFile = open('rawLog.txt', 'r')創建文件對象,因此比較elif importFile == ''永遠不會為True因為importFile是文件對象而不是串。

您分配FILE = importFile.readlines()以免耗盡創建FILE列表的迭代器,而是importFile.seek(0)但實際上並未在任何地方再次使用文件對象。

因此,基本上,您在FILE循環了一次, j從不增加,然后您的代碼移至下一個塊。

您真正需要的是嵌套循環,使用any循環查看每一行中是否有來自keyWords的單詞,而不必擔心您的elif:

for line in FILE: 
    if any(word in line for word in keyWords):
            parseString = line[29:35] #pulling in session ID
            sessionID.append(parseString) #saving session IDs to a list

相同的邏輯適用於您的下一個循環:

for line in FILE:
    if any(sess in line[29:35] for sess in sessionID ): #checking if the sessionID matches (doing it this way since I ran into issues where some sessionIDs matched parts of the log file that were not sessionIDs
            exportFile.write(line)#writing the log file line to a new text file

holderString = line不會對同一對象行進行任何引用,因此您可以簡單地exportFile.write(line)exportFile.write(line)分配。

在旁注中,對變量等使用小寫和下划線holderString -> holder_string並使用with來打開文件是最好的,因為它也會關閉它們。

with open('rawLog.txt') as import_file:
    log_lines = import_file.readlines()

我還將FILE更改為log_lines ,使用更具描述性的名稱使您的代碼更易於遵循。

在我看來,您的第二個循環需要一個內部循環而不是一個內部if語句。 例如

for line in FILE:
    for word in keyWords:
            if word in line:
                    parseString = line[29:35] #pulling in session ID
                    sessionID.append(parseString) #saving session IDs to a list
                    break # Assuming there will only be one keyword per line, else remove this
    logLine +=1 #for debugging purposes when testing
importFile2.close()      
print(sessionID) #for debugging purposes when testing        

假設我理解正確,那就是。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM