簡體   English   中英

在循環中將 Pandas 系列附加到 Dataframe

[英]Appending a Pandas Series to a Dataframe in a loop

我正在嘗試將 append 我的 nmap 掃描結果轉換為 dataframe。

def vulnScan(targets):
    portInfo =[]
    columnNames = ["Port","Protocol","State","Service"]
    for target in targets:
        portsDF = pd.DataFrame(columns = columnNames)
        print("Executing: nmap -Pn "+target[1])
        result = subprocess.run(['nmap','-Pn',target[1]], universal_newlines = True, stdout = subprocess.PIPE)
        for line in result.stdout.split("\n"):
            if "/" in line and "Starting" not in line:
                tableInfo = line.split(" ")
                port = tableInfo[0].split("/")[0]
                protocol = tableInfo[0].split("/")[1]
                status = tableInfo[1]
                service = tableInfo[3]
                print(port,protocol,status,service)
                newRow = pd.Series(data=[port,protocol,status,service],index=["Port","Protocol","State","Service"])
                portsDF = portsDF.append(newRow, ignore_index=True)
                print(tabulate(portsDF, headers="keys",tablefmt='psql'))
        portInfo = portInfo.append([target[0],portsDF])
    print("")
    print(tabulate(portInfo, headers="keys", tablefmt='psql'))

但是,正如您從 output 中看到的那樣,dataframe 永遠不會被填充。

80 tcp
+--------+------------+---------+-----------+
| Port   | Protocol   | State   | Service   |
|--------+------------+---------+-----------|
+--------+------------+---------+-----------+
135 tcp open msrpc
+--------+------------+---------+-----------+
| Port   | Protocol   | State   | Service   |
|--------+------------+---------+-----------|
+--------+------------+---------+-----------+
139 tcp open netbios-ssn
+--------+------------+---------+-----------+
| Port   | Protocol   | State   | Service   |
|--------+------------+---------+-----------|
+--------+------------+---------+-----------+
443 tcp open https
+--------+------------+---------+-----------+
| Port   | Protocol   | State   | Service   |
|--------+------------+---------+-----------|
+--------+------------+---------+-----------+
445 tcp open microsoft-ds
+--------+------------+---------+-----------+
| Port   | Protocol   | State   | Service   |
|--------+------------+---------+-----------|
+--------+------------+---------+-----------+

+-----------------+-------------------------------------------+
| 0               | 1                                         |
|-----------------+-------------------------------------------|
| DESKTOP-30UOSMD | Empty DataFrame                           |
|                 | Columns: [Port, Protocol, State, Service] |
|                 | Index: []                                 |
+-----------------+-------------------------------------------+

I am not sure what I am missing as I have checked the documentation https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.append.html and I think I am using append() correctly

更新部分

理查茲的回答似乎有效,但導致 portInfo 列表不再是一個列表,它現在是 class None。

def vulnScan(targets):
    portInfo = []
    print(type(portInfo))
    columnNames = ["Port","Protocol","State","Service"]
    for target in targets:
        rows = []
        print("Executing: nmap -Pn "+target[1])
        result = subprocess.run(['nmap','-Pn',target[1]], universal_newlines = True, stdout = subprocess.PIPE)
        for line in result.stdout.split("\n"):
            #This could be improved "/" indicates a row in table output
            if "/" in line and "Starting" not in line:
                tableInfo = line.split(" ")
                port = tableInfo[0].split("/")[0]
                protocol = tableInfo[0].split("/")[1]
                status = tableInfo[1]
                service = tableInfo[3]
                print(port,protocol,status,service)
                newRow = pd.Series(data=[port,protocol,status,service],index=["Port","Protocol","State","Service"])
                rows.append(newRow)

        portsDF = pd.DataFrame(rows, columns = columnNames)
        print(tabulate(portsDF, headers="keys", tablefmt='psql'))
        portInfo = portInfo.append([target[0],portsDF])
        print(type(portInfo))
        print(portInfo)

Output:

<class 'list'>
Executing: nmap -Pn 192.168.1.86
80 tcp
135 tcp open msrpc
139 tcp open netbios-ssn
443 tcp open https
445 tcp open microsoft-ds
+----+--------+------------+---------+--------------+
|    |   Port | Protocol   | State   | Service      |
|----+--------+------------+---------+--------------|
|  0 |     80 | tcp        |         |              |
|  1 |    135 | tcp        | open    | msrpc        |
|  2 |    139 | tcp        | open    | netbios-ssn  |
|  3 |    443 | tcp        | open    | https        |
|  4 |    445 | tcp        | open    | microsoft-ds |
+----+--------+------------+---------+--------------+
<class 'NoneType'>
None

在 portInfo 列表中,我們應該有一個包含主機名(字符串)和端口信息(數據幀)的列表 object。

pandas.DataFrame.append不在原位,因此它返回一個新對象,正如您鏈接的文檔頁面所述。 因此,您通常會執行以下操作:

portsDF = portsDF.append(newRow, ignore_index=True)

但在這種情況下,您將在循環中填充 dataframe ,因此運行上述代碼只會在循環本地創建一個名為portsDF變量,而不是修改原始的portsDF

因此,在這種情況下,我將創建一個列表和 append 每一行,然后在循環完成后從中創建portsDF

columnNames = ["Port","Protocol","State","Service"]
for target in targets:
    # New code:
    rows = []

    print("Executing: nmap -Pn "+target[1])
    result = subprocess.run(['nmap','-Pn',target[1]], universal_newlines = True, stdout = subprocess.PIPE)
    for line in result.stdout.split("\n"):
        if "/" in line and "Starting" not in line:
            tableInfo = line.split(" ")
            port = tableInfo[0].split("/")[0]
            protocol = tableInfo[0].split("/")[1]
            status = tableInfo[1]
            service = tableInfo[3]
            print(port,protocol,status,service)
            
            newRow = pd.Series(data=[port,protocol,status,service],index=["Port","Protocol","State","Service"])
            # New Code:
            rows.append(newRow)
    
    # New code:
    portsDF = pd.DataFrame(rows, columns=columnNames)

由於Series旨在保存相同類型的原子值,因此避免用於多列。 而是構建一個字典列表以傳遞給循環外的DataFrame構造函數。

下面將用於組織和異常處理的命令行調用分開。 此外, target[0]可以是一個可散列值,用作鍵(而不是列表元素)來識別數據幀字典列表的每個數據幀 object。

def run_cmd(t):
    print("Executing: nmap -Pn "+t)
    result = subprocess.run(
        ['nmap','-Pn',t], 
        universal_newlines = True, 
        stdout = subprocess.PIPE
    ) 
        
    return result.stdout.split("\n")
         
def vulnScan(targets): 
    portInfo = []
    for target in targets: 
        rows = [] 
        output_lines = run_cmd(target[1])
        for line in output_lines:
            if "/" in line and "Starting" not in line: 
                tableInfo = line.split(" ")
                d = {
                    "port": tableInfo[0].split("/")[0],
                    "protocol": tableInfo[0].split("/")[1],
                    "status": tableInfo[1],
                    "service": tableInfo[3]
                }
                rows.append(d)

        portDF = {target[0]: pd.DataFrame(rows)}
        portInfo.append(portDF)

    return portInfo

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM