简体   繁体   English

Python-将文本文件解析为csv文件

[英]Python - Parsing a text file into a csv file

I have a text file that is output from a command that I ran with Netmiko to retrieve data from a Cisco WLC of things that are causing interference on our WiFi network. 我有一个文本文件,该文件是从与Netmiko一起运行的命令输出的,该命令从Cisco WLC检索导致对我们的WiFi网络造成干扰的数据。 I stripped out just what I needed from the original 600k lines of code down to a couple thousand lines like this: 我从原来的60万行代码中剥离了我需要的内容,缩减为几千行,如下所示:

AP Name.......................................... 010-HIGH-FL4-AP04
Microwave Oven      11       10      -59         Mon Dec 18 08:21:23 2017   
WiMax Mobile               11       0       -84         Fri Dec 15 17:09:45 2017   
WiMax Fixed                11       0       -68         Tue Dec 12 09:29:30 2017   
AP Name.......................................... 010-2nd-AP04
Microwave Oven             11       10      -61         Sat Dec 16 11:20:36 2017   
WiMax Fixed                11       0       -78         Mon Dec 11 12:33:10 2017   
AP Name.......................................... 139-FL1-AP03
Microwave Oven             6        18      -51         Fri Dec 15 12:26:56 2017   
AP Name.......................................... 010-HIGH-FL3-AP04
Microwave Oven             11       10      -55         Mon Dec 18 07:51:23 2017   
WiMax Mobile               11       0       -83         Wed Dec 13 16:16:26 2017   

The goal is to end up with a csv file that strips out the 'AP Name ...' and puts what left on the same line as the rest of the information in the next line. 目标是最终得到一个csv文件,该文件会去除“ AP名称...”,并将剩下的信息与其余信息放在同一行中。 The problem is some have two lines below the AP name and some have 1 or none. 问题是某些名称在AP名称下面有两行,而有些则没有或只有1行。 I have been at it for 8 hours and cannot find the best way to make this happen. 我已经呆了8个小时,找不到实现此目标的最佳方法。

This is the latest version of code that I was trying to use, any suggestions for making this work? 这是我尝试使用的最新版本的代码,对此有何建议? I just want something I can load up in excel and create a report with: 我只想要一些我可以加载到excel并创建报告的内容:

with open(outfile_name, 'w') as out_file:
    with open('wlc-interference_raw.txt', 'r')as in_file:
        #Variables
        _ap_name = ''
        _temp = ''
        _flag = False
        for i in in_file:
            if 'AP Name' in i:
                #write whatever was put in the temp file to disk because new ap now
                #add another temp variable in case an ap has more than 1 interferer and check if new AP name
                out_file.write(_temp)
                out_file.write('\n')
                #print(_temp)
                _ap_name = i.lstrip('AP Name.......................................... ')
                _ap_name = _ap_name.rstrip('\n')
                _temp = _ap_name
                #print(_temp)
            elif '----' in i:
                pass
            elif 'Class Type' in i:
                pass
            else:
                line_split = i.split()
                for x in line_split:
                    _temp += ','
                    _temp += x
                _temp += '\n'

I think your best option is to read all lines of the file, then split into sections starting with AP Name. 我认为您最好的选择是读取文件的所有行,然后分成以AP名称开头的部分。 Then you can work on parsing each section. 然后,您可以分析每个部分。

Example

s = """AP Name.......................................... 010-HIGH-FL4-AP04
Microwave Oven      11       10      -59         Mon Dec 18 08:21:23 2017   
WiMax Mobile               11       0       -84         Fri Dec 15 17:09:45 2017   
WiMax Fixed                11       0       -68         Tue Dec 12 09:29:30 2017   
AP Name.......................................... 010-2nd-AP04
Microwave Oven             11       10      -61         Sat Dec 16 11:20:36 2017   
WiMax Fixed                11       0       -78         Mon Dec 11 12:33:10 2017   
AP Name.......................................... 139-FL1-AP03
Microwave Oven             6        18      -51         Fri Dec 15 12:26:56 2017   
AP Name.......................................... 010-HIGH-FL3-AP04
Microwave Oven             11       10      -55         Mon Dec 18 07:51:23 2017   
WiMax Mobile               11       0       -83         Wed Dec 13 16:16:26 2017"""

import re

class AP:
    """ 
    A class holding each section of the parsed file
    """
    def __init__(self):
        self.header = ""
        self.content = []

sections = []
section = None
for line in s.split('\n'):  # Or 'for line in file:'
    # Starting new section
    if line.startswith('AP Name'):
        # If previously had a section, add to list
        if section is not None:
            sections.append(section)  
        section = AP()
        section.header = line
    else:
        if section is not None:
            section.content.append(line)
sections.append(section)  # Add last section outside of loop


for section in sections:
    ap_name = section.header.lstrip("AP Name.")  # lstrip takes all the characters given, not a literal string
    for line in section.content:
        print(ap_name + ",", end="") 
        # You can extract the date separately, if needed
        # Splitting on more than one space using a regex
        line = ",".join(re.split(r'\s\s+', line))
        print(line.rstrip(','))  # Remove trailing comma from imperfect split

Output 输出量

010-HIGH-FL4-AP04,Microwave Oven,11,10,-59,Mon Dec 18 08:21:23 2017
010-HIGH-FL4-AP04,WiMax Mobile,11,0,-84,Fri Dec 15 17:09:45 2017
010-HIGH-FL4-AP04,WiMax Fixed,11,0,-68,Tue Dec 12 09:29:30 2017
010-2nd-AP04,Microwave Oven,11,10,-61,Sat Dec 16 11:20:36 2017
010-2nd-AP04,WiMax Fixed,11,0,-78,Mon Dec 11 12:33:10 2017
139-FL1-AP03,Microwave Oven,6,18,-51,Fri Dec 15 12:26:56 2017
010-HIGH-FL3-AP04,Microwave Oven,11,10,-55,Mon Dec 18 07:51:23 2017
010-HIGH-FL3-AP04,WiMax Mobile,11,0,-83,Wed Dec 13 16:16:26 2017

Tip: 小费:

You don't need Python to write the CSV, you can output to a file using the command line 您不需要Python即可编写CSV,您可以使用命令行将其输出到文件中

python script.py > output.csv

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM