简体   繁体   English

Python:解析文本文件

[英]Python: Parse Text File

My file looks like this: 我的文件看起来像这样:

A matrix of 2 by 100 , 一个2乘100的矩阵,

I would like to create a list for each column one list which correspond to the first element of each row mapping to current and the second element maps to the temperature. 我想为每列创建一个列表,列表对应于每行映射到当前的第一个元素,第二个元素映射到温度。

As shown below. 如下所示。 any better way to make the code look fancier? 有什么更好的方法让代码看起来更漂亮?

-12,30
-34,50
-33,89
-900,9
-2,37
-7,17
-8,28
-12,30
-34,50
-33,89

def parse_log(fname):
    f = open(fname, "r")
    samples = f.readlines()
    samples = filter(lambda x: not x.startswith('*'), samples)
    print(samples)
    current = map(lambda x: -1 * int(x.split(',')[0]), samples)
    print(current)
    temperature = map(lambda x: int(x.split(',')[1]), samples)
    print(temperature)
    return (current, temperature)

To avoid doing the split call twice for every line, I'd suggest the following solution 为避免每行进行两次split调用,我建议采用以下解决方案

def parse_log(fname):
    with open(fname) as f:
        samples = [line.strip() for line in f.readlines()
                   if not line.startswith('*')]
        ints = [map(int, line.split(",")) for line in samples]
        currents = [-x[0] for x in ints]
        temperatures = [x[1] for x in ints]
        return currents, temperatures

This is a simple version that would be IMO reasonable up to a few megabytes of log file (it doesn't try to minimize memory usage or computing time during parsing): 这是一个简单的版本,IMO可以合理地达到几兆字节的日志文件(它不会尝试最小化内存使用或解析期间的计算时间):

 def parse_log(fname):
     data = [map(int, x.split(",")) for x in open(fname) if x[:1] != "*"]
     return ([-current for current, temp in data],
             [temp for current, temp in data])

Using generator expressions: 使用生成器表达:

def parse_log(fname):
    with open(fname, "r") as file:
        return zip(*(
            (int(current) * -1, int(temp)) for current, temp in
                (line.strip().split(',')
                    for line in file if not line.startswith('*'))
        ))

print parse_log(filename)

[(-12, -34, -33, -900, -2, -7, -8, -12, -34, -33), (30, 50, 89, 9, 37, 17, 28, 30, 50, 89)]

Warning, this isn't necessarily better as it's probably harder to read and understand what's going on. 警告,这不一定更好,因为它可能更难以阅读和理解正在发生的事情。 Be sure to document it properly with a docstring. 请务必使用docstring正确记录。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM