简体   繁体   English

Python3.7 无法正确解析文本文件

[英]Python3.7 not parsing text file properly

I am attempting to write a (should be...) basic Python script to accomplish the following:我正在尝试编写一个(应该是……)基本的 Python 脚本来完成以下任务:

  1. Read in a log file from a hardcoded path (example file below)从硬编码路径读入日志文件(下面的示例文件)
  2. Create an array of each line of the file, with two or three elements为文件的每一行创建一个数组,包含两个或三个元素
  3. Print out that array.打印出那个数组。

Here is an example log file from the scimark benchmark test:以下是来自 scimark 基准测试的示例日志文件:

**                                                              **
** SciMark2 Numeric Benchmark, see http://math.nist.gov/scimark **
** for details. (Results can be submitted to pozo@nist.gov)     **
**                                                              **
Using       2.00 seconds min time per kenel.
Composite Score:          55.11
FFT             Mflops:   35.99    (N=1024)
SOR             Mflops:   60.25    (100 x 100)
MonteCarlo:     Mflops:    3.21
Sparse matmult  Mflops:   16.10    (N=1000, nz=5000)
LU              Mflops:   15.02    (M=100, N=100)

Ideally, I would be creating an array like so:理想情况下,我会像这样创建一个数组:

array = [
['Composite Score', 55.11 ''],
['FFT MFlops', 35.99, '(N=1024)'],
['SOR MFlops', 60.25, '(100 x 100)'],
['MonteCarlo Mflops', 3.21, ''],
['Sparse matmult Mflops', 16.10, '(n=1000, NZ=5000)'],
['LU', 3.21, '(M=100, N=100)']]

I have tried to do this with the following python codes:我尝试使用以下 python 代码执行此操作:

import csv

with open ('/SciMarkResults.txt') as file:
    lines = file.readlines()

print(len(lines))
new_lines = lines[5:]

def get_data(readfile):
    types = (line.split('\n') for line in readfile)
    return types

a = get_data(new_lines)

print(a)

Which provides the following output:它提供以下输出:

11
<generator object get_data.<locals>.<genexpr> at 0x7ff45b5c5ba0>

I have considered using regular expression, but that seems to be not a preferred solution.我考虑过使用正则表达式,但这似乎不是首选的解决方案。

I have not been able to determine why I am not able to split the array properly.我无法确定为什么我无法正确拆分阵列。 Simply printing new_lines yields:简单地打印new_lines产量:

['Composite Score:          460.11\n', 'FFT             Mflops:   315.99    (N=1024)\n', 'SOR             Mflops:   860.25    (100 x 100)\n', 'MonteCarlo:     Mflops:    93.21\n', 'Sparse matmult  Mflops:   416.10    (N=1000, nz=5000)\n', 'LU              Mflops:   615.02    (M=100, N=100)\n']

Any advice would be appreciated.任何意见,将不胜感激。

Instead of using,而不是使用,

types = (line.split('\n') for line in readfile)

which is a generator comprehension, you could use这是一个生成器理解,你可以使用

types = [line.split('\n') for line in readfile]

which is a list comprehension and should give you the output that you need.这是一个列表理解,应该给你你需要的输出。

The same has been answered above by @jdehesa @jdehesa在上面回答了同样的问题

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM