简体   繁体   English

在Python中读取格式化的多行

[英]Reading formatted multi-lines in Python

I would like to read some formatted data in python. 我想在python中读取一些格式化的数据。 The format of the data is something similar to this: 数据格式与此类似:

00:00:00
1 1 1
1 1 1
1 1 1

00:00:02
3 3 3
3 3 3
3 3 3

I could successfully simulate the reading in C/C++ using the forward code: 我可以使用正向代码成功地模拟C / C ++中的读取:

int main()
{
    string hour;
    int x0,y0,z0, x1,y1,z1, x2,y2,z2;

    while(cin >> hour)
    {
        scanf("%d %d %d\n%d %d %d\n%d %d %d\n", &x0, &y0, &z0, &x1, &y1, &z1, &x2, &y2, &z2);
        cout << hour << endl; //check the reading
    }
    return 0;
}

The problem is that i cannot find some Python's method that read a formatted multi-line string as simple as the scanf can. 问题是我找不到一些scanf可以读取的简单的格式化多行字符串的Python方法。 Some examples from np.genfromtxt got near to what a needed, as some from struct.unpack, but my skills wasn't enough to make it works right with the multi-lines. np.genfromtxt中的一些示例已接近需要,例如struct.unpack中的一些示例,但是我的技能不足以使其与多行代码一起使用。 I could probably use split() with some readline to get exactly the formatted data, but it driving me nuts that a program in C/C++ would be simpler than on in Python. 我可能可以将split()与某些readline一起使用,以获取准确的格式化数据,但是让我发疯的是,C / C ++中的程序比Python中的程序要简单。 Is there any way to do something similar to the C/C++ code in Python? 有什么办法可以做类似于Python中的C / C ++代码的事情吗?


Here is the answer after the Joril's help: 这是Joril帮助后的答案:

from scanf import sscanf
import sys

data = ''
for line in sys.stdin:
    if line != '\n':
        data += line
    else:
        print sscanf(data, "%s\n%d %d %d\n%d %d %d\n%d %d %d\n")
        data = ''

And as output i got something like: 作为输出,我得到类似:

('00:00:00', 1, 1, 1, 1, 1, 1, 1, 1, 1)
('00:00:02', 3, 3, 3, 3, 3, 3, 3, 3, 3)

You can definitely use regular expressions. 您绝对可以使用正则表达式。 Here is more or less matching code in python without loop: import re 这是或多或少在python中没有循环的匹配代码:import re

hour = input()
res = re.match(
    r'(?P<hour>\d\d):(?P<minute>\d\d):(?P<second>\d\d)\n'  # \n'
    r'(?P<x0>\d+) (?P<y0>\d+) (?P<z0>\d+)\n'
    r'(?P<x1>\d+) (?P<y1>\d+) (?P<z1>\d+)\n'
    r'(?P<x2>\d+) (?P<y2>\d+) (?P<z2>\d+)',
    hour, re.MULTILINE)

if res:
    print(res.groupdict())

I would split the data into lines first and then parse though. 我先将数据分成几行,然后解析。

Well the Python FAQ says: Python常见问题解答说:

Is there a scanf() or sscanf() equivalent? 是否有等效的scanf()或sscanf()?

Not as such. 不是这样的。

For simple input parsing, the easiest approach is usually to split the line into whitespace-delimited words using the split() method of string objects and then convert decimal strings to numeric values using int() or float(). 对于简单的输入解析,最简单的方法通常是使用字符串对象的split()方法将行拆分为空格分隔的单词,然后使用int()或float()将十进制字符串转换为数值。 split() supports an optional “sep” parameter which is useful if the line uses something other than whitespace as a separator. split()支持可选的“ sep”参数,如果该行使用空格以外的其他内容作为分隔符,则该参数很有用。

For more complicated input parsing, regular expressions are more powerful than C's sscanf() and better suited for the task. 对于更复杂的输入解析,正则表达式比C的sscanf()更强大,并且更适合于该任务。

But it looks like someone made a module that does exactly what you want: 但似乎有人制作了一个模块,可以完全满足您的要求:
https://hkn.eecs.berkeley.edu/~dyoo/python/scanf https://hkn.eecs.berkeley.edu/~dyoo/python/scanf

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM