在Python中读取格式化的多行

Question

I would like to read some formatted data in python. 我想在python中读取一些格式化的数据。 The format of the data is something similar to this: 数据格式与此类似：

I could successfully simulate the reading in C/C++ using the forward code: 我可以使用正向代码成功地模拟C / C ++中的读取：

int main()
{
    string hour;
    int x0,y0,z0, x1,y1,z1, x2,y2,z2;

    while(cin >> hour)
    {
        scanf("%d %d %d\n%d %d %d\n%d %d %d\n", &x0, &y0, &z0, &x1, &y1, &z1, &x2, &y2, &z2);
        cout << hour << endl; //check the reading
    }
    return 0;
}

The problem is that i cannot find some Python's method that read a formatted multi-line string as simple as the scanf can. 问题是我找不到一些scanf可以读取的简单的格式化多行字符串的Python方法。 Some examples from np.genfromtxt got near to what a needed, as some from struct.unpack, but my skills wasn't enough to make it works right with the multi-lines. np.genfromtxt中的一些示例已接近需要，例如struct.unpack中的一些示例，但是我的技能不足以使其与多行代码一起使用。 I could probably use split() with some readline to get exactly the formatted data, but it driving me nuts that a program in C/C++ would be simpler than on in Python. 我可能可以将split（）与某些readline一起使用，以获取准确的格式化数据，但是让我发疯的是，C / C ++中的程序比Python中的程序要简单。 Is there any way to do something similar to the C/C++ code in Python? 有什么办法可以做类似于Python中的C / C ++代码的事情吗？

Here is the answer after the Joril's help: 这是Joril帮助后的答案：

from scanf import sscanf
import sys

data = ''
for line in sys.stdin:
    if line != '\n':
        data += line
    else:
        print sscanf(data, "%s\n%d %d %d\n%d %d %d\n%d %d %d\n")
        data = ''

And as output i got something like: 作为输出，我得到类似：

('00:00:00', 1, 1, 1, 1, 1, 1, 1, 1, 1)
('00:00:02', 3, 3, 3, 3, 3, 3, 3, 3, 3)

Answer 1

You can definitely use regular expressions. 您绝对可以使用正则表达式。 Here is more or less matching code in python without loop: import re 这是或多或少在python中没有循环的匹配代码：import re

hour = input()
res = re.match(
    r'(?P<hour>\d\d):(?P<minute>\d\d):(?P<second>\d\d)\n'  # \n'
    r'(?P<x0>\d+) (?P<y0>\d+) (?P<z0>\d+)\n'
    r'(?P<x1>\d+) (?P<y1>\d+) (?P<z1>\d+)\n'
    r'(?P<x2>\d+) (?P<y2>\d+) (?P<z2>\d+)',
    hour, re.MULTILINE)

if res:
    print(res.groupdict())

I would split the data into lines first and then parse though. 我先将数据分成几行，然后解析。

Answer 2

Well the Python FAQ says: Python常见问题解答说：

Is there a scanf() or sscanf() equivalent? 是否有等效的scanf（）或sscanf（）？

Not as such. 不是这样的。

For simple input parsing, the easiest approach is usually to split the line into whitespace-delimited words using the split() method of string objects and then convert decimal strings to numeric values using int() or float(). 对于简单的输入解析，最简单的方法通常是使用字符串对象的split（）方法将行拆分为空格分隔的单词，然后使用int（）或float（）将十进制字符串转换为数值。 split() supports an optional “sep” parameter which is useful if the line uses something other than whitespace as a separator. split（）支持可选的“ sep”参数，如果该行使用空格以外的其他内容作为分隔符，则该参数很有用。

For more complicated input parsing, regular expressions are more powerful than C's sscanf() and better suited for the task. 对于更复杂的输入解析，正则表达式比C的sscanf（）更强大，并且更适合于该任务。

But it looks like someone made a module that does exactly what you want: 但似乎有人制作了一个模块，可以完全满足您的要求：
https://hkn.eecs.berkeley.edu/~dyoo/python/scanf https://hkn.eecs.berkeley.edu/~dyoo/python/scanf

在Python中读取格式化的多行

问题描述

2 个解决方案

解决方案1
2 2015-01-10 23:29:47

解决方案2
1 已采纳 2015-01-10 23:11:59

在Python中读取格式化的多行

问题描述

2 个解决方案

解决方案1 2 2015-01-10 23:29:47

解决方案2 1 已采纳 2015-01-10 23:11:59

解决方案1
2 2015-01-10 23:29:47

解决方案2
1 已采纳 2015-01-10 23:11:59