python：解析以冒号分隔的格式化字符串

Question

I need to write a python script (I'm a newbie in python but would like to take this a practice) to parse a message of the following format:我需要编写一个python脚本（我是python的新手，但想实践一下）来解析以下格式的消息：

T:L:x1:x2:x3:...T1:L1:y1:y2:y3...Tn:Ln:z1:z2:z3:...

where T holds a type, L is the length and x1..xn is the actual data of the type T1-Tn.其中T保存类型， L是长度，x1..xn 是 T1-Tn 类型的实际数据。 Each character is separated with : symbol.每个字符用:符号分隔。

For example:例如：

1:4:a:5:6:7:2:10:72:75:63:6f:6e:74:72:6f:6c:6c:65:72:2e:6f:72:67

(Type1=1, Length1=4, Type2=2, Length2=16) （类型 1=1，长度 1=4，类型 2=2，长度 2=16）

The parsed messages should be stored in dictionary (I think this is the most appropriate data structure, but I'd be glad to hear some other suggestions).解析后的消息应该存储在字典中（我认为这是最合适的数据结构，但我很高兴听到其他一些建议）。

So I am probably going to split the text, extract type and length, walk further and extract L bytes and store them in a dict with T as a key.所以我可能会拆分文本，提取类型和长度，进一步提取L个字节并将它们存储在一个以T为键的dict 。

So I will run a loop, how do I determine the end of string, so that I can break out of the loop?所以我会运行一个循环，我如何确定字符串的结尾，以便我可以跳出循环？
The actual data (x1-x3... for example) has to be stored in dictionary with : removed.实际数据（例如 x1-x3...）必须存储在字典中，删除: 。 I'm not sure how to do that.我不知道该怎么做。

I'd appreciate to learn about more efficient approach of parsing the string.我很高兴了解解析字符串的更有效方法。 Thanks!谢谢！

Answer 1

Something like this should work:这样的事情应该工作：

ss = "1:4:a:5:6:7:2:10:72:75:63:6f:6e:74:72:6f:6c:6c:65:72:2e:6f:72:67".split(":")

d = {}
idx = 0
while idx < len(ss):
    key = ss[idx]
    idx += 1
    length = int(ss[idx])
    idx += 1
    arr = ss[idx:idx+length]
    d[key] = arr
    idx += length

output d :输出d ：

{'1': ['a', '5', '6', '7'],
 '2': ['72', '75', '63', '6f', '6e', '74', '72', '6f', '6c', '6c'],
 '65': ['2e', '6f', '72', '67']}

Answer 2

Create an iterator over your string:在你的字符串上创建一个iterator ：

#                         v--- I replaced 10 by 12
message = '1:4:a:5:6:7:2:12:72:75:63:6f:6e:74:72:6f:6c:6c:65:72:2e:6f:72:67'

code = iter(message.split(':'))
data = {}

l = 0
for t in code:
    l += int(next(code))
    d = [next(code) for _ in range(l)]
    data[t] = d

Output:输出：

>>> data
{'1': ['a', '5', '6', '7'],
 '2': ['72', '75', '63', '6f', '6e', '74', '72', '6f', '6c', '6c', '65', '72', '2e', '6f', '72', '67']}

python：解析以冒号分隔的格式化字符串

问题描述

2 个解决方案

解决方案1
1 2021-10-23 22:09:31

解决方案2
0 2021-10-23 22:11:36

python：解析以冒号分隔的格式化字符串

问题描述

2 个解决方案

解决方案1 1 2021-10-23 22:09:31

解决方案2 0 2021-10-23 22:11:36

解决方案1
1 2021-10-23 22:09:31

解决方案2
0 2021-10-23 22:11:36