[英]How do I split records in Python?
I'm trying to split records in python using split function but unable to achieve the actual outcome.我正在尝试使用 split 函数在 python 中拆分记录,但无法实现实际结果。
Here is the contents of my .txt
file in below:下面是我的
.txt
文件的内容:
10000 {(10000,200,300,A),(10000,200,300,B)},{(10000,200,300,C),(10000,200,300,D)}
10001 {(10001,200,300,E),(10001,200,300,F)},{(10001,200,300,G),(10001,200,300,H)}
Here is the desired output:这是所需的输出:
10000 10000,200,300,A
10000 10000,200,300,B
10000 10000,200,300,C
10000 10000,200,300,D
10001 10001,200,300,E
10001 10001,200,300,F
10001 10001,200,300,G
10001 10001,200,300,H
Any help would be appreciated, thanks.任何帮助将不胜感激,谢谢。
Here is the simplest way to get the desired result, it only requires the sub
and findall
methods from the re
package to work.这是获得所需结果的最简单方法,它只需要
re
包中的sub
和findall
方法即可工作。
from re import sub, findall
string = """
10000 {(10000,200,300,A),(10000,200,300,B)},{(10000,200,300,C),(10000,200,300,D)}
10001 {(10001,200,300,E),(10001,200,300,F)},{(10001,200,300,G),(10001,200,300,H)}
"""
# our results go here
results = []
# loop through each line in the string
for line in string.split("\n"):
# get rid of leading and trailing whitespace
line = line.strip()
# ignore empty lines
if len(line) > 0:
# get the line's id
id = line.split("{")[0].strip()
# get all values wrapped in parenthesis
for match in findall("(\(.*?\))", string):
# add the string to the results list
results.append("{} {}".format(id, sub(r"\{|\}", "", match)))
# display the results
print(results)
Here is the same code in function form:这是函数形式的相同代码:
from re import sub, findall
def get_records(string):
# our results go here
results = []
# loop through each line in the string
for line in string.split("\n"):
# get rid of leading and trailing whitespace
line = line.strip()
# ignore empty lines
if len(line) > 0:
# get the line's id
id = line.split("{")[0].strip()
# get all values wrapped in parenthesis
for match in findall("(\(.*?\))", string):
# add the string to the results list
results.append("{} {}".format(id, sub(r"\{|\}", "", match)))
# return the results list
return results
You would then use the function, like this:然后您将使用该函数,如下所示:
# print the results
print(get_records("""
10000 {(10000,200,300,A),(10000,200,300,B)},{(10000,200,300,C),(10000,200,300,D)}
10001 {(10001,200,300,E),(10001,200,300,F)},{(10001,200,300,G),(10001,200,300,H)}
"""))
Good luck.祝你好运。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.