简体   繁体   中英

Python Regex: find all lines that start with '{' and end with '}'

I am receiving data over a socket, a bunch of JSON strings. However, I receive a set amount of bytes, so sometimes the last of my JSON strings is cut-off. I will typically get the following:

{"pitch":-30.778193,"yaw":-124.63285,"roll":-8.977466}
{"pitch":-30.856342,"yaw":-124.57556,"roll":-7.7220345}
{"pitch":-31.574106,"yaw":-124.65623,"roll":-7.911794}
{"pitch":-30.479567,"yaw":-124.24301,"roll":-8.730827}
{"pitch":-29.30239,"yaw":-123.97949,"roll":-8.134723}
{"pitch":-29.84712,"yaw":-124.584465,"roll":-8.588374}
{"pitch":-31.072054,"yaw":-124.707466,"roll":-8.877062}
{"pitch":-31.493435,"yaw":-124.75457,"roll":-9.019922}
{"pitch":-29.591925,"yaw":-124.960815,"roll":-9.379437}
{"pitch":-29.37105,"yaw":-125.14427,"roll":-9.642341}
{"pitch":-29.483717,"yaw":-125.16528,"roll":-9.687177}
{"pitch":-30.903332,"yaw":-124.603935,"roll":-9.423098}
{"pitch":-30.211857,"yaw":-124.471664,"roll":-9.116135}
{"pitch":-30.837414,"yaw":-125.18984,"roll":-9.824204}
{"pitch":-30.526165,"yaw":-124.85788,"roll":-9.158611}
{"pitch":-30.333513,"yaw":-123.68705,"roll":-7.9481263}
{"pitch":-30.903502,"yaw":-123.78847,"roll":-8.209373}
{"pitch":-31.194769,"yaw":-124.79708,"roll":-8.709783}
{"pitch":-30.816765,"yaw":-125

With Python, I would like to create a string array of the first 18 complete { data... } strings.

Here is what I have tried: cleanData = re.search('{.*}', data) but it seems like this is only giving me the very first { data... } entry. How can I get the full string array of complete { } sets?

To get all, you can use re.finditer or re.findall .

>>> re.findall(r'{.*}', s)
['{"pitch":-30.778193,"yaw":-124.63285,"roll":-8.977466}', '{"pitch":-30.856342,"yaw":-124.57556,"roll":-7.7220345}', '{"pitch":-31.574106,"yaw":-124.65623,"roll":-7.911794}', '{"pitch":-30.479567,"yaw":-124.24301,"roll":-8.730827}', '{"pitch":-29.30239,"yaw":-123.97949,"roll":-8.134723}', '{"pitch":-29.84712,"yaw":-124.584465,"roll":-8.588374}', '{"pitch":-31.072054,"yaw":-124.707466,"roll":-8.877062}', '{"pitch":-31.493435,"yaw":-124.75457,"roll":-9.019922}', '{"pitch":-29.591925,"yaw":-124.960815,"roll":-9.379437}', '{"pitch":-29.37105,"yaw":-125.14427,"roll":-9.642341}', '{"pitch":-29.483717,"yaw":-125.16528,"roll":-9.687177}', '{"pitch":-30.903332,"yaw":-124.603935,"roll":-9.423098}', '{"pitch":-30.211857,"yaw":-124.471664,"roll":-9.116135}', '{"pitch":-30.837414,"yaw":-125.18984,"roll":-9.824204}', '{"pitch":-30.526165,"yaw":-124.85788,"roll":-9.158611}', '{"pitch":-30.333513,"yaw":-123.68705,"roll":-7.9481263}', '{"pitch":-30.903502,"yaw":-123.78847,"roll":-8.209373}', '{"pitch":-31.194769,"yaw":-124.79708,"roll":-8.709783}']
>>> 

OR

>>> [x.group() for x in re.finditer(r'{.*}', s)]
['{"pitch":-30.778193,"yaw":-124.63285,"roll":-8.977466}', '{"pitch":-30.856342,"yaw":-124.57556,"roll":-7.7220345}', '{"pitch":-31.574106,"yaw":-124.65623,"roll":-7.911794}', '{"pitch":-30.479567,"yaw":-124.24301,"roll":-8.730827}', '{"pitch":-29.30239,"yaw":-123.97949,"roll":-8.134723}', '{"pitch":-29.84712,"yaw":-124.584465,"roll":-8.588374}', '{"pitch":-31.072054,"yaw":-124.707466,"roll":-8.877062}', '{"pitch":-31.493435,"yaw":-124.75457,"roll":-9.019922}', '{"pitch":-29.591925,"yaw":-124.960815,"roll":-9.379437}', '{"pitch":-29.37105,"yaw":-125.14427,"roll":-9.642341}', '{"pitch":-29.483717,"yaw":-125.16528,"roll":-9.687177}', '{"pitch":-30.903332,"yaw":-124.603935,"roll":-9.423098}', '{"pitch":-30.211857,"yaw":-124.471664,"roll":-9.116135}', '{"pitch":-30.837414,"yaw":-125.18984,"roll":-9.824204}', '{"pitch":-30.526165,"yaw":-124.85788,"roll":-9.158611}', '{"pitch":-30.333513,"yaw":-123.68705,"roll":-7.9481263}', '{"pitch":-30.903502,"yaw":-123.78847,"roll":-8.209373}', '{"pitch":-31.194769,"yaw":-124.79708,"roll":-8.709783}']
>>> 

You need re.findall() (or re.finditer )

>>> import re
>>> for r in re.findall(r'{.*}', data)[:18]:
    print r


{"pitch":-30.778193,"yaw":-124.63285,"roll":-8.977466}
{"pitch":-30.856342,"yaw":-124.57556,"roll":-7.7220345}
{"pitch":-31.574106,"yaw":-124.65623,"roll":-7.911794}
{"pitch":-30.479567,"yaw":-124.24301,"roll":-8.730827}
{"pitch":-29.30239,"yaw":-123.97949,"roll":-8.134723}
{"pitch":-29.84712,"yaw":-124.584465,"roll":-8.588374}
{"pitch":-31.072054,"yaw":-124.707466,"roll":-8.877062}
{"pitch":-31.493435,"yaw":-124.75457,"roll":-9.019922}
{"pitch":-29.591925,"yaw":-124.960815,"roll":-9.379437}
{"pitch":-29.37105,"yaw":-125.14427,"roll":-9.642341}
{"pitch":-29.483717,"yaw":-125.16528,"roll":-9.687177}
{"pitch":-30.903332,"yaw":-124.603935,"roll":-9.423098}
{"pitch":-30.211857,"yaw":-124.471664,"roll":-9.116135}
{"pitch":-30.837414,"yaw":-125.18984,"roll":-9.824204}
{"pitch":-30.526165,"yaw":-124.85788,"roll":-9.158611}
{"pitch":-30.333513,"yaw":-123.68705,"roll":-7.9481263}
{"pitch":-30.903502,"yaw":-123.78847,"roll":-8.209373}
{"pitch":-31.194769,"yaw":-124.79708,"roll":-8.709783}

Extracting lines that start and end with a specific character can be done without any regex, use str.startswith and str.endswith methods when iterating through the lines in a file:

results = []
with open(filepath, 'r') as f:
    for line in f:
        if line.startswith('{') and line.rstrip('\n').endswith('}'):
            results.append(line.rstrip('\n'))

Note the .rstrip('\\n') is used before .endswith to make sure the final newline does not interfere with the } check at the end of the string.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM