简体   繁体   中英

How do I grab text from a file using python regex

I have a large text file that has one long block of GPS information and I have written an python script that takes coordinates and inserts them into an xml file , I just need function to loop through the file and extract the coordinates.

The file is composed of the following text

{u'bearing': 0, u'altitude': 0, u'time': 1423728072412L, u'longitude': -118.38120859999991, u'provider': u'network', u'latitude': 34.052508400000001, u'speed': 0, u'accuracy': 20}{u'bearing': 0, u'altitude': 0, u'time': 1423728072412L, u'longitude': -118.38120859999992, u'provider': u'network', u'latitude': 34.052508400000001, u'speed': 0, u'accuracy': 20}

I would like to use some sort of regex that allows me to find and grab every instance of the value after u'longitude': and the value after u'latitude' :. The document contains about this repeating line about 1000 times with a different value every time.

Thanks in advance for any help or a nudge towards the right direction.

You can apply ast.literal_eval() to each line in the file and get the longitude value from a resulting dictionary:

from ast import literal_eval

with open('input.txt') as f:
    for line in f:
        d = literal_eval(line)
        print d['longitude']

As a side note, consider having the data serialized in a JSON format instead of dumping dictionaries into a text file. json module would help with that.

(?<=longitude':)\s*([^,}]*)|(?<=latitude':)\s*([^,}]*)

Try this.See demo.

https://regex101.com/r/jG2wO4/3

import re
p = re.compile(r'(?<=longitude\':)\s*([^,}]*)|(?<=latitude\':)\s*([^,}]*)')
test_str = "{u'bearing': 0, u'altitude': 0, u'time': 1423728072412L, u'longitude': -118.38120859999991, u'provider': u'network', u'latitude': 34.052508400000001, u'speed': 0, u'accuracy': 20}{u'bearing': 0, u'altitude': 0, u'time': 1423728072412L, u'longitude': -118.38120859999992, u'provider': u'network', u'latitude': 34.052508400000001, u'speed': 0, u'accuracy': 20}"

re.findall(p, test_str)

If the file is not big you can read in one go and apply this regex or else you can read line by line and apply the regex and keep appending results in a list or dictionary.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM