Parsing txt file into dictionary in Python

Question

There are a lot of posts about parsing a text file in Python but I have a special case where the txt file isn't always pretty.

In a perfect world, the key and value would be separated by an equals sign on the same line and you could iterate through line by line and store the values into a dictionary. But of course this isn't a perfect world. Here is a snippet of my txt file:

Map ID  = 
26
Device Type = iPhone OS
Tutorial viewed = false
Last 5 errors = (
    142,
    752,
    142,
    752,
    752
)

IP of Device     = XXX.XX.XXX.XX

It is very inconsistent in terms of keeping things on the same line. For example, sometimes

Device Type = iPhone OS

sometimes

Device Type = iPhone
OS

and sometimes

Device Type = 
iPhone OS

What is the best way to go through these files so I can get a dictionary similar to the code below no matter what kind of horrible formatting occurs:

{'Map ID': 26,
 'Device Type': iPhone OS,
 'Tutorial viewed': false,
 'Last 5 errors': {142, 752, 142, 752, 752},
 'IP of Device': XXX.XX.XXX.XX}

There are also many lines in the txt file that don't contain equals signs and some need to be ignored and some are delimited by a colon (:) but thats another story.

Answer 1

Assuming that at least the entire key is always on the same line as the equals sign, you can iterate through the lines, add a new entry if the line is a 'key' line and add to the last key's entry otherwise:

d = {}
for line in infile:
    if "=" in line:
        key, val = map(str.strip, line.split("="))
        d[key] = val
    else:
        d[key] += line.strip()

Also, = must never appear in a value. Output for your example:

{'IP of Device': 'XXX.XX.XXX.XX', 'Device Type': 'iPhone OS', 'Map ID': '26', 
 'Tutorial viewed': 'false', 'Last 5 errors': '(142,752,142,752,752)'}

Answer 2

Assuming that the delimiter (in this case '=') is never part of the data values, I'd do something like this:

mydict = {}
key, val = None, ''
for line in dirty_file:
    if '=' in line:
        if key is not None:
            mydict[key] = val  # You might want to do type conversions here
        key, val = line.strip().split('=')
    else:
        val += line.strip()

if key is not None:  # For the final item
    mydict[key] = val

Answer 3

The way i see it, you need to aggregate lines on the condition that you only encounter one "=" sign while doing the aggregation, as that is your best bet for a separator. The logic for parsing the error tuple into a set or the "false" string into a boolean is up to your implementation , but don't forget to strip the newline after the initial parsing . A piece of code might look like this :

initial split = myText.split("=")
firstKey = split[0]
secondSplit = split[1].split(\n)
firstVal = secondSplit[:-1]
secondKey = secondSplit[-1]

This is just an example, not a generalization. You would have to come up with the logic that threats the first and last pieces as special cases, while the middle ones are pretty much treated the same

Answer 4

Don't know how the rest of your file looks but this might work:

d = {}
key = None
value = ''
with open(infile) as fin:
    for line in fin:
        if '=' in line:  # New key starting.
            if key:
                d[key] = value
            value = ''  # Reset.
            key = line.split('=')[0].strip()
            value += line.split('=')[1].strip()

        else:  # Only additional value in line.
            value += line.strip()

Can't comment yet unfortunately, but you're right, I changed the dictionary name.

Parsing txt file into dictionary in Python

Question

4 answers

solution1
3 ACCPTED 2014-06-17 13:24:09

solution2
2 2014-06-17 13:24:10

solution3
0 2014-06-17 13:15:51

solution4
0 2014-06-17 13:28:38

Parsing txt file into dictionary in Python

Question

4 answers

solution1 3 ACCPTED 2014-06-17 13:24:09

solution2 2 2014-06-17 13:24:10

solution3 0 2014-06-17 13:15:51

solution4 0 2014-06-17 13:28:38

solution1
3 ACCPTED 2014-06-17 13:24:09

solution2
2 2014-06-17 13:24:10

solution3
0 2014-06-17 13:15:51

solution4
0 2014-06-17 13:28:38