简体   繁体   中英

Text to dictionary doesn't work

I have the following text file in the same folder as my Python Code.

78459581
Black Ballpoint Pen
12345670
Football
49585922
Perfume
83799715
Shampoo

I have written this Python code.

file = open("ProductDatabaseEdit.txt", "r")
d = {}
for line in file:
    x = line.split("\n")
    a=x[0]
    b=x[1]
    d[a]=b

print(d)

This is the result I receive.

b=x[1]  # IndexError: list index out of range

My dictionary should appear as follows:

{"78459581" : "Black Ballpoint Pen"
 "12345670" : "Football"
 "49585922" : "Perfume"
 "83799715" : "Shampoo"}

What am I doing wrong?

A line is terminated by a linebreak, thus line.split("\\n") will never give you more than one line.

You could cheat and do:

for first_line in file:
    second_line = next(file)

You can simplify your solution by using a dictionary generator , this is probably the most pythonic solution I can think of:

>>> with open("in.txt") as f:
...   my_dict = dict((line.strip(), next(f).strip()) for line in f)
... 
>>> my_dict
{'12345670': 'Football', '49585922': 'Perfume', '78459581': 'Black Ballpoint Pen', '83799715': 'Shampoo'}

Where in.txt contains the data as described in the problem. It is necessary to strip() each line otherwise you would be left with a trailing \\n character for your keys and values.

You need to strip the \\n, not split

file = open("products.txt", "r")
d = {}
for line in file:
    a = line.strip()
    b = file.next().strip()
    # next(file).strip() # if using python 3.x
    d[a]=b

print(d)

{'12345670': 'Football', '49585922': 'Perfume', '78459581': 'Black Ballpoint Pen', '83799715': 'Shampoo'}

What's going on

When you open a file you get an iterator, which will give you one line at a time when you use it in a for loop.

Your code is iterating over the file, splitting every line in a list with \\n as the delimiter, but that gives you a list with only one item : the same line you already had. Then you try to access the second item in the list, which doesn't exist. That's why you get the IndexError: list index out of range .

How to fix it

What you need is this:

file = open('products.txt','r')
d = {}
for line in file:
    d[line.strip()] = next(file).strip()

In every loop you add a new key to the dictionary (by assigning a value to a key that didn't exist yet) and assign the next line as the value. The next() function is just telling to the file iterator "please move on to the next line" . So, to drive the point home: in the first loop you set first line as a key and assign the second line as the value; in the second loop iteration, you set the third line as a key and assign the fourth line as the value; and so on.

The reason you need to use the .strip() method every time, is because your example file had a space at the end of every line, so that method will remove it.

Or...

You can also get the same result using a dictionary comprehension :

file = open('products.txt','r')
d = {line.strip():next(file).strip() for line in file}

Basically, is a shorter version of the same code above. It's shorter, but less readable: not necessarily something you want (a matter of taste).

In my solution i tried to not use any loops. Therefore, I first load the txt data with pandas:

import pandas as pd
file = pd.read_csv("test.txt", header = None)

Then I seperate keys and values for the dict such as:

keys, values = file[0::2].values, file[1::2].values

Then, we can directly zip these two as lists and create a dict:

result = dict(zip(list(keys.flatten()), list(values.flatten())))

To create this solution I used the information as provided in [question]: How to remove every other element of an array in python? (The inverse of np.repeat()?) and in [question]: Map two lists into a dictionary in Python

You can loop over a list two items at a time:

file = open("ProductDatabaseEdit.txt", "r")
data = file.readlines()
d = {}

for line in range(0,len(data),2):
    d[data[i]] = data[i+1]

Try this code (where the data is in /tmp/tmp5.txt ):

#!/usr/bin/env python3

d = dict()
iskey = True
with open("/tmp/tmp5.txt") as infile:
    for line in infile:
        if iskey:
            _key = line.strip()
        else:
            _value = line.strip()
            d[_key] = _value
        iskey = not iskey

print(d)

Which gives you:

{'12345670': 'Football', '49585922': 'Perfume', '78459581': 'Black Ballpoint Pen', '83799715': 'Shampoo'}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM