简体   繁体   中英

Python read text file into dictionary, list of strings

I am trying to read a text file into a dictionary. The text file contains a person's name, networks, and friends' names. The key for dictionary is person's name, and value is that person's networks Here is the text file:

Pritchett, Mitchell\n
Law Association\n
Dunphy, Claire\n
Tucker, Cameron\n
Dunphy, Luke\n
\n\n
Tucker, Cameron\n
Clown School\n
Wizard of Oz Fan Club\n
Pritchett, Mitchell\n
Pritchett, Gloria\n
\n\n
Dunphy, Alex\n
Orchestra\n
Chess Club\n
Dunphy, Luke\n

Here is what I did

def person_to_networks(file):

I get an error for the line 'if "\\n" and "," in lst[0]'. It says list index out of range. Please help me. I can't figure out what is wrong with this code.

因为第一次通过循环,所以您尝试在lst仍为[]时访问lst [0]。

First line at least, lst is empty list ( [] ). You should append some values to lst first.


Probably, you want to did following:

if "\\n" and "," in lst[0]: to if "\\n" and "," in line[0]: ,

elif "," not in lst[1:]: to elif "," not in line[1:]:

new_person_friends in the last line is not defined. You need fix that to right one.


When line is "\\n", lst will clear after networks 's update.
And your data have "\\n\\n". That means 2 consecutive empty lines. In the second "\\n", lst is empty list because the first "\\n" was processed.
You need fix your code to avoid that problem like this: if line == '\\n' and lst != []:

you get that error because you are initializing your lst to be empty [ ], and then you check the first element which does not exist.

you say that you want to turn your file into a dictionary, I suggest this simpler code for that:

import re  # import regex library
# open the file and import your data
f = open('data', 'r')
data = f.read()
f.close()
# initialize your data to be processed
dict = {}
data = data.replace('\\n', '') # remove \n characters
data = data.split('\n\n')      # split it into blocks
for block in data:
    block = block.split('\n')  # split each bock into lines
    nets = []
    for line in block:
        if ',' not in line and line != '': # find networks
            nets.append(line)
    block[0] = re.sub(r'(\w+),\s(\w+)', r'\2, \1', block[0])  # ADDED to switch first name and last name
    dict.update({block[0]: nets})   # update the result dictionary
print dict

and this will give you this result for your suggested file example:

{'Pritchett, Mitchell': ['Law Association'], 'Tucker, Cameron': ['Clown School', 'Wizard of Oz Fan Club'], 'Dunphy, Alex': ['Orchestra', 'Chess Club']}

if this is not what you want, please describe in more details what it is.

Edit: in order to switch the first name and last name you can add just that single line to make that switch before you update the dictionary. I added that line in the code above, it uses a regex ( don't forget to add "import re" like in the beginning of my code ) :

'(\w+),\s(\w+)' # used to find the first name and last name and store them in \1 and \2 match groups.
'\2, \1'        # to replace the place of the match groups as required.
 OR '\2 \1'     # if you don't want the comma 

and you can manipulate it however you like, eg: you can remove the , or something like that.

and after the switching the output will become like:

{'Alex, Dunphy': ['Orchestra', 'Chess Club'], 'Cameron, Tucker': ['Clown School', 'Wizard of Oz Fan Club'], 'Mitchell, Pritchett': ['Law Association']}

Edit: another way to switch between the first and last names ( remove the "import re" and the previously added line and replace it with these three lines with the same indent ):

s = block[0].split(', ')
s.reverse()
block[0] = ', '.join(s)  # or use ' '.join(s) if you don't want the comma

hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM