简体   繁体   中英

Reading and formatting text files into Python

I am trying to read a text file into a dictionary using python. When I open the file, it reads as follows:

SS,City,State,Country,Pop,Age,Random    
('321', 'Houston', 'TX', 'US', '84549', '45', 2000)        
('654', 'Miami', 'FL', 'US', '99999', '55', -2001)    
('940', 'Dallas', 'TX', 'US', '3243', '30', 324113)    

When I go to open my file into a dictionary I am getting added characters that I do not see in the text file. I have tied stripping and removing characters but can't seem to get anything to work. Here is what happens when I print my dictionary:

("('321'", " 'Houston'"," 'TX'"," 'US'"," '84549'"," '45'",' 2000)')    
("('654'"," 'Miami'"," 'FL'"," 'US'"," '99999'"," '55'"," -2001)')    
("('940'"," 'Dallas'"," 'TX'"," 'US'"," '3243'"," '30'"," 324113)')    

Below is the code I have so far.

locations={}
with open ("locations.txt") as lct:
    z=lct.readline()
    for line in lct:
        line=line.strip().split(",")
        ss, city, state, cntry, pop, age, random = line
    if state == "TX":
        locations[ss] = Texas(ss,city,state,cntry,pop,age,random)
    elif state == "FL":
        locations[ss] = Florida(ss,city,state,cntry,pop,age,random)

I would like the lines to display as follows:
('321', 'Houston', 'TX', 'US', '84549', '45', '2000')

Any suggestions?

You can just slice the incoming string.

locations={}
with open ("locations.txt") as lct:
    z=lct.readline()
    for line in lct:
        line=line.strip()[1:-1].split(",")
        ss, city, state, cntry, pop, age, random = line
    if state == "TX":
        locations[ss] = Texas(ss,city,state,cntry,pop,age,random)
    elif state == "FL":
        locations[ss] = Florida(ss,city,state,cntry,pop,age,random)

the file is just a bunch of text where space and inverted also considered using a regex can help you

    import re
    text = "('321', 'Houston', 'TX', 'US', '84549', '45', 2000)"
    pattern = r"(\w+)"
    print(re.findall(pattern,text))

>["321', 'Houston', 'TX', 'US', '84549', '45", '2000']

so your code will look like

import re                                         #Added line
pattern = r"(\w+)"                                #Added line

locations={}
with open ("locations.txt") as lct:
    z=lct.readline()
    for line in lct:
        l = re.findall(pattern,line)              #changed line
        ss, city, state, cntry, pop, age, random = l
    if state == "TX":
        locations[ss] = Texas(ss,city,state,cntry,pop,age,random)
    elif state == "FL":
        locations[ss] = Florida(ss,city,state,cntry,pop,age,random)

Just replace the three string 1. ( 2. ) 3. ' with empty string your problem will be solved.

Please use the below code

locations={}
with open ("locations.txt") as lct:
z=lct.readline()
for line in lct:
    line.replace("(","")
    line.replace(")","")
    line.replace("'","")
    line=line.strip().split(",")
    ss, city, state, cntry, pop, age, random = line
if state == "TX":
    locations[ss] = Texas(ss,city,state,cntry,pop,age,random)
elif state == "FL":
    locations[ss] = Florida(ss,city,state,cntry,pop,age,random)
    line=line.strip().split(",")
    ss, city, state, cntry, pop, age, random = line
if state == "TX":
    locations[ss] = Texas(ss,city,state,cntry,pop,age,random)
elif state == "FL":
    locations[ss] = Florida(ss,city,state,cntry,pop,age,random)

Since the format of text meet Python syntax, use eval will be easy.

text = """('321', 'Houston', 'TX', 'US', '84549', '45', 2000)
('654', 'Miami', 'FL', 'US', '99999', '55', -2001)
('940', 'Dallas', 'TX', 'US', '3243', '30', 324113)"""

locations={}
func = {'TX':Texes, 'FL':Florida}
for line in text.split('\n'):
    args = eval(line)
    ss, state = args[0], args[2]
    if state in func:
        locations[ss] = func(*args)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM