简体   繁体   中英

importing a file as a dictionary in python

I have a file with data like this. The '>' serves as identifier.

>test1
this is line 1
hi there
>test2
this is line 3
how are you
>test3
this is line 5 and
who are you

I'm trying to create a dictionary

{'>test1':'this is line 1hi there','>test2':'this is line 3how are you','>test3':'this is line 5who are you'}

I've imported the file but I'm unable to do it in this fashion. I want to delete the newline character at the end of each line so as to get one line. Spaces not required as seen. Any help would be appreciated

This is what I've tried so far

new_dict = {}
>>> db = open("/home/ak/Desktop/python_files/smalltext.txt")

for line in db:
    if '>' in line:
        new_dict[line]=''
    else:
        new_dict[line]=new_dict[line].append(line)

Using your approach it would be:

new_dict = {}
>>> db = open("/home/ak/Desktop/python_files/smalltext.txt", 'r')

for line in db:
    if '>' in line:
        key = line.strip()    #Strips the newline characters
        new_dict[key]=''
    else:
        new_dict[key] += line.strip()

Here is a solution using groupby:

from itertools import groupby

kvs=[]
with open(f_name) as f:
    for k, v in groupby((e.rstrip() for e in f), lambda s: s.startswith('>')):
        kvs.append(''.join(v) if k else '\n'.join(v))    

print {k:v for k,v in zip(kvs[0::2], kvs[1::2])}

The dict:

{'>test1': 'this is line 1\n\nhi there', 
 '>test2': 'this is line 3\n\nhow are you', 
 '>test3': 'this is line 5 and\n\nwho are you'}

You can use a regex:

import re

di={}
pat=re.compile(r'^(>.*?)$(.*?)(?=^>|\Z)', re.S | re.M)
with open(fn) as f:
    txt=f.read()
    for k, v in ((m.group(1), m.group(2)) for m in pat.finditer(txt)):
        di[k]=v.strip()

print di       


# {'>test1': 'this is line 1\nhi there', '>test2': 'this is line 3\nhow are you', '>test3': 'this is line 5 and\nwho are you'}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM