简体   繁体   中英

Loading .txt file as a dict, but exclude commented rows

I have some data in a txt file and I would like to load it into a list of dicts. I would normally use csv.ReadDict(open('file')) , however this data does not have the key values in the first row. Instead it has a number of rows commented out before the data actually begins. Also, sometimes, the commented rows will not always be at beginning of the file, but could be at the end of the file.

However, all line should always have the same fields, and I guess I could hard-code these field names (or key values) as they shouldn't change.

Sample Date

# twitter data
# retrieved at: 07.08.2014
# total number of records: 5
# exported by: userXYZ
# fields: date, time, username, source
10.12.2013; 02:00; tweeterA; web
10.12.2013; 02:01; tweeterB; iPhone
10.13.2013; 02:04; tweeterC; android
10.13.2013; 02:08; tweeterC; web
10.13.2013; 02:10; tweeterD; iPhone

Below is the what I've been able to figure out so far, but I need some help getting it worked out.

My Code

header = ['date', 'time', 'username', 'source']
data = []

for line in open('data.txt'):
    if not line.startswith('#'):
        data.append(line)

Desired Format

[{'date':'10.12.2013', 'time':'02:00', 'username':'tweeterA', 'source':,'web'},
 {'date':'10.12.2013', 'time':'02:01', 'username':'tweeterB', 'source':,'iPhone'},
 {'date':'10.12.2013', 'time':'02:04', 'username':'tweeterC', 'source':,'android'},
 {'date':'10.12.2013', 'time':'02:08', 'username':'tweeterC', 'source':,'web'},
 {'date':'10.12.2013', 'time':'02:10', 'username':'tweeterD', 'source':,'iPhone'}]

如果要列出每个字典对应一行的字典列表,请尝试以下操作:

list_of_dicts = [{key: value for (key, value) in zip(header, line.strip().split('; '))} for line in open('abcd.txt') if not line.strip().startswith('#')]
for line in open('data.txt'):
    if not line.startswith('#'):
        data.append(line.split("; "))

at least assuming I understand you correctly

or more succinct

data = [line.split("; ") for line in open("data.txt") if not line.strip().startswith("#")]
list_of_dicts = map(lambda row:dict(zip(header,row)),data)

depending on your version of python you may get an iterator back from map in which case just do

 list_of_dicts = list(map(lambda row:dict(zip(header,row)),data))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM