How to read a file line by line with space separated values?

Question

I have a different kind of file format which contains millions of lines in a txt file.

My file format is something like this:

12122.AA.K IRIR-93I3KD-OEPE-IE,6373,893939,09/12/2093,,N,EC,3838-38939-393
12123.AA.K KKKS-93I3KD-OEPE-IE,9393,039033,09/12/2093,,N,EC,3838-38939-393
12122.AA.K PEOEP-93I3KD-OEPE-IE,9033,930392,09/12/2093,,N,EC,3838-38939-393
12124.AA.K MDJDK-93I3KD-OEPE-IE,3930,272882,09/12/2093,,N,EC,3838-38939-393
12125.AA.K EOEPE-93I3KD-OEPE-IE,8393,039393,09/12/2093,,N,EC,3838-38939-393

In Python, I want to split each line into a key and a value:

Key: 12122.AA.K
Value: IRIR-93I3KD-OEPE-IE,3833,343343,09/12/2093,,N,EC,3838-38939-393

As you can see, the key and value are differentiated by one empty space only.

What's the efficient way of getting in python?

Answer 1

with open(filename) as f:
    mapping = dict(line.split(' ', 1) for line in f)

Answer 2

with open('file.txt','r') as file:
   thedict={e.split(' ')[0]:e.split(' ')[1] for e in file}

You could try this dictionary comprehension

Answer 3

It's going to be overkill, but you can also use the built-in csv module.

While it's designed to work for comma-separated values by default, it does provide a way toregister a custom dialect to match custom file formats, such as files with space-separated values. The Dialect and Formatting Parameters includes an attribute for a delimiter which you can set to a space " " .

import csv
from pprint import pprint

csv.register_dialect("my_custom_dialect", delimiter=" ")

mapping1 = {}
with open("test.txt") as f:
    reader = csv.reader(f, dialect="my_custom_dialect")
    for row in reader:
        # Each row is a list of strings separated by the delimiter
        key, value = row
        mapping1[key] = value
pprint(mapping1)

{'12122.AA.K': 'IRIR-93I3KD-OEPE-IE,6373,893939,09/12/2093,,N,EC,3838-38939-393',
 '12123.AA.K': 'KKKS-93I3KD-OEPE-IE,9393,039033,09/12/2093,,N,EC,3838-38939-393',
 '12124.AA.K': 'PEOEP-93I3KD-OEPE-IE,9033,930392,09/12/2093,,N,EC,3838-38939-393',
 '12125.AA.K': 'MDJDK-93I3KD-OEPE-IE,3930,272882,09/12/2093,,N,EC,3838-38939-393',
 '12126.AA.K': 'EOEPE-93I3KD-OEPE-IE,8393,039393,09/12/2093,,N,EC,3838-38939-393'}

If your file has headers, then you can leverage csv 'sDictReader to access each row's values as a dict.

KEY VALUE
12122.AA.K IRIR-93I3KD-OEPE-IE,6373,893939,09/12/2093,,N,EC,3838-38939-393
12123.AA.K KKKS-93I3KD-OEPE-IE,9393,039033,09/12/2093,,N,EC,3838-38939-393
12124.AA.K PEOEP-93I3KD-OEPE-IE,9033,930392,09/12/2093,,N,EC,3838-38939-393

import csv
from pprint import pprint

csv.register_dialect("my_custom_dialect", delimiter=" ")

mapping2 = {}
with open("test_with_headers.txt") as f:
    reader = csv.DictReader(f, dialect="my_custom_dialect")
    for row in reader:
        # 'row' is a dictionary with the headers as the key
        mapping2[row["KEY"]] = row["VALUE"]
pprint(mapping2)

{'12122.AA.K': 'IRIR-93I3KD-OEPE-IE,6373,893939,09/12/2093,,N,EC,3838-38939-393',
 '12123.AA.K': 'KKKS-93I3KD-OEPE-IE,9393,039033,09/12/2093,,N,EC,3838-38939-393',
 '12124.AA.K': 'PEOEP-93I3KD-OEPE-IE,9033,930392,09/12/2093,,N,EC,3838-38939-393'}

How to read a file line by line with space separated values?

Question

3 answers

solution1
1 2017-11-02 23:28:11

solution2
0 2017-11-02 23:18:44

solution3
0 2021-11-23 09:24:46

How to read a file line by line with space separated values?

Question

3 answers

solution1 1 2017-11-02 23:28:11

solution2 0 2017-11-02 23:18:44

solution3 0 2021-11-23 09:24:46

solution1
1 2017-11-02 23:28:11

solution2
0 2017-11-02 23:18:44

solution3
0 2021-11-23 09:24:46