简体   繁体   中英

How to remove space between rows in .dat files using python?

I am dealing with a data file, which has only two columns:

 1 100
 2 200
 3 300
 4 400
 5 500

 6 600
 7 700
 8 800
 9 900
10 1000

11 1100
12 1200
13 1300
.
.
. 

This file is in .dat format, which I loaded using the np.loadtxt method. I want to remove the space in between rows that are appearing randomly . I cannot do it manually because there are too many of them. So, I am wondering if I can use any method in python to perform this task.

Please give suggestions on it.
Thank you!

Your best bet is to use pandas.read_csv() with specific configurations.

>>> import pandas as pd
>>> df = pd.read_csv("<your_dat_file>", delimiter=" ", header=None, skipinitialspace=True)
>>> df
     0     1
0    1   100
1    2   200
2    3   300
3    4   400
4    5   500
5    6   600
6    7   700
7    8   800
8    9   900
9   10  1000
10  11  1100
11  12  1200
12  13  1300

I actually view this as a base Python problem, and so would suggest:

import re

with open("data_file.txt", "r") as fin, open("data_file_out.txt", "w") as fout:
    for line in fin.readlines():
        if re.search(r'\S', line):
            fout.write(line)

The file data_file_out.txt generated by the above should contain the sams contents as your current file, with empty lines removed ("empty" being defined here as lines which have either no content or only whitespace characters).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM