简体   繁体   中英

How to extract multiple columns from a space delimited .DAT file in python

I'm quite new to coding and don't have a proper education on the subject (most of my experience has been just stumbling through google searches) and I have a task that I would like assistance with.

I have 38 files which look something like this:

NGANo: 000a16d_1

Zeta: 0.050000

Ds5-95: 5.290000

Comments:

Period, SD, SV, SA

0.010000 0.000433 0.013167 170.812839
0.020000 0.001749 0.071471 172.720229
0.030000 0.004014 0.187542 176.055129
0.040000 0.007631 0.468785 189.322248
0.050000 0.012815 0.912067 203.359441
0.060000 0.019246 1.556853 210.602517
0.070000 0.025400 1.571091 206.360018

They're all .DAT files and are four columns of data (Period, SD, SV, SA) that are single space delimited in each row, additionally there are two spaces at the end of each line of data.

The only important data for me is the SA data, and I'd like to take the SA data and the title (this particular example being 000a16d_1) from each of these 38 files and put them all on the same sheet of an excel spreadsheet (one column after the next) with just the title followed by the SA data.

I've tried a few different things, but I'm stuck on how to separate the rows of data from one column into 4. I'm not too knowledgeable on whether I should use numpy or pandas. I know that everything up to the second to last line is correct, as when I have print(table) it does print the rows of data, I just don't understand how to separate the single column into multiple. Here is my current code, all assistance is appreciated.

import pandas as pd
import numpy as np
import os
import xlsxwriter
#
path = "C:/Users/amihi/Downloads/Plotter_Output"
dirs = os.listdir(path)
#
#
for file in dirs:
    table = pd.read_table(file, skiprows=4)
    SA = table.loc[:,"SA"]
    print(SA)

You could also do this without using pandas if you wanted. The code below will deal only with the table section of it, but wont deal with the info at the top of the file.

finalColumns = []
for file in dirs:
    with open(file, "r") as f:
        for l in f:
            line = l.strip("\n")
            splitted = line.split()
            if len(splitted) > len(columns):
                 for i in range(len(splitted)):
                     columns.append([])
            counter = 0
            for item in splitted:
                columns[counter].append(item)
                counter += 1   
        finalColumns.append(columns[3])

When adding to your other file, simply loop through finalColumns and each item will be what should be a new column in your file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM