Reading CSV file with multiple rows containing header

Question

I have csv files that are outputs generated by an instrument. Each file contains multiple datasets that are separated with a 'condition' followed by the header and data. I want to make the 'condition' a column for the appropriate data set and read the file. The output can either be one file or a file for each dataset. The condition, the headers, and the data are all separated by tabs in the csv file.

I can't figure out how to even begin this. I have a screenshot of the example inputs and outputs. Any insights or directions to take this would be appreciated. Thank you! Image of example input and desired output

Answer 1

There is one of the possible solutions:


#Open the fist file
mfile = open('file.csv', 'r')
string = mfile.read()
mfile.close()
# Split on the line breaks
string = string.split("\n")



#CAUTION if you CSV file uses ";" instead "," change it on the code!

condition = ''
newString = []
for i in range(len(string)):
    # Check if condition is trully oneline
    if(len(string[i].split(',')) ==1):
        condition = string[i]
        #Change the string 'header1,header2 to you header
    elif (string[i] == 'header1,header2'):
        pass
    else:
        newString.append(string[i] + ","+condition)

mfile = open('outfile.csv', 'w')
mfile.write('header1,header2\n')
for i in newString:
    mfile.write(i + '\n')

I've used this as a content of file.csv (input):

condidtion1
header1,header2
2,3
2,3
2,3
2,3
condidtion2
header1,header2
3,4
3,4
3,4
3,4
3,4
3,4

After running the code, the outfile.csv looks like (output):

header1,header2
2,3,condidtion1
2,3,condidtion1
2,3,condidtion1
2,3,condidtion1
3,4,condidtion2
3,4,condidtion2
3,4,condidtion2
3,4,condidtion2
3,4,condidtion2
3,4,condidtion2

Answer 2

This will solve your issue

import csv

file = open('test.tsv', 'r')
lines = file.readlines()
# lines = ['Condition 1\t\n', 'Header 1\tHeader 2\n', '2\t3\n', '2\t3\n', '2\t3\n', 'Condition 2\t\n', 'Header 1\tHeader 2\n', '2\t3\n', '2\t3\n', '2\t3\n']
current_condition = ''
final_output = [['Header 1', 'Header 2', 'condition']]
for i in range(0,len(lines)):
    row = lines[i].rstrip().split('\t')
    if len(row) == 1:
        current_condition = row[0]
    elif row[0] != 'Header 1' and row[1] != 'Header 2':
        final_output.append([
            row[0],
            row[1],
            current_condition
        ])

fout = open('output.csv', 'w')
writer = csv.writer(fout)
writer.writerows(final_output)

Reading CSV file with multiple rows containing header

Question

2 answers

solution1
0 2020-05-08 23:39:19

solution2
0 2020-05-09 02:29:27

Reading CSV file with multiple rows containing header

Question

2 answers

solution1 0 2020-05-08 23:39:19

solution2 0 2020-05-09 02:29:27

solution1
0 2020-05-08 23:39:19

solution2
0 2020-05-09 02:29:27