简体   繁体   中英

Text file converted to CSV using Python is different than converting with Excel

I have a program that parses a huge file of data output from an eye tracker. The raw file comes to me in text format, but I need a CSV file to do data analysis on.

What I had been doing is opening the text file in Excel, saving it as a .csv file, and then running it through my parser. That works fine, but it's laborious, so I want to create a piece of code to run at the beginning of my parser: which takes the raw text file, turns it into a CSV file, and then runs the parser on the just-made CSV file.

The code I am attempting to use is as follows and modified from here :

txt_file = subjectNum + ".asc"
csv_file = "subject_" + subjectNum + ".csv"
in_txt = csv.reader(open(txt_file, "r"), delimiter = '\t')
out_csv = csv.writer(open(csv_file, 'w'))
out_csv.writerows(in_txt)

This generates a file just fine, but the parser then fails to process it the same way as it does the "manually-generated" files that I get when doing the conversion through Excel. The parser does create the files, but they are empty.

Also, my source text file is 17.8mb. When I convert it into CSV using Excel, the resulting file is 16mb and contains 237,218 rows. When I use the code above to convert the text file into CSV, the resulting file is 17.8mb and 236,104 rows.

It seems like I am missing something in the code above that happens when I convert manually with Excel.

You need to close the file after writing to make sure it's written to disk entirely.

Also, you should always open the file in binary mode (Python 2) (or in newline="" mode (Python 3)).

with open(txt_file, "rb") as infile, open(csv_file, 'wb') as outfile:
    in_txt = csv.reader(infile, delimiter = '\t')
    out_csv = csv.writer(outfile)
    out_csv.writerows(in_txt)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM