简体   繁体   中英

How to turn a comma seperated value TXT into a CSV for machine learning

How do I turn this format of TXT file into a CSV file?

Date,Open,high,low,close  
1/1/2017,1,2,1,2  
1/2/2017,2,3,2,3  
1/3/2017,3,4,3,4  

I am sure you can understand? It already has the comma -eparated values. I tried using numpy.

>>> import numpy as np
>>> table = np.genfromtxt("171028 A.txt", comments="%")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\Smith\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\lib\npyio.py", line 1551, in genfromtxt
    fhd = iter(np.lib._datasource.open(fname, 'rb'))
  File "C:\Users\Smith\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\lib\_datasource.py", line 151, in open
    return ds.open(path, mode)
  File "C:\Users\Smith\AppData\Local\Continuum\anaconda3\lib\site-packages\numpy\lib\_datasource.py", line 501, in open
    raise IOError("%s not found." % path)
OSError: 171028 A.txt not found.

I have (S&P) 500 txt files to do this with.

You can use csv module. You can find more information here .

import csv

txt_file = 'mytext.txt'
csv_file = 'mycsv.csv'

in_txt = csv.reader(open(txt_file, "r"), delimiter=',')
out_csv = csv.writer(open(csv_file, 'w+'))

out_csv.writerows(in_txt)

Per @dclarke's comment, check the directory from which you run the code. As you coded the call, the file must be in that directory. When I have it there, the code runs without error (although the resulting table is a single line with four nan values). When I move the file elsewhere, I reproduce your error quite nicely.

Either move the file to be local, add a local link to the file, or change the file name in your program to use the proper path to the file (either relative or absolute).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM