I have data files looks like this:
ABE200501.dat
ABE200502.dat
ABE200503.dat
...
So I first combine these files into all.dat
, and do a little bit clean up
fout=open("all.dat","w")
for year in range(2000,2017):
for month in range(1,13):
try:
for line in open("ABE"+ str(year) +"%02d"%(month)+".dat"):
fout.write(line.replace("[", " ").replace("]", " ").replace('"', " ").replace('`', " "))
except:
pass
fout.close()
And I later on read the final file in pandas
df = pd.read_csv("all.dat", skipinitialspace=True, error_bad_lines=False, sep=' ',
names = ['stationID','time','vis','day_type','vis2','day_type2','dir','speed','dir_max','speed_max','visual_range', 'unknown'])
I want to know, if it is possible to save combine files in directly in RAM instead in my hard disk? This can save me a lot of unnecessary space.
Thanks!
The StringIO
module lets you treat strings as files.
Example from the docs:
import StringIO
output = StringIO.StringIO()
output.write('First line.\n')
print >>output, 'Second line.'
# Retrieve file contents -- this will be
# 'First line.\nSecond line.\n'
contents = output.getvalue()
# Close object and discard memory buffer --
# .getvalue() will now raise an exception.
output.close()
For your own code:
fout = StringIO.StringIO()
# treat fout as a file handle like usual
# parse input files, writing to fout
file = fout.getvalue() # file is kind of a virtual file now
# and can be "opened" by StringIO
fout.close()
# ...
using StringIO.StringIO(file) as fin:
df = pd.read_csv(fin, skipinitialspace=True, error_bad_lines=False, sep=' ', names = ['stationID','time','vis','day_type','vis2','day_type2','dir','speed','dir_max','speed_max','visual_range', 'unknown'])
pandas accepts both pathname strings and file handles as input.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.