I created a list of files like this:
merge_files = []
for i in range(2, 12):
merge_files.append(pandas.read_csv(final_user_study_path + "/P" + str(i) + "/DataCollection/data/merge.csv"))
I want to create a giant csv
file with all the files from this list.
Is this the most efficient way to do this?
I recommend unix shell. If they have no headers, or only first have a header:
cat file1.csv file2.csv ... fileN.csv > result.csv
If they have headers, you have to cut them off first:
cat file1.csv > result.csv
for i in {1..N}; do tail +2 file$i.csv >> result.csv; done
If files are in different directories - use path to each file:
cat path1/file.csv path2/file.csv > result.csv
The pandas way would be to use concat on the dataframes, this can be useful if you want to do some operations too (as filtering, removing duplicates... etc)
import io
import pandas as pd
Let's create two files
csv1 = "a,b\n1,2"
csv2 = "a,b\n3,4"
file1 = io.StringIO(csv1)
file2 = io.StringIO(csv2)
Loop over them and concat:
pd.concat((pd.read_csv(i) for i in [file1,file2])).to_csv(index=False)
Results in:
'a,b\n1,2\n3,4\n'
Adapted for you in a readable way (my preferred way):
files = []
for i in range(2, 12):
path = "{}/P{}/DataCollection/data/merge.csv".format(final_user_study_path,i)
files.append(path)
pd.concat((pd.read_csv(i) for i in files)).to_csv("output.csv",index=False)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.