I have several csv files name file1, file2, file3, etc. They all look like this (exactly identical, only the floats change):
filename, column1, column2, ... columnN
asdfasd.jpg 23.23, 21.24, 1e-06
ersdadfsd.jpg 223.23, 1.23, 1
assd.jpg 23.23, 1e-08, 232.1
...
I would like to get an indentical looking table in which all fields contain the mean. How can this be done in an efficient way?
all_csv = []
for one_file in list_of_file:
all_csv.append(pd.read_csv(one_file))
df = pd.concat(all_csv).groupby('filename').mean()
should do want you want.
As example, with two csv:
>>> df1 = pd.DataFrame({'name': ['a', 'b'], 'v1': [1, 2,], 'v2': [3, 4]}) # your first csv
>>> df2 = pd.DataFrame({'name': ['a', 'b'], 'v1': [5, 6,], 'v2': [7, 8]}) # your second csv
>>> df3 = pd.concat([df1, df2])
>>> df3
name v1 v2
0 a 1 3
1 b 2 4
0 a 5 7
1 b 6 8
>>> df3.groupby('name').mean()
# create sub dataframe with only the same name values (a and b) and
# the mean compute the mean on this sub dataframe column by column.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.