[英]Concatenate multiple .csv dataframe with multiindex
I am concatenating multiple dfs
which look like these:我正在连接多个如下所示的
dfs
:
X Y
mean std size mean std size
In_X
(10.424, 10.43] 10.425 NaN 1 0.003786 NaN 1
(10.43, 10.435] 10.4 NaN 0 NaN NaN 0
When I didn't have multiindex dfs
, I was using:当我没有 multiindex
dfs
时,我正在使用:
extension='csv'
all_filenames = [i for i in glob.glob('*.{}'.format(extension))]
all_dfs = pd.concat([pd.read_csv(f) for f in all_filenames ])
But this introduces a row:但这引入了一行:
mean std size mean std size
Every time a new df
is concatenated to all_dfs
.每次将新的
df
连接到all_dfs
。 How to have only the original multiindex header and avoid the introduction of the second-level header in the concatenated df?如何只有原始多索引 header 并避免在级联 df 中引入二级 header?
read_csv
by defaults only take first row as header. read_csv
默认只取第一行为 header。 You want to do specify two-row header with header
:你想用 header 指定两行
header
:
all_dfs = pd.concat([pd.read_csv(f, header=[0,1] for f in all_filenames ])
Convert your multi-index to regular columns like this:将您的多索引转换为常规列,如下所示:
df.columns = df.columns.map('_'.join)
And then use pd.concat
然后使用
pd.concat
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.