数据框列格式

Question

我有多个数据集，它们使用下面的代码连接在一个主文件中。

all_data = pd.DataFrame()
dfs = (pd.read_excel(f, index_col=0, skiprows=8, skipfooter=1, usecols=[0,1])
           for f in filenames)
all_data = pd.concat(dfs, axis=1)

“每个文件”中的数据如下所示：（附）ID VALUE

连接的主文件现在是这样的：

ID value  value value value value value value value

但是，我们想将主文件中的每个值重命名为文件名，例如：

基本上将数据帧（值）中的默认列替换为文件名作为单独的列。 请指导。

Answer 1

您可以通过简单地将df.columns属性设置为其他列表来更改列名。 在这种情况下，您的列表看起来像是['ID',file1,file2,...]

由于这些文件名包含在filenames列表的路径中，我们可以只提取文件名并使用这些文件名创建一个新列表。

columns = list()
for path in filenames: #loop through your filenames list
    file = path.split('\\')[-1] #this splits the path by the '\' character, and returns the last element, so the filename. You need the double-slash since the slash is the escape character from the string, or whatever they call it.
    file = file[:-4] # the file has the .xls on the end. this removes the last 4 characters
    columns.append(file)
all_data.columns = columns #all_data is our dataframe, and all_data.columns is the attribute of the dataframe that contains the column names. Changing this object to our columns list that we made will change the column names in the dataframe.

数据框列格式

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-06-22 21:36:39

数据框列格式

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-06-22 21:36:39

解决方案1
0 已采纳 2021-06-22 21:36:39