數據框列格式

Question

我有多個數據集，它們使用下面的代碼連接在一個主文件中。

all_data = pd.DataFrame()
dfs = (pd.read_excel(f, index_col=0, skiprows=8, skipfooter=1, usecols=[0,1])
           for f in filenames)
all_data = pd.concat(dfs, axis=1)

“每個文件”中的數據如下所示：（附）ID VALUE

連接的主文件現在是這樣的：

ID value  value value value value value value value

但是，我們想將主文件中的每個值重命名為文件名，例如：

基本上將數據幀（值）中的默認列替換為文件名作為單獨的列。 請指導。

Answer 1

您可以通過簡單地將df.columns屬性設置為其他列表來更改列名。 在這種情況下，您的列表看起來像是['ID',file1,file2,...]

由於這些文件名包含在filenames列表的路徑中，我們可以只提取文件名並使用這些文件名創建一個新列表。

columns = list()
for path in filenames: #loop through your filenames list
    file = path.split('\\')[-1] #this splits the path by the '\' character, and returns the last element, so the filename. You need the double-slash since the slash is the escape character from the string, or whatever they call it.
    file = file[:-4] # the file has the .xls on the end. this removes the last 4 characters
    columns.append(file)
all_data.columns = columns #all_data is our dataframe, and all_data.columns is the attribute of the dataframe that contains the column names. Changing this object to our columns list that we made will change the column names in the dataframe.

數據框列格式

問題描述

1 個解決方案

解決方案1
0 已采納 2021-06-22 21:36:39

數據框列格式

問題描述

1 個解決方案

解決方案1 0 已采納 2021-06-22 21:36:39

解決方案1
0 已采納 2021-06-22 21:36:39