简体   繁体   English

数据框列格式

[英]data frame columns formating

I have multiple data sets which is concatenated in one master file using code below.我有多个数据集,它们使用下面的代码连接在一个主文件中。

all_data = pd.DataFrame()
dfs = (pd.read_excel(f, index_col=0, skiprows=8, skipfooter=1, usecols=[0,1])
           for f in filenames)
all_data = pd.concat(dfs, axis=1)

The data looks in "each file" looks like these:(attached) ID VALUE “每个文件”中的数据如下所示:(附)ID VALUE

Concatenated master file now has like this:连接的主文件现在是这样的:

ID value  value value value value value value value 

However, we would like to rename the each value in master file to have the file name such as :但是,我们想将主文件中的每个值重命名为文件名,例如:


Basically replacing the default col in data frame(value) to file names as individual col instead.基本上将数据帧(值)中的默认列替换为文件名作为单独的列。 Please guide.请指导。

You can change column names by simply setting the df.columns attribute to some other list.您可以通过简单地将df.columns属性设置为其他列表来更改列名。 In this case, it looks like your list would be ['ID',file1,file2,...]在这种情况下,您的列表看起来像是['ID',file1,file2,...]

Since those filenames are included in the path in your filenames list, we can pull out just the filename and make a new list with those.由于这些文件名包含在filenames列表的路径中,我们可以只提取文件名并使用这些文件名创建一个新列表。

columns = list()
for path in filenames: #loop through your filenames list
    file = path.split('\\')[-1] #this splits the path by the '\' character, and returns the last element, so the filename. You need the double-slash since the slash is the escape character from the string, or whatever they call it.
    file = file[:-4] # the file has the .xls on the end. this removes the last 4 characters
    columns.append(file)
all_data.columns = columns #all_data is our dataframe, and all_data.columns is the attribute of the dataframe that contains the column names. Changing this object to our columns list that we made will change the column names in the dataframe.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM