简体   繁体   English

向数据框添加具有更新时间的列

[英]add a column with update time to data frame

I am new to python and I am playing with pandas libraries. 我是python的新手,正在玩pandas库。 I have a dataframe called COMBO that combines 5 different excel spreadsheets into one master spreadsheet. 我有一个称为COMBO的数据框,它将5个不同的excel电子表格合并为一个主电子表格。 I want to add a new column at the end showing when and what time I combined the spreadsheets, but I am having a hard time making it to work. 我想在末尾添加一个新列,以显示何时以及什么时候合并电子表格,但是我很难使它起作用。 I get an error: TypeError: cannot concatenate a non-NDFrame object. 我收到一个错误:TypeError:无法连接非NDFrame对象。 Here is what I have tried so far: 到目前为止,这是我尝试过的:

import pandas as pd
import datetime


df1= pd.read_excel(r'W:\sheets\sh1.xlsx')
df2= pd.read_excel(r'W:\sheets\sh2.xlsx')
df3= pd.read_excel(r'W:\sheets\sh3.xlsx')
df4= pd.read_excel(r'W:\sheets\sh4.xlsx')
df5= pd.read_excel(r'W:\sheets\sh5.xlsx')
df6= pd.read_excel(r'W:\sheets\sh6.xlsx')


combo = pd.concat([df1,df2,df3,df4,df5,df6])

now = datetime.datetime.now()  #defines NOW
ts = str(now) #converts it into string
timestamp = pd.DataFrame([]) #opens an empty dataframe

for row in combo.iterrows():
    timestamp.append(ts)

master = pd.concat([combo,timestamp],axis=1)

master.to_excel(r'W:\sheets\mastersheet.xlsx',index = False)

Basically, I concatenate the sheets first, then get the date and time, then create a new dataframe that is empty called TIMESTAMP. 基本上,我先将工作表连接起来,然后获取日期和时间,然后创建一个称为TIMESTAMP的空的新数据框。 Then, for every row in the COMBO I append the date and time to the empty dataframe TIMESTAMP (this is so I end up with a column with the same number of rows as the one in the COMBO, all with the same timestamp). 然后,对于COMBO中的每一行,我都将日期和时间附加到空的数据帧TIMESTAMP中(这样一来,我最后得到一列,其行数与COMBO中的行数相同,并且都带有相同的时间戳)。 At the end, I concatenate TIMESTAMP to the COMBO to generate the MASTER sheet. 最后,我将TIMESTAMP连接到COMBO以生成MASTER表。

I am not sure if the methodology is the correct one, but it is not working. 我不确定该方法是否正确,但无法正常工作。 Any help would be appreciated. 任何帮助,将不胜感激。

thanks 谢谢

According to your expected output, it may be simpler to just add timestamp column without creating another dataframe 根据您的预期输出,仅添加timestamp列而不创建另一个数据帧可能会更简单

combo = pd.concat(pd.read_excel(r'W:\sheets\sh%d.xlsx' % i) for i in range(1, 7))
combo['timestamp'] = str(datetime.datetime.now())
combo.to_excel(r'W:\sheets\mastersheet.xlsx', index=False)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM