AttributeError: 'pywintypes.datetime' object has no attribute 'nanosecond'

Question

I have some code to open an excel file and save it as a pandas dataframe, it was originally used in Python 2.7 and I am currently trying to make it work under Python 3.

Originally, I used the code in @myidealab from this other post: From password-protected Excel file to pandas DataFrame . It currently looks like this:

data_file = <path_for_file>
# Load excel file
xlApp = win32com.client.Dispatch("Excel.Application")
xlApp.Visible = False
pswd = getpass.getpass('password: ')
xldatabase = xlApp.Workbooks.Open(data_file, False, True, None, pswd)

dfdatabase = [] 
for sh in xldatabase.Sheets:
    xlsheet = xldatabase.Worksheets(sh.Name)

    # Get last_row
    row_num = 0
    cell_val = ''
    while cell_val != None:
        row_num += 1
        cell_val = xlsheet.Cells(row_num, 1).Value
    last_row = row_num - 1

    # Get last_column
    col_num = 0
    cell_val = '' 
    while cell_val != None:
        col_num += 1
        cell_val = xlsheet.Cells(1, col_num).Value
    last_col = col_num - 1

# Get content
content = xlsheet.Range(xlsheet.Cells(1, 1), xlsheet.Cells(last_row, last_col)).Value

# Load each sheet as a dataframe
dfdatabase.append(pd.DataFrame(list(content[1:]), columns=content[0]))

Now, I am getting the following error:

AttributeError: 'pywintypes.datetime' object has no attribute 'nanosecond'

The problem seems to boil down to the lines bellow:

# Get content
content = xlsheet.Range(xlsheet.Cells(1, 1), xlsheet.Cells(last_row, last_col)).Value

# Load each sheet as a dataframe
dfdatabase.append(pd.DataFrame(list(content[1:]), columns=content[0]))

The xlsheet.Range().Value is reading the data and assigning pywintymes descriptors to the data, which pd.DataFrame() fails to interpret.

Did anyone ran into this issue before? Is there a way that I can specifically tell xlsheet.Range().Value how to read the values in a way that pandas can interpret?

Any help will be welcome! Thank you.

Answer 1

This solves the issue, assuming you know beforehand the size/formatting of your dates/times in the excel sheet. Might be there are other more elegant ways to solve it, nonetheless.

Note: content is initially a tuple. Position [0] is the array containing the headers and the remaining positions contain the data.

import datetime
import pywintypes

...

content = xlsheet.Range(xlsheet.Cells(1, 1), xlsheet.Cells(last_row, last_col)).Value
head = content[0]
data = list(content[1:])

for x in range(0,len(data)):
    data[x] = list(data[x])
    for y in range(0,len(data[x])):
        if isinstance(data[x][y], pywintypes.TimeType):
            temp = str(data[x][y]).rstrip("+00:00").strip()
            if len(temp)>10:
                data[x][y] = datetime.datetime.strptime(temp, "%Y-%m-%d%H:%M")
            elif len(temp)>5 and len(temp)<=10:
                data[x][y] = datetime.datetime.strptime(temp, "%Y-%m-%d")
            elif len(temp)<=5:
                data[x][y] = datetime.datetime.strptime(temp, "%H:%M")
            print(data[x][y])

# Load each sheet as a dataframe
dfdatabase.append(pd.DataFrame(data, columns=head))

Used this as references: python-convert-pywintyptes-datetime-to-datetime-datetime

AttributeError: 'pywintypes.datetime' object has no attribute 'nanosecond'

Question

1 answers

solution1
0 ACCPTED 2019-08-08 11:51:03

AttributeError: 'pywintypes.datetime' object has no attribute 'nanosecond'

Question

1 answers

solution1 0 ACCPTED 2019-08-08 11:51:03

solution1
0 ACCPTED 2019-08-08 11:51:03