简体   繁体   English

csv模块将时间写为十进制

[英]csv module writes time as decimal

I am running into an issue where I have some data in a .xls file(example below). 我遇到了一个问题,我在.xls文件中有一些数据(下面的例子)。

  A            B           C         D         E        F
John Smith     8:00AM      9:00AM    10:00AM    5:00PM  8.00

When I write it to a csv using the Python CSV module it comes out as 当我使用Python CSV模块将它写入csv时,它就像是

John,Smith,0.333333333,0.375,0.416666667,0.708333333,0.333333333

Now the interesting part is if I manually save the xls file as a MSDOS csv I get the desired output of 现在有趣的部分是如果我手动将xls文件保存为MSDOS csv我得到了所需的输出

John,Smith,8:00 AM,9:00 AM,10:00 AM,5:00 PM,8:00

Here is the function I am running. 这是我正在运行的功能。 Its a bit messy so I apologize in advance. 它有点乱,所以我提前道歉。

def csv_gen(filepath, saveto):
    for files in glob.glob("*.xls"):
        shutil.copy(filepath + "\\" + files, saveto)
        with xlrd.open_workbook(files) as wb:
            sh = wb.sheet_by_index(0)
            newfile = saveto + files[:-4] + '.csv'
            now = datetime.datetime.now()
            dates = now.strftime("%m-%d-%Y")
            filestart = [saveto + files]
            time = [dates]
            with open(newfile, 'wb') as f:
                c = csv.writer(f,delimiter=',')
                list =  range(sh.nrows)
                last = range(sh.nrows)[-1]
                list.remove(0)
                list.remove(3)
                list.remove(2)
                list.remove(1)
                list.remove(last)
                #Iterate through data and show values of the rows
                for r in list:
                    lines = sh.row_values(r)
                    del lines[:4]
                    stuff = lines + filestart + time
                    #Remove blanks so csv doesnt have uneeded data
                    if lines[0] is '':
                        del stuff[:]
                    #Write to csv file with new data
                    if any(field.strip() for field in stuff):
                        c.writerow(stuff)
            shutil.move(newfile, mergeloc)

I don't understand why this is coming out this way. 我不明白为什么会出现这种情况。 I have tried adding the dialect flag to the csv writer to be 'excel', but the output is still the same. 我已经尝试将方言标志添加到csv编写器为'excel',但输出仍然是相同的。

Update: 更新:

If I save the document as a csv as so workBook.SaveAs(test.csv, 24) The encoding 24 is for MSDOS. 如果我将文档保存为csv,那么workBook.SaveAs(test.csv, 24)编码24用于MSDOS。 I get the desired output of 我得到了所需的输出

John,Smith,8:00 AM,9:00 AM,10:00 AM,5:00 PM,8:00

But when the csv module grabs it and removes some blank rows and deletes a few things at the end it writes the rows out and that is when I get the decimals again 但是当csv模块抓取它并删除一些空行并在最后删除一些东西时它会将行写出来,那时我再次得到小数

John,Smith,0.333333333,0.375,0.416666667,0.708333333,0.333333333

The purpose of the csv module is to modify rows and delete blank rows. csv模块的目的是修改行并删除空行。

Update 更新

 for r in list: 
     cells = sh.row_values(r) 
     csv_row = cells[0] for col_value in cells[1:]:
         csv_row.append(datetime.time(*xlrd.xldate_as_tuple(col_value, 0)[3:])) 

Added the row_values to just return the value of the cell and not xldata:0.33333. 添加了row_values只返回单元格的值而不是xldata:0.33333。 Then added a * to make the pass a positional argument. 然后添加一个*以使传递成为位置参数。

That doesn't look like a problem in csv module to me, it looks like something is going wrong in reading the .xls file. 对我来说这看起来不像csv模块中的问题,看起来在读取.xls文件时出现了问题。

According to the xlrd docs dates in Excel worksheets are a pretty awful mess 根据Excel中的xlrd docs日期,工作表是一个非常糟糕的混乱

Dates in Excel spreadsheets Excel电子表格中的日期

In reality, there are no such things. 实际上,没有这样的东西。 What you have are floating point numbers and pious hope. 你有什么是浮点数和虔诚的希望。 There are several problems with Excel dates: Excel日期有几个问题:

I did a quick test with a new .xls file with the contents you provided in there. 我用一个新的.xls文件快速测试了你在那里提供的内容。 Python has no problems reading the file, although I don't have Excel on my machine, I made the file in LibreOffice and saved it as .xls. Python在读取文件时没有问题,虽然我的机器上没有Excel,但我在LibreOffice中创建了文件并将其保存为.xls。 Even so, the fields come out as unicode strings on the python side. 即便如此,字段在python端作为unicode字符串出现。

You should be able to use the xlrd.xldate_as_tuple(xldate, datemode) ( link ) to convert the float into a python date tuple. 您应该能够使用xlrd.xldate_as_tuple(xldate, datemode)链接 )将float转换为python日期元组。 Doing

print xlrd.xldate_as_tuple(0.333333333,0)

prints out 打印出来

(0, 0, 0, 8, 0, 0)

UPDATE UPDATE

So you probably want something like the following, changing the for loop that goes over your rows 所以你可能想要类似下面这样的东西,改变遍历你的行的for循环

...
for r_idx in list:
    cells = sh.row(r)
    csv_row = [cells[0]] # the first row value should be ok as just a string
    for col_value in cells[1:]:
        # add the date time column values to the converted csv row
        csv_row.append( datetime.time(xlrd.xldate_as_tuple(col_value, 0)[3:]) )
    ...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM