简体   繁体   English

将数字单元格的单元格属性设置为数字时,将txt转换为xlsx

[英]Converting txt to xlsx while setting the cell property for number cells as number

Related question: Error in converting txt to xlsx using python 相关问题: 使用python将txt转换为xlsx时出错

I have the following code which I revised thanks you Anand S Kumar. 我修改了以下代码,谢谢Anand S Kumar。

import csv
import openpyxl

import sys


def convert(input_path, output_path):
    """
    Read a csv file (with no quoting), and save its contents in an excel file.
    """
    wb = openpyxl.Workbook()
    ws = wb.worksheets[0]

    with open(input_path) as f:
        reader = csv.reader(f, delimiter='\t', quoting=csv.QUOTE_NONE)
        for row_index, row in enumerate(reader, 1):
            for col_index, value in enumerate(row, 1):
                ws.cell(row=row_index, column=col_index).value = value

    wb.save(output_path)


def main():
    try:
        input_path, output_path = sys.argv[1:]
    except ValueError:
        print 'Usage: python %s input_path output_path' % (sys.argv[0],)
    else:
        convert(input_path, output_path)


if __name__ == '__main__':
    main()

A problem with this is that this saves xlsx in a way that saves purely number-only cells as normal text. 这样做的问题是,这将xlsx保存为将纯数字单元格保存为普通文本的方式。

So when I had to open the xlsx file manually using MS-Excel and then click "Convert to number". 因此,当我不得不使用MS-Excel手动打开xlsx文件,然后单击“转换为数字”时。

Can this code convert txt to xlsx in a way that automatically sets the cell property as number, if the cell is purely number? 如果单元格是纯数字的,此代码是否可以通过将单元格属性自动设置为数字的方式将txt转换为xlsx?

I think the issue is that when you read data using csv module, you are reading in all strings. 我认为问题在于,当您使用csv模块读取数据时,您正在读取所有字符串。 Example - 范例-

a.csv looks like - a.csv看起来像-

1,2,3
3,4,5
4,5,6

Code And result - 代码和结果-

>>> import csv
>>> with open('a.csv','r') as f:
...     reader = csv.reader(f)
...     for row in reader:
...             print(row)
...
['1', '2', '3']
['3', '4', '5']
['4', '5', '6']

And in your particular code, you are directly setting this value returned by the csv module to openpyxl , hence you are getting the strings, instead of numbers. 并且在您的特定代码中,您直接将csv模块返回的值设置为openpyxl,因此您将获取字符串而不是数字。

The best solution here would be that if you know which are the columns that you are expecting data to be an integer for, you can put a checking your code to convert those data to integer before setting it to excel . 最好的解决方案是,如果您知道期望数据作为整数的列,则可以在将其设置为excel之前检查一下代码以将这些数据转换为整数。 Example - 范例-

int_cols = set([2,4,5]) #This should be the list of all columns , 1 indexed, that contain integers.
with open(input_path) as f:
    reader = csv.reader(f, delimiter='\t', quoting=csv.QUOTE_NONE)
    for row_index, row in enumerate(reader, 1):
        for col_index, value in enumerate(row, 1):
            if col_index in int_cols:
                 ws.cell(row=row_index, column=col_index).value = int(value)
            else:
                ws.cell(row=row_index, column=col_index).value = value

If there are floats, you can use similar logic for them , define a set of columns that are float, and then if the col_index is that column, convert value to float before saving. 如果有浮点数,则可以为它们使用类似的逻辑,定义一组浮点数列,然后,如果col_index是该列,则在保存之前将值转换为float


If by the line - 如果按行-

Can this code convert txt to xlsx in a way that automatically sets the cell property as number, if the cell is purely number? 如果单元格是纯数字的,此代码是否可以通过将单元格属性自动设置为数字的方式将txt转换为xlsx?

You mean you want to set it to number for all cells that are only digits (not even decimals) , then you can use a method like the below - 您的意思是要为所有只包含digits (甚至不是小数)的单元格将其设置为number,然后可以使用以下方法-

def int_or_str(x):
    try:
        return int(x)
    except ValueError:
        return x

Then in your code, you can change the line setting the value, to - 然后,您可以在代码中将设置值的行更改为-

ws.cell(row=row_index, column=col_index).value = int_or_str(value)

Use float() in the above method, if you want to convert floats as well. 如果要转换浮点数,请在上述方法中使用float()

There are two things that may be causing your issue: 有两件事可能导致您的问题:

  1. You can/should convert your value from CSV to int or float like this: 您可以/应该将值从CSV转换为intfloat如下所示:

     ws.cell(row=row_index, column=col_index).value = int(value) # or float(value) 
  2. You are restrictive with your csv.reader ; 你对csv.reader有严格的限制; you should make sure that you really have tabs as delimiter or that your CSV is really not quoted. 您应该确保确实有制表符作为分隔符,或者确实没有引用CSV。

openpyxl does support the guess_types parameter for workbooks which will convert strings to numbers if possible. openpyxl确实支持工作簿的guess_types参数,该参数会在可能的情况下将字符串转换为数字。 Makes this kind of thing easier where there is no ambiguity. 在没有歧义的情况下,使这种事情变得容易。 But you are generally best of managing the conversion yourself. 但是通常最好是自己管理转换。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM