有没有办法使用 dbf to csv python library dbf by ethanfurman 替换和修剪单个字段中的值？

Question

I am using the python library, dbf , by Ethan Furman to convert a number of dbf files to csv.我正在使用 Ethan Furman 的 Python 库dbf将许多 dbf 文件转换为 csv。 It works extremely well for that.它为此非常有效。 I would like to further edit some of the fields during the conversion process but am unsure how to do it.我想在转换过程中进一步编辑一些字段，但不确定如何做。 Specifically, I would like to replace string fields that contain only 1 or more blanks with empty strings, (eg. " " replace with "") and date fields that contain "00000000" with empty strings "".具体来说，我想用空字符串替换仅包含 1 个或多个空格的字符串字段，（例如，“” 替换为“”），将包含“00000000”的日期字段替换为空字符串“”。 I would very much appreciate it if someone could describe how to edit the fields and write out the updated records during the conversion process.如果有人能描述如何在转换过程中编辑字段和写出更新的记录，我将不胜感激。 Obviously, I could write a simple secondary script to edit the csv files output during conversion but I would like to do it all in one step if possible.显然，我可以编写一个简单的辅助脚本来编辑转换过程中输出的 csv 文件，但如果可能的话，我想一步完成所有操作。 Here is the code I am using to convert the files:这是我用来转换文件的代码：

import csv
import dbf
import os
import sys

folder=sys.argv[1]

for dirpath, dirnames, filenames in os.walk(folder):
    for filename in filenames:
        if filename.endswith('.DBF'):
            db=dbf.Table(filename, ignore_memos=True)
            db.open()
            csv_fn = filename[:-4]+ ".csv"
            dbf.export(db, filename=csv_fn, format='csv', header=True)

Answer 1

By default, when using a dbf table the data types returned are simple -- ie int , str , bool , datetime.datetime , etc. But you can make your own data types and have those used instead by specifying them in the default_data_types parameter:默认情况下，当使用 dbf 表时，返回的数据类型很简单——即int 、 str 、 bool 、 datetime.datetime等。但是您可以创建自己的数据类型并通过在default_data_types参数中指定它们来使用这些类型：

db = dbf.Table(
        filename,
        ignore_memos=True,
        default_data_types={
            'C': my_white_space_stripping_data_type,
            'D': my_empty_date_str_data_type,
            },
        )

Fortunately, dbf comes with four enhanced data types already:幸运的是， dbf已经提供了四种增强的数据类型：

Char -- automatically strips trailing whitespace, and ignores trailing whitespace for comparisons Char -- 自动去除尾随空格，并忽略尾随空格进行比较
Logical -- supports True , False , and None ( None is returned when the field value is not true or false -- I've seen ? , ' ' , and other weird garbage) Logical - 支持True 、 False和None （当字段值不是 true 或 false 时返回None - 我见过? 、 ' '和其他奇怪的垃圾）
Date -- supports an empty date, such as 00000000 , and displays them as '' Date -- 支持空日期，例如00000000 ，并显示为''
DateTime -- supports an empty date/time, and displays them as '' DateTime -- 支持空日期/时间，并将它们显示为''

Typically, if you're using one of the enhanced data types you probably want them all, so instead of the dictionary you can just pass a string:通常，如果您使用一种增强型数据类型，您可能需要全部使用它们，因此您可以只传递一个字符串而不是字典：

db = dbf.Table(
        filename,
        ignore_memos=True,
        default_data_types='enhanced_data_types',
        )

Now, when a csv file is exported, trailing white-space is dropped, and empty date fields become '' .现在，当导出 csv 文件时，会删除尾随空格，并且空日期字段变为'' 。

Keep in mind that empty logical fields will become '?'请记住，空的逻辑字段将变为'?' instead of '' , so you may want the longer form of specifying a dict to default_data_types and only overriding C and D .而不是'' ，因此您可能需要更长的形式为default_data_types指定一个dict并且只覆盖C和D 。

有没有办法使用 dbf to csv python library dbf by ethanfurman 替换和修剪单个字段中的值？

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-10-24 15:42:17

有没有办法使用 dbf to csv python library dbf by ethanfurman 替换和修剪单个字段中的值？

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-10-24 15:42:17

解决方案1
1 已采纳 2021-10-24 15:42:17