简体   繁体   English

替换文本中的数字

[英]replace digits from text

I have a text file I need to replace digits with white space. 我有一个文本文件,我需要用空格替换数字。

I tried splitting the text file first into individual words and then checked if that word is digit or not 我尝试将文本文件首先拆分为单个单词,然后检查该单词是否为数字

   def replace_digits_symbols():
    text_file = open_file()
    for word in text_file:
       for char in word:
         if char.isdigit():
             word.replace(char, " ")
      print(text_file)

it should replace them with white spaces but nothing is happening 它应该用白色空间代替它们但没有任何事情发生

The str.replace method simply returns the replaced string without altering the original string in-place, which is why calling word.replace(char, " ") does nothing. str.replace方法只返回被替换的字符串而不改变原位字符串,这就是调用word.replace(char, " ")什么都不做的原因。 You can instead use str.join with a generator expression that iterates through each character in a line and outputs a space instead of the original character if it is a digit: 你可以使用str.join和一个生成器表达式来迭代一行中的每个字符并输出一个空格而不是原始字符(如果它是一个数字):

with open('file') as file:
    for line in file:
        print(''.join(' ' if char.isdigit() else char for char in line))

Here is the complete code for this process, 以下是此过程的完整代码,

def helper(text):
    import string
    for digit in string.digits:
        text = text.replace(digit, ' ')
    return text

def ReplaceDigits(fileName):
    output_file = open("processed.txt",'w')
    lineNo = 1
    with open(fileName) as file_ptr:
        for lines in file_ptr:
            print("Processing Line No : {}".format(lineNo))
            lines = helper(lines)
            output_file.write(lines)
            lineNo +=1
ReplaceDigits("test.txt")

test.txt contains test.txt包含

this1is5sample0text
this10is552sample0text
this10is5sample0text
this10is52sample0text
this0is52sample0text

and the result is, 结果是,

this is sample text
this  is   sample text
this  is sample text
this  is  sample text
this is  sample text

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM