简体   繁体   English

如何搜索和替换文件中的文本?

[英]How to search and replace text in a file?

How do I search and replace text in a file using Python 3?如何使用 Python 3 搜索和替换文件中的文本?

Here is my code:这是我的代码:

import os
import sys
import fileinput

print ("Text to search for:")
textToSearch = input( "> " )

print ("Text to replace it with:")
textToReplace = input( "> " )

print ("File to perform Search-Replace on:")
fileToSearch  = input( "> " )
#fileToSearch = 'D:\dummy1.txt'

tempFile = open( fileToSearch, 'r+' )

for line in fileinput.input( fileToSearch ):
    if textToSearch in line :
        print('Match Found')
    else:
        print('Match Not Found!!')
    tempFile.write( line.replace( textToSearch, textToReplace ) )
tempFile.close()


input( '\n\n Press Enter to exit...' )

Input file:输入文件:

hi this is abcd hi this is abcd
This is dummy text file.
This is how search and replace works abcd

When I search and replace 'ram' by 'abcd' in above input file, it works as a charm.当我在上面的输入文件中搜索并用“abcd”替换“ram”时,它很有用。 But when I do it vice-versa ie replacing 'abcd' by 'ram', some junk characters are left at the end.但是当我反之亦然,即用“ram”替换“abcd”时,最后会留下一些垃圾字符。

Replacing 'abcd' by 'ram'用“ram”替换“abcd”

hi this is ram hi this is ram
This is dummy text file.
This is how search and replace works rambcd

As pointed out by michaelb958, you cannot replace in place with data of a different length because this will put the rest of the sections out of place.正如 michaelb958 所指出的,您不能用不同长度的数据替换就地,因为这会使其余部分错位。 I disagree with the other posters suggesting you read from one file and write to another.我不同意其他海报建议您从一个文件中读取并写入另一个文件。 Instead, I would read the file into memory, fix the data up, and then write it out to the same file in a separate step.相反,我会将文件读入内存,修复数据,然后在单独的步骤中将其写出到同一个文件中。

# Read in the file
with open('file.txt', 'r') as file :
  filedata = file.read()

# Replace the target string
filedata = filedata.replace('ram', 'abcd')

# Write the file out again
with open('file.txt', 'w') as file:
  file.write(filedata)

Unless you've got a massive file to work with which is too big to load into memory in one go, or you are concerned about potential data loss if the process is interrupted during the second step in which you write data to the file.除非您有一个大文件要处理,而该文件太大而无法一次性加载到内存中,或者您担心如果在将数据写入文件的第二步过程中过程中断,则可能会丢失数据。

fileinput already supports inplace editing. fileinput已经支持就地编辑。 It redirects stdout to the file in this case:在这种情况下,它将stdout重定向到文件:

#!/usr/bin/env python3
import fileinput

with fileinput.FileInput(filename, inplace=True, backup='.bak') as file:
    for line in file:
        print(line.replace(text_to_search, replacement_text), end='')

As Jack Aidley had posted and JF Sebastian pointed out, this code will not work:正如 Jack Aidley 发布的和 JF Sebastian 所指出的,这段代码是行不通的:

 # Read in the file
filedata = None
with file = open('file.txt', 'r') :
  filedata = file.read()

# Replace the target string
filedata.replace('ram', 'abcd')

# Write the file out again
with file = open('file.txt', 'w') :
  file.write(filedata)`

But this code WILL work (I've tested it):但是这段代码会起作用(我已经测试过了):

f = open(filein,'r')
filedata = f.read()
f.close()

newdata = filedata.replace("old data","new data")

f = open(fileout,'w')
f.write(newdata)
f.close()

Using this method, filein and fileout can be the same file, because Python 3.3 will overwrite the file upon opening for write.使用这种方法,filein 和 fileout 可以是同一个文件,因为 Python 3.3 将在打开写入时覆盖文件。

You can do the replacement like this你可以像这样进行更换

f1 = open('file1.txt', 'r')
f2 = open('file2.txt', 'w')
for line in f1:
    f2.write(line.replace('old_text', 'new_text'))
f1.close()
f2.close()

You can also use pathlib .您也可以使用pathlib

from pathlib2 import Path
path = Path(file_to_search)
text = path.read_text()
text = text.replace(text_to_search, replacement_text)
path.write_text(text)

(pip install python-util) (pip 安装 python-util)

from pyutil import filereplace

filereplace("somefile.txt","abcd","ram")

Will replace all occurences of "abcd" with "ram".将所有出现的“abcd”替换为“ram”。
The function also supports regex by specifying regex=True该函数还通过指定regex=True来支持正regex=True

from pyutil import filereplace

filereplace("somefile.txt","\\w+","ram",regex=True)

Disclaimer: I'm the author ( https://github.com/MisterL2/python-util )免责声明:我是作者( https://github.com/MisterL2/python-util

Late answer, but this is what I use to find and replace inside a text file:迟到的答案,但这是我用来在文本文件中查找和替换的内容:

with open("test.txt") as r:
  text = r.read().replace("THIS", "THAT")
with open("test.txt", "w") as w:
  w.write(text)

DEMO演示

With a single with block, you can search and replace your text:使用单个 with 块,您可以搜索和替换您的文本:

with open('file.txt','r+') as f:
    filedata = f.read()
    filedata = filedata.replace('abc','xyz')
    f.truncate(0)
    f.write(filedata)

This answer works for me.这个答案对我有用。 Open the file in read mode.以读取模式打开文件。 Read the file in string format.以字符串格式读取文件。 Replace the text as intended.按预期替换文本。 Close the file.关闭文件。 Again open the file in write mode.再次以写入模式打开文件。 Finally, write the replaced text to the same file.最后,将替换后的文本写入同一个文件。

    with open("file_name", "r+") as text_file:
        texts = text_file.read()
        texts = texts.replace("to_replace", "replace_string")
    with open(file_name, "w") as text_file:
        text_file.write(texts)
except FileNotFoundError as f:
    print("Could not find the file you are trying to read.")

Your problem stems from reading from and writing to the same file.您的问题源于读取和写入同一个文件。 Rather than opening fileToSearch for writing, open an actual temporary file and then after you're done and have closed tempFile , use os.rename to move the new file over fileToSearch .与其打开fileToSearch进行写入, fileToSearch打开一个实际的临时文件,然后在完成并关闭tempFile ,使用os.rename将新文件移动到fileToSearch

My variant, one word at a time on the entire file.我的变体,在整个文件中一次一个字。

I read it into memory.我读到了记忆中。

def replace_word(infile,old_word,new_word):
    if not os.path.isfile(infile):
        print ("Error on replace_word, not a regular file: "+infile)
        sys.exit(1)

    f1=open(infile,'r').read()
    f2=open(infile,'w')
    m=f1.replace(old_word,new_word)
    f2.write(m)

I got the same issue.我遇到了同样的问题。 The problem is that when you load a .txt in a variable you use it like an array of string while it's an array of character.问题在于,当您在变量中加载 .txt 时,您将其用作字符串数组,而它是字符数组。

swapString = []
with open(filepath) as f: 
    s = f.read()
for each in s:
    swapString.append(str(each).replace('this','that'))
s = swapString
print(s)

you can use sed or awk or grep in python (with some restrictions).您可以在 Z23EEEB4347BDD26BDDZ6B7EE9A3B75 中使用 sed 或 awk 或 grep (有一些限制)。 Here is a very simple example.这是一个非常简单的例子。 It changes banana to bananatoothpaste in the file.它将文件中的香蕉更改为香蕉牙膏。 You can edit and use it.您可以编辑和使用它。 ( I tested it worked...note: if you are testing under windows you should install "sed" command and set the path first) (我测试过它有效......注意:如果您在 windows 下进行测试,您应该先安装“sed”命令并设置路径)

import os 
file="a.txt"
oldtext="Banana"
newtext=" BananaToothpaste"
os.system('sed -i "s/{}/{}/g" {}'.format(oldtext,newtext,file))
#print(f'sed -i "s/{oldtext}/{newtext}/g" {file}')
print('This command was applied:  sed -i "s/{}/{}/g" {}'.format(oldtext,newtext,file))

if you want to see results on the file directly apply: "type" for windows/ "cat" for linux:如果您想在文件上直接查看结果,请应用:“type” for windows/“cat” for linux:

####FOR WINDOWS:
os.popen("type " + file).read()
####FOR LINUX:
os.popen("cat " + file).read()

I recommend its worth checking it out this small program. 我建议值得检查这个小程序。 Regular expressions are the way to go. 正则表达式是解决之道。

https://github.com/khranjan/pythonprogramming/tree/master/findandreplace https://github.com/khranjan/pythonprogramming/tree/master/findandreplace

I have done this:我已经这样做了:

#!/usr/bin/env python3

import fileinput
import os

Dir = input ("Source directory: ")
os.chdir(Dir)

Filelist = os.listdir()
print('File list: ',Filelist)

NomeFile = input ("Insert file name: ")

CarOr = input ("Text to search: ")

CarNew = input ("New text: ")

with fileinput.FileInput(NomeFile, inplace=True, backup='.bak') as file:
    for line in file:
        print(line.replace(CarOr, CarNew), end='')

file.close ()

I modified Jayram Singh's post slightly in order to replace every instance of a '!'我稍微修改了 Jayram Singh 的帖子,以替换每个 '!' 的实例。 character to a number which I wanted to increment with each instance.字符到我想随每个实例递增的数字。 Thought it might be helpful to someone who wanted to modify a character that occurred more than once per line and wanted to iterate.认为这对于想要修改每行出现不止一次的字符并想要迭代的人可能会有所帮助。 Hope that helps someone.希望能帮助某人。 PS- I'm very new at coding so apologies if my post is inappropriate in any way, but this worked for me. PS-我在编码方面很新,所以如果我的帖子有任何不当之处,我深表歉意,但这对我有用。

f1 = open('file1.txt', 'r')
f2 = open('file2.txt', 'w')
n = 1  

# if word=='!'replace w/ [n] & increment n; else append same word to     
# file2

for line in f1:
    for word in line:
        if word == '!':
            f2.write(word.replace('!', f'[{n}]'))
            n += 1
        else:
            f2.write(word)
f1.close()
f2.close()
def word_replace(filename,old,new):
    c=0
    with open(filename,'r+',encoding ='utf-8') as f:
        a=f.read()
        b=a.split()
        for i in range(0,len(b)):
            if b[i]==old:
                c=c+1
        old=old.center(len(old)+2)
        new=new.center(len(new)+2)
        d=a.replace(old,new,c)
        f.truncate(0)
        f.seek(0)
        f.write(d)
    print('All words have been replaced!!!')

Besides the answers already mentioned, here is an explanation of why you have some random characters at the end:除了已经提到的答案,这里解释了为什么最后有一些随机字符:
You are opening the file in r+ mode, not w mode.您是在r+模式下打开文件,而不是w模式。 The key difference is that w mode clears the contents of the file as soon as you open it, whereas r+ doesn't.主要区别在于w模式会在您打开文件后立即清除文件内容,而r+不会。
This means that if your file content is "123456789" and you write "www" to it, you get "www456789".这意味着如果您的文件内容是“123456789”并且您在其中写入“www”,您将得到“www456789”。 It overwrites the characters with the new input, but leaves any remaining input untouched.它用新输入覆盖字符,但保留任何剩余输入不变。
You can clear a section of the file contents by using truncate(<startPosition>) , but you are probably best off saving the updated file content to a string first, then doing truncate(0) and writing it all at once.您可以使用truncate(<startPosition>)清除文件内容的truncate(<startPosition>) ,但您可能最好先将更新的文件内容保存为字符串,然后执行truncate(0)并一次性写入所有内容。
Or you can use my library :D或者你可以使用我的图书馆:D

I tried this and used readlines instead of read我试过这个并使用 readlines 而不是 read

with open('dummy.txt','r') as file:
    list = file.readlines()
print(f'before removal {list}')
for i in list[:]:
        list.remove(i)

print(f'After removal {list}')
with open('dummy.txt','w+') as f:
    for i in list:
        f.write(i)

Using re.subn it is possible to have more control on the substitution process, such as word splitted over two lines, case-(in)sensitive match.使用re.subn可以对替换过程进行更多控制,例如将单词拆分为两行,区分大小写(不区分大小写)匹配。 Further, it returns the amount of matches which can be used to avoid waste of resources if the string is not found.此外,它返回匹配的数量,如果找不到字符串,可以使用这些匹配数量来避免资源浪费。

import re

file = # path to file

# they can be also raw string and regex
textToSearch = r'Ha.*O' # here an example with a regex
textToReplace = 'hallo'

# read and replace
with open(file, 'r') as fd:
    # sample case-insensitive find-and-replace
    text, counter = re.subn(textToSearch, textToReplace, fd.read(), re.I)

# check if there is at least a  match
if counter > 0:
    # edit the file
    with open(file, 'w') as fd:
        fd.write(text)

# summary result
print(f'{counter} occurence of "{textToSearch}" were replaced with "{textToReplace}".')

Some regex:一些正则表达式:

  • add the re.I flag, short form of re.IGNORECASE , for a case-insensitive match添加re.I标志, re.IGNORECASE的缩写形式,用于不区分大小写的匹配
  • for multi-line replacement re.subn(r'\n*'.join(textToSearch), textToReplace, fd.read()) , depending on the data also '\n{,1}' .对于多行替换re.subn(r'\n*'.join(textToSearch), textToReplace, fd.read()) ,还取决于数据'\n{,1}' Notice that for this case textToSearch must be a pure string, not a regex!请注意,对于这种情况, textToSearch必须是纯字符串,而不是正则表达式!
def findReplace(find, replace):

    import os 

    src = os.path.join(os.getcwd(), os.pardir) 

    for path, dirs, files in os.walk(os.path.abspath(src)):

        for name in files: 

            if name.endswith('.py'): 

                filepath = os.path.join(path, name)

                with open(filepath) as f: 

                    s = f.read()

                s = s.replace(find, replace) 

                with open(filepath, "w") as f:

                    f.write(s) 

Like so:像这样:

def find_and_replace(file, word, replacement):
  with open(file, 'r+') as f:
    text = f.read()
    f.write(text.replace(word, replacement))

I have worked this out as an exercise of a course: open file, find and replace string and write to a new file.我已将此作为课程练习来解决:打开文件,查找并替换字符串并写入新文件。

class Letter:

    def __init__(self):

        with open("./Input/Names/invited_names.txt", "r") as file:
            # read the list of names
            list_names = [line.rstrip() for line in file]
            with open("./Input/Letters/starting_letter.docx", "r") as f:
                # read letter
                file_source = f.read()
            for name in list_names:
                with open(f"./Output/ReadyToSend/LetterTo{name}.docx", "w") as f:
                    # replace [name] with name of the list in the file
                    replace_string = file_source.replace('[name]', name)
                    # write to a new file
                    f.write(replace_string)


brief = Letter()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM