简体   繁体   English

删除 txt 文件 python 中包含字符串的行

[英]Delete line that contains a string in a txt file python

I'm trying to delete a line in a txt file which contains a variable (email).我正在尝试删除包含变量(电子邮件)的 txt 文件中的一行。

I want to delete the whole line which contains the email eg mano@gmail.com not just the variable This is what I've come up with so far but it doesn't seem to work.我想删除包含 email 的整行,例如 mano@gmail.com 不仅仅是变量 这是我到目前为止想出的,但它似乎不起作用。

with open("wappoint.txt.txt", "r") as w:
    lines = w.readlines()
with open("wappoint.txt.txt", "w") as w:
    for line in lines:
        if email.strip("\n") != email:
            w.write(line)

The contents of the txt file are txt文件的内容是

vasv@gmail.com, 1
mano@gmail.com, 3

Are you looking for this?:你在找这个吗?:

with open("wappoint.txt", "r") as w:
    lines = w.readlines()
with open("wappoint.txt", "w") as w:
    for line in lines:
        if email not in line:
            w.write(line)

this removes the line if it contains the email.如果该行包含 email,则会删除该行。

It seems like you just want to check if email occurs in the line .您似乎只想检查email是否出现在该line中。

Your code is trying to do an (in)equality comparison - when you should instead be checking for a substring (ie whether email occur in line ).您的代码正在尝试进行(不)相等比较 - 当您应该检查 substring 时(即email是否出现在line中)。

A suitable condition is:一个合适的条件是:

if email not in line:

There are a number of considerations to address about this:有许多考虑因素需要解决:

  1. If your file is large, it isn't a good idea to load it all in memory.如果您的文件很大,最好将其全部加载到 memory 中。
  2. If some exception occurs during processing (maybe even a KeyboardInterrruptException ), it is often desirable to leave your original file untouched (so, we'll try to make your operation ACID ).如果在处理过程中发生一些异常(甚至可能是KeyboardInterrruptException ),通常希望保持原始文件不变(因此,我们将尝试使您的操作ACID )。
  3. If multiple concurrent processes try to modify your file, you would like some guarantee that, at least, yours is safe (also ACID).如果多个并发进程尝试修改您的文件,您希望至少保证您的文件是安全的(也是 ACID)。
  4. You may (or may not) want a backup for your file.您可能(或可能不)需要备份文件。

There are a number of possibilities (see eg this SO question ).有很多可能性(参见例如这个 SO question )。 In my experience however, I got mixed results with fileinput : it makes it easy to modify one or several files in place, optionally creating a backup for each, but unfortunately it writes eagerly in each file (possibly leaving it incomplete when encountering an exception).然而,根据我的经验,我使用fileinput得到的结果好坏参半:它可以很容易地修改一个或多个文件,可以选择为每个文件创建一个备份,但不幸的是它急切地写入每个文件(遇到异常时可能会使其不完整) . I put an example at the end for reference.我在最后放了一个例子供参考。

What I've found to be the simplest and safest approach is to use a temporary file (in the same directory as the file you are processing and named uniquely but in a recognizable manner), do your operation from src to tmp , then mv tmp src which, at least for practical purposes, is atomic on most POSIX filesystems .我发现最简单和最安全的方法是使用临时文件(在与您正在处理的文件相同的目录中,并以唯一但可识别的方式命名),从srctmp进行操作,然后mv tmp src至少出于实际目的,在大多数 POSIX 文件系统上是原子的

def acceptall(line):
    return True

def filefilter(filename, filterfunc=acceptall, backup=None):
    if backup:
        backup = f'{filename}{backup}'  # leave None if no backup wanted
    tmpname = tempfile.mktemp(prefix=f'.{filename}-', dir=os.path.dirname(filename))
    with open(tmpname, 'w') as tmp, open(filename, 'r') as src:
        for line in src:
            if filterfunc(line):
                tmp.write(line)
    if backup:
        os.rename(filename, backup)
    os.rename(tmpname, filename)

Example for your case:您的案例示例:

filefilter('wappoint.txt.txt', lambda line: email not in line)

Using a regex to exclude multiple email addresses (case-insensitive and only fully matching), and generating a .bak backup file:使用正则表达式排除多个 email 地址(不区分大小写且仅完全匹配),并生成.bak备份文件:

matcher = re.compile(r'.*\b(bob|fred|jeff)@foo\.com\b', re.IGNORECASE)
filefilter(filename, lambda line: not matcher.match(line), backup='.bak')

We can also simulate what happens if an exception is raised in the middle (eg on the first matching line):我们还可以模拟如果在中间引发异常(例如在第一个匹配行)会发生什么:

def flaky(line):
    if email in line:
        1 / 0
    return True

filefilter(filename, flaky)

That will raise ZeroDivisionError upon the first matching line.这将在第一条匹配行引发ZeroDivisionError But notice how your file is not modified at all in that case (and no backup is made).但是请注意在这种情况下您的文件根本没有被修改(并且没有进行备份)。 As a side-effect, the temporary file remains (this is consistent with other utils, eg rsync , that leave .filename-<random> incomplete temp files at the destination when interrupted).作为副作用,临时文件仍然存在(这与其他实用程序一致,例如rsync ,在中断时将.filename-<random>不完整的临时文件留在目的地)。


As promised, here is also an example using fileinput , but with the caveats explained earlier:正如所承诺的,这也是一个使用fileinput的示例,但带有前面解释的警告:

with fileinput.input(filename, inplace=True, backup='.bak') as f:
    for line in f:
        if email not in line:
            print(line, end='')  # this prints back to filename

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM