简体   繁体   English

使用python faliure在文件中搜索字符串

[英]searching for a string in a file using python faliure

I am using this code to search for emails in a particular file and write them into a another file. 我正在使用此代码在特定文件中搜索电子邮件,并将它们写入另一个文件。 I have used 'in' operator to make sure that the email are not duplicated. 我已经“在”运营商使用,以确保邮件不重复。 But this code does not get executed after the for line in f: line. 但是这个代码没有得到执行后,将for line in f:线路。 Can any one point out the mistake i have made here? 有人可以指出我在这里犯的错误吗?

tempPath = input("Please Enter the Path of the File\n")
temp_file = open(tempPath, "r")
fileContent = temp_file.read()
temp_file.close()

pattern_normal = re.compile("[-a-zA-Z0-9._]+@[-a-zA-Z0-9_]+.[a-zA-Z0-9_.]+")

pattern_normal_list = pattern_normal.findall(str(fileContent))

with open('emails_file.txt', 'a+') as f:            
    for item in pattern_normal_list:            
        for line in f:
            if line in item:
                print("duplicate")
            else:
                print("%s" %item)
                f.write("%s" %item)
                f.write('\n')

New solution: 新解决方案:

tempPath = input("Please Enter the Path of the File\n")
temp_file = open(tempPath, "r")
fileContent = temp_file.read()
temp_file.close()

pattern_normal = re.compile("[-a-zA-Z0-9._]+@[-a-zA-Z0-9_]+.[a-zA-Z0-9_.]+")

addresses = list(set(pattern_normal.findall(str(fileContent))))
with open('new_emails.txt', 'a+') as f:
    f.write('\n'.join(addresses))

I think your logic was wrong, this works: 我认为您的逻辑错误的,这可行:

addresses = ['test@wham.com', 'heffa@wham.com']

with open('emails_file.txt', 'a+') as f:
    fdata = f.read()
    for mail in addresses:
        if not mail in fdata:
            f.write(mail + '\n')

Without reading to much into your code, it looks like youre looping line by line, checking if the address you've also looping through exists in the line, if it doesn't you append your e-mail to it? 无需过多地阅读代码,它看起来就像是逐行循环,检查您是否还循环通过的地址是否存在于该行中(如果您不向其添加电子邮件)? But in 99% of a 100 lines the address will not be in the line, hence you'll get an unwanted addition. 但是在100行中,有99%的地址将不在行中,因此您会得到不必要的添加。

Output of my code snippet: 我的代码段的输出:

[Torxed@faparch ~]$ cat emails_file.txt 
test@wham.com
Torxed@whoever.com
[Torxed@faparch ~]$ python test.py 
[Torxed@faparch ~]$ cat emails_file.txt 
test@wham.com
Torxed@whoever.com
heffa@wham.com
[Torxed@faparch ~]$ 
for line in f:

Shouldn't you first call f.readlines()? 您不应该首先调用f.readlines()吗?

lines = f.readlines()
for line in lines:

Check this. 检查一下。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM