简体   繁体   English

将多个 for 循环输出附加到列表

[英]Appending multiple for-loop outputs to a list

I am using RegEx to extract some data from a txt file.我正在使用 RegEx 从 txt 文件中提取一些数据。 I've made the below for-loops to extract emails and birthdates and (tried) to append the outputs to a list.我制作了以下 for 循环来提取电子邮件和生日,并(尝试)将输出到 append 到列表中。 But when I print my list only the first appended output is printed.但是当我打印我的列表时,只打印第一个附加的 output 。 The birtdate RegEx works fine when run by itself. birtdate RegEx 在单独运行时工作正常。 I'm sure I'm doing something very basic wrong.我确定我在做一些非常基本的错误。

f = open("/Users/me/Desktop/scrape.txt", "r", encoding="utf8")

list = []

for i in f:
    if re.findall(r"((?i)[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.])", i):
        list.append(i)

for k in f:
    if re.findall(r'\d\d-\d\d-\d\d\d\d', k):
        list.append(k)

print(list)
f.close()

Try this:尝试这个:

with open("/Users/me/Desktop/scrape.txt", "r", encoding="utf8") as f:
    i = f.readline()
    if re.findall(r"((?i)[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.])", i):
        list.append(i)
    if re.findall(r'\d\d-\d\d-\d\d\d\d', k):
        list.append(i)

in your code, after the first for loop, f is now pointing to the end of the file and so the second for loop doesn't "run" as you're intending it to run.在您的代码中,在第一个 for 循环之后, f 现在指向文件的末尾,因此第二个 for 循环不会像您打算运行的那样“运行”。

So to modify your code to get it to do what you intended you would close file after first loop and reopen it before second loop so that the file pointer f points to begining of file again:因此,要修改您的代码以使其执行您想要的操作,您将在第一个循环之后关闭文件并在第二个循环之前重新打开它,以便文件指针 f 再次指向文件的开头:

f = open("/Users/me/Desktop/scrape.txt", "r", encoding="utf8")

list = []

for i in f:
    if re.findall(r"((?i)[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.])", i):
        list.append(i)

f.close()

f = open("/Users/me/Desktop/scrape.txt", "r", encoding="utf8")
for k in f:
    if re.findall(r'\d\d-\d\d-\d\d\d\d', k):
        list.append(k)

print(list)
f.close()

You try to read the same file twice.您尝试读取同一个文件两次。 The second for-loop will not do anything.第二个 for 循环不会做任何事情。 Have a look at this to understand:看看这个就明白了:

f = open("/Users/me/Desktop/scrape.txt", "r", encoding="utf8")
print(list(f))
print("second time:")
print(list(f))

Output: Output:

['1234567890abcdefghijklmopqrstuvwxyz'] # or whatever your content is :)
second time:
[]

To fix this you can store the result of the file in a list (if you are not dealing with huge files, of course):要解决此问题,您可以将文件的结果存储在列表中(当然,如果您不处理大文件):

f = open("/Users/me/Desktop/scrape.txt", "r", encoding="utf8")
content = list(f)


for i in content:
   ... 

for k in content:
   ... 

In your specific example it would be cleaner (and faster) to do all processing in a single for-loop, though.不过,在您的具体示例中,在单个 for 循环中进行所有处理会更干净(更快)。 However, the mistake was to try to read twice from the same file without resetting it.但是,错误是尝试从同一个文件中读取两次而不重置它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM