Python 从文件中读取 URL 直到最后一行

Question

I have script which basically checks domain from the text file and finds its email.我有一个脚本，它基本上从文本文件中检查域并找到它的 email。 I want to add multiple domain names(line by line) then script should take each domain run the function and goes to second line after finishing.我想添加多个域名（逐行）然后脚本应该让每个域运行 function 并在完成后进入第二行。 I tried to google for specific solution but not sure how do i find appropriate answer.我试图用谷歌搜索具体的解决方案，但不知道如何找到合适的答案。

f = open("demo.txt", "r")
    url = f.readline()
     extractUrl(url)


       def extractUrl(url):
            try:
            print("Searching emails... please wait")
        count = 0
        listUrl = []

        req = urllib.request.Request(
            url,
            data=None,
            headers={
                'User-Agent': ua.random
            })
        try:
        conn = urllib.request.urlopen(req, timeout=10)
        status = conn.getcode()
        contentType = conn.info().get_content_type()
        html = conn.read().decode('utf-8')
        emails = re.findall(
            r '[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}', html)

        for email in emails:
            if (email not in listUrl):
                count += 1
        print(str(count) + " - " + email)
        listUrl.append(email)
        print(str(count) + " emails were found")

Answer 1

Python files are iterable, so it's basically a simple as: Python 文件是可迭代的，所以基本上很简单：

for line in f:
    extractUrl(line)

But you may want to do it right (ensure you close the file whatever happens, ignore possible empty lines etc):但是您可能希望正确执行（确保无论发生什么都关闭文件，忽略可能的空行等）：

# use `with open(...)` to ensure the file will be correctly closed
with open("demo.txt", "r") as f:

    # use `enumerate` to get line numbers too 
    #- we might need them for information  
    for lineno, line in enumerate(f, 1): 

        # make sure the line is clean (no leading / trailing whitespaces)
        # and not empty:
        line = line.strip()

        # skip empty lines
        if not line: 
            continue

         # ok, this one _should_ match - but something could go wrong
         try:
             extractUrl(line)
         except Exception as e:
             # mentioning the line number in error report might help debugging
             print("oops, failed to get urls for line {} ('{}') : {}".format(lineno, line, e))

Python 从文件中读取 URL 直到最后一行

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-04-28 11:30:17

Python 从文件中读取 URL 直到最后一行

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-04-28 11:30:17

解决方案1
2 已采纳 2020-04-28 11:30:17