簡體   English   中英

如何在for循環下打開多個TXT文件並為每個文件分配名稱

[英]How to open multiple TXT files under for loop and assigning a names to each file

我首先嘗試抓取包含不同作業名稱(帶鏈接)的td 而且我想將從那些“td”鏈接(來自其網頁的相應作業的數據)中再次抓取的數據保存在不同的 txt 文件中。 我希望將每個網頁的抓取數據分別保存在不同的 txt 文件中。 我可以這樣做嗎? 如果您對此有所了解,請分享您的想法!!

import requests
from bs4 import BeautifulSoup

main = "https://deltaimmigration.com.au/Australia-jobs/"

def First():
    r = requests.get(main)
    soup = BeautifulSoup(r.text, 'html5lib')
    links = []
    with open("links.txt", 'w', newline="", encoding="UTF-8") as f:
        for item in soup.findAll("td", {'width': '250'}):
            item = item.contents[1].get("href")[3:]
            item = f"https://deltaimmigration.com.au/{item}"
            f.write(item+"\n")
            links.append(item)
    print(f"We Have Collected {len(links)} urls")
    return links

def Second():
    links = First() 
    with requests.Session() as req:
        for link in links:
            print(f"Extracting {link}")
            r = req.get(link,timeout = 100)
            soup = BeautifulSoup(r.text, 'html5lib')
            for item in soup.findAll("table", {'width': '900'}):
                return item

def Third():
    r = requests.get(main)
    soup = BeautifulSoup(r.text, 'html5lib')
    result = Second()
    for item in soup.findAll("td", {'width': '250'}):
        with open(item.text + '.txt', 'w', newline="", encoding="UTF-8") as f:
            f.write('result')           

Third()       

我嘗試了以下內容:

with open(item.text + '.txt', 'w', newline="", encoding="UTF-8") as f:

但我收到錯誤

File "e:/test/check.py", line 10, in Third with open(item.text + '.txt', 'w', newline="", encoding="UTF-8") as f: FileNotFoundError: [Errno 2] No such file or directory: ' Vegetable Grower (Aus)/market Gardener (NZ).txt'"
import requests
from bs4 import BeautifulSoup

main = "https://deltaimmigration.com.au/Australia-jobs/"


def First():
    r = requests.get(main)
    soup = BeautifulSoup(r.text, 'html5lib')
    links = []
    names = []
    with open("links.txt", 'w', newline="", encoding="UTF-8") as f:
        for item in soup.findAll("td", {'width': '250'}):
            name = item.contents[1].text
            item = item.contents[1].get("href")[3:]
            item = f"https://deltaimmigration.com.au/{item}"
            f.write(item+"\n")
            links.append(item)
            names.append(name)
    print(f"We Have Collected {len(links)} urls")
    return links, names


def Second():
    links, names = First()
    with requests.Session() as req:
        for link, name in zip(links, names):
            print(f"Extracting {link}")
            r = req.get(link)
            soup = BeautifulSoup(r.text, 'html5lib')
            for item in soup.findAll("table", {'width': '900'}):
                with open(f"{name}.txt", 'w', newline="", encoding="UTF-8") as f:
                    f.write(item.text)


Second()

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM