简体   繁体   中英

No such file or directory while web scrapping altough the folder does exist

i want to scrape all the.csv files from the url list in my code like this

os.makedirs("Project Data ISPU SPKU DKI JAKARTA 2010 - 2021", exist_ok=True)

change_directory = r"C:\Users\EVOSYS\Documents\PROJECT-ISPU-DKI-JAKARTA-main\Project Data ISPU SPKU DKI JAKARTA 2010 - 2021"

os.chdir(change_directory)
print("Current Working directory has been changed to :", os.getcwd())

URLS = [
        'https://data.jakarta.go.id/dataset/indeks-standar-pencemaran-udara-ispu-tahun-2020',
        'https://data.jakarta.go.id/dataset/indeks-standar-pencemaran-udara-ispu-tahun-2021',
        'https://data.jakarta.go.id/dataset/data-indeks-standar-pencemar-udara-ispu-di-provinsi-dki-jakarta-tahun-2019'
       ]


for url in URLS:
    soup = BeautifulSoup(requests.get(url).content, "html.parser")

    folder = url.split("/")[-1]
    os.makedirs(folder, exist_ok=True)

    for a in soup.select('a[href$=".csv"]'):
        file_name = a["href"].split("/")[-1]
        
        path = os.path.join(folder, file_name)

        print(
            "Downloading {} ...".format(path),
            end=" ",)
        
        with open(path, "wb") as f_out:
            f_out.write(requests.get(a["href"]).content)
        print("OK.")

but for the 2019 url it gives an error

FileNotFoundError: [Errno 2] No such file or directory: 'data-indeks-standar-pencemar-udara-ispu-di-provinsi-dki-jakarta-tahun-2019\\Indeks-Standar-Pencemar-Udara-di-Provinsi-DKI-Jakarta-Bulan-Januari-Tahun-2019.csv'

i already check that the folder for 2019 data does exist but it still showing an error that the folder is not exist, all the url using the sam tag (href) to get the.csv files

You may need to run your program as admin Terminal opening python with sudo

You likely need to run it as an admin as the code seems to be using the C:/ directory further more i think you should take out the username in the file path and edit this discussion

Happy time coding

You are creating a folder in the current working directory (where the py file is located), then hard coding the change directory. If you current working directory is not in the Users..\Documents, the the folder you created will not be there.

There's no need to change the current working directory. Just code the path of where you want the data stored.

import os
from bs4 import BeautifulSoup
import requests


outputPath = r"C:\Users\EVOSYS\Documents\PROJECT-ISPU-DKI-JAKARTA-main\Project Data ISPU SPKU DKI JAKARTA 2010 - 2021"
os.makedirs(outputPath, exist_ok=True)


URLS = [
        'https://data.jakarta.go.id/dataset/indeks-standar-pencemaran-udara-ispu-tahun-2020',
        'https://data.jakarta.go.id/dataset/indeks-standar-pencemaran-udara-ispu-tahun-2021',
        'https://data.jakarta.go.id/dataset/data-indeks-standar-pencemar-udara-ispu-di-provinsi-dki-jakarta-tahun-2019'
       ]


for url in URLS:
    soup = BeautifulSoup(requests.get(url).content, "html.parser")

    folder = url.split("/")[-1]
    folder = os.path.join(outputPath, folder)
    os.makedirs(folder, exist_ok=True)

    for a in soup.select('a[href$=".csv"]'):
        file_name = a["href"].split("/")[-1]
        
        path = os.path.join(folder, file_name)

        print(
            "Downloading {} ...".format(path),
            end=" ",)
        
        with open(path, "wb") as f_out:
            f_out.write(requests.get(a["href"]).content)
        print("OK.")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM