简体   繁体   中英

LOOP When scraping data

i am trying to scrap data using loop and this is the code

import requests
import json
import pandas as pd

parameters = ['a:1','a:2','a:3','a:4','a:3','a:4','a:5','a:6','a:7','a:8','a:9','a:10']

results = pd.DataFrame()
for item in parameters:
    key, value = item.split(':')
    url = "https://xxxx.000webhostapp.com/getNamesEnc02Motasel2.php?keyword=%s&type=2&limit=%s" %(key, value)
    r = requests.get(url)
    cont = json.loads(r.content)
    temp_df = pd.DataFrame(cont)
    results = results.append(temp_df)

results.to_csv('ScrapeData.csv', index=False)

this method is working great but the problem is that there i need the parameters = until 'a:1000' and i think there is a better solution to loop from 'a:1' to 'a:1000' instead of duplicating parameters like in my code .

i really need your help

value = 1
key = 'a'
while value <= 1000:
    url = .....%(key, str(value))
    ....
    ....
    value += 1

......

Use a counter

Use can use a for i in range(start, end) loop. Like this

results = pd.DataFrame()
key = 'a'

# Goes from 1 to 1000 (including both)
for value in range(1, 1001):
    url = f'https://xxxx.000webhostapp.com/getNamesEnc02Motasel2.php?keyword={key}&type=2&limit={value}'
    r = requests.get(url)
    cont = json.loads(r.content)
    temp_df = pd.DataFrame(cont)
    results = results.append(temp_df)

results.to_csv('ScrapeData.csv', index=False)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM