简体   繁体   中英

how do I get to loop a range properly by not getting it to repeat results

how do I get this for loop to not repeat this list previous output while still using a range. this for-loop is repeating the output of the previous number. every time it goes to the next number. instead of going from 0-20 one time. it goes 0-1,0-2,0-3,0-4...…..etc. I want it to go from 0-20 once and not duplicate itself.

import time
from selenium import webdriver
import selenium
from selenium.webdriver.chrome import service
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd

#class scraperdata():

ser= Service("C:\Program Files (x86)\chromedriver.exe")
options = webdriver.ChromeOptions()
options.add_experimental_option('excludeSwitches', ['enable-logging'])
driver = webdriver.Chrome(options=options,service=ser)
driver.get('https://soundcloud.com/jujubucks')
print(driver.title)

wait = WebDriverWait(driver,30)

wait.until(EC.element_to_be_clickable((By.ID,"onetrust-accept-btn-handler"))).click()

try:  
    song_list = []

    i = 1
    for _ in range(20):
        song_contents = driver.find_element(By.XPATH, "//li[@class='soundList__item'][{}]".format(i))
        driver.execute_script("arguments[0].scrollIntoView(true);",song_contents)
        search = song_contents.find_element(By.XPATH, ".//a[contains(@class,'soundTitle__username')]/span").text
        search_song = song_contents.find_element(By.XPATH, ".//a[contains(@class,'soundTitle__title')]/span").text
        search_date = song_contents.find_element(By.XPATH, ".//time[contains(@class,'relativeTime')]/span").text
        search_plays = song_contents.find_element(By.XPATH, ".//span[contains(@class,'sc-ministats-small')]/span").text
        i+=1
        if _ == Exception:
            break

        option ={
        'Artist': search, 
        'Song_title': search_song, 
        'Date': search_date,
        'Streams': search_plays
        }
        song_list.append(option)

        df = pd.DataFrame(song_list)
        print(df)

except Exception:
    pass        

driver.quit()

Output

Stream Juju Bucks music | Listen to songs, albums, playlists for free on SoundCloud
       Artist                              Song_title               Date   Streams
0  Juju Bucks  Squad Too Deep Ft. Cool Prince (Outro)  Posted 1 year ago  31 plays
       Artist                              Song_title               Date   Streams
0  Juju Bucks  Squad Too Deep Ft. Cool Prince (Outro)  Posted 1 year ago  31 plays
1  Juju Bucks            Tropikana ft. P-Dogg Amazing  Posted 1 year ago  48 plays
       Artist                              Song_title               Date   Streams
0  Juju Bucks  Squad Too Deep Ft. Cool Prince (Outro)  Posted 1 year ago  31 plays
1  Juju Bucks            Tropikana ft. P-Dogg Amazing  Posted 1 year ago  48 plays
2  Juju Bucks              Party Ka Mngani Ft. X-Poll  Posted 1 year ago  72 plays
       Artist                              Song_title               Date    Streams
0  Juju Bucks  Squad Too Deep Ft. Cool Prince (Outro)  Posted 1 year ago   31 plays
1  Juju Bucks            Tropikana ft. P-Dogg Amazing  Posted 1 year ago   48 plays
2  Juju Bucks              Party Ka Mngani Ft. X-Poll  Posted 1 year ago   72 plays
3  Juju Bucks      Joy Ft. Black Sushi & Gavin Bowden  Posted 1 year ago  122 plays

The for-loop's range is fine. The problem is that, for each iteration of the loop, you are appending a new item to song_list , which lives outside of the scope of the loop. Move song_list = [] into the loop to make the print-statement work the way you want.

However, then you will not be keeping track of all songs anymore when the loop ends. You probably don't want to print inside the loop at all. Print once outside the loop.

You should move the dataframe allocation outside of the for loop:

for _ in range(20):
    …
    song_list.append(option)  
df = pd.DataFrame(song_list)
print(df)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM