简体   繁体   中英

How to convert a string in list into url?

How to convert a string in list into url? I try url.parse, but it didn't work.

!pip install selenium
from urllib.parse import urlparse
from urllib.parse   import quote
from urllib.request import urlopen
import time

browser = webdriver.Chrome(executable_path='./chromedriver.exe')
wait = WebDriverWait(browser,5)
output = []
for i in range(1,2): # Iterate from page 1 to the last page
    browser.get("https://tw.mall.yahoo.com/search/product?p=%E5%B1%88%E8%87%A3%E6%B0%8F&pg={}".format(i))
    
 wait.until(EC.presence_of_element_located((By.XPATH,"//ul[@class='gridList']")))


    product_links = browser.find_elements(By.XPATH,"//ul[@class='gridList']/li/a")
    
     
    for link in (product_links):
        print(f"{link.get_attribute('href')}")
        output.append([link.get_attribute('href')])


for b in output[:3]:
    print(b)

The total code above, I try to make the string into url. But it doesn't work.

I think what you're trying to do is:

// importing library 
from urllib.parse import urlparse

// putting the link in a list 
b = [['https://tw.mall.yahoo.com/item/p033088522688']
['https://tw.mall.yahoo.com/item/p0330103147501']
 ['https://tw.mall.yahoo.com/item/p033097510324']]

// going through each element of the list and parse them 
for i in range ( len (b)) : 
     print(urlparse(b[i]))

That's not a list of urls, you can define that list like that:

b = ['https://tw.mall.yahoo.com/item/p033088522688', 'https://tw.mall.yahoo.com/item/p0330103147501', 'https://tw.mall.yahoo.com/item/p033097510324']

after that you need to iterate over the list to get each string using a for loop, then inside the loop you can parse the string as an url. Of course you first need to import the urlparse package.

from urlparse import urlparse

b = ['https://tw.mall.yahoo.com/item/p033088522688', 'https://tw.mall.yahoo.com/item/p0330103147501', 'https://tw.mall.yahoo.com/item/p033097510324']

for el in b:
    parsedUrl = urlparse(el)
    # do something with parsedUrl

You can find more about the urlparse lib here: https://pymotw.com/2/urlparse/

Well, at first:

NameError: name 'url' is not defined

This error is being thrown because either you're not importing the correct library or there is no object with the name url .

Your variable b is a list of strings, from which you can access any element using b[index] where index is the position of the string in the list (eg b[0] results in https://tw.mall.yahoo.com/item/p033088522688 etc).

In python you define lists by putting a list of things between square brackets. Where you have gone wrong with this is that rather than list the urls seperated by commas you have defined them each as lists.

The reason for your current error (as eluded to in the comments) is likly due to not importing URL.

import urllib as url
urls = ['https://tw.mall.yahoo.com/item/p033088522688','https://tw.mall.yahoo.com/item/p0330103147501','https://tw.mall.yahoo.com/item/p033097510324']
for url_string in urls:
    print(url.parse(url_string))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM