简体   繁体   中英

Extract part of the URL using python 3.x

I am trying to extract just the ICID bit from my URL:

url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"

So what I am after really is:

ICID=secondary_pricing_goldplus_cust_paymentpage_anon

I am trying to do the following but obviously, it doesn't work (any help would be highly appreciated. This ICID bit could be anywhere in the URL - beginning, middle or end:

from urllib.parse import urlparse
url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"
obj = urlparse(url)
print(obj)
query = obj.query
print (query)
path_list = query.split("ICID(.+?)&")
print (path_list)

You're half way there- using urlparse is a right first step, but then you want to use parse_qs (also from urllib.parse ) to parse the query:

from urllib.parse import urlparse, parse_qs
url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"
query = urlparse(url).query
path_list = parse_qs(query)['ICID']

Output:

>>> print(path_list)
['secondary_pricing_goldplus_cust_paymentpage_anon']

I would approach this by splitting the url string at the "&" sign:

url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"

split_list=url.split("&")

# iterate through the list and get the sequence which has the "ICID" string in it
for i in split_list:
    if "ICID" in i:
        # i will be you final string
        print(i)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM