function 在 function 參數中調用

Question


import requests
from bs4 import BeautifulSoup
import pandas as pd

header = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.11',
       'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
       'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
       'Accept-Encoding': 'none',
       'Accept-Language': 'en-US,en;q=0.8',
       'Connection': 'keep-alive'}

def url_parse(url):
    if url.endswith('/'):
        baseUrl = url.replace('/','')
    else:
        baseUrl = url
    return url, baseUrl

def scrap_hrefs(url,baseUrl):
    resp = requests.get(url, headers= header)
    respData = BeautifulSoup(resp.content, 'html.parser')     
    allHrefs = respData.select('[href]')
    return allHrefs, baseUrl
    
    
def get_hrefs(allHrefs, baseUrl):
    for i in range(0,len(allHrefs)):
        if allHrefs[i]['href'].startswith('/'):
            allHrefs[i]= baseUrl + allHrefs[i]['href']
        else:
            allHrefs[i]= allHrefs[i]['href']
    return allHrefs

def store_hrefs(allHrefs):
    links = {'links' : allHrefs}
    df = pd.DataFrame(links)
    df.to_csv("autoliv_home_page_links.csv")
    return df
    
def run_scraper(url) :
    store_hrefs(get_hrefs(scrap_hrefs(url_parse(url))))
     
    
run_scraper('https://www.example.com/')

當我運行上面的代碼時，它給了我以下錯誤： scrap_hrefs() missing 1 required positional argument: 'baseUrl'

url_parse() function 返回兩個東西，scrap_hrefs() 函數接受兩個參數。 那為什么會出錯呢？

Answer 1

在 url_parse 前面添加一個 * （編輯：以及在 scrap_hrefs 前面）： store_hrefs(get_hrefs(*scrap_hrefs(*url_parse(url))))

python 中的所有內容總是返回一件事。 當你說你 url_parse 返回兩件事時，它實際上是返回一個由兩個元素組成的元組（但仍然是一個元組）。

這個元組被放置為 scrap_hrefs() 的第一個參數，因此 scrap_hrefs() 缺少第二個參數。

Placing a * in front of a tuple or list when calling a function tells python to take all the elements of that tuple or list and put them into the function as if they were separate function arguments. 這會導致scrap_hrefs 看到兩個輸入參數，它們是url_parse 返回的元組的兩個元素。

Answer 2

這個問題是因為url_parse(url ) 它返回tuple作為結果另一方面scrap_hrefs function需要兩個參數而不是元組所以你需要解構元組如下：

scrap_hrefs(*url_parse(URL))

例如，如果您嘗試在不解構的情況下打印元組，如下所示

# A tuple is created
z = (10, 100)
   
print (z)

output：

(10, 100)

但如果你解構它

# unpacked tuple
print (*z)

output：

10 100

有關在 python 中解構元組的更多信息，請點擊此鏈接

function 在 function 參數中調用

問題描述

2 個解決方案

解決方案1
3 2021-01-18 07:03:26

解決方案2
2 2021-01-18 07:04:27

function 在 function 參數中調用

問題描述

2 個解決方案

解決方案1 3 2021-01-18 07:03:26

解決方案2 2 2021-01-18 07:04:27

解決方案1
3 2021-01-18 07:03:26

解決方案2
2 2021-01-18 07:04:27