簡體   English   中英

While 循環數據未附加到 while 循環之外的列表中

[英]While loop data not appending to list outside of while loop

我正在嘗試抓取數據,將其寫入 pd 系列,然后將 go 寫入一個 while 循環,以便在每次迭代后附加到原始系列(位於 while 循環之外)的網站剩余頁面。 我不確定為什么這不起作用。 這是我卡住的地方:

current_url = 'https://www.yellowpages.com/search?search_terms=hvac&geo_location_terms=97080'

def get_data_run(current_url):
    company_names1 = get_company_name(current_url)
    print(company_names1) #1
    page = 1
    max_page = 3
    company_names1 = paginate(current_url, page, max_page, company_names1)
    print(company_names1) #2



def paginate(current_url, page, max_page, company_names1):
    while (page <= max_page):
            new_url = current_url + f"&page={page}"
            print(new_url)
            company_names = get_company_name(new_url)
            company_names1.append(company_names)
            print(company_names) #3
            print(company_names1) #4
            
            page +=1
            if page == max_page:
                return company_names1

def get_company_name(url):
    company_names = []
    page = requests.get(url)
    soup = BeautifulSoup(page.content, 'lxml')
    box = list(soup.findAll("div", {"class": "result"}))
    for i in range(len(box)):
        try:
            company_names.append(box[i].find("a", {"class": "business-name"}).text.strip())
        except Exception:
            company_names.append("null")
        else: 
            continue
    company_names = pd.Series(company_names, dtype='string')
    return company_names


get_data_run(current_url)

我已經標記了company_names1company_names的不同打印和所有打印,並且每次company_names1打印相同系列的公司,即使在 while 循環中附加company_names之后也是如此。 我無法理解的是,當我打印company_names (#3) 時,它會打印下一頁公司名稱。 我不明白為什么它沒有附加到 while 循環中,那么為什么它沒有成功返回 function 之外並在#2 打印中打印組合系列。 謝謝!

更新:這是一些示例 output:

當我打印 #3 時:

(pyfinance) justinbenfit@MacBook-Pro-3 yellowpages_scrape % /usr/local/anaconda3/envs/pyfinance/bin/python /Users/justinbenfit/Desktop/yellowpages_scrape/test.py
0             Honke Heating & Air Conditioning
1                   Climate Kings Heating & Ac
2                  Mike's Truck & Auto Service
3          One Hour Heating & Air Conditioning
4                 Morgan Heating & Cooling Inc
5       Rnr Heating Venting & Air Conditioning
6                           Universal HVAC Inc
7                                   Mr Furnace
8                Affordable Excellence Heating
9                           Green Air Products
10                        David Eugene Neketin
11                  Century Heating & Air Cond
12                            Appliance Wizard
13             Precision Energy Solutions Inc.
14      Portland Heating & Air Conditioning Co
15                                         Mhc
16     American Pride Heating and Cooling, LLC
17                            Tri Star Western
18                 Comfort Zone Heat & Air Inc
19                          Don's Air-Care Inc
20                   Chuck's Heating & Cooling
21    Mt. Hood Heating Cooling & Refrigeration
22                   Chuck's Heating & Cooling
23                                 Mr. Furnace
24                  America's Same Day Service
25         Arctic Commercial Refrigeration LLC
26                          Apex Refrigeration
27        Ben's Heating & Air Conditioning LLC
28                       David's Appliance Inc
29                   Wolcott Heating & Cooling
dtype: string
0                                              Air-Trix
1                                      Johnstone Supply
2                            Buss Heating & Cooling Inc
3                                     The Heat Exchange
4                   Hoodview Heating & Air Conditioning
5                Loomis Heating Cooling & Refrigeration
6                       All About Air Heating & Cooling
7                                        Hanson Heating
8                              Sparks Heating & Cooling
9                              Interior Comfort Systems
10                              P D X Heating & Cooling
11                                      Apcom Power Inc
12                                     Area Heating Inc
13    Four Seasons Heating Air Conditioning & Servic...
14                                  Perfect Climate Inc
15                           Combustion Consultants Inc
16                            Classic Heat Source, Inc.
17                               Multnomah Heating, Inc
18     Apollo Plumbing, Heating & Air Conditioning - OR
19                             Art's Furnace & Air Cond
20                                      Kurchel Heating
21                               P & O Construction Inc
22                                Systems Management NW
23                                   Bridgetown Heating
24             Amana Heating & Air Conditioning Systems
25                                         QualitySmith
26                                   Wilbert Jr, Wilson
27                 Faith Heating & Air Conditioning Inc
28    Northwest Commercial Heating & Air Conditionin...
29                                     Heat Master Corp
dtype: string

當我打印 #1、#2 和 #4 時

0             Honke Heating & Air Conditioning
1                   Climate Kings Heating & Ac
2                  Mike's Truck & Auto Service
3          One Hour Heating & Air Conditioning
4                 Morgan Heating & Cooling Inc
5       Rnr Heating Venting & Air Conditioning
6                           Universal HVAC Inc
7                                   Mr Furnace
8                Affordable Excellence Heating
9                           Green Air Products
10                        David Eugene Neketin
11                  Century Heating & Air Cond
12                            Appliance Wizard
13             Precision Energy Solutions Inc.
14      Portland Heating & Air Conditioning Co
15                                         Mhc
16     American Pride Heating and Cooling, LLC
17                            Tri Star Western
18                 Comfort Zone Heat & Air Inc
19                          Don's Air-Care Inc
20                   Chuck's Heating & Cooling
21                   Chuck's Heating & Cooling
22                                 Mr. Furnace
23    Mt. Hood Heating Cooling & Refrigeration
24                  America's Same Day Service
25         Arctic Commercial Refrigeration LLC
26                          Apex Refrigeration
27        Ben's Heating & Air Conditioning LLC
28                       David's Appliance Inc
29                   Wolcott Heating & Cooling
dtype: string

問題是您將pd.Series視為list ,但前者是不可變的,而后者是可變的。 這意味着,將數據附加到列表的工作方式如下:

lst = [1,2,3]
lst.append(4)
print(lst)
# [1, 2, 3, 4]

object 無需顯式分配即可更改。 如果您對Series執行相同操作,則會發生這種情況:

series = pd.Series([1,2,3])
series.append(pd.Series([4]))
print(series)

output 是:

0    1
1    2
2    3
dtype: int64

所以,要更新一個系列,你必須替換原來的 object 或創建一個新的。 如果沒有賦值,則不會存儲在 memory 中:

series = pd.Series([1,2,3])
series = series.append(pd.Series([4]))
print(series)

Output:

0    1
1    2
2    3
0    4
dtype: int64

如果您的問題在於paginate function,您應該更改此行:

company_names1.append(company_names)

至:

company_names1 = company_names1.append(company_names)

一切都應該工作

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM