简体   繁体   中英

How to scrape a website using beautifulsoup

I am trying to scrape a website https://remittanceprices.worldbank.org/en/corridor/Australia/China for Fee field.

url = 'https://remittanceprices.worldbank.org/en/corridor/Australia/China'
r = requests.get(url,verify=False)
soup = BeautifulSoup(r.text,'lxml')
rows = [i.get_text("|").split("|") for i in soup.select('#tab-1 .corridor-row')]
for row in rows:
    #a,b,c,d,e = row[2],row[15],row[18],row[21],row[25]
    #print(a,b,c,d,e,sep='|')
    print('{0[2]}|{0[15]}|{0[18]}|{0[21]}|{0[25]}').format(row)

But I am receiving a response AttributeError with the above code.

Can anyone help me out?

The problem is you are using .format() with print() , not the string. .format() is a method of str type and print() actually returns None , so try:

url = 'https://remittanceprices.worldbank.org/en/corridor/Australia/China'
r = requests.get(url,verify=False)
soup = BeautifulSoup(r.text,'lxml')
rows = [i.get_text("|").split("|") for i in soup.select('#tab-1 .corridor-row')]
for row in rows:
    #a,b,c,d,e = row[2],row[15],row[18],row[21],row[25]
    #print(a,b,c,d,e,sep='|')
    print('{0[2]}|{0[15]}|{0[18]}|{0[21]}|{0[25]}'.format(row))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM