I trying to write a web-scraper for wordreference.com, the code is this:
import bs4,sys, requests
from urllib import request
def scraper():
url=f"https://wordreference.com/iten/{sys.argv[1]}"
r=requests.get(url)
html=r.text
soup=bs4.BeautifulSoup(html, "html.parser")
x=soup.find_all("td","ToWrd")
print(x[1].contents)
if __name__ == "__main__":
scraper()
The code is right but the output is this: input:
python3 scraper.py cane
Output:
['dog ', <em class="tooltip POS2">n<span><i>noun</i>: Refers to person, place, thing, quality, etc.</span></em>]]
I want print only the first part of array, when is contained 'dog'. How can I resolve this problem?
well here is the code:
print(x[1].contents[0])
x[1].contents returns a list, so to access the string 'dog ', you need to specify index 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.