When I use
page = urllib2.urlopen("https:somewebpage.com")
soup = BeautifulSoup(page,"html.parser")
soup.get_text()
I get a result that looks like a table list but its not it returns this as actual text value:
["<a href='/path<a>","tableNameAAA","FINISHED","SUCCEEDED","<br title='100.0'> <div class='ui-progressbar ui-widget ui-widget-content ui-corner-all' title='100.0%'> ,"0"],
["<a href='/path<a>","tableNameBBB","INPROCESS","SUCCEEDED","<br title='100.0'> <div class='ui-progressbar ui-widget ui-widget-content ui-corner-all' title='100.0%'> ,"0"],...
How do I convert this to a list so I can iterate through it. I tried doing list(soup.get_text()) but when I try to iterate through it goes bananas:
...v', u'>', u'"', u',', u'"', u'<', u'a', u' ', u'...
What I expect when I iterate is : [list1],[list2]
instead of what it is now which is "[list1],[list2]"
最终,我只是剥离了所有单引号,然后列出了所有表的列表,这些列表可能在没有BS的情况下也可以完成,但可以正常工作。
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.