I am struggling trying to grab a tag that doesn't contain any class or id. It is just the a href, and then the link.
html code - there is more, but this is just a short bit of it. Im trying to grab the a href="url is here", but I can't just grab "a" because it will grab every link on the page.
<table>
<tbody>
<tr class="">
<td class="col1 align">
<a href="url is here">
1
</a>
</td>
<td class="col2">
<a href="www.example.com">
<img class="avatar" src="www.example.com" alt="le me">
le me
<img class="test" alt="test" title="test" src="test-icon.png">
</a>
</td>
<td class="col3 align">
<a href="www.example.com">
2,715
</a>
</td>
<td class="col4 align">
<a href="www.example.com">
5,400,000,000
</a>
</td>
</tr>
My code:
source_code = requests.get(url)
plain_text = source_code.text
soup = BeautifulSoup(plain_text)
for link in soup.findAll():
username = link.get()
print(username)
I don't have these filled in because anything I try won't work. Not sure what else to do.
You can select all a
tags and using the has_attr
function check if it has the class
or id
attributes:
for link in soup.findAll('a'):
if link.has_attr('class') or link.has_attr('id'):
continue
username = link.get('href')
print(username)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.