MY CODE IS GIVEN BELOW
import requests
import re
from bs4 import BeautifulSoup
page = requests.get(
"https://catalog.data.gov/dataset?q=&sort=metadata_created+desc")
soup = BeautifulSoup(page.content, 'html.parser')
# value = soup.find_all(class_='new-results')
for hit in soup.findAll(attrs={'class': 'dataset-heading'}):
print(hit.text)
MY RESULTS in several rows eg.
Culverts
Iowa Geographic Map Server
Potential Vorticity based parameterization for specification of Upper troposphere/lower stratosphere ozone in atmospheric models
A demonstration of the uncertainty in predicting the estrogenic activity of individual chemicals and mixtures from an in vitro estrogen receptor transcriptional activation assay (T47D-KBluc) to the in vivo uterotrophic assay using oral exposure
data for MRPAT simulation
Waterline ATS BG disinfection data
Computer Code for Industrial Wireless Measurement Analysis and Scenario Generation
MY QUESTION :
How can i get only the first row eg. in this case 'Culverts'
Or how to get the first row from the bs4 findall results ?
try soup.find
instead of soup.findAll
.
This will only return the first result.
I modified little bit in your code.
import requests
import re
from bs4 import BeautifulSoup
page = requests.get(
"https://catalog.data.gov/dataset?q=&sort=metadata_created+desc")
soup = BeautifulSoup(page.content, 'html.parser')
# value = soup.find_all(class_='new-results')
#for hit in soup.find(attrs={'class': 'dataset-heading'}).text:
a = soup.find(attrs={'class': 'dataset-heading'}).text
print a
As @Sid said, use find to get only the first element. No need to use for loop and findall.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.