简体   繁体   中英

How to get the first row in a bs4 findall result using python?

MY CODE IS GIVEN BELOW

import requests
import re

from bs4 import BeautifulSoup


page = requests.get(
    "https://catalog.data.gov/dataset?q=&sort=metadata_created+desc")

soup = BeautifulSoup(page.content, 'html.parser')

# value = soup.find_all(class_='new-results')

for hit in soup.findAll(attrs={'class': 'dataset-heading'}):
    print(hit.text)

MY RESULTS in several rows eg.

Culverts

Iowa Geographic Map Server

Potential Vorticity based parameterization for specification of Upper troposphere/lower stratosphere ozone in atmospheric models

A demonstration of the uncertainty in predicting the estrogenic activity of individual chemicals and mixtures from an in vitro estrogen receptor transcriptional activation assay (T47D-KBluc) to the in vivo uterotrophic assay using oral exposure

data for MRPAT simulation

Waterline ATS BG disinfection data

Computer Code for Industrial Wireless Measurement Analysis and Scenario Generation

MY QUESTION :

How can i get only the first row eg. in this case 'Culverts'

Or how to get the first row from the bs4 findall results ?

try soup.find instead of soup.findAll .

This will only return the first result.

I modified little bit in your code.

import requests
import re

from bs4 import BeautifulSoup


page = requests.get(
    "https://catalog.data.gov/dataset?q=&sort=metadata_created+desc")

soup = BeautifulSoup(page.content, 'html.parser')
# value = soup.find_all(class_='new-results')
#for hit in soup.find(attrs={'class': 'dataset-heading'}).text:
a = soup.find(attrs={'class': 'dataset-heading'}).text
print a

As @Sid said, use find to get only the first element. No need to use for loop and findall.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM