简体   繁体   中英

Beautifulsoup Python unable to scrape data from a website

I have been using Python Beautifulsoup to scrape data. So far have beeen successfully scraped. But stuck with the following website.

Target Site: LyricsHindiSong

My goal is scrape song lyrics from the mentioned website. But all the time it gives blank result or Nonetype object has no attribute kind error.

Have been struggling since last 15 days and could not able to figure out where was the problem and how to fix it?

Following is the code which is I am using.

import pymysql
import requests
from bs4 import Beautifulsoup

r=requests.get("https://www.lyricshindisong.in/2020/04/chnda-re-chnda-re-chhupe-rahana.html")
soup=Beautifulsoup(r.content,'html5lib')
pageTitle=soup.find('h1').text.strip()
targetContent=soup.find('div',{'style':'margin:25px; color:navy;font-size:18px;'})
print(pageTitle)
print(targetContent.text.strip())

It prints error nonetype object has no text error. If I check in the inspect window, element has both the elements present. Unable to understand where is the problem. Atleast it should have printed the title page.

Hope you understand my requirement. Please guide me. Thanks.

You made a mistake in class name from bs4 lib and used find method instead of find_all

Full code:

import requests
from bs4 import BeautifulSoup


url = "https://www.lyricshindisong.in/2020/04/chnda-re-chnda-re-chhupe-rahana.html"
response = requests.get(url)

soup = BeautifulSoup(response.content,'html5lib')

title = soup.find('h1').text.strip()
content = soup.find_all('div',{'style':'margin:25px; color:navy;font-size:18px;'})

print(title)

for line in content:
    print(line.text.strip())

Result:

python answer.py
Chnda Re Chnda Re Chhupe Rahana
चंदा रे, चंदा रे, छुपे रहनासोये मेरी मैना, लेके मेरी निंदिया रे
फूल चमेली धीरे महको, झोका ना लगा जाये नाजुक डाली कजरावाली सपने में मुस्काये लेके मेरी निंदिया रे
हाथ कहीं है, पाँव कहीं है, लागे प्यारी प्यारी ममता गाए, पवन झुलाये, झूले राजकुमारी लेके मेरी निंदिया रे  

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM