I was trying to write simple script with Beautiful Soup which can scrap just two information and generate a SQL file please from a website.
import mechanize
import urlparse
from bs4 import BeautifulSoup
op = mechanize.Browser()
op.open("https://www.mentalhelp.net/symptoms/")
for link in op.links():
print link.text
print urlparse.urljoin(link.base_url, link.url)
get = BeautifulSoup(urllib2.urlopen("https://www.mentalhelp.net/symptoms/").read()).findAll('p')
print get
print "\n"
error:
C:\\Python27>python symtoms.py File "symtoms.py", line 8 print link.text ^ IndentationError: expected an indented block
I just want a script which will scrap those items and short descriptions and generate a SQL file which will have only two field "name" & "sug". "name" is those items and "sug" is those descriptions.
Indentation is important in Python , it is used to determine blocks , like for loop or if block or while loop or functions etc.
In the code you gave , the statement after the for loop is not correctly indented inside the for loop , and the for loop expects atleast one statement in its body , and I think you expected the lines below the for loop to be inside the for loop , so you should indent them inside the for loop .
Code -
for link in op.links():
print link.text
print urlparse.urljoin(link.base_url, link.url)
get = BeautifulSoup(urllib2.urlopen("https://www.mentalhelp.net/symptoms/").read()).findAll('p')
print get
print "\n"
Though I am not sure if that would get what you want , it would fix your current error .
For the new requirement to get just the classic symptoms
and its descrciption , you can use -
soup = BeautifulSoup(urllib2.urlopen("https://www.mentalhelp.net/symptoms/").read())
for div in soup.findAll('div',{'id':'page'}):
for entrydiv in div.findAll('div',{'class':'h4 entry-title'}):
print(entrydiv.get_text())
print(entrydiv.next_sibling.get_text())
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.