简体   繁体   中英

beautifulsoup html.parser error

I am trying to use BeautifulSoup to parse HTML data from a URL. However, I keep getting the warning:

"No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

To get rid of this warning, change this:

 BeautifulSoup([your markup])

to this:

 BeautifulSoup([your markup], "html.parser")

  markup_type=markup_type))

I currently have

url = "myurl.com"


page = urllib2.urlopen(url).read()
soup = BeautifulSoup(page, "html.parser")

Any ideas?

I had also that problem and i googled but i didn't got any idea how to fix that so i commented warning printing part. See the picture this is how i solved problem like your.

if builder.is_xml:
    markup_type = "XML"
        else:
            markup_type = "HTML"
            #warnings.warn(self.NO_PARSER_SPECIFIED_WARNING % dict(
            #parser=builder.NAME,
            #markup_type=markup_type))

fixed bs4 warning showing problem

So BeautifulSoup expect you to use a better parser. Check out this . Also try install a parser recommended at here . But you also need to make sure your target environments have these parser.

In the warning itself, they are providing the solution. I just followed as per the statement. Added 2nd parameter 'html.parser' . It removes the warning.

parsed_html = BeautifulSoup(html,'html.parser')

I was suffering from the same problem, but i solved it as follows:

if builder.is_xml:
   markup_type = "lxml"
else:
   markup_type = "HTML"

and:

soup = BeautifulSoup(sys.stdin) update as
soup = BeautifulSoup(sys.stdin,"html.parser")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM