installing urlib and beautifulsoup in python 2.7

Question

I am new to python. Wrote a small program to fetch all links in a page. I am using python 2.7, the one that comes with Ubuntu. I used different sources to put the code together, but it seems like I am either missing a library or using the right library for the wrong version of python.

import sys
from bs4 import *
import urllib2
import re

if len(sys.argv) != 2:
    print "USAGE:"
    print "Python test.py Your_URL"
else:
        url = sys.argv[1]

html_page = urllib2.urlopen(url)
soup = BeautifulSoup(html_page)
for link in soup.findAll('a'):
    print link.get('href')

I am getting this error:

Traceback (most recent call last):
  File "test.py", line 12, in <module>
    html_page = urllib2.urlopen(url)
  File "/usr/lib/python2.7/urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python2.7/urllib2.py", line 421, in open
    protocol = req.get_type()
  File "/usr/lib/python2.7/urllib2.py", line 283, in get_type
    raise ValueError, "unknown url type: %s" % self.__original
ValueError: unknown url type: www.cs.odu.edu

I have installed bs4, urlib after python. Still the same error.

sudo apt install python

sudo apt install python-pip

sudo pip install bs4

Answer 1

When you enter a URL in a browser without the protocol, it defaults to HTTP. urllib2 won't make that assumption for you; you need to prefix it with http://.

Duplicated: ValueError: unknown url type in urllib2, though the url is fine if opened in a browser

Answer 2

尝试在您的网址前指定http或https，它肯定可以工作。

installing urlib and beautifulsoup in python 2.7

Question

2 answers

solution1
2 ACCPTED 2017-01-22 17:52:35

solution2
2 2017-01-23 05:54:06

installing urlib and beautifulsoup in python 2.7

Question

2 answers

solution1 2 ACCPTED 2017-01-22 17:52:35

solution2 2 2017-01-23 05:54:06

solution1
2 ACCPTED 2017-01-22 17:52:35

solution2
2 2017-01-23 05:54:06