简体   繁体   中英

Using BeautifulSoup in CGI without installing

I am trying to build a simple scraper in Python, which will run on a Webserver via CGI. Basically it will return a value determined by a parameter passed to it in a URL. I need BeautifulSoup to do the processing of HTML pages on the webserver. However, I'm using HelioHost, which doesn't give me shell access or pip etc. I can only use FTP. One the BS website, it says you can directly extract it and use it without installing.

So I got the tarball on my Win7 machine, used 7-zip to remove bz2 compression, and then tar compression, which gave me a bs4 folder and a setup.py file. I transferred the complete bs4 folder to my cgi-bin directory where the python script is located via ftp. My script code is :

#!/usr/bin/python
import cgitb
cgitb.enable()


import urllib
import urllib2
from bs4 import *

print "Content-type: text/html\n\n"
print "<html><head><title>CGI Demo</title></head>"
print "<h1>Hello World</h1>"
print "</html>"

But it is giving me an error:

 /home/poiasd/public_html/cgi-bin/lel.py
    6 import urllib
    7 import urllib2
    8 from bs4 import *
    9 
   10 print "Content-type: text/html\n\n"
bs4 undefined
SyntaxError: invalid syntax (__init__.py, line 29) 
      args = ('invalid syntax', ('/home/poiasd/public_html/cgi-bin/bs4/__init__.py', 29, 6, 'from .builder import builder_registry\n')) 
      filename = '/home/poiasd/public_html/cgi-bin/bs4/__init__.py' 
      lineno = 29 
      msg = 'invalid syntax' 
      offset = 6 
      print_file_and_line = None 
      text = 'from .builder import builder_registry\n'

How can I use the bs4 module via CGI? How can I install but not-install it? Can I convert the BeautifulSoup I have on my PC to a nice little BeautifulSoup4.py which will contain all the code?

You are using a version of Python that doesn't yet support PEP 328 Relative Imports ; eg Python 2.4 or older. BeautifulSoup 4 requires Python 2.7 or newer.

Presumably you cannot upgrade to a newer Python version. In that case you can try using BeautifulSoup 3 ; it'll have a few bugs and you'll be missing some features, but at least you can get past the syntax error.

However, I note that HelioHost does list Python 2.7 as supported .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM