Webscraping with mechanize and beautifulsoup - cannot write output file

Question

I need to get lots of data specific to rivercruising, so I am working with alteryx, and for scraping I want to use python from the command line. I need to write the output file to json or to csv. The output file is empty. The hashtags in the code are for processing the output file in alteryx, as the scraped text already contains ",". Preferably I would love to map the output to Json. My code is as follows:

from mechanize import Browser
from bs4 import BeautifulSoup
import lxml

mech = Browser()

url = 'http://www.cruiseshipschedule.com/viking-river-cruises/viking-aegir-schedule/'
page = mech.open(url)

html = page.read()
html.replace('charset="ISO-8859-1"','charset=utf-8')
s = BeautifulSoup(html, "lxml")
content = s.findAll('div', id="content")
link = s.findAll("a")
h1 = s.findAll("h1")

table = s.findAll("table", border="1")

for link in s.findAll("a"):
    linktext = link.text
    linkhref = link.get("href")

for h1 in s.findAll("h1"):
    ship = h1.text

h2_1 = s.h2
h2_1.text
h2_2 = h2_1.find_next('h2')
itinerary_1 = h2_2.text
h2_3 = h2_2.find_next('h2')
itinerary_2 = h2_3.text
h2_4 = h2_3.find_next('h2')
itinerary_3 = h2_4.text

for table in content:
    table0 = s.findAll("table", border='0')

    for tr in s.findAll("table", border='1'):
        trs1 = s.findAll("tr")
        table1 = tr.text.replace('\n','|')
        tds1 = s.findAll('td')
        uls1 = s.findAll('ul')
        lis1 = s.findAll('li')



    for tr in s.findAll("table", border='0'):
        trs2 = s.findAll("tr")
        table2 = tr.text.replace('\n','|')
        tds2 = s.findAll('td')
        uls2 = s.findAll('ul')
        lis2 = s.findAll('li')

all_data=ship+"#"+table1+"#"+table2+"#"+itinerary_1+"#"+itinerary_2+"#"+itinerary_3


all_data = open("Z:/txt files/all_data.txt", "w")
print all_data >> "Z:/txt files/all_data.txt"

Answer 1

To get output to your file, try something like instead of the last 2 lines in your code above:

with open('all_data_txt, 'w') as f:
    f.write(all_data.encode('utf8'))

Webscraping with mechanize and beautifulsoup - cannot write output file

Question

1 answers

solution1
0 ACCPTED 2015-11-10 18:53:31

Webscraping with mechanize and beautifulsoup - cannot write output file

Question

1 answers

solution1 0 ACCPTED 2015-11-10 18:53:31

solution1
0 ACCPTED 2015-11-10 18:53:31