Return [] when scraping data with bs4

Question

i am trying to scrape data from a website but so far have been pretty unsuccessful. i tried a couple of ways most promising has been this. i am trying to get the yearBuild from the site. can someone help me out. any leads would be highly appreciated

import bs4 as bs
from selenium import webdriver  
wd = webdriver.Chrome()
url = ("https://www.marinetraffic.com/en/ais/details/ships/mmsi:255805792")
wd.get(url)
html_source = wd.page_source
wd.quit()
soup = bs.BeautifulSoup(html_source)
elems = soup.select('#yearBuild > b')
print(elems)
print(soup.prettify())

here elems is returned as an empty list

Answer 1

You can use their API to get info about the ship.

For example:

import re
import json
import requests


url = 'https://www.marinetraffic.com/en/ais/details/ships/mmsi:255805792'

ship_info_url = 'https://www.marinetraffic.com/en/vesselDetails/vesselInfo/shipid:{ship_id}'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0'}

r = requests.get(url, headers=headers)
ship_id = re.search(r'shipid:(\d+)', r.url)[1]
data = requests.get(ship_info_url.format(ship_id=ship_id), headers=headers).json()

print(json.dumps(data, indent=4))
print('Year Built = ', data['yearBuilt'])

Prints:

{
    "name": "LAILA",
    "nameAis": "LAILA",
    "imo": 9377559,
    "eni": null,
    "mmsi": 255805792,
    "callsign": "CQDP",
    "country": "Portugal",
    "countryCode": "PT",
    "type": "Cargo - Hazard A (Major)",
    "typeSpecific": "Container Ship",
    "typeColor": "7",
    "grossTonnage": 28048,
    "deadweight": 38080,
    "teu": 2700,
    "liquidGas": null,
    "length": 215.5,
    "breadth": 29.87,
    "yearBuilt": 2008,
    "status": "Active",
    "isNavigationalAid": false,
    "correspondingRoamingStationId": null,
    "homePort": null
}
Year Built =  2008

Answer 2

Could I suggest using VesselFinder instead of MarineTraffic? The data is the same but MarineTraffic is hard to scrape as it's all JavaScript, while VesselFinder can be scraped with just BeautifulSoup.

VesselFinder also uses tables to show the data so it's easy to parse with pandas.

Here's the code:

import pandas as pd
import requests

r = requests.get('https://www.vesselfinder.com/vessels/LAILA-IMO-9377559-MMSI-255805792', headers={'User-Agent': 'iPhone'})

df = pd.read_html(r.text)
ship = ship = pd.concat([df[2], df[3]], ignore_index=True).set_index(0).to_dict()[1]

print(ship['Year of Built'])

Return [] when scraping data with bs4

Question

2 answers

solution1
1 2020-06-20 10:30:02

solution2
0 2020-06-20 11:09:11

Return [] when scraping data with bs4

Question

2 answers

solution1 1 2020-06-20 10:30:02

solution2 0 2020-06-20 11:09:11

solution1
1 2020-06-20 10:30:02

solution2
0 2020-06-20 11:09:11