Get data from script tag with Scrapy Xpath and using it as CSV

Question

I've been trying to extract data from script tag using Scrapy(xpath). My main issue is with identifying the correct div and script tags. I'm new to using xpath and would be thankful for any kind of help!

<script>    
var COUNTRY_SHOP_STATUS = "buy";
var COUNTRY_SHOP_URL = "";
try {
digitalData.page.pathIndicator.depth_2 = "mobile";
digitalData.page.pathIndicator.depth_3 = "mobile";
digitalData.page.pathIndicator.depth_4 = "smartphones";
digitalData.page.pathIndicator.depth_5 = "galaxy-s8";    
digitalData.product.pvi_type_name = "Mobile";
digitalData.product.pvi_subtype_name = "Smartphone";
digitalData.product.model_name = "SM-G950F";
digitalData.product.category = digitalData.page.pathIndicator.depth_3;
} catch(e) {}
</script>

I would finally like to populate my csv file with the data of model.name and depth 3, 4 and 5. I've tried the other solutions from the questions similar to this one but they seem to not work...

Answer 1

You can use regex to extract required values:

import re

source = response.xpath("//script[contains(., 'COUNTRY_SHOP_STATUS')]/text()").extract()[0]

def get_values(parameter, script):
    return re.findall('%s = "(.*)"' % parameter, script)[0]

print(get_values("pathIndicator.depth_5", source))
print(get_values("pvi_subtype_name", source))
print(get_values("model_name", source))
...

Get data from script tag with Scrapy Xpath and using it as CSV

Question

1 answers

solution1
2 ACCPTED 2018-08-25 20:08:56

Get data from script tag with Scrapy Xpath and using it as CSV

Question

1 answers

solution1 2 ACCPTED 2018-08-25 20:08:56

solution1
2 ACCPTED 2018-08-25 20:08:56