extract js data from a web page using scrapy

Question

I am crawling a web page using scrapy.

Now there's some data in a script tag. I got all data in script tag using xpath and looks like this.

 <script>
 some data

 abc.xyz=[["mohit","gupta","456123"]];

 some data
 </script>

I want data in abc.xyz but I'm unable to do so.

Answer 1

You can use regular expression abc.xyz=(.*?); for extracting the variable value. Also, if you want to make a python list from it, you can use literal_eval() :

from ast import literal_eval
import re

text = """<script>
 some data

 abc.xyz=[["mohit","gupta","456123"]];

 some data
 </script>"""

value = re.search('abc.xyz=(.*?);', text).group(1)
print value, type(value)

value = literal_eval(value)
print value, type(value)

prints:

[["mohit","gupta","456123"]] <type 'str'>
[['mohit', 'gupta', '456123']] <type 'list'>

extract js data from a web page using scrapy

Question

1 answers

solution1
1 2013-09-19 07:26:00

extract js data from a web page using scrapy

Question

1 answers

solution1 1 2013-09-19 07:26:00

solution1
1 2013-09-19 07:26:00