简体   繁体   English

使用 Python 从网站抓取图形数据

[英]Scraping graph data from a website using Python

Is it possible to capture the graph data from a website?是否可以从网站捕获图形数据? For example the website here , has a number of plots.例如这里的网站,有许多情节。 Is is possible to capture these data using Python code?是否可以使用 Python 代码捕获这些数据?

Looking at the page source of the link you provided, the chart data is available directly in JSON format through the link.查看您提供的链接的页面来源,可以通过链接直接以 JSON 格式获取图表数据。 http://www.fbatoolkit.com/chart_data/1414978499.87 http://www.fbatoolkit.com/chart_data/1414978499.87

So your scraper might want to do something like this:所以你的刮板可能想做这样的事情:

import requests
import re

r = requests.get('http://www.fbatoolkit.com')
data_link = b'http://www.fbatoolkit.com/' + re.search(b'chart_data/[^"]*', r.content).group()
data_string = requests.get(data_link).content.decode('utf-8')
chart_data = eval(data_string.replace('window.chart_data =', '').replace(';\n',''))

(Edit to explain my process for finding the link) When I approach a problem like this, the first thing I do is view the page source, (ctrl-u in Chrome for Windows). (编辑以解释我查找链接的过程)当我遇到这样的问题时,我做的第一件事就是查看页面源代码(Windows 版 Chrome 中的 ctrl-u)。 I searched around for something related to drawing the charts, until I found the following javascript我四处寻找与绘制图表相关的东西,直到我找到以下 javascript

  function make_containers(i){
        var chart = chart_data[i];

I then did a search through the source for where they defined the variable chart_data .然后我搜索了他们定义变量chart_data I couldn't find this, but I did find the line我找不到这个,但我确实找到了这条线

<script type="text/javascript" src="/chart_data/1414978499.87"></script>

Following this link, (you can just click on it in the view souce page in Chrome) I could see that this was a one-line piece of javascript which defines this variable.按照这个链接,(你可以在 Chrome 的视图源页面中点击它)我可以看到这是一个定义这个变量的单行 javascript。 (Notice that in the last line of my example code I had to make a little change to this file to get it to evaluate in Python). (请注意,在我的示例代码的最后一行中,我必须对该文件进行一些更改才能使其在 Python 中进行评估)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM