简体   繁体   English

如何在PHP中抓取JavaScript值

[英]How to scrape a JavaScript value in PHP

I'm new to PHP and coding in general and I can't figure this out. 我是PHP和编码方面的新手,对此我一无所知。 I'm trying to get the number of kills from this profile page . 我正在尝试从此个人资料页面获取杀死人数。

At the moment, the string I am trying to get is: 目前,我尝试获取的字符串是:

29362

When I view the page source, this number is nowhere to be seen. 当我查看页面源代码时,这个数字无处可寻。

When I use inspect element, however, I find: 但是,当我使用inspect元素时,我发现:

<td class="num">29362</td>

How can I get the content shown in inspect element instead of the content shown by viewing the page source? 如何获取显示在inspect元素中的内容,而不是通过查看页面源显示的内容?

In using a tool like Firebug for Firefox, or the inspector for Safari and Chrome, you can see that at page load a series of AJAX requests are made for data. 在使用诸如Firebug的Firefox或Safari和Chrome的检查器之类的工具时,您可以看到在页面加载时会发出一系列针对数据的AJAX请求。 Though I didn't dig through all of the data returned by those requests, I do see the data you're looking for in at least one of them: 尽管我没有仔细研究这些请求返回的所有数据,但至少在其中之一中确实看到了您正在寻找的数据:

http://uberstrike.com/profile/all_stats/631163 

So at page load JavaScript makes a series of AJAX requests back to the server to get all the data, then it manipulates the DOM to insert it all into the view. 因此,在页面加载时,JavaScript向服务器发出一系列AJAX请求,以获取所有数据,然后操纵DOM将其全部插入视图中。

If you wanted, your PHP could directly request the URL I pasted above and json_decode the response. 如果需要,您的PHP可以直接请求我上面粘贴的URL并用json_decode响应。 This would produce a data structure for you to use which includes that number in the kills_all_time property. 这将产生一个供您使用的数据结构,该结构在kills_all_time属性中包含该数字。

Quick and dirty example: 快速而肮脏的例子:

<?php
$data_url = 'http://uberstrike.com/profile/all_stats/631163';

$serialized_data = file_get_contents($data_url);

$data = json_decode($serialized_data, true);

var_dump($data['kills_all_time']);

I looked and it looks like there is no API currently, so your best method will be to do an inter-web-server http request. 我看了看,好像目前没有API,所以最好的方法是执行网络服务器间的HTTP请求。 Get the page you want and then it is a lot of string math from there. 获取所需的页面,然后从那里开始很多字符串数学运算。

I would recommend using string search to find <td class="name">Kills</td> and the kills row will appear right after it. 我建议使用字符串搜索来找到<td class="name">Kills</td> ,并且kills行将紧随其后。 From there its simply extracting the number using string math. 从那里只需使用字符串数学来提取数字。

To add to what JAAulde has explained, it seems like there is a method to these AJAX requests. 要补充JAAulde解释,似乎对这些AJAX请求有一种方法。 And they are all based on the profile ID that can be found at the end of the URL: 它们都是基于可在URL末尾找到的配置文件ID:

http://uberstrike.com/public_profile/631163

Then in the Safari debugger (which is what I am using) you can see these XHR ( XMLHttpRequest ) requests which are directly connected to API calls: 然后在Safari调试器(我正在使用的调试器)中,您可以看到这些XHR( XMLHttpRequest )请求直接与API调用连接:

在此处输入图片说明

Then looking at the data in them shows some really nicely formatted JSON. 然后查看其中的数据会显示一些格式非常正确的JSON。 Great! 大! No scraping! 不要刮! So just go through these URLs to see what you can see: 因此,只需浏览以下URL即可看到:

http://uberstrike.com/profile/items

http://uberstrike.com/profile/user_info/631163

http://uberstrike.com/profile/user_loadout/631163

http://uberstrike.com/profile/all_stats/631163

And looking at the all_stats endpoint shows: 并查看all_stats端点显示:

"kills_all_time":29362,

Nice! 真好!

So now let's use some PHP json_decode ing like this: 因此,现在让我们使用像这样的一些PHP json_decode

// Set the URL to the data.
$url = 'http://uberstrike.com/profile/all_stats/631163';

// Get the contenst of the URL via file_get_contents.
$all_stats_json = file_get_contents($url);

// Decode the JSON string with the 'true' optionso we get any array.
$all_stats_json_decoded = json_decode($all_stats_json, true);

// Dump the results for testing.
echo '<pre>';
print_r($all_stats_json_decoded);
echo '</pre>';

Which will dump an array like this: 这将转储这样的数组:

Array
(
    [headshots_record] => 24
    [nutshots_record] => 33
    [damage_dealt_record] => 6710
    [damage_received_record] => 31073
    [kills_record] => 50
    [smackdowns_record] => 45
    [headshots_all_time] => 4299
    [nutshots_all_time] => 1925
    [kills_all_time] => 29362
    [deaths_all_time] => 16491
    …

Now to get kills_all_time just do this: 现在要获取kills_all_time只需执行以下操作:

// Return the 'kills_all_time'.
echo $all_stats_json_decoded['kills_all_time'];

Which gives us: 这给了我们:

29362 29362

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM