简体   繁体   English

使用curl下载乱码事件日志

[英]Download flurry event log using curl

I started to use Flurry Analytics and have found that it's analysing tools are insufficient and too slow. 我开始使用Flurry Analytics,发现它的分析工具不够用而且速度太慢。 Simple funnel of 3 steps was processed for 3 days, while normally query with 3 left joins take 0,001 seconds on table with 100,000 rows. 3个步骤的简单漏斗被处理3天,而通常使用3个左连接查询在表格上占用100,000行,需要0,001秒。

Flurry allows to download raw event data in csv format on Event Logs page, so I decided to import all the events and analyze them at home. Flurry允许在“事件日志”页面上以csv格式下载原始事件数据,因此我决定导入所有事件并在家中进行分析。

Flurry allows to download only 100,000 records, and they advise just to download often to fit this limit. Flurry只允许下载100,000条记录,他们建议您经常下载以适应此限制。 They had raw event download API but abandoned it for some reason. 他们有原始事件下载API但由于某种原因放弃了它。 So the only way is to go to Event Logs page and download events data manually. 因此,唯一的方法是转到“事件日志”页面并手动下载事件数据。 But as you can imagine it is very annoying. 但是你可以想象它很烦人。

So I decided to get this data using curl in php. 所以我决定在php中使用curl获取这些数据。 I've copied the GET HTTP request to download link with headers and got the data. 我已经将GET HTTP请求复制到带有标题的下载链接并获取数据。 But the whole magic is in session/cookies which I can just copy from existing session. 但整个魔术都在会话/ cookie中,我可以从现有会话中复制。 So to make curl query succeed I have to: 因此,要使curl查询成功,我必须:

  1. go to flurry site in browser and login 在浏览器中访问flurry网站并登录
  2. go to Event Logs page, choose time frame parameters and click download 转到“事件日志”页面,选择时间范围参数,然后单击“下载”
  3. copy request headers in sniffer 在嗅探器中复制请求标头
  4. paste them to my php code 将它们粘贴到我的PHP代码中
  5. and from now on I can make this query in php until session cookies expire 从现在开始,我可以在php中进行此查询,直到会话cookie过期

I'm not sure but suppose cookies will expire on the next day, so the whole this effort is useless. 我不确定但是假设cookie会在第二天到期,所以整个这个努力都没用。

As I understand I should try to POST login with curl, and keeping this connection perform GET to download data. 据我所知,我应该尝试使用curl进行POST登录,并保持此连接执行GET以下载数据。 Yet I can not login even copying the whole POST login request body - it answers with the same login page although should 302 redirect to https://dev.flurry.com/fullPageTakeover.do?originalTarget=&isFirstPostLogin=true&defaultTarget=%2Fhome.do 然而我甚至无法登录甚至复制整个POST登录请求正文 - 它使用相同的登录页面进行回复,尽管302应该重定向到https://dev.flurry.com/fullPageTakeover.do?originalTarget=&isFirstPostLogin=true&defaultTarget=%2Fhome.do

It looks like flurry is somehow protected from such curl reading. 看起来flurry以某种方式保护免受这种卷曲阅读。 Or maybe somebody succeed in it? 或者也许有人成功了?

Here's the code: 这是代码:

    $cookie_file_path = "cookies.txt";
    $LOGINURL         = "https://dev.flurry.com/secure/login.do";
    $MY_EMAIL ="my email";
    $MY_PASS="password";
    $MY_GAME_ID="gameid";

    $ch = curl_init();

    curl_setopt($ch, CURLOPT_HEADER,  0);
    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
    curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6");
    curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
    curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
    curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
    curl_setopt ($ch, CURLOPT_REFERER, $LOGINURL);


    curl_setopt($ch, CURLOPT_URL, $LOGINURL);
    curl_setopt($ch, CURLOPT_POST, 1);
    curl_setopt($ch, CURLOPT_POSTFIELDS, "loginEmail=$MY_EMAIL&loginPassword=$MY_PASS&__checkbox_rememberMe=true&struts.token.name=struts.token&struts.token=7NB9NWLOYZ8SD8TWR8LGS63REVDI8SQS");

    $result = curl_exec($ch);


    $remotePageUrl = "https://dev.flurry.com/eventsLogCsv.do?projectID=$MY_GAME_ID&versionCut=versionsAll&intervalCut=7Days&stream=true&direction=1&offset=0";
    curl_setopt($ch, CURLOPT_POST, 0);
    curl_setopt($ch, CURLOPT_URL, $remotePageUrl);
    $result = curl_exec($ch);

    echo $result;

Also tried to pass cookies (like it does from browser), but nothing helped: 还试图传递cookie(就像它从浏览器那样),但没有任何帮助:

Cookie: _ga=GA1.2.100867533.1424333566; S0hZTkM0RFRXRjJNSlg2TVdXSEs_fit=1424333594147; fid=SG1162A8DEFC14B8428E7C2AFC71D3AEA00C1872F5; JSESSIONID=w34~mvbdvftm9x9dez9dg9b2pmhs; _map_zoomLevel=0;
_map_zoneId=0; __utmt=1; __utmt_~1=1; S0hZTkM0RFRXRjJNSlg2TVdXSEs_fs=eyJiYSI6MTQyNDMzNzkzMzU2OCwicGF1c2VUaW1lc3RhbXAiOjAsImJjIjotMSwiZXZlbnRDb3VudGVyIjowLCJwdXJjaGFzZUNvdW50ZXIiOjAsImVycm9yQ291bnRlciI6MCwidGltZWRFdmVudHMiOltdfQ==;
__utma=83277827.100867533.1424333566.1424333594.1424336847.2; __utmb=83277827.8.10.1424336847; __utmc=83277827; __utmz=83277827.1424333594.1.1.utmcsr=flurry.com|utmccn=(referral)|utmcmd=referral|utmcct=/; __utma=34058230.100867533.1424333566.1424333566.1424336847.2; __utmb=34058230.8.10.1424336847; __utmc=34058230; __utmz=34058230.1424333566.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); _mkto_trk=id:802-TBR-126&token:_mch-flurry.com-1424333577360-64839; S0hZTkM0RFRXRjJNSlg2TVdXSEs_flp=1424338032448

Thanks to silkfire Flurry problem is solved! 感谢silkfire Flurry问题解决了!

The struts.token is a CRSF token which is bound to your session and regenerated on each page load. struts.token是一个CRSF令牌,它绑定到您的会话并在每次页面加载时重新生成。 In your code though, it's static. 但是在你的代码中,它是静态的。 You need to fetch it after your first cURL request and then inject it into your POST array to be used for your second request. 您需要在第一个cURL请求之后获取它,然后将其注入POST数组以用于第二个请求。

Also the page you have to login to is /loginAction.do and not /login.do . 您还要登录的页面是/loginAction.do而不是/login.do

This is how I successfully logged in to Flurry: 这就是我成功登录Flurry的方式:

$post = [
         'loginEmail'        => 'E-MAIL',
         'loginPassword'     => 'PASSWORD',
         'struts.token.name' => 'struts.token'
        ];

$ch = curl_init('https://dev.flurry.com/secure/login.do');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
curl_setopt($ch, CURLOPT_COOKIEFILE, null);

curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);

libxml_use_internal_errors(true);

$dom = new DOMDocument('1.0', 'UTF-8');
$dom->loadHTML(curl_exec($ch));

$xpath = new DOMXPath($dom);


$post['struts.token'] = $xpath->query('//input[@name="struts.token"]')->item(0)->getAttribute('value');

curl_setopt($ch, CURLOPT_URL, 'https://dev.flurry.com/secure/loginAction.do');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($post));

$data = curl_exec($ch);


echo $data;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM