Is there any chance to retrieve DOM results when I click older posts from the site:
http://www.facebook.com/FamilyGuy
using C# or Java? I heard that it is possible to execute a script with onclick
and get results. How I can execute this script:
onclick="(JSCC.get('j4eb9ad57ab8a19f468880561') && JSCC.get('j4eb9ad57ab8a19f468880561').getHandler())(); return false;"
I think older posts
link sends an Ajax
request and appends the response to the page. (I'm not sure. You should check the page source).
You can emulate this behavior in C#
, Java
, and JavaScript
(you already have the code for javascript).
Edit:
It seems that Facebook
uses some sort of internal APIs ( JSCC
) to load the content and it's undocumented.
I don't know about Facebook
Developers' APIs (you may want to check that first) but if you want to emulate exactly what happens in your browser then you can use TamperData
to intercept GET
requests when you click on more posts
link and find the request URL and it's parameters .
After you get this information you have to Login
to your account in your application and get the authentication cookie.
C#
sample code as you requested:
private CookieContainer GetCookieContainer(string loginURL, string userName, string password)
{
var webRequest = WebRequest.Create(loginURL) as HttpWebRequest;
var responseReader = new StreamReader(webRequest.GetResponse().GetResponseStream());
string responseData = responseReader.ReadToEnd();
responseReader.Close();
// Now you may need to extract some values from the login form and build the POST data with your username and password.
// I don't know what exactly you need to POST but again a TamperData observation will help you to find out.
string postData =String.Format("UserName={0}&Password={1}", userName, password); // I emphasize that this is just an example.
// cookie container
var cookies = new CookieContainer();
// post the login form
webRequest = WebRequest.Create(loginURL) as HttpWebRequest;
webRequest.Method = "POST";
webRequest.ContentType = "application/x-www-form-urlencoded";
webRequest.CookieContainer = cookies;
// write the form values into the request message
var requestWriter = new StreamWriter(webRequest.GetRequestStream());
requestWriter.Write(postData);
requestWriter.Close();
webRequest.GetResponse().Close();
return cookies;
}
Then you can perform GET
requests with the cookie you have, on the URL
you've got from analyzing that JSCC.get().getHandler()
requests using TamperData
, and eventually you'll get what you want as a response stream:
var webRequest = WebRequest.Create(url) as HttpWebRequest;
webRequest.CookieContainer = GetCookieContainer(url, userName, password);
var responseStream = webRequest.GetResponse().GetResponseStream();
You can also use Selenium
for browser automation. It also has C#
and Java
APIs (I have no experience using Selenium
).
Facebook loads it's content dynamically with AJAX. You can use a tool like Firebug to examine what kind of request is made, and then replicate it.
Or you can use a browser render engine like webkit to process the JavaScript for you and expose the resulting HTML: http://webscraping.com/blog/Scraping-JavaScript-webpages-with-webkit/
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.